Re: 【DISCUSS】The configuration of Commitlog archiving

guo Maxwell Tue, 08 Oct 2024 08:54:51 -0700

Thanks Jon

But should we at least add the ability of dynamic closing to the archive
that Stefan mentioned before ?


Jon Haddad <[email protected]> 于2024年9月25日周三 01:16写道：

> I categorize this as Not a Problem.
>
> So far the two main justifications have been operator error and a fairly
> convoluted series of steps to an already compromised database.  I don't
> view either of them as a reason to inconvenience users.
>
> If someone wants to avoid the shell command, what's wrong with CDC?
>
> Jon
>
>
>
>
> On Tue, Sep 24, 2024 at 9:21 AM guo Maxwell <[email protected]> wrote:
>
>> Hello，are there any new updates？🤔
>>
>> guo Maxwell <[email protected]>于2024年9月18日 周三下午4:06写道：
>>
>>> Do you have any new updates  on this DISCUSS ?
>>>
>>> - The reason this pattern is popular is it allows extension of
>>> functionality ahead of the database. Some people copy to a NAS/SAN. Some
>>> people copy to S3. Some people copy to their own object storage that isn’t
>>> s3 compatible. “Compress and move” is super limiting, because “move” varies
>>> remarkably between environments.
>>>
>>> Yes, it is indeed very flexible to use this way, but would it be more
>>> appropriate to decouple the file archiving to heterogeneous storage and
>>> leave it to other systems to handle it specifically? And we only do
>>> compression and copying (file linking like sstable incremental backup)?
>>>
>>>
>>> Štefan Miklošovič <[email protected]> 于2024年9月5日周四 04:18写道：
>>>
>>>>
>>>> On Wed, Sep 4, 2024 at 8:34 PM Jon Haddad <[email protected]> wrote:
>>>>
>>>>> I thought about this a bit over the last few days, and there's
>>>>> actually quite a few problems present that would need to be addressed.
>>>>>
>>>>> *Insecure JMX*
>>>>>
>>>>> First off - if someone has access to JMX, the entire system is already
>>>>> compromised.  A bad actor can mess with the cluster topology, truncate
>>>>> tables, and do a ton of other disruptive stuff.  But if we're going to go
>>>>> down this path I think we should apply your logic consistently to avoid
>>>>> creating a "solution" that has the same "problem" as we do now.  I use
>>>>> quotes because I'm not entirely convinced the root cause of the problem is
>>>>> enabling some shell access, but I'll entertain it for the sake of the
>>>>> discussion.
>>>>>
>>>>> *Dynamic Configuration and Shell Scripts*
>>>>>
>>>>> Let's pretend that somehow an open JMX isn't already a *massive*
>>>>> security flaw by itself.  Once an attacker has control of a system, the
>>>>> next phase of the attack relies on them dynamically changing the
>>>>> configuration to point to a different shell script, or to execute 
>>>>> arbitrary
>>>>> shell scripts.
>>>>>
>>>> I agree with the general idea that we don't want this - so in my mind
>>>>> the necessary solution here is to disable the ability to change the commit
>>>>> log archiving behavior at runtime.
>>>>>
>>>>> The idea that commit log archiving (and many other config settings)
>>>>> would be dynamically configurable is a massive security flaw that should 
>>>>> be
>>>>> disallowed.  If you want to take this a step further and claim there's a
>>>>> flaw with shell scripts in general, I'll even entertain that for a minute,
>>>>> but we need to examine if the proposed solution of moving code to Java
>>>>> actually solves the problem.
>>>>>
>>>>> *Dynamic Configuration and Java Code*
>>>>>
>>>>> Let's say we've removed the ability to use shell scripts, and we've
>>>>> gotten people to rewrite their shell code with java code, but we've left
>>>>> the dynamic configuration in.  Going back to my original email, I 
>>>>> mentioned
>>>>> copying commit logs off the node and into an object store.  If someone is
>>>>> able to change the parameter dynamically at runtime, they could just as
>>>>> easily point to a public S3 bucket and commit logs would be archived there
>>>>> which is just as bad as the shell version.  So if we are to convert this
>>>>> functionality to Java, we should also be making best practice
>>>>> recommendations on what users should and should not do.
>>>>>
>>>>
>>>> I think what you meant here is that if we allowed people to provide a
>>>> pluggable way how stuff is copied over and they would code it up, put that
>>>> JAR on the class path, Cassandra (re)started etc, then someone might
>>>> reconfigure this custom solution in runtime? Yeah, we do not want this. We
>>>> can make it pluggable, but not reconfigurable. To have it pluggable and not
>>>> reconfigurable, then to replace it with something else, an attacker would
>>>> basically need to restart Cassandra with a rogue JAR on the class path. In
>>>> order to do that, I think that at this point it would be beyond any
>>>> salvation and the system is completely compromised anyway.
>>>>
>>>>
>>>>>
>>>>>
>>>>> *Apply All Operational Best Practices*
>>>>>
>>>>> There's been a variety of examples of how a user can further
>>>>> compromise a machine once they have JMX, working in tandem with shell
>>>>> scripts, but I hope at this point you can see that the issue is
>>>>> fundamentally more complex than simply disallowing shell scripts.  The
>>>>> issue is present in the Java examples as well, and is strongly tied to the
>>>>> issue of dynamic config.  If we're to design this the "right" way, I think
>>>>> we'd want these properties:
>>>>>
>>>>> * Commit log archiving should only have the ability to compress and
>>>>> move files to a staging location
>>>>> * Once the files are moved to the staging location, the file should be
>>>>> moved somewhere else by a script NOT run as the C* user.
>>>>>
>>>> * The commit log archive configuration should not be dynamically
>>>>> updatable, nor should any config affecting directories
>>>>>
>>>>
>>>> This would essentially copy the logic we have for snapshots as Jordan
>>>> mentioned. I do not mind having it like that. It is a good question for
>>>> what exactly we need to have it reconfigurable. Why is it like that? People
>>>> do not want to restart a whole cluster consisting of 100 nodes when the
>>>> destination of the archived commit logs changed? How often is this
>>>> happening that we need to expose ourselves to the problems related to that?
>>>>
>>>>
>>>>>
>>>>> Moving the scell configuration to Java code is a half measure that's
>>>>> only solving a tiny problem in a massive chain of events and security
>>>>> holes.
>>>>>
>>>>> Jon
>>>>>
>>>>>
>>>>>
>>>>> On Tue, Sep 3, 2024 at 4:15 AM Štefan Miklošovič <
>>>>> [email protected]> wrote:
>>>>>
>>>>>> Scott is right that this is also coming from us having a MBean method
>>>>>> which allows commands to be changed in runtime. The solution to that was
>>>>>> that we can prevent it from changing dynamically by having a 
>>>>>> configuration
>>>>>> property, which is actually by default set to false so FQL archiving is
>>>>>> ever possible only in case an operator explicitly enables that.
>>>>>>
>>>>>> However, even if commands were not modifiable in runtime via JMX and
>>>>>> even an operator has a chance to enable command execution explicitly, 
>>>>>> that
>>>>>> still does not make it 100% secure because an attacker does not need to
>>>>>> change / modify cassandra.yaml where the script to execute is configure,
>>>>>> just the content of such a script which is executed.
>>>>>>
>>>>>> So, introducing a similar property as it was done for FQL would in
>>>>>> this context mean that it would be used for disabling commitlog 
>>>>>> archiving /
>>>>>> restoring altogether while for FQL it would still do its thing, it would
>>>>>> just not archive it. Whole commitlog archiving / restoring is now based 
>>>>>> on
>>>>>> some commands to be executed so disabling commands being executed
>>>>>> practically means we disabled this whole feature as such.
>>>>>>
>>>>>> We could indeed make it flat out impossible to execute anything but
>>>>>> these scripts might contain some custom logic, like uploading to various
>>>>>> cloud storages (AWS, Azure, GCP or something completely custom), people
>>>>>> have their own "storage solutions" like remove the old logs when new come
>>>>>> in etc. so by disabling this altogether we would make it impossible and
>>>>>> users would need to accommodate that which would break their existing
>>>>>> solutions.
>>>>>>
>>>>>> What I find confusing is that commitlog_archiving.properties is used
>>>>>> both for restoration AS WELL AS for archiving. If we're ever going to
>>>>>> change how this works, I think that it should be somehow logically split
>>>>>> into archiving and restoring parts.
>>>>>>
>>>>>> So, we might introduce a property in cassandra.yaml to disable
>>>>>> commitlog_archiving.properties altogether and we might deprecate
>>>>>> commitlog_archiving.properties way of doing this (still keep it there for
>>>>>> legacy reasons), add a new cassandra.yaml configuration section for that
>>>>>> and there make the archiving and the restoration pluggable. By default we
>>>>>> would provide "cp $from $to" implemented by Cassandra itself without any
>>>>>> process invocation. Then we might eventually drop
>>>>>> commitlog_archiving.properties but if the maintenance of that is cheap I
>>>>>> would just keep it, we would just flip the switch so a new way of doing
>>>>>> that would be preferable and the old way of doing it (via properties) 
>>>>>> would
>>>>>> need to be explicitly enabled.
>>>>>>
>>>>>> On Tue, Sep 3, 2024 at 11:55 AM guo Maxwell <[email protected]>
>>>>>> wrote:
>>>>>>
>>>>>>> Thank you very much for everyone's replies, they are all very
>>>>>>> valuable feedback to me.
>>>>>>>
>>>>>>> I don't really understand what benefit adding restrictions would
>>>>>>>> serve.  Would it be hard coded in C* itself, or configurable?  If it's
>>>>>>>> configurable, then are we just making users enter their commands twice?
>>>>>>>> This is meant to be used by an operator, so who's actually protected 
>>>>>>>> by an
>>>>>>>> allow-list?
>>>>>>>>
>>>>>>>
>>>>>>> I agree with you too, so I may prefer to idea 2 with implement
>>>>>>> commitlog archiving in c* (not archiving by user defined shell),
>>>>>>> and deprecate the  commitlog_archiving.properties
>>>>>>> <https://github.com/apache/cassandra/blob/trunk/conf/commitlog_archiving.properties#L28>
>>>>>>>  configuration
>>>>>>> through which  we  can set the properties of commitlog archiving. This 
>>>>>>> view
>>>>>>> may be similar to that of Scott.
>>>>>>>
>>>>>>> If I want to use rclone or aws-cli to archive my commit logs that's
>>>>>>>> my prerogative.
>>>>>>>>
>>>>>>>
>>>>>>> Yes, it may be very flexible if we set aws-cli in shell. But as I
>>>>>>> know cassandra-medusa can also do this , and for me letting other tools 
>>>>>>> to
>>>>>>> do this work may be better , for example we can upload more than one log
>>>>>>> (if log size is not big ) in a rpc to improve write throughput.
>>>>>>>
>>>>>>> I think we can divide this big task into several subtasks:
>>>>>>>
>>>>>>>    - Add this feature that Stefan mentioned before for commitlog
>>>>>>>    archive CASSANDRA-18550
>>>>>>>    <https://issues.apache.org/jira/browse/CASSANDRA-18550> in 5.x
>>>>>>>    and may the original commitlog_archiving.properties  deprecate.
>>>>>>>    - Add the feature of archiving for cassandra (commitlog/query
>>>>>>>    log/or event sstable) in the long run such as 6.0.
>>>>>>>
>>>>>>> I can prepare a cep if necessary. Looking forward to your feedback.
>>>>>>>
>>>>>>>
>>>>>>> We can divide this task into several subtasks and complete them step
>>>>>>> by step
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Jordan West <[email protected]> 于2024年9月3日周二 00:55写道：
>>>>>>>
>>>>>>>> +1 to Scott’s comments. Once you expose those YAML config params
>>>>>>>> outside of a single node which many of us do, this becomes an RCE 
>>>>>>>> attack
>>>>>>>> vector. Something more structured as Scott proposes, similar to 
>>>>>>>> snapshots,
>>>>>>>> would be preferred. Would recommend a CEP.
>>>>>>>>
>>>>>>>> Jordan
>>>>>>>>
>>>>>>>> On Fri, Aug 30, 2024 at 20:58 C. Scott Andreas <[email protected]>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> I appreciate this report and would love to work toward the
>>>>>>>>> direction it recommends.
>>>>>>>>>
>>>>>>>>> I’m also familiar with past concerns raised by others with our FQL
>>>>>>>>> configuration parameters that allow passing shell commands for FQL 
>>>>>>>>> segment
>>>>>>>>> archival.
>>>>>>>>>
>>>>>>>>> We bias toward ensuring an MBean exists for dynamic modification
>>>>>>>>> of yaml parameters. When we couple dynamic configuration updates and
>>>>>>>>> arbitrary shell command execution, we introduce vectors for arbitrary 
>>>>>>>>> code
>>>>>>>>> execution, data exfiltration, and data compromise that have a lower 
>>>>>>>>> bar to
>>>>>>>>> achieve than local file write.
>>>>>>>>>
>>>>>>>>> I agree that we should work toward removing operator-provided
>>>>>>>>> shell commands in yaml.
>>>>>>>>>
>>>>>>>>> For concerns like archival, these seem like areas that Cassandra
>>>>>>>>> could easily accomplish itself without shelling out to
>>>>>>>>> gzip/zstd/lz4-compress a file. Introducing a new config structure that
>>>>>>>>> declares an archival format, accompanying implementations for
>>>>>>>>> compression/decompression, and deprecation of the prior approach 
>>>>>>>>> sounds
>>>>>>>>> both reasonable and desirable to me.
>>>>>>>>>
>>>>>>>>> – Scott
>>>>>>>>>
>>>>>>>>> —
>>>>>>>>> Mobile
>>>>>>>>>
>>>>>>>>> On Aug 30, 2024, at 10:25 PM, Bowen Song via dev <
>>>>>>>>> [email protected]> wrote:
>>>>>>>>>
>>>>>>>>> 
>>>>>>>>>
>>>>>>>>> I'm not sure what is the concern here. Is it a malicious user
>>>>>>>>> exploiting this? Or human error with unintended consequences?
>>>>>>>>>
>>>>>>>>> For malicious user, in order to exploit this, an attacker needs to
>>>>>>>>> be able to write to the config file. The config file on Linux by 
>>>>>>>>> default is
>>>>>>>>> owned by the root user and has the -rw-r--r-- permission, that means 
>>>>>>>>> the
>>>>>>>>> attacker must either gain root access to the system or has the 
>>>>>>>>> ability to
>>>>>>>>> write arbitrary file on the filesystem. With either of these 
>>>>>>>>> permission,
>>>>>>>>> they can already do almost anything they want (e.g. modify a SUID
>>>>>>>>> executable file). They wouldn't even need to exploit this to run a 
>>>>>>>>> script
>>>>>>>>> or dangerous command. So this sounds like a non-issue to me, at least 
>>>>>>>>> on
>>>>>>>>> Linux-based OSes.
>>>>>>>>>
>>>>>>>>> For human error, if the operator puts "rm -rf" in it, the software
>>>>>>>>> should treat it as the operator actually wants to do that. I 
>>>>>>>>> personally
>>>>>>>>> don't like software attempting to outsmart human, which often ends up
>>>>>>>>> interfering with legitimate use cases. The best thing a software can 
>>>>>>>>> do is
>>>>>>>>> log it, so there's some traceability if and when things go wrong.
>>>>>>>>>
>>>>>>>>> So, IMO, there's nothing wrong with the implementation in
>>>>>>>>> Cassandra.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 30/08/2024 17:13, guo Maxwell wrote:
>>>>>>>>>
>>>>>>>>>     Commitlog has the ability of archive  log file, see
>>>>>>>>> CommitLogArchiver.java
>>>>>>>>> <https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/commitlog/CommitLogArchiver.java>,
>>>>>>>>> we can achieve the purpose of archive and restore commitlog by
>>>>>>>>> configuring archive_command and restore_command in
>>>>>>>>> commitlog_archiving.properties
>>>>>>>>> <https://github.com/apache/cassandra/blob/trunk/conf/commitlog_archiving.properties#L28>
>>>>>>>>> .The archive_command and restore_command can be some linux/unix
>>>>>>>>> shell command.  However, I found that the shell command can
>>>>>>>>> actually be filled with any script, even if "*rm -rf"* .I have
>>>>>>>>> tested this situation and it finally succeeded with my test file being
>>>>>>>>> deleted.
>>>>>>>>>
>>>>>>>>>     Personally, I think it is a dangerous behavior, because if
>>>>>>>>> there are no system-level restrictions and users are allowed to do 
>>>>>>>>> anything
>>>>>>>>> in these shell commands. So here I want to discuss with you
>>>>>>>>> whether it is necessary to impose any restrictions on use, or do we 
>>>>>>>>> need a
>>>>>>>>> new way of archiving/restoring commitlog?
>>>>>>>>>
>>>>>>>>> Of course, before that, I would also like to ask, how many people
>>>>>>>>> are using archive and restore of commitlog? It seems that the 
>>>>>>>>> commitlog
>>>>>>>>> archive code has not been updated for a long time.
>>>>>>>>>
>>>>>>>>> I have two ideas.
>>>>>>>>> One is to make some restrictions on the command context based on
>>>>>>>>> the existing usage methods, such as strictly only allowing the current
>>>>>>>>> cp/mv/ln %path to %name.Other redundant strings in the command are not
>>>>>>>>> allowed.
>>>>>>>>> Another one , As I roughly investigated the archive of mysql and
>>>>>>>>> pg. They do not give users too much space (I am talking about letting 
>>>>>>>>> users
>>>>>>>>> define their own archiving command ), and archive directly to a 
>>>>>>>>> designated
>>>>>>>>> location. For us, I feel that we can refer to c * Incremental backup 
>>>>>>>>> of
>>>>>>>>> sstable,  add a hardlink to the commitlog to the specified location, 
>>>>>>>>> but
>>>>>>>>> this place may modify the original configuration method, such as 
>>>>>>>>> setting
>>>>>>>>> the archive location and restoring location of the node through 
>>>>>>>>> nodetool
>>>>>>>>> and deprecate the  commitlog_archiving.properties
>>>>>>>>> <https://github.com/apache/cassandra/blob/trunk/conf/commitlog_archiving.properties#L28>
>>>>>>>>>  configuration.
>>>>>>>>>
>>>>>>>>> I am just putting forward some views  here, and looking forward to
>>>>>>>>> your feedback. 😀
>>>>>>>>>
>>>>>>>>>

Re: 【DISCUSS】The configuration of Commitlog archiving

Reply via email to