+1 to Scott’s comments. Once you expose those YAML config params outside of
a single node which many of us do, this becomes an RCE attack vector.
Something more structured as Scott proposes, similar to snapshots, would be
preferred. Would recommend a CEP.

Jordan

On Fri, Aug 30, 2024 at 20:58 C. Scott Andreas <csco...@icloud.com> wrote:

> I appreciate this report and would love to work toward the direction it
> recommends.
>
> I’m also familiar with past concerns raised by others with our FQL
> configuration parameters that allow passing shell commands for FQL segment
> archival.
>
> We bias toward ensuring an MBean exists for dynamic modification of yaml
> parameters. When we couple dynamic configuration updates and arbitrary
> shell command execution, we introduce vectors for arbitrary code execution,
> data exfiltration, and data compromise that have a lower bar to achieve
> than local file write.
>
> I agree that we should work toward removing operator-provided shell
> commands in yaml.
>
> For concerns like archival, these seem like areas that Cassandra could
> easily accomplish itself without shelling out to gzip/zstd/lz4-compress a
> file. Introducing a new config structure that declares an archival format,
> accompanying implementations for compression/decompression, and deprecation
> of the prior approach sounds both reasonable and desirable to me.
>
> – Scott
>
> —
> Mobile
>
> On Aug 30, 2024, at 10:25 PM, Bowen Song via dev <dev@cassandra.apache.org>
> wrote:
>
> 
>
> I'm not sure what is the concern here. Is it a malicious user exploiting
> this? Or human error with unintended consequences?
>
> For malicious user, in order to exploit this, an attacker needs to be able
> to write to the config file. The config file on Linux by default is owned
> by the root user and has the -rw-r--r-- permission, that means the attacker
> must either gain root access to the system or has the ability to write
> arbitrary file on the filesystem. With either of these permission, they can
> already do almost anything they want (e.g. modify a SUID executable file).
> They wouldn't even need to exploit this to run a script or dangerous
> command. So this sounds like a non-issue to me, at least on Linux-based
> OSes.
>
> For human error, if the operator puts "rm -rf" in it, the software should
> treat it as the operator actually wants to do that. I personally don't like
> software attempting to outsmart human, which often ends up interfering with
> legitimate use cases. The best thing a software can do is log it, so
> there's some traceability if and when things go wrong.
>
> So, IMO, there's nothing wrong with the implementation in Cassandra.
>
>
> On 30/08/2024 17:13, guo Maxwell wrote:
>
>     Commitlog has the ability of archive  log file, see
> CommitLogArchiver.java
> <https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/commitlog/CommitLogArchiver.java>,
> we can achieve the purpose of archive and restore commitlog by
> configuring archive_command and restore_command in
> commitlog_archiving.properties
> <https://github.com/apache/cassandra/blob/trunk/conf/commitlog_archiving.properties#L28>
> .The archive_command and restore_command can be some linux/unix shell
> command.  However, I found that the shell command can actually be filled
> with any script, even if "*rm -rf"* .I have tested this situation and it
> finally succeeded with my test file being deleted.
>
>     Personally, I think it is a dangerous behavior, because if there are
> no system-level restrictions and users are allowed to do anything in these
> shell commands. So here I want to discuss with you whether it is
> necessary to impose any restrictions on use, or do we need a new way of
> archiving/restoring commitlog?
>
> Of course, before that, I would also like to ask, how many people are
> using archive and restore of commitlog? It seems that the commitlog archive
> code has not been updated for a long time.
>
> I have two ideas.
> One is to make some restrictions on the command context based on the
> existing usage methods, such as strictly only allowing the current cp/mv/ln
> %path to %name.Other redundant strings in the command are not allowed.
> Another one , As I roughly investigated the archive of mysql and pg. They
> do not give users too much space (I am talking about letting users define
> their own archiving command ), and archive directly to a designated
> location. For us, I feel that we can refer to c * Incremental backup of
> sstable,  add a hardlink to the commitlog to the specified location, but
> this place may modify the original configuration method, such as setting
> the archive location and restoring location of the node through nodetool
> and deprecate the  commitlog_archiving.properties
> <https://github.com/apache/cassandra/blob/trunk/conf/commitlog_archiving.properties#L28>
>  configuration.
>
> I am just putting forward some views  here, and looking forward to your
> feedback. 😀
>
>

Reply via email to