Hi Joel, Thanks for the meaningful feedback, see follow up below:
> I think I'm still trying to form a mental model of where Sidecar's responsibilities start and end. It sounds like for this proposal, its scope is basically applying the "current" configuration to the node(s), but the configuration management itself needs to be done above Sidecar. Yes, the main goal of this is to add awareness of the node configuration to sidecar so we can start exploring things like rolling config updates and config drift detection within sidecar, but we still need an external orchestrator to bootstrap the config. > With the introduction of hashes, you may have a path towards hardening this by enabling the node to validate the integrity of its configuration at runtime, which will be more useful as Sidecar makes automated deployments of configuration easier. I agree! This CEP is a step in this direction, since it will allow operators to express the desired config and we can compare with the running config from the `system_views.settings` virtual table to detect and report any discrepancies. > Is there a write-up somewhere of how Sidecar auth works end-to-end? I don't think there is a formal doc available but you can find more information in the PR that introduced it: - https://issues.apache.org/jira/browse/CASSSIDECAR-161 - https://github.com/apache/cassandra-sidecar/commit/5a19e3448038fa4b2e9f497ab94dbbe911f44c29 > Not directly related: in that section it says "Sidecar does not depend on Cassandra being running", but in the auth section it says "Permissions are resolved from Sidecar's sidecar_internal.role_permissions_v1 tables". Can Sidecar access tables on the node directly, without going through the main Cassandra process? That's a good point. When the managed node is offline it's possible to access authorization tables through other nodes, by specifying multiple CQL contact points in sidecar.yml. Alternatively it's possible to set up admin_identities in sidecar.yml allowing admins to perform config changes in the event that auth tables are unavailable. I have updated the authorization section with a note about this. Cheers, Paulo On Wed, Mar 25, 2026 at 6:16 PM Joel Shepherd <[email protected]> wrote: > Thanks, Paulo - I think I'm still trying to form a mental model of where > Sidecar's responsibilities start and end. It sounds like for this proposal, > it's scope is basically applying the "current" configuration to the > node(s), but the configuration management itself needs to be done above > Sidecar. Makes sense. > > A couple questions below for my own learning, and a response as well... > On 3/18/2026 3:14 PM, Paulo Motta wrote: > > Thanks for the feedback Joel! See follow-up below: > > > * Authorization - What are the authorization controls. > > Good call! This will use Sidecar's existing authorization mechanism per > HTTP endpoint. Two new permissions will be added: CONFIGURATION:READ and > CONFIGURATION:MODIFY to control reading or updating configs. I've updated > the doc with a new section about authorization. > > Is there a write-up somewhere of how Sidecar auth works end-to-end? > > > > * Integrity - I see you're using hashes for conflict detection. Have you > considered using them as integrity checks as well: e.g., to guarantee > the configuration deployed for the node/instance to load at runtime is > the same configuration computed by the configuration manager (I think > that's the right component)? > > The deployed configuration is guaranteed to be the same configuration > computed by the configuration manager, since the runtime configuration is > always refreshed during instance startup. If the runtime configuration is > corrupted it would be overwritten by the recomputed config. Let me know if > this cover the scenario you have in mind or if I missed something. > > I guess I was thinking a little beyond the scope of this change, to: how > does the node know that the configuration hasn't been altered from what > it's intended to be, regardless of whether that's through cosmic ray > flipping a bit, or a misbehaving file system corrupting bytes, or manual > modification? With the introduction of hashes, you may have a path towards > hardening this by enabling the node to validate the integrity of its > configuration at runtime, which will be more useful as Sidecar makes > automated deployments of configuration easier. Color me paranoid. > > But, I acknowledge this is outside the scope of this CEP. > > > * Rollback/forward - If I push a bad configuration change, how do I as > the the administrator respond to that? > > Change tracking is explicitly not a goal of this CEP to keep the scope > limited. When a bad configuration is pushed, the operator would need to > manually revert by submitting another PATCH request undoing the bad > configuration. An external RCS would need to be used to keep track of > config history if needed. I've added a new future work entry to support > change tracking natively. I've also added an Operational Guide section with > an overview of how this is expected to be used. Let me know if this makes > sense. > > Thanks so much for doing that: it's really helpful to talk about how the > user will work with the feature up-front. > > Not directly related: in that section it says "Sidecar does not depend on > Cassandra being running", but in the auth section it says "Permissions are > resolved from Sidecar's sidecar_internal.role_permissions_v1 tables". Can > Sidecar access tables on the node directly, without going through the main > Cassandra process? > > Thanks again -- Joel. > > > > On Tue, 17 Mar 2026 at 17:04 Joel Shepherd <[email protected]> wrote: > >> Hi Paulo - Interesting CEP, and potentially very useful: thanks! >> >> I was wondering about several things as I was reading through it: >> >> * Authorization - Particularly for operations that mutate configuration >> (either in the store or at run-time for the node). What are the >> authorization controls. >> >> * Integrity - I see you're using hashes for conflict detection. Have you >> considered using them as integrity checks as well: e.g., to guarantee the >> configuration deployed for the node/instance to load at runtime is the same >> configuration computed by the configuration manager (I think that's the >> right component)? This would be a guard against bugs, network gremlins, >> file system gremlins, etc., quietly corrupting the configuration that the >> node will eventually read. >> >> * Visibility - As an 'administrator' how do I determine how much of my >> cluster is running on the latest configuration, and which nodes >> specifically aren't? Is it up to me to implement that monitoring? >> >> * Rollback/forward - If I push a bad configuration change, how do I as >> the the administrator respond to that? For example, is there an assumption >> that I'll be managing my configuration in a RCS somewhere and will be >> expected to quickly retrieve a known-good older revision from it and push >> it through sidecar? It might be helpful to have a "user experience" section >> in the CEP to describe how you envision users managing their cluster's >> configuration through this tool: what they're responsible for, what the >> tool is responsible for. >> >> Thanks -- Joel. >> On 3/17/2026 9:32 AM, Paulo Motta wrote: >> >> >> Hi everyone, >> >> I'd like to propose CEP-62: Cassandra Configuration Management via >> Sidecar for discussion by the community. >> >> CASSSIDECAR-266[1] introduced Cassandra process lifecycle management >> capabilities to Sidecar, giving operators the ability to start and stop >> Cassandra instances programmatically. However, Sidecar currently has no way >> to manipulate the configuration files that those instances consume at >> startup. >> >> Many Cassandra settings (memtable configuration, SSTable settings, >> storage_compatibility_mode) cannot be modified at runtime via JMX/CQL and >> must be set in cassandra.yaml or JVM options files, requiring a restart to >> take effect. Managing these files manually or through custom tooling is >> cumbersome and lacks a stable API. >> >> This CEP extends Sidecar's lifecycle management by adding configuration >> management capabilities for persisted configuration artifacts. It >> introduces a REST API for reading and updating cassandra.yaml and JVM >> options, a pluggable ConfigurationProvider abstraction for integration with >> centralized configuration systems (etcd, Consul, or custom backends), and >> version-aware validation to prevent startup failures. >> >> This CEP also serves as a prerequisite for future Cassandra upgrades via >> Sidecar. For example, upgrading from Cassandra 4 to Cassandra 5 requires >> updating storage_compatibility_mode in cassandra.yaml. The configuration >> management capabilities introduced here will enable Sidecar to orchestrate >> such upgrades by updating configuration artifacts alongside binary version >> changes. >> >> The CEP is linked here: >> https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-62%3A+Cassandra+Configuration+Management+via+Sidecar >> >> Looking forward to your feedback! >> >> Thanks, >> >> Paulo >> >> [1] - https://issues.apache.org/jira/browse/CASSSIDECAR-266 >> >>
