Thanks for the info; Yeah the 'cassandra_instances' block is one of the confusing aspects for me, and the underlying apis all take the instanceId, which strongly implies an intention to manage N with 1 sidecar. I've never heard of running more than one cassandra instance on a single host ! Now I'm curious about that topology, particularly on large compute.
I guess you could also NFS mount other instance data folders, but that seems like a 'bad idea'. We'll stick with our 1:1 topology with a little more confidence, thanks again. Cheers, Carl ________________________________ From: Štefan Miklošovič <[email protected]> Sent: 16 October 2025 21:42 To: [email protected] <[email protected]> Subject: Re: Intended Sidecar / Cassandra Topology EXTERNAL EMAIL - USE CAUTION when clicking links or attachments I would add to that that I think it is possible to manage more than one Cassanra instance, if you look into sidecar.yaml, cassandra_instances is a list (1). That still means that Sidecar has to have access to the local disk of a particular instance. That means that the mapping is not necessarily 1:1, it is like 1:N - but the caveat here is that all instances have to be run on the same machine. Regards (1) https://urldefense.com/v3/__https://github.com/apache/cassandra-sidecar/blob/trunk/conf/sidecar.yaml*L22__;Iw!!Nhn8V6BzJA!RbiGXbVUsCmgqbbNvllhsDiM4PJ3sVNvtPLdN2EyEeHmcvCjNQ6C1FyX6Uo9g9nAZ62rFshNGDCRTUBc6wLa43Q5$ On Thu, Oct 16, 2025 at 2:03 AM Dinesh Joshi <[email protected]> wrote: > > The Cassandra Sidecar, as it is designed today, should be run locally to the > Cassandra node that it is managing. It _could_ be used as a separate service > but obviously functionality that it relies on for local directory access will > not work. It is not currently intended to be used separately. A 1:1 topology > is the right approach for now. > > Thanks, > > Dinesh > > On Wed, Oct 15, 2025 at 4:42 PM Sandland, Carl via user > <[email protected]> wrote: >> >> Hi, >> >> We are investigating the use of sidecar with cassandra to deliver bulk >> read/write workflows via spark/cassandra-analytics. I'm relatively new to >> cassandra devops, but have started reading some of the code (sidecar in >> particular). >> >> I'm quite confused as to the intended topology of a large scale cassandra >> fleet (say 200+ nodes) and the sidecar process itself. We've currently >> assumed that sidecar MUST be local and have access to the sidecar >> "cassandra_instances.storage_dir" folder in order to carry out its bulk >> read/write functionality. But after reading stuff here (and on dev), it >> seems like a single sidecar process chould manage N(200) instances of >> cassandra, even remotely from a separate machine or network. Can someone >> clarify the intended topology of sidecar:cassandra-node or point me in the >> right direction for review, it would be appreciated. >> >> I see that 'sidecar' can be discussed as meaning many different things >> (modules?), and one thing I've been asked to do is restrict access to only >> parts of sidecar to clients (as we are paid to manage stuff for them). Is >> there any intention to make "my sidecar" configurable? Ie what parts of >> sidecar should be running on this node, without rebuilding from source? >> I could see a new block of config that enables/disables parts of the >> api. At the moment, we are moving forward with one sidecar to one cassandra, >> and intend to put sidecar upstream from an nginx server to limit access. >> >> Any pointers/reading material appreciated. >> >> Cheers, >> Carl
