Thanks for bringing this to the list Ekaterina! It’s worth noting that the two don’t have to be in conflict: we could offer two template yaml with the parameters grouped differently, for users to decide for themselves.
The proposals primarily define parameter names differently, with my proposal going by kind->place, and the other proposal maintaining (mostly) the existing name form (which is a bit more like place->kind). While the example yaml groups by kind, you can convert nested definitions into a ‘dot’ form (e.g. limits.concurrency.reads) for use in a different grouping. One advantage of grouping parameters together is that it aids maintaining coherency of naming between systems, and also potentially permits a more succinct config file and better discovery. But it’s far from a silver bullet, as value judgements have to be made about where the grouping lines are. I’m sure anything we settle on will be a huge improvement over the status quo, however. From: Ekaterina Dimitrova <e.dimitr...@gmail.com> Date: Thursday, 2 September 2021 at 16:32 To: dev@cassandra.apache.org <dev@cassandra.apache.org> Subject: [DISCUSS] CASSANDRA-15234 Hi team, I would like to bring to the attention of the community CASSANDRA-15234, standardise config and JVM parameters. This is work we discussed back in Summer 2020 just before our first 4.0 Beta release. During the discussion we figured out that there is more than one option to do the job and not enough time to get user feedback and finish it so this was delayed post-4.0 And here I am, bringing it back to the table. This work’s goal is: - To standardize naming - that we did by agreeing to the form noun_verb - Provision of values with units while maintaining backward compatibility. Those two parts are more or less already done. More interesting is the third part - reorganizing the cassandra.yaml file. My personal approach was to split it into sections, done here <https://github.com/ekaterinadimitrova2/cassandra/blob/b4eebe080835da79d032f9314262c268b71172a8/conf/cassandra.yaml> . Another proposal is done by Benedict; grouping the config parameters. To make it clearer, he created a yaml <https://github.com/belliottsmith/cassandra/blob/5f80d1c0d38873b7a27dc137656d8b81f8e6bbd7/conf/cassandra_nocomment.yaml> with comments mostly stripped. In his version, there are basic settings for network, disk etc all grouped together, followed by operator tuneables mostly under limits within which we now have throughput, concurrency, capacity. This leads to settings for some features being kept separate (most notably for caching), but helps the operator understand what they have to play with for controlling resource consumption. I am interested to hear what people think about the two options or if anyone has another idea to share, open discussion. Thank you, Ekaterina