Re: [DISCUSS] CASSANDRA-15234

2022-02-06 Thread Ekaterina Dimitrova
Hi everyone,

I have some good news and a bit of not that good but also not that bad.
CASSANDRA-15234 was committed last night, you will need to rebase CCM,
DTest and Trunk (if you use your own CCM and DTest branches). Push to
GitHub prior running CI should be in the order CCM -> DTest -> Trunk.
Unfortunately, we have new issues with CCM retagging (we solved different
ones but related before and things were working for some time
CASSANDRA-16688 <https://issues.apache.org/jira/browse/CASSANDRA-16688>)
and CircleCI. After retagging CCM it seems now CircleCI doesn't pick the
new tag and if you want to use CircleCI for testing, you will need to add
-e in requirements.txt:

-e git+https://github.com/riptano/ccm.git@cassandra-test#egg=ccm

The post-commit build in Jenkins looks fine, it picked up the new CCM tag.
I already have a ticket CASSANDRA-17351
<https://issues.apache.org/jira/browse/CASSANDRA-17351>. If anyone has any
ideas how to fix this permanently, all are welcome.

About CASSANDRA-15234, old and new yaml and format should both work.Further
to make it smoother for our users to migrate whenever they have time to the
new yaml, that gave us the chance also not to have to update the parameters
in the DTest suite but keep on exercising the old ones(one more way of
testing the backward compatibility too further to all new tests added). BUT
please, add any new tests using the new config so we can migrate in time.
You can find more details in the three primary classes Duration Spec,
DataStorageSpec and DataRateSpec (two of them are also extended, more
details in follow up tickets and posts) Backward compatibility is provided
through annotations used in Config.java. There is Converters enum to handle
different cases for backward compatibility.

Another rule is to start adding any new config with only the smallest
supported units.

I will explain and add full details in the docs I am already working on as
part of - CASSANDRA-17246
<https://issues.apache.org/jira/browse/CASSANDRA-17246> I am planning also
on two posts - one for the users and one for us how to use the new
framework/format and what to expect.

Enjoy the rest of your weekend and apologise for the CircleCI temporary
inconvenience.

Bests regards,
Ekaterina

On Mon, 25 Oct 2021 at 10:19, Ekaterina Dimitrova 
wrote:

> Thank you Benedict.
>
> Considering there were no objections I am closing the discussion and
> getting back to work on the ticket itself. Thank you all. Have a great week
> ahead.
>
> On Wed, 20 Oct 2021 at 18:06, [email protected] 
> wrote:
>
>> Thanks for moving this forwards Ekaterina.
>>
>> I think what we perhaps discovered is that there’s not really any
>> consensus about how to best do config files. I think in this situation it’s
>> best to defer to the one who’s actually putting in the time to _do_, so I
>> am more than happy to defer to your decisions.
>>
>> I’m sure everyone is looking forward to the improved consistency of this
>> work.
>>
>>
>> From: Ekaterina Dimitrova 
>> Date: Wednesday, 20 October 2021 at 22:27
>> To: [email protected] 
>> Subject: Re: [DISCUSS] CASSANDRA-15234
>> Hi everyone,
>>
>> I think it is time to summarize the discussion.
>>
>> First of all, thank you for all the valuable input, suggestions, concerns,
>> and comments!
>>
>> The things that I believe we all agree on:
>>
>>-
>>
>>Simplicity for maintenance on our end - automation as much as possible
>>so we don’t have to maintain more than one configuration file and our
>>config is less prone to human errors while adding new features
>>-
>>
>>Simplicity for our users - as less confusing and as simple as possible
>>and having in mind the users’ toolset
>>-
>>
>>Simplicity for testing and verification of the different config file
>>formats
>>
>>
>> It seems to me that most people want to see committed both proposed
>> versions(feel free to correct me if I am wrong) with revision of the
>> default values and potentially commented out all parameters that are not
>> really mandatory to be changed. Also, versions with striped comments plus
>> a
>> way to maintain everything automatically, as much as possible.
>>
>> With that said it seems to me the current patch in CASSANDRA-15234 can be
>> committed after rebase and addressing any outstanding review comments. The
>> new version of cassandra.yaml, grouping the parameters can be added in a
>> new ticket by me or anyone with free cycles for that. It will require
>> additional work on the backward compatibility and the opportunity for
>> Cassandra to operate on all of the current versions but i

Re: [DISCUSS] CASSANDRA-15234

2021-10-25 Thread Ekaterina Dimitrova
Thank you Benedict.

Considering there were no objections I am closing the discussion and
getting back to work on the ticket itself. Thank you all. Have a great week
ahead.

On Wed, 20 Oct 2021 at 18:06, [email protected] 
wrote:

> Thanks for moving this forwards Ekaterina.
>
> I think what we perhaps discovered is that there’s not really any
> consensus about how to best do config files. I think in this situation it’s
> best to defer to the one who’s actually putting in the time to _do_, so I
> am more than happy to defer to your decisions.
>
> I’m sure everyone is looking forward to the improved consistency of this
> work.
>
>
> From: Ekaterina Dimitrova 
> Date: Wednesday, 20 October 2021 at 22:27
> To: [email protected] 
> Subject: Re: [DISCUSS] CASSANDRA-15234
> Hi everyone,
>
> I think it is time to summarize the discussion.
>
> First of all, thank you for all the valuable input, suggestions, concerns,
> and comments!
>
> The things that I believe we all agree on:
>
>-
>
>Simplicity for maintenance on our end - automation as much as possible
>so we don’t have to maintain more than one configuration file and our
>config is less prone to human errors while adding new features
>-
>
>Simplicity for our users - as less confusing and as simple as possible
>and having in mind the users’ toolset
>-
>
>Simplicity for testing and verification of the different config file
>formats
>
>
> It seems to me that most people want to see committed both proposed
> versions(feel free to correct me if I am wrong) with revision of the
> default values and potentially commented out all parameters that are not
> really mandatory to be changed. Also, versions with striped comments plus a
> way to maintain everything automatically, as much as possible.
>
> With that said it seems to me the current patch in CASSANDRA-15234 can be
> committed after rebase and addressing any outstanding review comments. The
> new version of cassandra.yaml, grouping the parameters can be added in a
> new ticket by me or anyone with free cycles for that. It will require
> additional work on the backward compatibility and the opportunity for
> Cassandra to operate on all of the current versions but it will be new
> additional opportunity which doesn’t disqualify the old ones so it seems as
> a fair game to be added at any point in time in the future as it won’t be a
> breaking change. We won’t replace anything. We will only add more options.
>
> If someone disagrees and wants to implement all possible options and
> functionalities at once, I will be happy to handover the work and try to
> find the time to provide feedback/reviews later.
>
> Please do not hesitate to correct me if I misunderstood something.
>
> I will leave this discussion open until Monday and if there are no
> objections I will continue with CASSANDRA-15234 as per my proposal.
>
> Best regards,
>
> Ekaterina
>
> On Fri, 10 Sep 2021 at 20:18, Patrick McFadin  wrote:
>
> > Ah, I feel like cassandra.yaml discussions are such an evergreen topic.
> >
> > This was something brought up a while back, but I remember years ago we
> > talked about emulating the config options that some other databases have
> > done. Providing different versions of the config for different
> approaches.
> > For instance, MySQL has had 'my-small.cnf' with just the bare minimum
> > config and restricted parameters for something like a laptop. A friendly
> > option for newcomers would be a clearly labeled  'cassandra-small.yaml'
> > with just the bare minimum and good comments. Then people new to
> Cassandra
> > wouldn't have a panic moment wondering if they have to know what
> concurrent
> > compactors are and how many you actually need? (Is there a right answer
> > even???) It's tackling the way operators approach config by the use case
> > they are trying to satisfy. Run one node on my laptop. Run a small
> cluster
> > on a budget cloud server. Run any size cluster on a ginormous server.
> >
> > Unfortunately, the cleaner solution would be how Apache HTTD solved it
> back
> > in the day with include files. It made config management much easier and
> > the overwhelm factor much lower. Yaml doesn't support it and it would all
> > have to be custom code in the Cassandra config loader. Not the best
> option
> > really.
> >
> > Back to the original question, I think Ekaterina's sectioned version
> could
> > be used for new operators because there is a lot to learn looking at the
> > comments.  Publish the following options:
> >
> > cassandra-smal

Re: [DISCUSS] CASSANDRA-15234

2021-10-20 Thread [email protected]
Thanks for moving this forwards Ekaterina.

I think what we perhaps discovered is that there’s not really any consensus 
about how to best do config files. I think in this situation it’s best to defer 
to the one who’s actually putting in the time to _do_, so I am more than happy 
to defer to your decisions.

I’m sure everyone is looking forward to the improved consistency of this work.


From: Ekaterina Dimitrova 
Date: Wednesday, 20 October 2021 at 22:27
To: [email protected] 
Subject: Re: [DISCUSS] CASSANDRA-15234
Hi everyone,

I think it is time to summarize the discussion.

First of all, thank you for all the valuable input, suggestions, concerns,
and comments!

The things that I believe we all agree on:

   -

   Simplicity for maintenance on our end - automation as much as possible
   so we don’t have to maintain more than one configuration file and our
   config is less prone to human errors while adding new features
   -

   Simplicity for our users - as less confusing and as simple as possible
   and having in mind the users’ toolset
   -

   Simplicity for testing and verification of the different config file
   formats


It seems to me that most people want to see committed both proposed
versions(feel free to correct me if I am wrong) with revision of the
default values and potentially commented out all parameters that are not
really mandatory to be changed. Also, versions with striped comments plus a
way to maintain everything automatically, as much as possible.

With that said it seems to me the current patch in CASSANDRA-15234 can be
committed after rebase and addressing any outstanding review comments. The
new version of cassandra.yaml, grouping the parameters can be added in a
new ticket by me or anyone with free cycles for that. It will require
additional work on the backward compatibility and the opportunity for
Cassandra to operate on all of the current versions but it will be new
additional opportunity which doesn’t disqualify the old ones so it seems as
a fair game to be added at any point in time in the future as it won’t be a
breaking change. We won’t replace anything. We will only add more options.

If someone disagrees and wants to implement all possible options and
functionalities at once, I will be happy to handover the work and try to
find the time to provide feedback/reviews later.

Please do not hesitate to correct me if I misunderstood something.

I will leave this discussion open until Monday and if there are no
objections I will continue with CASSANDRA-15234 as per my proposal.

Best regards,

Ekaterina

On Fri, 10 Sep 2021 at 20:18, Patrick McFadin  wrote:

> Ah, I feel like cassandra.yaml discussions are such an evergreen topic.
>
> This was something brought up a while back, but I remember years ago we
> talked about emulating the config options that some other databases have
> done. Providing different versions of the config for different approaches.
> For instance, MySQL has had 'my-small.cnf' with just the bare minimum
> config and restricted parameters for something like a laptop. A friendly
> option for newcomers would be a clearly labeled  'cassandra-small.yaml'
> with just the bare minimum and good comments. Then people new to Cassandra
> wouldn't have a panic moment wondering if they have to know what concurrent
> compactors are and how many you actually need? (Is there a right answer
> even???) It's tackling the way operators approach config by the use case
> they are trying to satisfy. Run one node on my laptop. Run a small cluster
> on a budget cloud server. Run any size cluster on a ginormous server.
>
> Unfortunately, the cleaner solution would be how Apache HTTD solved it back
> in the day with include files. It made config management much easier and
> the overwhelm factor much lower. Yaml doesn't support it and it would all
> have to be custom code in the Cassandra config loader. Not the best option
> really.
>
> Back to the original question, I think Ekaterina's sectioned version could
> be used for new operators because there is a lot to learn looking at the
> comments.  Publish the following options:
>
> cassandra-small.yaml: Just the 'Quickstart' section
> cassandra-medium.yaml: 'Quickstart' and 'Commonly used' with sane defaults
> cassandra-advanced.yaml: Every section
>
> The addition is a similarly named JVM properties file .
>
> As somebody who has been using Cassandra for a while and would like to have
> a more verbose version (especially for config management) Benedict's
> grouped version is fantastic. Just one option there:
>
> cassandra-full.yaml
>
> That's my idea to satisfy the various operators that approach a new
> install.
>
> Patrick
>
> On Fri, Sep 10, 2021 at 3:31 PM Jeremiah D Jordan <
> [email protected]&

Re: [DISCUSS] CASSANDRA-15234

2021-10-20 Thread Ekaterina Dimitrova
gt; checked
> > there were many fields that when commented out would not use a sensible
> > value, or would result in NPE’s because they didn’t have a code level
> > default.
> >
> > -Jeremiah
> >
> > > On Sep 10, 2021, at 1:24 PM, David Capwell  >
> > wrote:
> > >
> > > We can have both, but I would hope we do not have humans maintaining
> > both.  If we maintain the commented one, and did something like the below
> > while we compile then the burden to maintain doesn’t exist
> > >
> > > # remove comments and empty lines
> > > $ egrep -v '^[[:space:]]*#|^[[:space:]]*$' conf/cassandra.yaml.doc >
> > conf/cassandra.yaml
> > >
> > > We do this right now with conf/hotspot_compiler so as long as our build
> > maintains the other file +1
> > >
> > > Also, if you run the above command you will see we actually have a lot
> > of things show (129 lines)… it would be nice to clean it up as only a
> small
> > subset is required and most shown normal users won’t care
> > >
> > >> On Sep 3, 2021, at 6:45 AM, [email protected] wrote:
> > >>
> > >>> I think as the comments were stripped only for the POC. I guess many
> > of them will get back
> > >> in the actual doc version unfortunately.
> > >>
> > >> Well, I think the grouped format lends itself to much briefer
> comments,
> > with groups of related parameters getting an overall description. Even
> as a
> > developer who understands most of the toggles I found the old file very
> > hard to navigate.
> > >>
> > >> I also don’t see why we cannot have both heavily commented versions
> and
> > uncommented (or lightly commented) versions.
> > >>
> > >> I don’t personally see why multiple different config templates would
> be
> > confusing if they’re in a suitably labelled directory, even if we settle
> on
> > one for the default. It might even be nice to have a pared-down config
> that
> > has only those properties we expect the normal user to need, so it’s
> > particularly easy to navigate.
> > >>
> > >>
> > >> From: Ekaterina Dimitrova 
> > >> Date: Friday, 3 September 2021 at 14:40
> > >> To: [email protected] 
> > >> Subject: Re: [DISCUSS] CASSANDRA-15234
> > >>>>
> > >>>> It’s worth noting that the two don’t have to be in >conflict: we
> could
> > >>> offer two template yaml with the parameters grouped differently, for
> > users
> > >>> to decide for themselves.
> > >>
> > >> Sure, my only concern is that three versions of the yaml could bring
> > >> confusion (we will have backward compatibility to the current one for
> > some
> > >> time). But it might be only me. I am open for feedback
> > >>
> > >>
> > >>> If we can document this, it would be great as stuff >like “enabled”
> are
> > >>> inconsistent so not sure if I did it properly =D
> > >>>
> > >> Well, this is for now only in the ticket in the first version but no
> one
> > >> raised any concern. We will definitely have to update our docs on this
> > and
> > >> whatever else we came to agreement on - both for users and
> contributors.
> > >>
> > >>> though I will agree that it can be hard for some >tools (such
> > >>> as bash templating), but feel we can always find a >common ground
> > >> Valid point and I believe it is one of the reasons we delayed the
> > ticket,
> > >> in order to get feedback on that. I am really interested to hear what
> > >> concerns people might have.
> > >>
> > >>
> > >>> Opening up a 1500+ line .yaml file is very daunting, >even if most of
> > it is
> > >>> comments. Can't blame folks for being >overwhelmed at the prospect of
> > >> tuning
> > >>> Cassandra w/that as our operator config API. :)
> > >> I am all in for simplification and to make our users’ lives easier.
> But
> > at
> > >> this point we shouldn’t be comparing the length of the files I think
> as
> > the
> > >> comments were stripped only for the POC. I guess many of them will get
> > back
> > >> in the actual doc version unfortunately.
> > >>
> > >> Thank you all,
> > >> Ekaterina
> > >>

Re: [DISCUSS] CASSANDRA-15234

2021-09-10 Thread Patrick McFadin
Ah, I feel like cassandra.yaml discussions are such an evergreen topic.

This was something brought up a while back, but I remember years ago we
talked about emulating the config options that some other databases have
done. Providing different versions of the config for different approaches.
For instance, MySQL has had 'my-small.cnf' with just the bare minimum
config and restricted parameters for something like a laptop. A friendly
option for newcomers would be a clearly labeled  'cassandra-small.yaml'
with just the bare minimum and good comments. Then people new to Cassandra
wouldn't have a panic moment wondering if they have to know what concurrent
compactors are and how many you actually need? (Is there a right answer
even???) It's tackling the way operators approach config by the use case
they are trying to satisfy. Run one node on my laptop. Run a small cluster
on a budget cloud server. Run any size cluster on a ginormous server.

Unfortunately, the cleaner solution would be how Apache HTTD solved it back
in the day with include files. It made config management much easier and
the overwhelm factor much lower. Yaml doesn't support it and it would all
have to be custom code in the Cassandra config loader. Not the best option
really.

Back to the original question, I think Ekaterina's sectioned version could
be used for new operators because there is a lot to learn looking at the
comments.  Publish the following options:

cassandra-small.yaml: Just the 'Quickstart' section
cassandra-medium.yaml: 'Quickstart' and 'Commonly used' with sane defaults
cassandra-advanced.yaml: Every section

The addition is a similarly named JVM properties file .

As somebody who has been using Cassandra for a while and would like to have
a more verbose version (especially for config management) Benedict's
grouped version is fantastic. Just one option there:

cassandra-full.yaml

That's my idea to satisfy the various operators that approach a new
install.

Patrick

On Fri, Sep 10, 2021 at 3:31 PM Jeremiah D Jordan 
wrote:

> > Also, if you run the above command you will see we actually have a lot
> of things show (129 lines)… it would be nice to clean it up as only a small
> subset is required and most shown normal users won’t care
>
> +1 for this.  It would be good to clean up the config code and yaml such
> that only “things that are required to be changed” are not commented out in
> the file, and everything else is commented out by default.  Last I checked
> there were many fields that when commented out would not use a sensible
> value, or would result in NPE’s because they didn’t have a code level
> default.
>
> -Jeremiah
>
> > On Sep 10, 2021, at 1:24 PM, David Capwell 
> wrote:
> >
> > We can have both, but I would hope we do not have humans maintaining
> both.  If we maintain the commented one, and did something like the below
> while we compile then the burden to maintain doesn’t exist
> >
> > # remove comments and empty lines
> > $ egrep -v '^[[:space:]]*#|^[[:space:]]*$' conf/cassandra.yaml.doc >
> conf/cassandra.yaml
> >
> > We do this right now with conf/hotspot_compiler so as long as our build
> maintains the other file +1
> >
> > Also, if you run the above command you will see we actually have a lot
> of things show (129 lines)… it would be nice to clean it up as only a small
> subset is required and most shown normal users won’t care
> >
> >> On Sep 3, 2021, at 6:45 AM, [email protected] wrote:
> >>
> >>> I think as the comments were stripped only for the POC. I guess many
> of them will get back
> >> in the actual doc version unfortunately.
> >>
> >> Well, I think the grouped format lends itself to much briefer comments,
> with groups of related parameters getting an overall description. Even as a
> developer who understands most of the toggles I found the old file very
> hard to navigate.
> >>
> >> I also don’t see why we cannot have both heavily commented versions and
> uncommented (or lightly commented) versions.
> >>
> >> I don’t personally see why multiple different config templates would be
> confusing if they’re in a suitably labelled directory, even if we settle on
> one for the default. It might even be nice to have a pared-down config that
> has only those properties we expect the normal user to need, so it’s
> particularly easy to navigate.
> >>
> >>
> >> From: Ekaterina Dimitrova 
> >> Date: Friday, 3 September 2021 at 14:40
> >> To: [email protected] 
> >> Subject: Re: [DISCUSS] CASSANDRA-15234
> >>>>
> >>>> It’s worth noting that the two don’t have to be in >

Re: [DISCUSS] CASSANDRA-15234

2021-09-10 Thread Jeremiah D Jordan
> Also, if you run the above command you will see we actually have a lot of 
> things show (129 lines)… it would be nice to clean it up as only a small 
> subset is required and most shown normal users won’t care

+1 for this.  It would be good to clean up the config code and yaml such that 
only “things that are required to be changed” are not commented out in the 
file, and everything else is commented out by default.  Last I checked there 
were many fields that when commented out would not use a sensible value, or 
would result in NPE’s because they didn’t have a code level default.

-Jeremiah

> On Sep 10, 2021, at 1:24 PM, David Capwell  wrote:
> 
> We can have both, but I would hope we do not have humans maintaining both.  
> If we maintain the commented one, and did something like the below while we 
> compile then the burden to maintain doesn’t exist
> 
> # remove comments and empty lines
> $ egrep -v '^[[:space:]]*#|^[[:space:]]*$' conf/cassandra.yaml.doc > 
> conf/cassandra.yaml
> 
> We do this right now with conf/hotspot_compiler so as long as our build 
> maintains the other file +1
> 
> Also, if you run the above command you will see we actually have a lot of 
> things show (129 lines)… it would be nice to clean it up as only a small 
> subset is required and most shown normal users won’t care
> 
>> On Sep 3, 2021, at 6:45 AM, [email protected] wrote:
>> 
>>> I think as the comments were stripped only for the POC. I guess many of 
>>> them will get back
>> in the actual doc version unfortunately.
>> 
>> Well, I think the grouped format lends itself to much briefer comments, with 
>> groups of related parameters getting an overall description. Even as a 
>> developer who understands most of the toggles I found the old file very hard 
>> to navigate.
>> 
>> I also don’t see why we cannot have both heavily commented versions and 
>> uncommented (or lightly commented) versions.
>> 
>> I don’t personally see why multiple different config templates would be 
>> confusing if they’re in a suitably labelled directory, even if we settle on 
>> one for the default. It might even be nice to have a pared-down config that 
>> has only those properties we expect the normal user to need, so it’s 
>> particularly easy to navigate.
>> 
>> 
>> From: Ekaterina Dimitrova 
>> Date: Friday, 3 September 2021 at 14:40
>> To: [email protected] 
>> Subject: Re: [DISCUSS] CASSANDRA-15234
>>>> 
>>>> It’s worth noting that the two don’t have to be in >conflict: we could
>>> offer two template yaml with the parameters grouped differently, for users
>>> to decide for themselves.
>> 
>> Sure, my only concern is that three versions of the yaml could bring
>> confusion (we will have backward compatibility to the current one for some
>> time). But it might be only me. I am open for feedback
>> 
>> 
>>> If we can document this, it would be great as stuff >like “enabled” are
>>> inconsistent so not sure if I did it properly =D
>>> 
>> Well, this is for now only in the ticket in the first version but no one
>> raised any concern. We will definitely have to update our docs on this and
>> whatever else we came to agreement on - both for users and contributors.
>> 
>>> though I will agree that it can be hard for some >tools (such
>>> as bash templating), but feel we can always find a >common ground
>> Valid point and I believe it is one of the reasons we delayed the ticket,
>> in order to get feedback on that. I am really interested to hear what
>> concerns people might have.
>> 
>> 
>>> Opening up a 1500+ line .yaml file is very daunting, >even if most of it is
>>> comments. Can't blame folks for being >overwhelmed at the prospect of
>> tuning
>>> Cassandra w/that as our operator config API. :)
>> I am all in for simplification and to make our users’ lives easier. But at
>> this point we shouldn’t be comparing the length of the files I think as the
>> comments were stripped only for the POC. I guess many of them will get back
>> in the actual doc version unfortunately.
>> 
>> Thank you all,
>> Ekaterina
>> 
>> On Thu, 2 Sep 2021 at 20:07, Joshua McKenzie  wrote:
>> 
>>> Reading through the two, the grouping approach seems like it's a lot more
>>> friendly to newcomers as well as providing context specific cues for
>>> relationships between params you're editing. Showing and not telling, if
>>> you will.
>>> 
>>> Opening up a 1500+ line .yaml fil

Re: [DISCUSS] CASSANDRA-15234

2021-09-10 Thread David Capwell
We can have both, but I would hope we do not have humans maintaining both.  If 
we maintain the commented one, and did something like the below while we 
compile then the burden to maintain doesn’t exist

# remove comments and empty lines
$ egrep -v '^[[:space:]]*#|^[[:space:]]*$' conf/cassandra.yaml.doc > 
conf/cassandra.yaml

We do this right now with conf/hotspot_compiler so as long as our build 
maintains the other file +1

Also, if you run the above command you will see we actually have a lot of 
things show (129 lines)… it would be nice to clean it up as only a small subset 
is required and most shown normal users won’t care

> On Sep 3, 2021, at 6:45 AM, [email protected] wrote:
> 
>> I think as the comments were stripped only for the POC. I guess many of them 
>> will get back
> in the actual doc version unfortunately.
> 
> Well, I think the grouped format lends itself to much briefer comments, with 
> groups of related parameters getting an overall description. Even as a 
> developer who understands most of the toggles I found the old file very hard 
> to navigate.
> 
> I also don’t see why we cannot have both heavily commented versions and 
> uncommented (or lightly commented) versions.
> 
> I don’t personally see why multiple different config templates would be 
> confusing if they’re in a suitably labelled directory, even if we settle on 
> one for the default. It might even be nice to have a pared-down config that 
> has only those properties we expect the normal user to need, so it’s 
> particularly easy to navigate.
> 
> 
> From: Ekaterina Dimitrova 
> Date: Friday, 3 September 2021 at 14:40
> To: [email protected] 
> Subject: Re: [DISCUSS] CASSANDRA-15234
>>> 
>>> It’s worth noting that the two don’t have to be in >conflict: we could
>> offer two template yaml with the parameters grouped differently, for users
>> to decide for themselves.
> 
> Sure, my only concern is that three versions of the yaml could bring
> confusion (we will have backward compatibility to the current one for some
> time). But it might be only me. I am open for feedback
> 
> 
>> If we can document this, it would be great as stuff >like “enabled” are
>> inconsistent so not sure if I did it properly =D
>> 
> Well, this is for now only in the ticket in the first version but no one
> raised any concern. We will definitely have to update our docs on this and
> whatever else we came to agreement on - both for users and contributors.
> 
>> though I will agree that it can be hard for some >tools (such
>> as bash templating), but feel we can always find a >common ground
> Valid point and I believe it is one of the reasons we delayed the ticket,
> in order to get feedback on that. I am really interested to hear what
> concerns people might have.
> 
> 
>> Opening up a 1500+ line .yaml file is very daunting, >even if most of it is
>> comments. Can't blame folks for being >overwhelmed at the prospect of
> tuning
>> Cassandra w/that as our operator config API. :)
> I am all in for simplification and to make our users’ lives easier. But at
> this point we shouldn’t be comparing the length of the files I think as the
> comments were stripped only for the POC. I guess many of them will get back
> in the actual doc version unfortunately.
> 
> Thank you all,
> Ekaterina
> 
> On Thu, 2 Sep 2021 at 20:07, Joshua McKenzie  wrote:
> 
>> Reading through the two, the grouping approach seems like it's a lot more
>> friendly to newcomers as well as providing context specific cues for
>> relationships between params you're editing. Showing and not telling, if
>> you will.
>> 
>> Opening up a 1500+ line .yaml file is very daunting, even if most of it is
>> comments. Can't blame folks for being overwhelmed at the prospect of tuning
>> Cassandra w/that as our operator config API. :)
>> 
>> ~Josh
>> 
>> On Thu, Sep 2, 2021 at 1:48 PM David Capwell 
>> wrote:
>> 
>>> Thanks for bringing this back up; Caleb and I were talking about the lack
>>> of clarity with regard to CASSANDRA-16896, fleshing this out would make
>>> those configs nicer!
>>> 
>>>>  To standardize naming - that we did by agreeing to the form noun_verb
>>> 
>>> If we can document this, it would be great as stuff like “enabled” are
>>> inconsistent so not sure if I did it properly =D
>>> 
>>>> 
>>>>  Provision of values with units while maintaining backward
>>> compatibility.
>>> 
>>> +1
>>> 
>>> I really hate local_read_size_threshold_kb; I would 

Re: [DISCUSS] CASSANDRA-15234

2021-09-03 Thread [email protected]
> I think as the comments were stripped only for the POC. I guess many of them 
> will get back
in the actual doc version unfortunately.

Well, I think the grouped format lends itself to much briefer comments, with 
groups of related parameters getting an overall description. Even as a 
developer who understands most of the toggles I found the old file very hard to 
navigate.

I also don’t see why we cannot have both heavily commented versions and 
uncommented (or lightly commented) versions.

I don’t personally see why multiple different config templates would be 
confusing if they’re in a suitably labelled directory, even if we settle on one 
for the default. It might even be nice to have a pared-down config that has 
only those properties we expect the normal user to need, so it’s particularly 
easy to navigate.


From: Ekaterina Dimitrova 
Date: Friday, 3 September 2021 at 14:40
To: [email protected] 
Subject: Re: [DISCUSS] CASSANDRA-15234
> >
> > It’s worth noting that the two don’t have to be in >conflict: we could
> offer two template yaml with the parameters grouped differently, for users
> to decide for themselves.

Sure, my only concern is that three versions of the yaml could bring
confusion (we will have backward compatibility to the current one for some
time). But it might be only me. I am open for feedback


> If we can document this, it would be great as stuff >like “enabled” are
> inconsistent so not sure if I did it properly =D
>
Well, this is for now only in the ticket in the first version but no one
raised any concern. We will definitely have to update our docs on this and
whatever else we came to agreement on - both for users and contributors.

>though I will agree that it can be hard for some >tools (such
> as bash templating), but feel we can always find a >common ground
Valid point and I believe it is one of the reasons we delayed the ticket,
in order to get feedback on that. I am really interested to hear what
concerns people might have.


>Opening up a 1500+ line .yaml file is very daunting, >even if most of it is
>comments. Can't blame folks for being >overwhelmed at the prospect of
tuning
>Cassandra w/that as our operator config API. :)
I am all in for simplification and to make our users’ lives easier. But at
this point we shouldn’t be comparing the length of the files I think as the
comments were stripped only for the POC. I guess many of them will get back
in the actual doc version unfortunately.

Thank you all,
Ekaterina

On Thu, 2 Sep 2021 at 20:07, Joshua McKenzie  wrote:

> Reading through the two, the grouping approach seems like it's a lot more
> friendly to newcomers as well as providing context specific cues for
> relationships between params you're editing. Showing and not telling, if
> you will.
>
> Opening up a 1500+ line .yaml file is very daunting, even if most of it is
> comments. Can't blame folks for being overwhelmed at the prospect of tuning
> Cassandra w/that as our operator config API. :)
>
> ~Josh
>
> On Thu, Sep 2, 2021 at 1:48 PM David Capwell 
> wrote:
>
> > Thanks for bringing this back up; Caleb and I were talking about the lack
> > of clarity with regard to CASSANDRA-16896, fleshing this out would make
> > those configs nicer!
> >
> > >   To standardize naming - that we did by agreeing to the form noun_verb
> >
> > If we can document this, it would be great as stuff like “enabled” are
> > inconsistent so not sure if I did it properly =D
> >
> > >
> > >   Provision of values with units while maintaining backward
> > compatibility.
> >
> > +1
> >
> > I really hate local_read_size_threshold_kb; I would love
> > local_read_size_threshold: 10kb.  Once we have the infrastructure in
> place
> > (believe your patch before had these tools) I would love to switch!
> >
> >
> > > Another proposal is done by Benedict; grouping the config parameters.
> >
> > Yep, this is what triggered Caleb and I to talk about this thread!  To
> > group or not to group; that is the question
> >
> > Personally I like grouping from an organization point of view so am in
> > favor of that; though I will agree that it can be hard for some tools
> (such
> > as bash templating), but feel we can always find a common ground
> >
> >
> > > On Sep 2, 2021, at 8:44 AM, [email protected] wrote:
> > >
> > > Thanks for bringing this to the list Ekaterina!
> > >
> > > It’s worth noting that the two don’t have to be in conflict: we could
> > offer two template yaml with the parameters grouped differently, for
> users
> > to decide for themselves.
> > >
> > > The proposals pr

Re: [DISCUSS] CASSANDRA-15234

2021-09-03 Thread Joshua McKenzie
>
> at this point we shouldn’t be comparing the length of the files I think as
> the comments were stripped only for the POC

Ah - my misunderstanding then. I assumed we were relying on the local
context of the grouping to provide insight into the functionality of
parameters and removed the comments to that end; my point does not stand. :)

Re: multiple options of .yaml files, having to update multiple .yaml
template files on addition of new features or params will be another spot
for human error but we can do some simple build-time checking of that to
ensure the files stay in sync.



On Fri, Sep 3, 2021 at 9:39 AM Ekaterina Dimitrova 
wrote:

> > >
> > > It’s worth noting that the two don’t have to be in >conflict: we could
> > offer two template yaml with the parameters grouped differently, for
> users
> > to decide for themselves.
>
> Sure, my only concern is that three versions of the yaml could bring
> confusion (we will have backward compatibility to the current one for some
> time). But it might be only me. I am open for feedback
>
>
> > If we can document this, it would be great as stuff >like “enabled” are
> > inconsistent so not sure if I did it properly =D
> >
> Well, this is for now only in the ticket in the first version but no one
> raised any concern. We will definitely have to update our docs on this and
> whatever else we came to agreement on - both for users and contributors.
>
> >though I will agree that it can be hard for some >tools (such
> > as bash templating), but feel we can always find a >common ground
> Valid point and I believe it is one of the reasons we delayed the ticket,
> in order to get feedback on that. I am really interested to hear what
> concerns people might have.
>
>
> >Opening up a 1500+ line .yaml file is very daunting, >even if most of it
> is
> >comments. Can't blame folks for being >overwhelmed at the prospect of
> tuning
> >Cassandra w/that as our operator config API. :)
> I am all in for simplification and to make our users’ lives easier. But at
> this point we shouldn’t be comparing the length of the files I think as the
> comments were stripped only for the POC. I guess many of them will get back
> in the actual doc version unfortunately.
>
> Thank you all,
> Ekaterina
>
> On Thu, 2 Sep 2021 at 20:07, Joshua McKenzie  wrote:
>
> > Reading through the two, the grouping approach seems like it's a lot more
> > friendly to newcomers as well as providing context specific cues for
> > relationships between params you're editing. Showing and not telling, if
> > you will.
> >
> > Opening up a 1500+ line .yaml file is very daunting, even if most of it
> is
> > comments. Can't blame folks for being overwhelmed at the prospect of
> tuning
> > Cassandra w/that as our operator config API. :)
> >
> > ~Josh
> >
> > On Thu, Sep 2, 2021 at 1:48 PM David Capwell  >
> > wrote:
> >
> > > Thanks for bringing this back up; Caleb and I were talking about the
> lack
> > > of clarity with regard to CASSANDRA-16896, fleshing this out would make
> > > those configs nicer!
> > >
> > > >   To standardize naming - that we did by agreeing to the form
> noun_verb
> > >
> > > If we can document this, it would be great as stuff like “enabled” are
> > > inconsistent so not sure if I did it properly =D
> > >
> > > >
> > > >   Provision of values with units while maintaining backward
> > > compatibility.
> > >
> > > +1
> > >
> > > I really hate local_read_size_threshold_kb; I would love
> > > local_read_size_threshold: 10kb.  Once we have the infrastructure in
> > place
> > > (believe your patch before had these tools) I would love to switch!
> > >
> > >
> > > > Another proposal is done by Benedict; grouping the config parameters.
> > >
> > > Yep, this is what triggered Caleb and I to talk about this thread!  To
> > > group or not to group; that is the question
> > >
> > > Personally I like grouping from an organization point of view so am in
> > > favor of that; though I will agree that it can be hard for some tools
> > (such
> > > as bash templating), but feel we can always find a common ground
> > >
> > >
> > > > On Sep 2, 2021, at 8:44 AM, [email protected] wrote:
> > > >
> > > > Thanks for bringing this to the list Ekaterina!
> > > >
> > > > It’s worth noting that the two don’t have to be in conflict: we could
> > > offer two template yaml with the parameters grouped differently, for
> > users
> > > to decide for themselves.
> > > >
> > > > The proposals primarily define parameter names differently, with my
> > > proposal going by kind->place, and the other proposal maintaining
> > (mostly)
> > > the existing name form (which is a bit more like place->kind). While
> the
> > > example yaml groups by kind, you can convert nested definitions into a
> > > ‘dot’ form (e.g. limits.concurrency.reads) for use in a different
> > grouping.
> > > >
> > > > One advantage of grouping parameters together is that it aids
> > > maintaining coherency of naming between systems, and also poten

Re: [DISCUSS] CASSANDRA-15234

2021-09-03 Thread Ekaterina Dimitrova
> >
> > It’s worth noting that the two don’t have to be in >conflict: we could
> offer two template yaml with the parameters grouped differently, for users
> to decide for themselves.

Sure, my only concern is that three versions of the yaml could bring
confusion (we will have backward compatibility to the current one for some
time). But it might be only me. I am open for feedback


> If we can document this, it would be great as stuff >like “enabled” are
> inconsistent so not sure if I did it properly =D
>
Well, this is for now only in the ticket in the first version but no one
raised any concern. We will definitely have to update our docs on this and
whatever else we came to agreement on - both for users and contributors.

>though I will agree that it can be hard for some >tools (such
> as bash templating), but feel we can always find a >common ground
Valid point and I believe it is one of the reasons we delayed the ticket,
in order to get feedback on that. I am really interested to hear what
concerns people might have.


>Opening up a 1500+ line .yaml file is very daunting, >even if most of it is
>comments. Can't blame folks for being >overwhelmed at the prospect of
tuning
>Cassandra w/that as our operator config API. :)
I am all in for simplification and to make our users’ lives easier. But at
this point we shouldn’t be comparing the length of the files I think as the
comments were stripped only for the POC. I guess many of them will get back
in the actual doc version unfortunately.

Thank you all,
Ekaterina

On Thu, 2 Sep 2021 at 20:07, Joshua McKenzie  wrote:

> Reading through the two, the grouping approach seems like it's a lot more
> friendly to newcomers as well as providing context specific cues for
> relationships between params you're editing. Showing and not telling, if
> you will.
>
> Opening up a 1500+ line .yaml file is very daunting, even if most of it is
> comments. Can't blame folks for being overwhelmed at the prospect of tuning
> Cassandra w/that as our operator config API. :)
>
> ~Josh
>
> On Thu, Sep 2, 2021 at 1:48 PM David Capwell 
> wrote:
>
> > Thanks for bringing this back up; Caleb and I were talking about the lack
> > of clarity with regard to CASSANDRA-16896, fleshing this out would make
> > those configs nicer!
> >
> > >   To standardize naming - that we did by agreeing to the form noun_verb
> >
> > If we can document this, it would be great as stuff like “enabled” are
> > inconsistent so not sure if I did it properly =D
> >
> > >
> > >   Provision of values with units while maintaining backward
> > compatibility.
> >
> > +1
> >
> > I really hate local_read_size_threshold_kb; I would love
> > local_read_size_threshold: 10kb.  Once we have the infrastructure in
> place
> > (believe your patch before had these tools) I would love to switch!
> >
> >
> > > Another proposal is done by Benedict; grouping the config parameters.
> >
> > Yep, this is what triggered Caleb and I to talk about this thread!  To
> > group or not to group; that is the question
> >
> > Personally I like grouping from an organization point of view so am in
> > favor of that; though I will agree that it can be hard for some tools
> (such
> > as bash templating), but feel we can always find a common ground
> >
> >
> > > On Sep 2, 2021, at 8:44 AM, [email protected] wrote:
> > >
> > > Thanks for bringing this to the list Ekaterina!
> > >
> > > It’s worth noting that the two don’t have to be in conflict: we could
> > offer two template yaml with the parameters grouped differently, for
> users
> > to decide for themselves.
> > >
> > > The proposals primarily define parameter names differently, with my
> > proposal going by kind->place, and the other proposal maintaining
> (mostly)
> > the existing name form (which is a bit more like place->kind). While the
> > example yaml groups by kind, you can convert nested definitions into a
> > ‘dot’ form (e.g. limits.concurrency.reads) for use in a different
> grouping.
> > >
> > > One advantage of grouping parameters together is that it aids
> > maintaining coherency of naming between systems, and also potentially
> > permits a more succinct config file and better discovery. But it’s far
> from
> > a silver bullet, as value judgements have to be made about where the
> > grouping lines are. I’m sure anything we settle on will be a huge
> > improvement over the status quo, however.
> > >
> > >
> > >
> > >
> > > From: Ekaterina Dimitrova 
> > > Date: Thursday, 2 September 2021 at 16:32
> > > To: [email protected] 
> > > Subject: [DISCUSS] CASSANDRA-15234
> > > Hi team,
> > >
> > > I would like to bring to the attention of the community
> CASSANDRA-15234,
> > > standardise config and JVM parameters.
> > >
> > > This is work we discussed back in Summer 2020 just before our first 4.0
> > > Beta release. During the discussion we figured out that there is more
> > than
> > > one option to do the job and not enough time to get user feedback and
> > > finish i

Re: [DISCUSS] CASSANDRA-15234

2021-09-02 Thread Joshua McKenzie
Reading through the two, the grouping approach seems like it's a lot more
friendly to newcomers as well as providing context specific cues for
relationships between params you're editing. Showing and not telling, if
you will.

Opening up a 1500+ line .yaml file is very daunting, even if most of it is
comments. Can't blame folks for being overwhelmed at the prospect of tuning
Cassandra w/that as our operator config API. :)

~Josh

On Thu, Sep 2, 2021 at 1:48 PM David Capwell 
wrote:

> Thanks for bringing this back up; Caleb and I were talking about the lack
> of clarity with regard to CASSANDRA-16896, fleshing this out would make
> those configs nicer!
>
> >   To standardize naming - that we did by agreeing to the form noun_verb
>
> If we can document this, it would be great as stuff like “enabled” are
> inconsistent so not sure if I did it properly =D
>
> >
> >   Provision of values with units while maintaining backward
> compatibility.
>
> +1
>
> I really hate local_read_size_threshold_kb; I would love
> local_read_size_threshold: 10kb.  Once we have the infrastructure in place
> (believe your patch before had these tools) I would love to switch!
>
>
> > Another proposal is done by Benedict; grouping the config parameters.
>
> Yep, this is what triggered Caleb and I to talk about this thread!  To
> group or not to group; that is the question
>
> Personally I like grouping from an organization point of view so am in
> favor of that; though I will agree that it can be hard for some tools (such
> as bash templating), but feel we can always find a common ground
>
>
> > On Sep 2, 2021, at 8:44 AM, [email protected] wrote:
> >
> > Thanks for bringing this to the list Ekaterina!
> >
> > It’s worth noting that the two don’t have to be in conflict: we could
> offer two template yaml with the parameters grouped differently, for users
> to decide for themselves.
> >
> > The proposals primarily define parameter names differently, with my
> proposal going by kind->place, and the other proposal maintaining (mostly)
> the existing name form (which is a bit more like place->kind). While the
> example yaml groups by kind, you can convert nested definitions into a
> ‘dot’ form (e.g. limits.concurrency.reads) for use in a different grouping.
> >
> > One advantage of grouping parameters together is that it aids
> maintaining coherency of naming between systems, and also potentially
> permits a more succinct config file and better discovery. But it’s far from
> a silver bullet, as value judgements have to be made about where the
> grouping lines are. I’m sure anything we settle on will be a huge
> improvement over the status quo, however.
> >
> >
> >
> >
> > From: Ekaterina Dimitrova 
> > Date: Thursday, 2 September 2021 at 16:32
> > To: [email protected] 
> > Subject: [DISCUSS] CASSANDRA-15234
> > Hi team,
> >
> > I would like to bring to the attention of the community CASSANDRA-15234,
> > standardise config and JVM parameters.
> >
> > This is work we discussed back in Summer 2020 just before our first 4.0
> > Beta release. During the discussion we figured out that there is more
> than
> > one option to do the job and not enough time to get user feedback and
> > finish it so this was delayed post-4.0 And here I am, bringing it back to
> > the table.
> >
> > This work’s goal is:
> >
> >   -
> >
> >   To standardize naming - that we did by agreeing to the form noun_verb
> >   -
> >
> >   Provision of values with units while maintaining backward
> compatibility.
> >
> >
> > Those two parts are more or less already done.
> >
> > More interesting is the third part - reorganizing the cassandra.yaml
> file.
> >
> > My personal approach was to split it into sections, done here
> > <
> https://github.com/ekaterinadimitrova2/cassandra/blob/b4eebe080835da79d032f9314262c268b71172a8/conf/cassandra.yaml
> >
> > .
> >
> > Another proposal is done by Benedict; grouping the config parameters.
> >
> > To make it clearer, he created a yaml
> > <
> https://github.com/belliottsmith/cassandra/blob/5f80d1c0d38873b7a27dc137656d8b81f8e6bbd7/conf/cassandra_nocomment.yaml
> >
> > with comments mostly stripped.
> >
> > In his version, there are basic settings for network, disk etc all
> grouped
> > together, followed by operator tuneables mostly under limits within which
> > we now have throughput, concurrency, capacity. This leads to settings for
> > some features being kept separate (most notably for caching), but helps
> the
> > operator understand what they have to play with for controlling resource
> > consumption.
> >
> > I am interested to hear what people think about the two options or if
> > anyone has another idea to share, open discussion.
> >
> > Thank you,
> >
> > Ekaterina
>
>
> -
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
>
>


Re: [DISCUSS] CASSANDRA-15234

2021-09-02 Thread David Capwell
Thanks for bringing this back up; Caleb and I were talking about the lack of 
clarity with regard to CASSANDRA-16896, fleshing this out would make those 
configs nicer!

>   To standardize naming - that we did by agreeing to the form noun_verb

If we can document this, it would be great as stuff like “enabled” are 
inconsistent so not sure if I did it properly =D

> 
>   Provision of values with units while maintaining backward compatibility.

+1

I really hate local_read_size_threshold_kb; I would love 
local_read_size_threshold: 10kb.  Once we have the infrastructure in place 
(believe your patch before had these tools) I would love to switch!


> Another proposal is done by Benedict; grouping the config parameters.

Yep, this is what triggered Caleb and I to talk about this thread!  To group or 
not to group; that is the question

Personally I like grouping from an organization point of view so am in favor of 
that; though I will agree that it can be hard for some tools (such as bash 
templating), but feel we can always find a common ground


> On Sep 2, 2021, at 8:44 AM, [email protected] wrote:
> 
> Thanks for bringing this to the list Ekaterina!
> 
> It’s worth noting that the two don’t have to be in conflict: we could offer 
> two template yaml with the parameters grouped differently, for users to 
> decide for themselves.
> 
> The proposals primarily define parameter names differently, with my proposal 
> going by kind->place, and the other proposal maintaining (mostly) the 
> existing name form (which is a bit more like place->kind). While the example 
> yaml groups by kind, you can convert nested definitions into a ‘dot’ form 
> (e.g. limits.concurrency.reads) for use in a different grouping.
> 
> One advantage of grouping parameters together is that it aids maintaining 
> coherency of naming between systems, and also potentially permits a more 
> succinct config file and better discovery. But it’s far from a silver bullet, 
> as value judgements have to be made about where the grouping lines are. I’m 
> sure anything we settle on will be a huge improvement over the status quo, 
> however.
> 
> 
> 
> 
> From: Ekaterina Dimitrova 
> Date: Thursday, 2 September 2021 at 16:32
> To: [email protected] 
> Subject: [DISCUSS] CASSANDRA-15234
> Hi team,
> 
> I would like to bring to the attention of the community CASSANDRA-15234,
> standardise config and JVM parameters.
> 
> This is work we discussed back in Summer 2020 just before our first 4.0
> Beta release. During the discussion we figured out that there is more than
> one option to do the job and not enough time to get user feedback and
> finish it so this was delayed post-4.0 And here I am, bringing it back to
> the table.
> 
> This work’s goal is:
> 
>   -
> 
>   To standardize naming - that we did by agreeing to the form noun_verb
>   -
> 
>   Provision of values with units while maintaining backward compatibility.
> 
> 
> Those two parts are more or less already done.
> 
> More interesting is the third part - reorganizing the cassandra.yaml file.
> 
> My personal approach was to split it into sections, done here
> 
> .
> 
> Another proposal is done by Benedict; grouping the config parameters.
> 
> To make it clearer, he created a yaml
> 
> with comments mostly stripped.
> 
> In his version, there are basic settings for network, disk etc all grouped
> together, followed by operator tuneables mostly under limits within which
> we now have throughput, concurrency, capacity. This leads to settings for
> some features being kept separate (most notably for caching), but helps the
> operator understand what they have to play with for controlling resource
> consumption.
> 
> I am interested to hear what people think about the two options or if
> anyone has another idea to share, open discussion.
> 
> Thank you,
> 
> Ekaterina


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [DISCUSS] CASSANDRA-15234

2021-09-02 Thread [email protected]
Thanks for bringing this to the list Ekaterina!

It’s worth noting that the two don’t have to be in conflict: we could offer two 
template yaml with the parameters grouped differently, for users to decide for 
themselves.

The proposals primarily define parameter names differently, with my proposal 
going by kind->place, and the other proposal maintaining (mostly) the existing 
name form (which is a bit more like place->kind). While the example yaml groups 
by kind, you can convert nested definitions into a ‘dot’ form (e.g. 
limits.concurrency.reads) for use in a different grouping.

One advantage of grouping parameters together is that it aids maintaining 
coherency of naming between systems, and also potentially permits a more 
succinct config file and better discovery. But it’s far from a silver bullet, 
as value judgements have to be made about where the grouping lines are. I’m 
sure anything we settle on will be a huge improvement over the status quo, 
however.




From: Ekaterina Dimitrova 
Date: Thursday, 2 September 2021 at 16:32
To: [email protected] 
Subject: [DISCUSS] CASSANDRA-15234
Hi team,

I would like to bring to the attention of the community CASSANDRA-15234,
standardise config and JVM parameters.

This is work we discussed back in Summer 2020 just before our first 4.0
Beta release. During the discussion we figured out that there is more than
one option to do the job and not enough time to get user feedback and
finish it so this was delayed post-4.0 And here I am, bringing it back to
the table.

This work’s goal is:

   -

   To standardize naming - that we did by agreeing to the form noun_verb
   -

   Provision of values with units while maintaining backward compatibility.


Those two parts are more or less already done.

More interesting is the third part - reorganizing the cassandra.yaml file.

My personal approach was to split it into sections, done here

.

Another proposal is done by Benedict; grouping the config parameters.

To make it clearer, he created a yaml

with comments mostly stripped.

In his version, there are basic settings for network, disk etc all grouped
together, followed by operator tuneables mostly under limits within which
we now have throughput, concurrency, capacity. This leads to settings for
some features being kept separate (most notably for caching), but helps the
operator understand what they have to play with for controlling resource
consumption.

I am interested to hear what people think about the two options or if
anyone has another idea to share, open discussion.

Thank you,

Ekaterina