Re: Changing the output of tooling between majors

2023-07-07 Thread Brandon Williams
On Fri, Jul 7, 2023 at 2:20 PM Miklosovic, Stefan
 wrote:
>
> Great thanks. That might work.
>
> So we do not change the default output unless there is json / yaml equivalent.
>
> Once there is, we are free to change the default output however we want.

Yes, exactly.  Then we have the best of both worlds: programmatic
access that isn't flimsy, and a pretty display however we want it.


Re: CASSANDRA-18554 - mTLS based client and internode authenticators

2023-07-07 Thread Jyothsna Konisa
Hi Yuki, Jeremiah & Christopher,

Thank you very much for the feedback.

Regarding removing superuser check for adding/removing identities, I have
relaxed that check and added permissions check instead. With this change
only users with appropriate permissions to add/drop identities can perform
that action.

About extending `Create Role` cqlsh statement, we have a couple of reasons
for not doing that. We designed the mTLS authenticator in such a way that a
single role can be associated with multiple identities, EX: there can be
several identities which are read_only users. Also, having a separate cqlsh
statement for identities makes it more pluggable and independent. If we
still think that extending the create role statement would be a convenient
feature, we can add it as required in the followup patches.

Christopher, I will be acting upon your feedback regarding having identity
in the cassandra.yaml optionally configurable.

Thanks,
Jyothsna Konisa.

On Thu, Jul 6, 2023 at 5:30 PM Dinesh Joshi  wrote:

> > On Jun 30, 2023, at 1:09 PM, Jeremiah Jordan 
> wrote:
> >
> > I don’t think users necessarily need to be able to update their own
> identities.  I just don’t want to have to use the super user role.  The
> super user role has all power over all things in the data base.  I don’t
> want to have to give that much power to the person who manages identities,
> I just want to give them the power to manage identities.
>
> Makes sense. I think Jyothsna already pushed an update to the PR to relax
> the restriction. Please feel free to take a look at it.
>
> Dinesh
>
>
>
>


Re: Changing the output of tooling between majors

2023-07-07 Thread Miklosovic, Stefan
Great thanks. That might work.

So we do not change the default output unless there is json / yaml equivalent.

Once there is, we are free to change the default output however we want.


From: Brandon Williams 
Sent: Friday, July 7, 2023 21:17
To: dev@cassandra.apache.org
Subject: Re: Changing the output of tooling between majors

NetApp Security WARNING: This is an external email. Do not click links or open 
attachments unless you recognize the sender and know the content is safe.




On Fri, Jul 7, 2023 at 2:11 PM Miklosovic, Stefan
 wrote:
>
> Yes, that is true, but the original, unfixed, output, is still there. Are we 
> OK with that?

When we have a serialized output available, we do whatever we like to
the display output.


Re: Changing the output of tooling between majors

2023-07-07 Thread Brandon Williams
On Fri, Jul 7, 2023 at 2:11 PM Miklosovic, Stefan
 wrote:
>
> Yes, that is true, but the original, unfixed, output, is still there. Are we 
> OK with that?

When we have a serialized output available, we do whatever we like to
the display output.


Re: Changing the output of tooling between majors

2023-07-07 Thread Miklosovic, Stefan
Yes, that is true, but the original, unfixed, output, is still there. Are we OK 
with that?

Now the command "nodetool command" writes this:

someValue: 1
Another Value: 2
The Third Value: 3

You say that, lets add a flag to this too, -j (as in json), so a user will get:

{
"some_value": 1,
"another_value": 2,
"the_third_value": 3
}

Correct?

But the original discrepancy, "someValue" instead of "Some Value", is still 
there.

Is this OK for everybody?

My aim is to fix the original output too and having "-j" flag is just nice to 
have, just another way how to interpret the results. But you mean that we are 
not going to touch "someValue" output ever again?


From: Brandon Williams 
Sent: Friday, July 7, 2023 21:05
To: dev@cassandra.apache.org
Subject: Re: Changing the output of tooling between majors

NetApp Security WARNING: This is an external email. Do not click links or open 
attachments unless you recognize the sender and know the content is safe.




On Fri, Jul 7, 2023 at 2:02 PM Miklosovic, Stefan
 wrote:
>
> There is just no clear path how to improve that over time and exposing the 
> same output via different format is not really solving it ... the 
> discrepancies are still there.

I'm not sure what you mean, can you explain?  In my mind, if we have a
serialized output format, we have divorced the display from the data
and so we should be free to modify how we display it all we like after
that point.


Re: Changing the output of tooling between majors

2023-07-07 Thread Brandon Williams
On Fri, Jul 7, 2023 at 2:02 PM Miklosovic, Stefan
 wrote:
>
> There is just no clear path how to improve that over time and exposing the 
> same output via different format is not really solving it ... the 
> discrepancies are still there.

I'm not sure what you mean, can you explain?  In my mind, if we have a
serialized output format, we have divorced the display from the data
and so we should be free to modify how we display it all we like after
that point.


Re: Changing the output of tooling between majors

2023-07-07 Thread Miklosovic, Stefan
Thank you Brandon for further clarification of your position on this.

While I get the necessity of being compatible is real, I just find the fact 
that we need to do this across majors to be just too much. Are we all aware 
that if we can not change it, this is just a snowball getting bigger over time? 
After long enough period, it will be so "conserved" that it will be detrimental 
to the usability as it will be also hard to parse just visually.

There is just no clear path how to improve that over time and exposing the same 
output via different format is not really solving it ... the discrepancies are 
still there.

I welcome other people to this thread to tell us about how they are parsing it, 
how frequently, how important is that for them. As I said before, I have never 
met anybody who is parsing this output and it actually matters to them. Do we 
have some proof this is happening in scale?


From: Brandon Williams 
Sent: Friday, July 7, 2023 20:39
To: dev@cassandra.apache.org
Subject: Re: Changing the output of tooling between majors

NetApp Security WARNING: This is an external email. Do not click links or open 
attachments unless you recognize the sender and know the content is safe.




On Fri, Jul 7, 2023 at 10:21 AM Miklosovic, Stefan
 wrote:
>
> Anyway, the main question here is if we are OK to change the output in majors.

I think we always want to strive for compatibility whenever possible.
My personal litmus test is "can this information be obtained
elsewhere?" and if the answer is no, then the format shouldn't change
as it is very likely to at least cause friction for anyone screen
scraping to get it programmatically.  However, as you mentioned,
adding a serialized format provides another, superior method of
programmatic access, freeing us of the issues with cosmetic changes.


Re: Changing the output of tooling between majors

2023-07-07 Thread Brandon Williams
On Fri, Jul 7, 2023 at 10:21 AM Miklosovic, Stefan
 wrote:
>
> Anyway, the main question here is if we are OK to change the output in majors.

I think we always want to strive for compatibility whenever possible.
My personal litmus test is "can this information be obtained
elsewhere?" and if the answer is no, then the format shouldn't change
as it is very likely to at least cause friction for anyone screen
scraping to get it programmatically.  However, as you mentioned,
adding a serialized format provides another, superior method of
programmatic access, freeing us of the issues with cosmetic changes.


Re: [DISCUSS] Allow UPDATE on settings virtual table to change running configuration

2023-07-07 Thread Josh McKenzie
This really is great work Maxim; definitely appreciate all the hard work that's 
gone into it and I think the users will too.

In terms of where it should land, we discussed this type of question at length 
on the ML awhile ago and ended up codifying it in the wiki: 
https://cwiki.apache.org/confluence/display/CASSANDRA/Patching%2C+versioning%2C+and+LTS+releases

> When working on a ticket, use the following guideline to determine which 
> branch to apply it to (Note: See *How To Commit 
> * for details 
> on the commit and merge process)
> 
>  • Bugfix: apply to oldest applicable LTS and merge up through latest GA to 
> trunk
>• In the event you need to make changes on the merge commit, merge with 
> *-s ours *and revise the commit via *--amend*
>  • Improvement: apply to *trunk only (next release)*
>• *Note: refactoring and removing dead code qualifies as an Improvement; 
> our priority is stability on GA lines*
>  • New Feature: apply to *trunk only (next release)*
> Our priority is to keep the 2 LTS releases and latest GA stable while 
> releasing new "latest GA" on a cadence that provides new improvements and 
> functionality to users soon enough to be valuable and relevant.
> 

So in this case, target whatever unreleased next feature release (i.e. SEMVER 
MAJOR || MINOR) we have on deck.

On Thu, Jul 6, 2023, at 1:21 PM, Ekaterina Dimitrova wrote:
> Hi,
> 
> First of all, thank you for all the work! 
> I personally think that it should be ok to add a new column.
> 
> I will be very happy to see this landing in 5.0. 
> I am personally against porting this patch to 4.1. To be clear, I am sure you 
> did a great job and my response would be the same to every single person - 
> the configuration is quite wide-spread and the devil is in the details. I do 
> not see a good reason for exception here except convenience. There is no 
> feature flag for these changes too, right?
> 
> Best regards,
> Ekaterina
> 
> На четвъртък, 6 юли 2023 г. Miklosovic, Stefan  
> написа:
>> Hi Maxim,
>> 
>> I went through the PR and added my comments. I think David also reviewed it. 
>> All points you mentioned make sense to me but I humbly think it is necessary 
>> to have at least one additional pair of eyes on this as the patch is 
>> relatively impactful.
>> 
>> I would like to see additional column in system_views.settings of name 
>> "mutable" and of type "boolean" to see what field I am actually allowed to 
>> update as an operator.
>> 
>> It seems to me you agree with the introduction of this column (1) but there 
>> is no clear agreement where we actually want to put it. You want this whole 
>> feature to be committed to 4.1 branch as well which is an interesting 
>> proposal. I was thinking that this work will go to 5.0 only. I am not 
>> completely sure it is necessary to backport this feature but your 
>> argumentation here (2) is worth to discuss further.
>> 
>> If we introduce this change to 4.1, that field would not be there but in 5.0 
>> it would. So that way we will not introduce any new column to 
>> system_views.settings.
>> We could also go with the introduction of this column to 4.1 if people are 
>> ok with that.
>> 
>> For the simplicity, I am slightly leaning towards introducing this feature 
>> to 5.0 only.
>> 
>> (1) https://github.com/apache/cassandra/pull/2334#discussion_r1251104171
>> (2) https://github.com/apache/cassandra/pull/2334#discussion_r1251248041
>> 
>> 
>> From: Maxim Muzafarov 
>> Sent: Friday, June 23, 2023 13:50
>> To: dev@cassandra.apache.org
>> Subject: Re: [DISCUSS] Allow UPDATE on settings virtual table to change 
>> running configuration
>> 
>> NetApp Security WARNING: This is an external email. Do not click links or 
>> open attachments unless you recognize the sender and know the content is 
>> safe.
>> 
>> 
>> 
>> 
>> Hello everyone,
>> 
>> 
>> As there is a lack of feedback for an option to go on with and having
>> a discussion for pros and cons for each option I tend to agree with
>> the vision of this problem proposed by David :-) After a lot of
>> discussion on Slack, we came to the @ValidatedBy annotation which
>> points to a validation method of a property and this will address all
>> our concerns and issues with validation.
>> 
>> I'd like to raise the visibility of these changes and try to find one
>> more committer to look at them:
>> https://issues.apache.org/jira/browse/CASSANDRA-15254
>> https://github.com/apache/cassandra/pull/2334/files
>> 
>> I'd really appreciate any kind of review in advance.
>> 
>> 
>> Despite the number of changes +2,043 −302 and the fact that most of
>> these additions are related to the tests themselves, I would like to
>> highlight the crucial design points which are required to make the
>> SettingsTable virtual table updatable. Some of these have already been
>> discussed in this thread, and I would like to provide a brief 

Re: Fwd: [DISCUSS] Formalizing requirements for pre-commit patches on new CI

2023-07-07 Thread Andrés de la Peña
I think 500 runs combining all configs could be reasonable, since it's
unlikely to have config-specific flaky tests. As in five configs with 100
repetitions each.

On Fri, 7 Jul 2023 at 16:14, Josh McKenzie  wrote:

> Maybe. Kind of depends on how long we write our tests to run doesn't it? :)
>
> But point taken. Any non-trivial test would start to be something of a
> beast under this approach.
>
> On Fri, Jul 7, 2023, at 11:12 AM, Brandon Williams wrote:
>
> On Fri, Jul 7, 2023 at 10:09 AM Josh McKenzie 
> wrote:
> > 3. Multiplexed tests (changed, added) run against all JDK's and a
> broader range of configs (no-vnode, vnode default, compression, etc)
>
> I think this is going to be too heavy...we're taking 500 iterations
> and multiplying that by like 4 or 5?
>
>
>


Re: Changing the output of tooling between majors

2023-07-07 Thread Brandon Williams
On Fri, Jul 7, 2023 at 10:21 AM Miklosovic, Stefan
 wrote:
> If that is the case, we should start to treat this problem completely 
> differently and we should not rely on the output of tooling at all and we 
> should either provide corresponding JMX method to retrieve it or we should 
> offer other formats tooling prints, like JSON or YAML.

We offer both JSON and YAML output in nodetool commands today, so I
would stick with those.  I want to shoot down JMX though, since that's
basically telling scripting languages to get bent.


Changing the output of tooling between majors

2023-07-07 Thread Miklosovic, Stefan
Hi list,

I want to clarify the policy we have when we want to / going to change the 
output of the tooling (nodetool or tools/bin etc.).

I am not sure it is written somewhere explicitly, but how I get it from the 
gossip over years is that we should not change the output (e.g. changing the 
name of fields etc) in minors, but for majors (4.0 -> 5.0), this is OK, correct?

For example, when some tool prints this:

thisIsAStatistic: 10

and we see that all other lines in that output print it like this:

This Is Another Statistic: abc

scratching the itch is almost irresistible so we want to change the output to:

This Is a Statistic: 10

This is the natural way how fixes are done. We are improving the output, making 
it consistent etc.

Someone may argue that we are changing "public api" and people are actually 
parsing the output like this and we better not to change it because we might 
break "the scripts" for somebody.

While I get this for minors and it is understandable that minors should be 
same, is this relevant for majors? Because if we care about majors too in this 
situation, how are we supposed to evolve the output over time? Is it supposed 
to be just frozen for ever? I do not buy this argument. For minors, fine. But 
for majors, I do not think so.

I feel like "not break the output because API" is more or less an urban legend 
we keep repeating ourselves. I yet need to meet somebody who is stressing over 
the fact that her output changed *between majors*.

If that is the case, we should start to treat this problem completely 
differently and we should not rely on the output of tooling at all and we 
should either provide corresponding JMX method to retrieve it or we should 
offer other formats tooling prints, like JSON or YAML.

Anyway, the main question here is if we are OK to change the output in majors.

Regards

Re: Fwd: [DISCUSS] Formalizing requirements for pre-commit patches on new CI

2023-07-07 Thread Josh McKenzie
Maybe. Kind of depends on how long we write our tests to run doesn't it? :)

But point taken. Any non-trivial test would start to be something of a beast 
under this approach.

On Fri, Jul 7, 2023, at 11:12 AM, Brandon Williams wrote:
> On Fri, Jul 7, 2023 at 10:09 AM Josh McKenzie  wrote:
> > 3. Multiplexed tests (changed, added) run against all JDK's and a broader 
> > range of configs (no-vnode, vnode default, compression, etc)
> 
> I think this is going to be too heavy...we're taking 500 iterations
> and multiplying that by like 4 or 5?
> 


Re: Fwd: [DISCUSS] Formalizing requirements for pre-commit patches on new CI

2023-07-07 Thread Brandon Williams
On Fri, Jul 7, 2023 at 10:09 AM Josh McKenzie  wrote:
> 3. Multiplexed tests (changed, added) run against all JDK's and a broader 
> range of configs (no-vnode, vnode default, compression, etc)

I think this is going to be too heavy...we're taking 500 iterations
and multiplying that by like 4 or 5?


Re: Fwd: [DISCUSS] Formalizing requirements for pre-commit patches on new CI

2023-07-07 Thread Josh McKenzie
> I wouldn’t also advocate not running JDK17 tests until we enable all test 
> suites post-commit
What about this:
1. Pre-commit suite runs against just 1 JDK (latest supported / most common / 
default)
2. Pre-commit suite runs against default config
3. *Multiplexed tests* (changed, added) run against all JDK's and a broader 
range of configs (no-vnode, vnode default, compression, etc)

That might give us a compromise that's the best of both worlds where we don't 
run redundant tests we don't expect to have changed but we also don't just 
assume if it works on one JDK/config it'll work on all.

This doesn't solve the "a test will now fail that you didn't change or add 
because you changed functionality" shape of things, but... my intuition is that 
should be a fairly rare failure if it passes in the base / default case. Rare 
enough that it should hopefully be quickly caught by the post-analysis Jenkins 
-> JIRA parsing script here: 
https://github.com/apache/cassandra-builds/blob/trunk/jenkins-jira-integration/jenkins_jira_integration.py.
 

On Wed, Jul 5, 2023, at 5:09 PM, Ekaterina Dimitrova wrote:
> “I'm curious what it triggers for you Brandon, Berenguer, Andres, Ekaterina, 
> and Mick (when you're back from the mountains ;)). ”
> We already have pre-commit a minimum set being mandatory in CircleCI. People 
> can manually trigger other tests if they feel they might have broken 
> something. The only tests that are a matter of config twist that is mandatory 
> to run now in the pre-commit CircleCI workflow are those that were never 
> added to Jenkins (for example, system keyspaces, and oa unit tests). Probably 
> the only combination that we might want to reconsider is with/without vnodes? 
> 
> I wouldn’t also advocate not running JDK17 tests until we enable all test 
> suites post-commit. Reminder - those that still have failing tests we are 
> actively working on are disabled in Jenkins to reduce the noise until we are 
> ready to fully switch from 8+11 to 11+17. Probably also in the future, when 
> we work to introduce new JDK versions, we would again want to run tests and 
> see whether we regress the people who are dealing with all the 
> maintenance/problems in the background. 
> 
> Another twist - I think Jenkins dev so far triggers all tests as post-commit 
> does, no? Probably that can change to mimic what we agreed on for CircleCI. I 
> am sure the devil will be again in the details, but just a thing to consider. 
> 
> “ If a failure makes it to post-commit, it's much more expensive to root 
> cause and figure out with much higher costs to the community's collective 
> productivity.”
> 
> Totally agree. And my hope is that the pre-commit spinning in loop tests 
> should help us deal with that to some extend. It is always easy for the 
> author to do a fix while their thoughts are still on the topic. It also 
> reduces the time people spend on bisecting and doing archeology later. On 
> Derek’s point about flakiness being also attributed to tests sometimes - when 
> we get bitten a few times pre-commit and have to improve our tests to make 
> them more deterministic, I believe we will learn a thing or two, and those 
> types of things will happen less in time.
> 
> Best regards,
> Ekaterina
> 
> -- Forwarded message -
> From: *Josh McKenzie* 
> Date: Wed, 5 Jul 2023 at 8:25
> Subject: Re: [DISCUSS] Formalizing requirements for pre-commit patches on new 
> CI
> To: dev 
> 
> 
> __
>> choose a consistent, representative subset of stable tests that we feel give 
>> us a reasonable level of confidence in return for a reasonable amount of 
>> runtime
>> 
>> ...
>> Currently a dtest is being ran in j8 w/wo vnodes , j8/j11 w/wo vnodes and 
>> j11 w/wo vnodes. That is 6 times total. I wonder about that ROI.
>> ...
>> test with the default number of vnodes, test with the default compression 
>> settings, and test with the default heap/off-heap buffers.
>> 
> If I take these at face value to be true (I happen to agree with them, so I'm 
> going to do this :)), what falls out for me:
>  1. Pre-commit should be an intentional smoke-testing suite, much smaller 
> relative to post-commit than it is today
>  2. We should aggressively cull all low-signal pre-commit tests, suites, and 
> configurations that aren't needed to keep post-commit stable
> High signal in pre-commit (indicative; non-exhaustive):
>  1. Only the most commonly used JDK (JDK11 atm?)
>  2. Config defaults (vnodes, compression, heap/off-heap buffers, memtable 
> format, sstable format)
>  3. Most popular / general / run-of-the-mill linux distro (debian?)
> Low signal in pre-commit (indicative; non-exhaustive):
>  1. No vnodes
>  2. JDK8; JDK17
>  3. Non-default settings (Compression off. Fully mmap, no mmap. Trie 
> memtables or sstables, cdc enabled)
> 
> So this shape of thinking - I'm curious what it triggers for you Brandon, 
> Berenguer, Andres, Ekaterina, and Mick (when you're back from the mountains 
> ;)). You