Re: Cassandra CI Status

2020-01-23 Thread Mick Semb Wever


> >  - parallelise dtests (because 12 hours is wild)
> 
> That's one word for it. :)
> 
>  We used to ad hoc take a crack at sorting the individual test times by
> longest and taking top-N and seeing if there was LHF to shave off that.
> Being on a flight atm, not having that data handy right now, and that not
> being in the linked logs from that pipeline run here (awesome work btw!),
> do we think that might be something worth doing periodically on the project?


Yes I think so! Maybe even the longest dtest(s) can be an entry in the progress 
report? Especially now we can rewrite dtests into either quick "unit" tests 
using jvm-dtests or event diagnostics.

Along we focus on dtests execution time, it would be nice to shore up the 
flakey unit tests (there's only a handful), so that they are more steps in the 
pipeline that hard fail (and fail-fast), giving faster feedback to the 
contributor/reviewer. 

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: Update defaults for 4.0?

2020-01-23 Thread Jeremy Hanna
I've previously created
https://issues.apache.org/jira/browse/CASSANDRA-14902 for updating the
compaction_throughput_in_mb default.  I created
https://issues.apache.org/jira/browse/CASSANDRA-15521 for updating the
num_tokens default, https://issues.apache.org/jira/browse/CASSANDRA-15522
for updating the [roles|permissions|credentials]_validity_in_ms defaults,
and https://issues.apache.org/jira/browse/CASSANDRA-15523 for updating the
default snitch to GossipingPropertyFileSnitch.
I'm unable to create an epic in the project - not sure if that has to do
with project permissions.  Could someone create an epic and link these
tickets as subtasks?
Jon - would you mind creating the ticket around JVM defaults?  Are you
thinking of the default GC and settings for a better out of the box
experience?
Thanks all,
Jeremy

On Fri, Jan 24, 2020 at 1:57 PM Jon Haddad  wrote:

> Yes. please do. We should also update our JVM defaults.
>
> On Thu, Jan 23, 2020, 9:28 PM Jeremy Hanna 
> wrote:
>
> > To summarize this thread, I think people are generally okay with updating
> > certain defaults for 4.0 provided we make sure it doesn't unpleasantly
> > surprise cluster operators.  I think with the num_tokens and
> > compaction_throughput_in_mb we could go with a release note for the
> reasons
> > in my last email.  I also agree that we should consider bump
> > roles_validity_in_ms, permissions_validity_in_ms, and
> >credentials_validity_in_ms along with the default snitch (going to
> GPFS
> > as the default) as that gives people a DC aware default at least to
> start.
> >
> > Is everyone okay if I create tickets for each of these and link them with
> > an epic so that we can discuss them separately?
> >
> > Thanks,
> >
> > Jeremy
> >
> > On Thu, Jan 23, 2020 at 5:34 AM Alex Ott  wrote:
> >
> > > In addition to these, maybe we could consider to change other as well?
> > > Like:
> > >
> > > 1. bump roles_validity_in_ms, permissions_validity_in_ms, and
> > >credentials_validity_in_ms as well - maybe at least to a minute, or
> > 2. I
> > >have seen multiple times when authentication was failing under the
> > heavy
> > >load because queries to system tables were timing out - with these
> > >defaults people may still have the possibility to get updates to
> > >roles/credentials faster when specifying _update_interval_ variants
> of
> > >these configurations.
> > > 2. change default snitch from SimpleSnitch to
> > GossipingPropertyFileSnitch -
> > >we're anyway saying that SimpleSnitch is only appropriate for
> > >single-datacenter deployments, and for real production we need to
> use
> > >GossipingPropertyFileSnitch - why not to set it as default?
> > >
> > >
> > > Jeremy Hanna  at "Wed, 22 Jan 2020 11:22:36 +1100" wrote:
> > >  JH> I mentioned this in the contributor meeting as a topic to bring up
> > on
> > > the list - should we
> > >  JH> take the opportunity to update defaults for Cassandra 4.0?
> > >
> > >  JH> The rationale is two-fold:
> > >  JH> 1) There are best practices and tribal knowledge around certain
> > > properties where people
> > >  JH> just know to update those properties immediately as a starting
> > > point.  If it's pretty much
> > >  JH> a given that we set something as a starting point different than
> the
> > > current defaults, why
> > >  JH> not make that the new default?
> > >  JH> 2) We should align the defaults with what we test with.  There may
> > be
> > > exceptions if we
> > >  JH> have one-off tests but on the whole, we should be testing with
> > > defaults.
> > >
> > >  JH> As a starting point, compaction throughput and number of vnodes
> seem
> > > like good candidates
> > >  JH> but it would be great to get feedback for any others.
> > >
> > >  JH> For compaction throughput (
> > > https://jira.apache.org/jira/browse/CASSANDRA-14902), I've made
> > >  JH> a basic case on the ticket to default to 64 just as a starting
> point
> > > because the decision
> > >  JH> for 16 was made when spinning disk was most common.  Hence most
> > > people I know change that
> > >  JH> and I think without too much bikeshedding, 64 is a reasonable
> > > starting point.  A case
> > >  JH> could be made that empirically the compaction throughput throttle
> > may
> > > have less effect
> > >  JH> than many people think, but I still think an updated default would
> > > make sense.
> > >
> > >  JH> For number of vnodes, Michael Shuler made the point in the
> > discussion
> > > that we already test
> > >  JH> with 32, which is a far better number than the 256 default.  I
> know
> > > many new users that
> > >  JH> just leave the 256 default and then discover later that it's
> better
> > > to go lower.  I think
> > >  JH> 32 is a good balance.  One could go lower with the new algorithm
> but
> > > I think 32 is much
> > >  JH> better than 256 without being too skewed, and it's what we
> currently
> > > test.
> > >
> > >  JH> Jeff brought up a good point that we want to be careful 

Re: [DISCUSS] Switch to using GitHub pull requests?

2020-01-23 Thread Jeff Jirsa
100% agree

François and team wrote a doc on testing and gating commits
Blake wrote a doc on testing and gating commits
Every release there’s a thread on testing and gating commits

People are the common factor every time. Nobody wants to avoid merging their 
patch because someone broke a test elsewhere. 

We can’t block merging technically with the repo as it exists now so it’s 
always going to come down to people and peer pressure, until or unless someone 
starts reverting commits that break tests

(Of course, someone could write a tool that automatically reverts new commits 
as long as tests fail)

On Jan 23, 2020, at 5:54 PM, Joshua McKenzie  wrote:
> 
> 
>> 
>> 
>> I am reacting to what I currently see
>> happening in the project; tests fail as the norm and this is kinda seen as
>> expected, even though it goes against the policies as I understand it.
> 
> After over half a decade seeing us all continue to struggle with this
> problem, I've come around to the school of "apply pain" (I mean that as
> light-hearted as you can take it) when there's a failure to incent fixing;
> specifically in this case, the only idea I can think of is preventing merge
> w/any failing tests on a PR. We go through this cycle as we approach each
> major release: we have the gatekeeper of "we're not going to cut a release
> with failing tests obviously", and we clean them up. After the release, the
> pressure is off, we exhale, relax, and flaky test failures (and others)
> start to creep back in.
> 
> If the status quo is the world we want to live in, that's totally fine and
> no judgement intended - we can build tooling around test failure history
> and known flaky tests etc to optimize engineer workflows around that
> expectation. But what I keep seeing on threads like this (and have always
> heard brought up in conversation) is that our collective *moral* stance is
> that we should have green test boards and not merge code if it introduces
> failing tests.
> 
> Not looking to prescribe or recommend anything, just hoping that
> observation above might be of interest or value to the conversation.
> 
>> On Thu, Jan 23, 2020 at 4:17 PM Michael Shuler 
>> wrote:
>> 
>>> On 1/23/20 3:53 PM, David Capwell wrote:
>>> 
>>> 2) Nightly build email to dev@?
>> 
>> Nope. builds@c.a.o is where these go.
>> https://lists.apache.org/list.html?bui...@cassandra.apache.org
>> 
>> Michael
>> 
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
>> For additional commands, e-mail: dev-h...@cassandra.apache.org
>> 
>> 

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: Update defaults for 4.0?

2020-01-23 Thread Jon Haddad
Yes. please do. We should also update our JVM defaults.

On Thu, Jan 23, 2020, 9:28 PM Jeremy Hanna 
wrote:

> To summarize this thread, I think people are generally okay with updating
> certain defaults for 4.0 provided we make sure it doesn't unpleasantly
> surprise cluster operators.  I think with the num_tokens and
> compaction_throughput_in_mb we could go with a release note for the reasons
> in my last email.  I also agree that we should consider bump
> roles_validity_in_ms, permissions_validity_in_ms, and
>credentials_validity_in_ms along with the default snitch (going to GPFS
> as the default) as that gives people a DC aware default at least to start.
>
> Is everyone okay if I create tickets for each of these and link them with
> an epic so that we can discuss them separately?
>
> Thanks,
>
> Jeremy
>
> On Thu, Jan 23, 2020 at 5:34 AM Alex Ott  wrote:
>
> > In addition to these, maybe we could consider to change other as well?
> > Like:
> >
> > 1. bump roles_validity_in_ms, permissions_validity_in_ms, and
> >credentials_validity_in_ms as well - maybe at least to a minute, or
> 2. I
> >have seen multiple times when authentication was failing under the
> heavy
> >load because queries to system tables were timing out - with these
> >defaults people may still have the possibility to get updates to
> >roles/credentials faster when specifying _update_interval_ variants of
> >these configurations.
> > 2. change default snitch from SimpleSnitch to
> GossipingPropertyFileSnitch -
> >we're anyway saying that SimpleSnitch is only appropriate for
> >single-datacenter deployments, and for real production we need to use
> >GossipingPropertyFileSnitch - why not to set it as default?
> >
> >
> > Jeremy Hanna  at "Wed, 22 Jan 2020 11:22:36 +1100" wrote:
> >  JH> I mentioned this in the contributor meeting as a topic to bring up
> on
> > the list - should we
> >  JH> take the opportunity to update defaults for Cassandra 4.0?
> >
> >  JH> The rationale is two-fold:
> >  JH> 1) There are best practices and tribal knowledge around certain
> > properties where people
> >  JH> just know to update those properties immediately as a starting
> > point.  If it's pretty much
> >  JH> a given that we set something as a starting point different than the
> > current defaults, why
> >  JH> not make that the new default?
> >  JH> 2) We should align the defaults with what we test with.  There may
> be
> > exceptions if we
> >  JH> have one-off tests but on the whole, we should be testing with
> > defaults.
> >
> >  JH> As a starting point, compaction throughput and number of vnodes seem
> > like good candidates
> >  JH> but it would be great to get feedback for any others.
> >
> >  JH> For compaction throughput (
> > https://jira.apache.org/jira/browse/CASSANDRA-14902), I've made
> >  JH> a basic case on the ticket to default to 64 just as a starting point
> > because the decision
> >  JH> for 16 was made when spinning disk was most common.  Hence most
> > people I know change that
> >  JH> and I think without too much bikeshedding, 64 is a reasonable
> > starting point.  A case
> >  JH> could be made that empirically the compaction throughput throttle
> may
> > have less effect
> >  JH> than many people think, but I still think an updated default would
> > make sense.
> >
> >  JH> For number of vnodes, Michael Shuler made the point in the
> discussion
> > that we already test
> >  JH> with 32, which is a far better number than the 256 default.  I know
> > many new users that
> >  JH> just leave the 256 default and then discover later that it's better
> > to go lower.  I think
> >  JH> 32 is a good balance.  One could go lower with the new algorithm but
> > I think 32 is much
> >  JH> better than 256 without being too skewed, and it's what we currently
> > test.
> >
> >  JH> Jeff brought up a good point that we want to be careful with
> defaults
> > since changing them
> >  JH> could come as an unpleasant surprise to people who don't explicitly
> > set them.  As a
> >  JH> general rule, we should always update release notes to clearly state
> > that a default has
> >  JH> changed.  For these two defaults in particular, I think it's safe.
> > For compaction
> >  JH> throughput I think a release not is sufficient in case they want to
> > modify it.  For number
> >  JH> of vnodes, it won't affect existing deployments with data - it would
> > be for new clusters,
> >  JH> which would honestly benefit from this anyway.
> >
> >  JH> The other point is whether it's too late to go into 4.0.  For these
> > two changes, I think
> >  JH> significant testing can still be done with these new defaults before
> > release and I think
> >  JH> testing more explicitly with 32 vnodes in particular will give
> people
> > more confidence in
> >  JH> the lower number with a wider array of testing (where we don't
> > already use 32 explicitly).
> >
> >  JH> In summary, are people okay with considering updating 

Re: Update defaults for 4.0?

2020-01-23 Thread Jeremy Hanna
To summarize this thread, I think people are generally okay with updating
certain defaults for 4.0 provided we make sure it doesn't unpleasantly
surprise cluster operators.  I think with the num_tokens and
compaction_throughput_in_mb we could go with a release note for the reasons
in my last email.  I also agree that we should consider bump
roles_validity_in_ms, permissions_validity_in_ms, and
   credentials_validity_in_ms along with the default snitch (going to GPFS
as the default) as that gives people a DC aware default at least to start.

Is everyone okay if I create tickets for each of these and link them with
an epic so that we can discuss them separately?

Thanks,

Jeremy

On Thu, Jan 23, 2020 at 5:34 AM Alex Ott  wrote:

> In addition to these, maybe we could consider to change other as well?
> Like:
>
> 1. bump roles_validity_in_ms, permissions_validity_in_ms, and
>credentials_validity_in_ms as well - maybe at least to a minute, or 2. I
>have seen multiple times when authentication was failing under the heavy
>load because queries to system tables were timing out - with these
>defaults people may still have the possibility to get updates to
>roles/credentials faster when specifying _update_interval_ variants of
>these configurations.
> 2. change default snitch from SimpleSnitch to GossipingPropertyFileSnitch -
>we're anyway saying that SimpleSnitch is only appropriate for
>single-datacenter deployments, and for real production we need to use
>GossipingPropertyFileSnitch - why not to set it as default?
>
>
> Jeremy Hanna  at "Wed, 22 Jan 2020 11:22:36 +1100" wrote:
>  JH> I mentioned this in the contributor meeting as a topic to bring up on
> the list - should we
>  JH> take the opportunity to update defaults for Cassandra 4.0?
>
>  JH> The rationale is two-fold:
>  JH> 1) There are best practices and tribal knowledge around certain
> properties where people
>  JH> just know to update those properties immediately as a starting
> point.  If it's pretty much
>  JH> a given that we set something as a starting point different than the
> current defaults, why
>  JH> not make that the new default?
>  JH> 2) We should align the defaults with what we test with.  There may be
> exceptions if we
>  JH> have one-off tests but on the whole, we should be testing with
> defaults.
>
>  JH> As a starting point, compaction throughput and number of vnodes seem
> like good candidates
>  JH> but it would be great to get feedback for any others.
>
>  JH> For compaction throughput (
> https://jira.apache.org/jira/browse/CASSANDRA-14902), I've made
>  JH> a basic case on the ticket to default to 64 just as a starting point
> because the decision
>  JH> for 16 was made when spinning disk was most common.  Hence most
> people I know change that
>  JH> and I think without too much bikeshedding, 64 is a reasonable
> starting point.  A case
>  JH> could be made that empirically the compaction throughput throttle may
> have less effect
>  JH> than many people think, but I still think an updated default would
> make sense.
>
>  JH> For number of vnodes, Michael Shuler made the point in the discussion
> that we already test
>  JH> with 32, which is a far better number than the 256 default.  I know
> many new users that
>  JH> just leave the 256 default and then discover later that it's better
> to go lower.  I think
>  JH> 32 is a good balance.  One could go lower with the new algorithm but
> I think 32 is much
>  JH> better than 256 without being too skewed, and it's what we currently
> test.
>
>  JH> Jeff brought up a good point that we want to be careful with defaults
> since changing them
>  JH> could come as an unpleasant surprise to people who don't explicitly
> set them.  As a
>  JH> general rule, we should always update release notes to clearly state
> that a default has
>  JH> changed.  For these two defaults in particular, I think it's safe.
> For compaction
>  JH> throughput I think a release not is sufficient in case they want to
> modify it.  For number
>  JH> of vnodes, it won't affect existing deployments with data - it would
> be for new clusters,
>  JH> which would honestly benefit from this anyway.
>
>  JH> The other point is whether it's too late to go into 4.0.  For these
> two changes, I think
>  JH> significant testing can still be done with these new defaults before
> release and I think
>  JH> testing more explicitly with 32 vnodes in particular will give people
> more confidence in
>  JH> the lower number with a wider array of testing (where we don't
> already use 32 explicitly).
>
>  JH> In summary, are people okay with considering updating these defaults
> and possibly others
>  JH> in the alpha stage of a new major release?  Are there other
> properties to consider?
>
>  JH> Jeremy
>  JH> -
>  JH> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
>  JH> For additional commands, e-mail: 

Re: [DISCUSS] Switch to using GitHub pull requests?

2020-01-23 Thread Joshua McKenzie
>
> I am reacting to what I currently see
> happening in the project; tests fail as the norm and this is kinda seen as
> expected, even though it goes against the policies as I understand it.

After over half a decade seeing us all continue to struggle with this
problem, I've come around to the school of "apply pain" (I mean that as
light-hearted as you can take it) when there's a failure to incent fixing;
specifically in this case, the only idea I can think of is preventing merge
w/any failing tests on a PR. We go through this cycle as we approach each
major release: we have the gatekeeper of "we're not going to cut a release
with failing tests obviously", and we clean them up. After the release, the
pressure is off, we exhale, relax, and flaky test failures (and others)
start to creep back in.

If the status quo is the world we want to live in, that's totally fine and
no judgement intended - we can build tooling around test failure history
and known flaky tests etc to optimize engineer workflows around that
expectation. But what I keep seeing on threads like this (and have always
heard brought up in conversation) is that our collective *moral* stance is
that we should have green test boards and not merge code if it introduces
failing tests.

Not looking to prescribe or recommend anything, just hoping that
observation above might be of interest or value to the conversation.

On Thu, Jan 23, 2020 at 4:17 PM Michael Shuler 
wrote:

> On 1/23/20 3:53 PM, David Capwell wrote:
> >
> > 2) Nightly build email to dev@?
>
> Nope. builds@c.a.o is where these go.
> https://lists.apache.org/list.html?bui...@cassandra.apache.org
>
> Michael
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
>


Re: Cassandra CI Status

2020-01-23 Thread Joshua McKenzie
>
>  - parallelise dtests (because 12 hours is wild)

That's one word for it. :)

 We used to ad hoc take a crack at sorting the individual test times by
longest and taking top-N and seeing if there was LHF to shave off that.
Being on a flight atm, not having that data handy right now, and that not
being in the linked logs from that pipeline run here (awesome work btw!),
do we think that might be something worth doing periodically on the project?

On Thu, Jan 23, 2020 at 8:37 AM Mick Semb Wever  wrote:

>
> > > If I don't hear any objection, I'll commit this. Off this, as it
> > > aggregates test reports, it's now possible to start test posting
> emails
> > > with the test report summary, as well as bringing in the dtest builds
> > > into the pipeline.
> >
> >
> > Based on the pipeline approach I've gotten notifications to slack and
> > email working.
> >
> > Does anyone object if I send these to #cassandra-builds (a brand new
> > slack room) and to bui...@cassandra.apache.org ?
> >
> > This is not meant as anything perfect or finished, just to get
> > something out there, on which a pragmatic discussion can continue…
>
>
> Closing the loop on this^, these commits have been made and each of our
> release branches have a pipeline build.
>
> An example of a pipeline build result is
>
> https://builds.apache.org/blue/organizations/jenkins/Cassandra-trunk/detail/Cassandra-trunk/6/pipeline
>
> Notifications from these are being sent to slack #cassandra-builds and to
> the builds@c.a.o ML
>
> Next steps (I see) are:
>  - remove scm polling on the other builds, so that only pipelines trigger
> off new commits,
>  - look into posting test result summary from each pipeline back to jira
> as a comment (there will be a separate comment for each release branch a
> patch is pushed to),
>  - stabilise disk usage on the jenkins nodes (nodes fall over from full
> disks), some discussion on builds@a.o is happening on this,
>  - parallelise dtests (because 12 hours is wild)
>
>
> And thanks to David, Josh, Dinesh, and Michael, for your input.
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
>


Re: [DISCUSS] Switch to using GitHub pull requests?

2020-01-23 Thread Michael Shuler

On 1/23/20 3:53 PM, David Capwell wrote:


2) Nightly build email to dev@?


Nope. builds@c.a.o is where these go.
https://lists.apache.org/list.html?bui...@cassandra.apache.org

Michael

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: [DISCUSS] Switch to using GitHub pull requests?

2020-01-23 Thread David Capwell
Thanks for the link. I have reached out to infra and will update this
thread as I hear back.

Looking around other Apache projects, there are work arounds so I don't
actually see this as a blocker, more limiting possible implementations.

So assuming we have a solution which enables CI builds on PR (which is
publicly accessible), are there other concerns?  The main points I see are
"don't force me" and JIRA noise (comments may be better outside of JIRA,
but I have heard strong "everything in JIRA"... I am not touching this
debate in this thread and will let someone else start that topic...)

My thoughts on the "don't force me" is we could have a compromise.

1) for PRs we have CI which annotations results in JIRA
2) Nightly build email to dev@?

Thoughts?

On Thu, Jan 23, 2020 at 12:51 PM Michael Shuler 
wrote:

> On 1/23/20 2:13 PM, Mick Semb Wever wrote:
> >
> >>> ASF policy is that patches from contributors that haven't a ICLA
> >>> filed can not have their patches automatically run through any ASF CI
> >>> system. It's up to a committer (or someone who has filed a ICLA) to
> >>> trigger the test run on the patch.
> >>
> >> I couldn't find this CI+ICLA policy info in the general ASF docs nor a
> >> google search to locate it. Link?
> >
> >
> >
> https://issues.apache.org/jira/browse/INFRA-18748?focusedCommentId=16885091=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16885091
>
> Well, OK then. This would appear to negate the "PRs make things easier
> for non-committer contributors" feature request, at least with regards
> to ASF CI. Darn, I was hoping we could hook into builds.a.o with PRs at
> some point to actually make build/test easier for everyone, which is
> generally where this thread started.
>
> Thanks for the JIRA link - would be good to see this formalized
> somewhere in policy docs, if it's what INFRA is actually enforcing.
>
> Michael
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
>


Re: [DISCUSS] Switch to using GitHub pull requests?

2020-01-23 Thread Michael Shuler

On 1/23/20 2:13 PM, Mick Semb Wever wrote:



ASF policy is that patches from contributors that haven't a ICLA
filed can not have their patches automatically run through any ASF CI
system. It's up to a committer (or someone who has filed a ICLA) to
trigger the test run on the patch.


I couldn't find this CI+ICLA policy info in the general ASF docs nor a
google search to locate it. Link?



https://issues.apache.org/jira/browse/INFRA-18748?focusedCommentId=16885091=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16885091


Well, OK then. This would appear to negate the "PRs make things easier 
for non-committer contributors" feature request, at least with regards 
to ASF CI. Darn, I was hoping we could hook into builds.a.o with PRs at 
some point to actually make build/test easier for everyone, which is 
generally where this thread started.


Thanks for the JIRA link - would be good to see this formalized 
somewhere in policy docs, if it's what INFRA is actually enforcing.


Michael

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: [DISCUSS] Switch to using GitHub pull requests?

2020-01-23 Thread Mick Semb Wever


> > ASF policy is that patches from contributors that haven't a ICLA
> > filed can not have their patches automatically run through any ASF CI
> > system. It's up to a committer (or someone who has filed a ICLA) to
> > trigger the test run on the patch.
> 
> I couldn't find this CI+ICLA policy info in the general ASF docs nor a 
> google search to locate it. Link?


https://issues.apache.org/jira/browse/INFRA-18748?focusedCommentId=16885091=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16885091

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: [DISCUSS] Switch to using GitHub pull requests?

2020-01-23 Thread Jordan West
On Thu, Jan 23, 2020 at 9:09 AM Jeff Jirsa  wrote:

> On Thu, Jan 23, 2020 at 6:18 AM Jeremiah Jordan 
> wrote:
>
> > It is the reviewer and authors job to make sure CI ran and didn’t
> > introduce new failing tests, it doesn’t matter how they were ran. It is
> > just as easy to let something through when “pr triggered” tests have a
> > failure as it is tests manually linked from a JIRA comment, if the author
> > and reviewer think the failures are not new.
> >
>
> Agreed. Any committer who commits while tests are broken is ignoring
> policy. Moving patch submission from one system to another won't somehow
> make committers adhere to policy.
>

Agreed but at the same time having a mechanism that reduces the amount of
manual work required by the reviewer/committer (running tests/viewing
reported results) should increase the likelihood the policy is adhered to
if it’s not being adhered to currently. I would welcome information being
automatically added from CI during review and if PRs are an easy way to
accomplish that in the future then +1.


>
> >
> > If someone want to setup some extra niceties, like auto triggered builds
> > or something, to happen if people use the PR workflow, then I see no
> > problem there. But I don’t think we need to force use of PRs.
> >
> > This is why I don’t think we need to “switch” to using PR’s. There is no
> > need to switch. People can “also” use PRs. If someone who likes the PR
> > workflow sets up some more nice stuff to happen when it is used, that
> would
> > probably encourage more people to do things that way. But it doesn’t need
> > to be forced.
> >
>
> Agreed.
>
+1. I don’t think we need requirement yet as much as encouragement. If some
folks start using this approach, hopefully it’s overall use will spread as
others see the benefits. I used it for the first time on my most recent
ticket and have found it convenient so far.

We can always revisit requirement if failures on trunk continue to be a
problem. Better visibility would be a good start.

Jordan


Re: [DISCUSS] Switch to using GitHub pull requests?

2020-01-23 Thread mck


> I am fine with Jenkins or CircleCI; though I feel CircleCI is more effort
> for each member.


ASF policy is that patches from contributors that haven't a ICLA filed can not 
have their patches automatically run through any ASF CI system. It's up to a 
committer (or someone who has filed a ICLA) to trigger the test run on the 
patch.



> The finer granularity of code review comments can also be on the github
> > commit pages, again you don't need to open the PR
> 
> 
> I just tested this out and what I see is the following
> 
> 1) branch has a jira link, so PR comments make it into jira
> 2) going to the commit and commenting *does not* go into jira.


(2) is The Right Thing™ IMHO. Keep low-level code review out of the jira 
ticket. I think a number of people complained about the noise of PR comments 
coming back into the jiras. Another reason to do the review on the commit 
instead of the PR, but again just an opinion and one way to do it.

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: [DISCUSS] Switch to using GitHub pull requests?

2020-01-23 Thread David Capwell
>
> CircleCI can build github forked branches.


Yes it can, but we currently require each member of the community to set up
their own CircleCI in order to test Cassandra (and non-paid account will
have many tests failing).  I looked into CircleCI JIRA integration and it
seems that we would need every member to go into their settings to enable
JIRA integration (get the correct tokens, link, etc.).  Assuming we did go
through this route, then we would require all members to also tag commits
to have CircleCI know to forward (which is step one in what I proposed).

I am fine with Jenkins or CircleCI; though I feel CircleCI is more effort
for each member.

The finer granularity of code review comments can also be on the github
> commit pages, again you don't need to open the PR


I just tested this out and what I see is the following

1) branch has a jira link, so PR comments make it into jira
2) going to the commit and commenting *does not* go into jira.

Here is the test case (maybe it eventually will and its just slower this
route?) .

https://issues.apache.org/jira/browse/CASSANDRA-15508
https://github.com/dcapwell/cassandra/commit/0a9bec8f2d0b0eb0c96b85d03dafbee317270eeb

So, given the stance that all conversations go into JIRA, in order to have
any conversation in the commit, someone will need to copy this also into
JIRA.


> Can’t you currently open a PR with the right commit message, have do
> review there with all comments posted back to JIRA, run CI on it and then
> merge it closing the PR?  This is the basic workflow you are proposing yes?

Yep.

It is the reviewer and authors job to make sure CI ran and didn’t introduce
> new failing tests, it doesn’t matter how they were ran.


I agree it doesn't matter how it runs, I am just proposing a small change
which makes it easier to automate so we can have these results posted back
into JIRA.  This would give more visibility to reviewers, authors, and
maintains the history better since there is a reference in JIRA that this
was known stable or not pre-merge; the current process requires more effort
from both parties, so trying to help lower this.


> It is just as easy to let something through when “pr triggered” tests have
> a failure as it is tests manually linked from a JIRA comment, if the author
> and reviewer think the failures are not new.


I agree that human judgement is the current gate keeper for merge or not
but as I understand it test failing is supposed to be a blocker for commit,
and tests failing is a blocker for release.  Given this are we not supposed
to right now fix these things before committing/releasing?  Wouldn't an
automated system which annotates JIRA only help bring visibility to this?

If someone want to setup some extra niceties, like auto triggered builds or
> something, to happen if people use the PR workflow, then I see no problem
> there. But I don’t think we need to force use of PRs.


I am more than willing to help set this up; i started this conversation to
make sure we are all on the same page or not.

But it doesn’t need to be forced.

I am fine not forcing anything, but I am reacting to what I currently see
happening in the project; tests fail as the norm and this is kinda seen as
expected, even though it goes against the policies as I understand it.

If there are alternatives to make this better, I would be glad to listen;
just proposing the one which I use frequently as my attempt to get the ball
rolling.

On Thu, Jan 23, 2020 at 9:09 AM Jeff Jirsa  wrote:

> On Thu, Jan 23, 2020 at 6:18 AM Jeremiah Jordan 
> wrote:
>
> > Can’t you currently open a PR with the right commit message, have do
> > review there with all comments posted back to JIRA, run CI on it and then
> > merge it closing the PR?  This is the basic workflow you are proposing
> yes?
> >
> >
> Yes you can.
>
>
> > It is the reviewer and authors job to make sure CI ran and didn’t
> > introduce new failing tests, it doesn’t matter how they were ran. It is
> > just as easy to let something through when “pr triggered” tests have a
> > failure as it is tests manually linked from a JIRA comment, if the author
> > and reviewer think the failures are not new.
> >
>
> Agreed. Any committer who commits while tests are broken is ignoring
> policy. Moving patch submission from one system to another won't somehow
> make committers adhere to policy.
>
>
> >
> > If someone want to setup some extra niceties, like auto triggered builds
> > or something, to happen if people use the PR workflow, then I see no
> > problem there. But I don’t think we need to force use of PRs.
> >
> > This is why I don’t think we need to “switch” to using PR’s. There is no
> > need to switch. People can “also” use PRs. If someone who likes the PR
> > workflow sets up some more nice stuff to happen when it is used, that
> would
> > probably encourage more people to do things that way. But it doesn’t need
> > to be forced.
> >
>
> Agreed.
>


Re: [DISCUSS] Switch to using GitHub pull requests?

2020-01-23 Thread Jeff Jirsa
On Thu, Jan 23, 2020 at 6:18 AM Jeremiah Jordan 
wrote:

> Can’t you currently open a PR with the right commit message, have do
> review there with all comments posted back to JIRA, run CI on it and then
> merge it closing the PR?  This is the basic workflow you are proposing yes?
>
>
Yes you can.


> It is the reviewer and authors job to make sure CI ran and didn’t
> introduce new failing tests, it doesn’t matter how they were ran. It is
> just as easy to let something through when “pr triggered” tests have a
> failure as it is tests manually linked from a JIRA comment, if the author
> and reviewer think the failures are not new.
>

Agreed. Any committer who commits while tests are broken is ignoring
policy. Moving patch submission from one system to another won't somehow
make committers adhere to policy.


>
> If someone want to setup some extra niceties, like auto triggered builds
> or something, to happen if people use the PR workflow, then I see no
> problem there. But I don’t think we need to force use of PRs.
>
> This is why I don’t think we need to “switch” to using PR’s. There is no
> need to switch. People can “also” use PRs. If someone who likes the PR
> workflow sets up some more nice stuff to happen when it is used, that would
> probably encourage more people to do things that way. But it doesn’t need
> to be forced.
>

Agreed.


Re: Cassandra CI Status

2020-01-23 Thread Mick Semb Wever


> > If I don't hear any objection, I'll commit this. Off this, as it 
> > aggregates test reports, it's now possible to start test posting emails 
> > with the test report summary, as well as bringing in the dtest builds 
> > into the pipeline. 
> 
> 
> Based on the pipeline approach I've gotten notifications to slack and 
> email working.
> 
> Does anyone object if I send these to #cassandra-builds (a brand new 
> slack room) and to bui...@cassandra.apache.org ?
> 
> This is not meant as anything perfect or finished, just to get 
> something out there, on which a pragmatic discussion can continue…


Closing the loop on this^, these commits have been made and each of our release 
branches have a pipeline build.

An example of a pipeline build result is 
 
https://builds.apache.org/blue/organizations/jenkins/Cassandra-trunk/detail/Cassandra-trunk/6/pipeline
 

Notifications from these are being sent to slack #cassandra-builds and to the 
builds@c.a.o ML

Next steps (I see) are:
 - remove scm polling on the other builds, so that only pipelines trigger off 
new commits,
 - look into posting test result summary from each pipeline back to jira as a 
comment (there will be a separate comment for each release branch a patch is 
pushed to),
 - stabilise disk usage on the jenkins nodes (nodes fall over from full disks), 
some discussion on builds@a.o is happening on this,
 - parallelise dtests (because 12 hours is wild)


And thanks to David, Josh, Dinesh, and Michael, for your input.

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: [DISCUSS] Switch to using GitHub pull requests?

2020-01-23 Thread Jeremiah Jordan
Can’t you currently open a PR with the right commit message, have do review 
there with all comments posted back to JIRA, run CI on it and then merge it 
closing the PR?  This is the basic workflow you are proposing yes?

It is the reviewer and authors job to make sure CI ran and didn’t introduce new 
failing tests, it doesn’t matter how they were ran. It is just as easy to let 
something through when “pr triggered” tests have a failure as it is tests 
manually linked from a JIRA comment, if the author and reviewer think the 
failures are not new.

If someone want to setup some extra niceties, like auto triggered builds or 
something, to happen if people use the PR workflow, then I see no problem 
there. But I don’t think we need to force use of PRs.

This is why I don’t think we need to “switch” to using PR’s. There is no need 
to switch. People can “also” use PRs. If someone who likes the PR workflow sets 
up some more nice stuff to happen when it is used, that would probably 
encourage more people to do things that way. But it doesn’t need to be forced.


> On Jan 22, 2020, at 9:53 PM, David Capwell  wrote:
> 
> Sorry Jeremiah, I don't understand your comment, would it be possible to
> elaborate more?
> 
> About the point on not forbidding as long as the review and testing needs
> are met, could you define what that means to you?
> 
> There are a few questions I ask myself
> 
> "Does the current process stop code which breaks the build from merging?"
> And
> "How long does it take for regressions to get noticed"
> 
> If I take myself as a example, I added a test which always failed in
> CircleCI (I assume Jenkins as well), this got merged, and the jira to fix
> it was around 3 months later.  I am personally trying to find ways to
> detect issues faster, but also see that test fail frequently (unit, jvm
> dtest, python dtest, etc.) so it's easy for this to slip through.
> 
> My mind set is that by switching to PRs (even if all the conversations are
> in JIRA) we can setup automation which helps detect issues before merging.
> 
>> On Wed, Jan 22, 2020, 7:00 PM J. D. Jordan 
>> wrote:
>> 
>> Doesn’t this github review workflow as described work right now?  It’s
>> just not the “only” way people do things?
>> 
>> I don’t think we need to forbid other methods of contribution as long as
>> the review and testing needs are met.
>> 
>> -Jeremiah
>> 
 On Jan 22, 2020, at 6:35 PM, Yifan Cai  wrote:
>>> 
>>> +1 nb to the PR approach for reviewing.
>>> 
>>> 
>>> And thanks David for initiating the discussion. I would like to put my 2
>>> cents in it.
>>> 
>>> 
>>> IMO, reviews comments are better associated with the changes, precisely
>> to
>>> the line level, if they are put in the PR rather than in the JIRA
>> comments.
>>> Discussions regarding each review comments are naturally organized into
>>> this own dedicated thread. I agree that JIRA comments are more suitable
>> for
>>> high-level discussion regarding the design. But review comments in PR can
>>> do a better job at code-level discussion.
>>> 
>>> 
>>> Another benefit is to relief reviewers’ work. In the PR approach, we can
>>> leverage the PR build step to perform an initial qualification. The
>> actual
>>> review can be deferred until the PR build passes. So reviewers are sure
>>> that the change is good at certain level, i.e. it builds and the tests
>> can
>>> pass. Right now, contributors volunteer for providing the link to CI test
>>> (however, one still needs to open the link to see the result).
>>> 
 On Wed, Jan 22, 2020 at 3:16 PM David Capwell 
>> wrote:
 
 Thanks for the links Benedict!
 
 Been reading the links and see the following points being made
 
 *) enabling the spark process would lower the process to enter the
>> project
 *) high level discussions should be in JIRA [1]
 *) not desirable to annotation JIRA and Github; should only annotate
>> JIRA
 (reviewer, labels, etc.)
 *) given the multi branch nature, pull requires are not intuitive [2]
 *) merging is problematic and should keep the current merge process
 *) commits@ is not usable with PRs
 *) commits@ is better because of PRs
 *) people are more willing to nit-pick with PRs, less likely with
>> current
 process [3]
 *) opens potential to "prevent commits that don't pass the tests" [4]
 *) prefer the current process
 https://urldefense.proofpoint.com/v2/url?u=http-3A__cassandra.apache.org_doc_latest_development_patches.html=DwIFaQ=adz96Xi0w1RHqtPMowiL2g=CNZK3RiJDLqhsZDG6FQGnXn8WyPRCQhp4x_uBICNC0g=8ytcWFMDYEPmp0dPd-upl_fBM0p1Yg59TmWwfX-wul4=VcidG8VBMY2hhNtOULVNpvbfSdq4tMobD6JxCLB91J8=
   [5]
 *) current process is annoying since you have to take the link in github
 and attach to JIRA for each comment in review
 *) missed notifications, more trust in commits@
 *) if someone rewrites history, comments could be hard to see
 *) its better to leave comments in the source code so 

Re: [DISCUSS] Switch to using GitHub pull requests?

2020-01-23 Thread Mick Semb Wever


> My mind set is that by switching to PRs (even if all the conversations are
> in JIRA) we can setup automation which helps detect issues before merging.


CircleCI can build github forked branches. AFAIK there's no reason to open a PR.

The finer granularity of code review comments can also be on the github commit 
pages, again you don't need to open the PR (though you will lose any the 
discussion thread over commits, but i find it easy enough to just reference the 
previous comment to create the continuity).

While PRs are a bit of a headache to close, if you forget to put in the "closes 
…" text into the commit message then only the author can close it. (I'm not 
sure if this situation has been improved or not?)

regards,
Mick

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org