Re: Wrapping up tick-tock

2017-01-17 Thread Stefan Podkowinski
Unfortunately it's hard to find a definition for "stable" in context of new
Cassandra releases. It depends a lot on used features and use cases. New
features will likely contain bugs, while the core features should remain
stable if there hasn't been a major rewrite or refactoring like in 3.0.

If you talk to long term Cassandra users, most of them will form their
opinion on release stability based on how long the release branch has been
available and how long it received bug fixes, and also based on Jira
activity / known issues. New users will probably just start with the latest
version, but I still think they also understand that newer releases will be
less stable than releases maintained for a while. That's why I find it less
important to rubber-stamp a release as "stable", but instead to give a
clear picture on the first .0 release date, the EOL date and included
features and let the user pick his poison, as it's a trade-off after all.
We could however list new features as "beta" there, if that would help
users to consider if their next mission critical application should be
based on MV or CDC (just to give an example).



On Sat, Jan 14, 2017 at 7:07 PM, Anuj Wadehra <
anujw_2...@yahoo.co.in.invalid> wrote:

> Hi,
> Now that we are rethinking versioning and release frequency, there exists
> an opportunity to make life easier for Cassandra users.
> How often mailing lists are discussing:
> "Which Cassandra version is stable for production?"OR"Is x version stable?"
> Your release version should indicate your confidence on the stability of
> the release , is it a bug fix or a feature release, are there any breaking
> changes or not.
>
> +1 semver and alpha/beta/GA releases
> So that you dont find every second Cassandra user asking about the latest
> stable Cassandra version.
> Thanks
> Anuj
>
>   On Sat, 14 Jan, 2017 at 1:04 AM, Jeff Jirsa<jji...@gmail.com> wrote:
>  Mick proposed it (semver) in one of the release proposals, and I dropped
> the ball on sending out the actual "vote on which release plan we want to
> use" email, because I messed up and got busy.
>
>
>
> On Fri, Jan 13, 2017 at 11:26 AM, Russell Bradberry <rbradbe...@gmail.com>
> wrote:
>
> > Has any thought been given to SemVer?
> >
> > http://semver.org/
> >
> > -Russ
> >
> > On 1/13/17, 1:57 PM, "Jason Brown" <jasedbr...@gmail.com> wrote:
> >
> >It's fine to limit the minimum time between major releases to six
> > months,
> >but I do not think we should force a major just because n months have
> >passed. I think we should up the major only when we have significant
> >    (possibly breaking) changes/features. It would seem odd to have a 6.0
> >that's basically the same as 4.0 (in terms of features and
> > protocol/format
> >compatibility).
> >
> >Thoughts?
> >
> >On Wed, Jan 11, 2017 at 1:58 AM, Stefan Podkowinski <spo...@gmail.com
> >
> >wrote:
> >
> >> I honestly don't understand the release cadence discussion. The 3.x
> > branch
> >> is far from production ready. Is this really the time to plan the
> > next
> >> major feature releases on top of it, instead of focusing to
> > stabilize 3.x
> >> first? Who knows how long that would take, even if everyone would
> >> exclusively work on bug fixing (which I think should happen).
> >>
> >> On Tue, Jan 10, 2017 at 4:29 PM, Jonathan Haddad <j...@jonhaddad.com
> >
> >> wrote:
> >>
> >> > I don't see why it has to be one extreme (yearly) or another
> > (monthly).
> >> > When you had originally proposed Tick Tock, you wrote:
> >> >
> >> > "The primary goal is to improve release quality.  Our current
> > major “dot
> >> > zero” releases require another five or six months to make them
> > stable
> >> > enough for production.  This is directly related to how we pile
> > features
> >> in
> >> > for 9 to 12 months and release all at once.  The interactions
> > between the
> >> > new features are complex and not always obvious.  2.1 was no
> > exception,
> >> > despite DataStax hiring a full tme test engineering team
> > specifically for
> >> > Apache Cassandra."
> >> >
> >> > I agreed with you at the time that the yearly cycle was too long
> > to be
> >> > adding features before cutting a release, and still do now.
> > 

Re: CASSANDRA-10993 Approaches

2016-08-18 Thread Stefan Podkowinski
>From my perspective, one of the most important reasons for RxJava would be
the strategic option to integrate reactive streams [1] in the overall
Cassandra architecture at some point in the future. Reactive streams would
allow to design back pressure fundamentally different compared to what we
do in the current work-queue based execution model. Think about the
optimizations currently deployed to walk a thin line between throughput,
latency and GC pressure. About the lack of coordination between individual
processes such as compactions, streaming and client requests that will
effect each other; where we can just hope that clients back off due to
latency aware policies, streams that will eventually timeout, or
compactions that hopefully get enough work done at some point. We even have
to tell people to tune batch sizes to not overwhelm nodes in the cluster.
Squeezing out n% during performance tests is nice, but IMO 10993 should
also address how to get more control on using system resources and a
reactive stream based approach could help with that.

[1] https://github.com/ReactiveX/RxJava/wiki/Reactive-Streams


On Wed, Aug 17, 2016 at 9:54 PM, Jake Luciani  wrote:

> I think I outlined the tradeoffs I see between the roll our own vs use a
> reactive framework in
> https://issues.apache.org/jira/plugins/servlet/mobile#
> issue/CASSANDRA-10528
>
> My view is we should try to utilize the existing before we start writing
> our own. And even if we do write our own keep it reactive since reactive
> APIs are going to be adopted in the Java 9 spec.  There is an entire
> community out there thinking about asynchronous programming that we can tap
> into.
>
> I don't buy the argument (yet) that Rx or other libraries lack the control
> we need. In fact these APIs are quite extensible.
>
> On Aug 17, 2016 3:08 PM, "Tyler Hobbs"  wrote:
>
> > In the spirit of the recent thread about discussing large changes on the
> > Dev ML, I'd like to talk about CASSANDRA-10993, the first step in the
> > "thread per core" work.
> >
> > The goal of 10993 is to transform the read and write paths into an
> > event-driven model powered by event loops.  This means that each request
> > can be handled on a single thread (although typically broken up into
> > multiple steps, depending on I/O and locking) and the old mutation and
> read
> > thread pools can be removed.  So far, we've prototyped this with a couple
> > of approaches:
> >
> > The first approach models each request as a state machine (or composition
> > of state machines).  For example, a single write request is encapsulated
> in
> > a WriteTask object which moves through a series of states as portions of
> > the write complete (allocating a commitlog segment, syncing the
> commitlog,
> > receiving responses from remote replicas).  These state transitions are
> > triggered by Events that are emitted by, e.g., the
> > CommitlogSegmentManager.  The event loop that manages tasks, events,
> > timeouts, and scheduling is custom and is (currently) closely tied to a
> > Netty event loop.  Here are a couple of example classes to take a look
> at:
> >
> > WriteTask:
> > https://github.com/thobbs/cassandra/blob/CASSANDRA-
> > 10993-WIP/src/java/org/apache/cassandra/poc/WriteTask.java
> > EventLoop:
> > https://github.com/thobbs/cassandra/blob/CASSANDRA-
> > 10993-WIP/src/java/org/apache/cassandra/poc/EventLoop.java
> >
> > The second approach utilizes RxJava and the Observable pattern.  Where we
> > would wait for emitted events in the state machine approach, we instead
> > depend on an Observable to "push" the data/result we're awaiting.
> > Scheduling is handled by an Rx scheduler (which is customizable).  The
> code
> > changes required for this are, overall, less intrusive.  Here's a quick
> > example of what this looks like for high-level operations:
> > https://github.com/thobbs/cassandra/blob/rxjava-rebase/
> > src/java/org/apache/cassandra/service/StorageProxy.java#L1724-L1732
> > .
> >
> > So far we've benchmarked both approaches on in-memory reads to get an
> idea
> > of the upper-bound performance of both approaches.  Throughput appears to
> > be very similar with both branches.
> >
> > There are a few considerations up for debate as to which approach we
> should
> > go with that I would appreciate input on.
> >
> > First, performance.  There are concerns that going with Rx (or something
> > similar) may limit the peak performance we can eventually attain in a
> > couple of ways.  First, we don't have as much control over the event
> loop,
> > scheduling, and chunking of tasks.  With the state machine approach,
> we're
> > writing all of this, so it's totally under our control.  With Rx, a lot
> of
> > things are customizable or already have decent tools, but this may come
> up
> > short in critical ways.  Second, the overhead of the Observable machinery
> > may become significant as other bottlenecks are removed.  Of course,
> > WriteTask 

Re: Rough roadmap for 4.0

2016-11-04 Thread Stefan Podkowinski
There has been a lot of discussions about diversity and getting new
contributors and I think this aspect should be kept in mind as well when
talking about a roadmap, additionally to the listed tickets that are
already in the pipeline. What can inspiring developers contribute to 4.0
that would move the project forward to it’s goals and would be very likely
included in the final release? What should people work on that would not be
left ignored, because there’s no need for it or no time to really take care
of it?

Same applies for reviewing tickets and I’m afraid the situations could get
even worse with the recent organisational changes. If there are no goals on
what should be included in 4.0, reviewers will probably just pick what they
personally find relevant or related to what they already worked on. Other
work will just be left ignored and I think that’s the worst thing to do
while trying to build up a bigger developer community. Each contribution
deserves some kind of response and even if it’s just a “not relevant for
next release, will look into it another time” type of reply. Having clear
goals or a certain theme for the release should make it easier to decide
what to review and where to decline. Does that make sense?

On Fri, Nov 4, 2016 at 3:47 AM, Nate McCall  wrote:

> It was brought up recently at the PMC level that our goals as a
> project are not terribly clear.
>
> This is a pretty good point as outside of Jira 'Fix Version' labelling
> (which we actually suck less at compared to a lot of other ASF
> projects) this really isnt tracked anywhere outside of general tribal
> knowledge about who is working on what.
>
> I would like to see us change this for two reasons:
> - it's important we are clear with our community about where we are going
> - we need to start working more closely together
>
> To that end, i've put together a list (in no particular order) of the
> *major* features in which I know folks are interested, have patches
> coming, are awaiting design review, etc.:
>
> - CASSANDRA-9425 Immutable node-local schema
> - CASSANDRA-10699 Strongly consistent schema alterations
> - CASSANDRA-12229 NIO streaming
> - CASSANDRA-8457 NIO messaging
> - CASSANDRA-12345 Gossip 2.0
> - CASSANDRA-9754 Birch trees
>
> What did I miss? What else would folks like to see? Specifically, this
> should be "new stuff that could/will break things" given we are upping
> the major version.
>
> To be clear, it's not my intention to set this in stone and then beat
> people about the head with it. More to have it there to point it at a
> high level and foster better communication with our users from the
> perspective of an open source project.
>
> Please keep in mind that given everything else going on, I think it's
> a fantastic idea to keep this list small and spend some time focusing
> on stability particularly as we transition to a new release process.
>
> -Nate
>


Re: Rough roadmap for 4.0

2016-11-16 Thread Stefan Podkowinski
>From my understanding, this will also effect EOL dates of other branches.

"We will maintain the 2.2 stability series until 4.0 is released, and 3.0
for six months after that.".


On Wed, Nov 16, 2016 at 5:34 AM, Nate McCall  wrote:

> Agreed. As long as we have a goal I don't see why we have to adhere to
> arbitrary date for 4.0.
>
> On Nov 16, 2016 1:45 PM, "Aleksey Yeschenko"  wrote:
>
> > I’ll comment on the broader issue, but right now I want to elaborate on
> > 3.11/January/arbitrary cutoff date.
> >
> > Doesn’t matter what the original plan was. We should continue with 3.X
> > until all the 4.0 blockers have been
> > committed - and there are quite a few of them remaining yet.
> >
> > So given all the holidays, and the tickets remaining, I’ll personally be
> > surprised if 4.0 comes out before
> > February/March and 3.13/3.14. Nor do I think it’s an issue.
> >
> > —
> > AY
> >
> > On 16 November 2016 at 00:39:03, Mick Semb Wever (m...@thelastpickle.com
> )
> > wrote:
> >
> > On 4 November 2016 at 13:47, Nate McCall  wrote:
> >
> > > Specifically, this should be "new stuff that could/will break things"
> > > given we are upping
> > > the major version.
> > >
> >
> >
> > How does this co-ordinate with the tick-tock versioning¹ leading up to
> the
> > 4.0 release?
> >
> > To just stop tick-tock and then say yeehaa let's jam in all the breaking
> > changes we really want seems to be throwing away some of the learnt
> wisdom,
> > and not doing a very sane transition from tick-tock to
> > features/testing/stable². I really hope all this is done in a way that
> > continues us down the path towards a stable-master.
> >
> > For example, are we fixing the release of 4.0 to November? or continuing
> > tick-tocks until we complete the 4.0 roadmap? or starting the
> > features/testing/stable branching approach with 3.11?
> >
> >
> > Background:
> > ¹) Sylvain wrote in an earlier thread titled "A Home for 4.0"
> >
> > > And as 4.0 was initially supposed to come after 3.11, which is coming,
> > it's probably time to have a home for those tickets.
> >
> > ²) The new versioning scheme slated for 4.0, per the "Proposal - 3.5.1"
> > thread
> >
> > > three branch plan with “features”, “testing”, and “stable” starting
> with
> > 4.0?
> >
> >
> > Mick
> >
>


Re: Proposals for releases - 4.0 and beyond

2016-11-19 Thread Stefan Podkowinski
I’d like to suggest an option similar to what Jeremiah described and that
would basically follow the Ubuntu LTS release model [1], but with shorter
time periods. The idea would be to do a stable release every 6 months with
1 year bug fixing support. At the same time, every third stable release
will serve as a LTS release and will be supported for 2 years.

Have a look at the following gist for illustration:
https://gist.github.com/spodkowinski/b9659169c73de3231f99bd17f74f5d1f

As you can see, although the support periods are relatively long, only 3
releases must be supported at the same time, which should be comparable to
what is done now.

At the same time, we also keep doing monthly releases, but they will only
serve as a milestone for the next stable release. Call them “dev”, “beta”,
“testing” or whatever you like. Users will be able to start developing for
those dev releases and deploy to production with the next standard or LTS
release, after development is finished. Another option for users would be
to start a project with a standard release and later settle down on a LTS
release for maintenance only. It's pretty flexible from a user perspective,
easy to understand and not too much effort to implement from the
development side.

On Sat, Nov 19, 2016 at 12:49 AM, Jeff Jirsa 
wrote:

> With 3.10 voting in progress (take 3), 3.11 in December/January
> (probably?), we should solidify the plan for 4.0.
>
> I went through the archives and found a number of proposals. We (PMC) also
> had a very brief chat in private to make sure we hadn’t missed any, and
> here are the proposals that we’ve seen suggested.
>
> Option #1: Jon proposed [1] a feature release every 3 months and bugfixes
> for 6 months after that.
> Option #2: Mick proposed [2] bimonthly feature, semver, labelling release
> with stability/quality during voting, 3 GA branches at a time.
> Option #3: Sylvain proposed [3] feature / testing / stable branches, Y
> cadence for releases, X month rotation from feature -> testing -> stable ->
> EOL (X to be determined). This is similar to an Ubuntu/Debian like release
> schedule – I asked Sylvain for an example just to make sure I understood
> it, and I’ve copied that to github at [4].
> Option #4: Jeremiah proposed [5] keeping monthly cadence, and every 12
> months break off X.0.Y which becomes LTS (same as 3.0.x now). This
> explicitly excludes alternating tick/tock feature/bugfix for the monthly
> cadence on the newest/feature/4.x branch.
> Option #5: Jason proposed a revision to Jeremiah’s proposal such that
> releases to the LTS branches are NOT tied to a monthly cadence, but are
> released “as needed”, and the LTS branches are also “as needed”, not tied
> to a fixed (annual/semi-annual/etc) schedule.
>
> Please use this thread as an opportunity to discuss these proposals or
> feel free to make your own proposals. I think it makes sense to treat this
> like a nomination phase of an election – let’s allow at least 72 hours for
> submitting and discussing proposals, and then we’ll open a vote after that.
>
> - Jeff
>
> [1]: https://lists.apache.org/thread.html/0b2ca82eb8c1235a4e44a406080729
> be78fb539e1c0cbca638cfff52@%3Cdev.cassandra.apache.org%3E
> [2]: https://lists.apache.org/thread.html/674ef1c02997041af4b8950023b07b
> 2f48bce3b197010ef7d7088662@%3Cdev.cassandra.apache.org%3E
> [3]: https://lists.apache.org/thread.html/fcc4180b7872be4db86eae12b538ee
> f34c77dcdb5b13987235c8f2bd@%3Cdev.cassandra.apache.org%3E
> [4]: https://gist.github.com/jeffjirsa/9bee187246ca045689c52ce9caed47bf
> [5]: https://lists.apache.org/thread.html/0a3372b2f2b30fbeac04f7d5a214b2
> 03b18f3d69223e7ec9efb64776@%3Cdev.cassandra.apache.org%3E
>
>
>
>
>


Re: Wrapping up tick-tock

2017-01-11 Thread Stefan Podkowinski
I honestly don't understand the release cadence discussion. The 3.x branch
is far from production ready. Is this really the time to plan the next
major feature releases on top of it, instead of focusing to stabilize 3.x
first? Who knows how long that would take, even if everyone would
exclusively work on bug fixing (which I think should happen).

On Tue, Jan 10, 2017 at 4:29 PM, Jonathan Haddad  wrote:

> I don't see why it has to be one extreme (yearly) or another (monthly).
> When you had originally proposed Tick Tock, you wrote:
>
> "The primary goal is to improve release quality.  Our current major “dot
> zero” releases require another five or six months to make them stable
> enough for production.  This is directly related to how we pile features in
> for 9 to 12 months and release all at once.  The interactions between the
> new features are complex and not always obvious.  2.1 was no exception,
> despite DataStax hiring a full tme test engineering team specifically for
> Apache Cassandra."
>
> I agreed with you at the time that the yearly cycle was too long to be
> adding features before cutting a release, and still do now.  Instead of
> elastic banding all the way back to a process which wasn't working before,
> why not try somewhere in the middle?  A release every 6 months (with
> monthly bug fixes for a year) gives:
>
> 1. long enough time to stabilize (1 year vs 1 month)
> 2. not so long things sit around untested forever
> 3. only 2 releases (current and previous) to do bug fix support at any
> given time.
>
> Jon
>
> On Tue, Jan 10, 2017 at 6:56 AM Jonathan Ellis  wrote:
>
> > Hi all,
> >
> > We’ve had a few threads now about the successes and failures of the
> > tick-tock release process and what to do to replace it, but they all died
> > out without reaching a robust consensus.
> >
> > In those threads we saw several reasonable options proposed, but from my
> > perspective they all operated in a kind of theoretical fantasy land of
> > testing and development resources.  In particular, it takes around a
> > person-week of effort to verify that a release is ready.  That is, going
> > through all the test suites, inspecting and re-running failing tests to
> see
> > if there is a product problem or a flaky test.
> >
> > (I agree that in a perfect world this wouldn’t be necessary because your
> > test ci is always green, but see my previous framing of the perfect world
> > as a fantasy land.  It’s also worth noting that this is a common problem
> > for large OSS projects, not necessarily something to beat ourselves up
> > over, but in any case, that's our reality right now.)
> >
> > I submit that any process that assumes a monthly release cadence is not
> > realistic from a resourcing standpoint for this validation.  Notably, we
> > have struggled to marshal this for 3.10 for two months now.
> >
> > Therefore, I suggest first that we collectively roll up our sleeves to
> vet
> > 3.10 as the last tick-tock release.  Stick a fork in it, it’s done.  No
> > more tick-tock.
> >
> > I further suggest that in place of tick tock we go back to our old model
> of
> > yearly-ish releases with as-needed bug fix releases on stable branches,
> > probably bi-monthly.  This amortizes the release validation problem over
> a
> > longer development period.  And of course we remain free to ramp back up
> to
> > the more rapid cadence envisioned by the other proposals if we increase
> our
> > pool of QA effort or we are able to eliminate flakey tests to the point
> > that a long validation process becomes unnecessary.
> >
> > (While a longer dev period could mean a correspondingly more painful test
> > validation process at the end, my experience is that most of the
> validation
> > cost is “fixed” in the form of flaky tests and thus does not increase
> > proportionally to development time.)
> >
> > Thoughts?
> >
> > --
> > Jonathan Ellis
> > co-founder, http://www.datastax.com
> > @spyced
> >
>


Documentation contributors guide

2017-03-17 Thread Stefan Podkowinski
There's recently been a discussion about the wiki and how we should
continue to work on the documentation in general. One of my suggestions
was to start giving users a clearer guideline how they are able to
contribute to our documentation, before having a technical discussion
around tools and wikis again.

I've now created a first version on such a guide that can be found here:
https://github.com/spodkowinski/cassandra/blob/docs_gettingstarted/doc/source/development/documentation.rst

As you can see there's a large part about using GitHub for editing on
the page. I'd like to know what you think about that and if you'd agree
to accept PRs for such purposes.

I'd also like to add another section for committers that describes the
required steps to actually publish the latest trunk to our website. I
know that svn has been mentioned somewhere, but I would appreciate if
someone either adds that section or just shares some details in this thread.

Cheers!



Re: Documentation contributors guide

2017-03-17 Thread Stefan Podkowinski
I don't see how that would be harder compared to merging a patch
attached to a jira ticket. If you'd want to merge my PR you'd just have
to do something like that:

curl -o docs.patch
https://github.com/apache/cassandra/compare/trunk...spodkowinski:docs_gettingstarted.patch
git am docs.patch
git reset --soft origin/trunk
git commit (add proper commit message and a "Merges #" text to
automatically close the PR)



On 03/17/2017 09:03 PM, Jeff Jirsa wrote:
> 
> 
> On 2017-03-17 12:33 (-0700), Stefan Podkowinski <s...@apache.org> wrote: 
> 
>> As you can see there's a large part about using GitHub for editing on
>> the page. I'd like to know what you think about that and if you'd agree
>> to accept PRs for such purposes.
>>
> 
> The challenge of github PRs isn't that we don't want them, it's that we can't 
> merge them - the apache github repo is a read only mirror (the master is on 
> ASF infrastructure). 
> 
> Personally, I'd rather have a Github PR than no patch, but I'd much rather 
> have a JIRA patch than a Github PR, because ultimately the committer is going 
> to have to manually transform the Github PR into a .patch file and commit it 
> with a special commit message to close the Github PR (or hope that the 
> contributor closes it for us, because committers can't even close PRs at this 
> point). 
> 
>> I'd also like to add another section for committers that describes the
>> required steps to actually publish the latest trunk to our website. I
>> know that svn has been mentioned somewhere, but I would appreciate if
>> someone either adds that section or just shares some details in this thread.
> 
> The repo is at https://svn.apache.org/repos/asf/cassandra/ - there's a doc at 
> https://svn.apache.org/repos/asf/cassandra/site/src/README that describes it. 
> 


Re: Documentation contributors guide

2017-03-20 Thread Stefan Podkowinski
As nobody seems to object, you can now find a patch for the proposed
document in the corresponding Jira ticket:
https://issues.apache.org/jira/browse/CASSANDRA-13256



On 03/17/2017 08:33 PM, Stefan Podkowinski wrote:
> There's recently been a discussion about the wiki and how we should
> continue to work on the documentation in general. One of my suggestions
> was to start giving users a clearer guideline how they are able to
> contribute to our documentation, before having a technical discussion
> around tools and wikis again.
> 
> I've now created a first version on such a guide that can be found here:
> https://github.com/spodkowinski/cassandra/blob/docs_gettingstarted/doc/source/development/documentation.rst
> 
> As you can see there's a large part about using GitHub for editing on
> the page. I'd like to know what you think about that and if you'd agree
> to accept PRs for such purposes.
> 
> I'd also like to add another section for committers that describes the
> required steps to actually publish the latest trunk to our website. I
> know that svn has been mentioned somewhere, but I would appreciate if
> someone either adds that section or just shares some details in this thread.
> 
> Cheers!
> 


Re: Testing and jira tickets

2017-03-16 Thread Stefan Podkowinski
Yes, failed test results need to be looked at by someone. But this is
already the case and won't change no matter if we run tests for each
patch and branch, or just once a day for a single dev branch. Having to
figure out which commit exactly causes the regression would take some
additional effort, but I don't think that would be the hardest part
dealing with failed test results. I'd be happy to discus other options,
but I'm pretty sure all of them will come with a price and we eventually
have to agree on something.


On 03/10/2017 03:43 PM, Josh McKenzie wrote:
>> I think we'd be able to figure out the one of them causing a regression
>> on the day after.
> That sounds great in theory. In practice, that doesn't happen unless one
> person steps up and makes themselves accountable for it.
>
> For reference, take a look at: https://cassci.datastax.com/view/trunk/, and
> https://issues.apache.org/jira/issues/?jql=project%20%3D%20cassandra%20and%20resolution%20%3D%20unresolved%20and%20labels%20in%20(%27test-fail%27%2C%20%27test-failure%27%2C%20%27testall%27%2C%20%27dtest%27%2C%20%27unit-test%27%2C%20%27unittest%27)%20and%20assignee%20%3D%20null%20order%20by%20created%20ASC
>
> We're thankfully still in a place where these tickets are at least being
> created, but unless there's a body of people that are digging in to fix
> those test failures they're just going to keep growing.
>
> On Fri, Mar 10, 2017 at 5:03 AM, Stefan Podkowinski <s...@apache.org> wrote:
>
>> If I remember correctly, the requirement of providing test results along
>> with each patch was because of tick-tock, where the goal was to have
>> stable release branches at all times. Without CI for testing each
>> individual commit on all branches, this just won't work anymore. But
>> would that really be that bad? Can't we just get away with a single CI
>> run per branch and day?
>>
>> E.g. in the future we could commit to dev branches that are used to run
>> all tests automatically on Apache CI on daily basis, which is then
>> exclusively used for that. We don't have that many commits on a single
>> day, some of them rather trivial, and I think we'd be able to figure out
>> the one of them causing a regression on the day after. If all tests
>> pass, we can merge dev manually or even better automatically. If anyone
>> wants to run tests on his own CI before committing to dev, that's fine
>> too and will help analyzing any regressions if they happen, as we then
>> don't have to look at those patches (and all commits before on dev).
>>
>>
>>
>> On 09.03.2017 19:51, Jason Brown wrote:
>>> Hey all,
>>>
>>> A nice convention we've stumbled into wrt to patches submitted via Jira
>> is
>>> to post the results of unit test and dtest runs to the ticket (to show
>> the
>>> patch doesn't break things). Many contributors have used the
>>> DataStax-provided cassci system, but that's not the best long term
>>> solution. To that end, I'd like to start a conversation about what is the
>>> best way to proceed going forward, and then add it to the "How to
>>> contribute" docs.
>>>
>>> As an example, should contributors/committers run dtests and unit tests
>> on
>>> *some* machine (publicly available or otherwise), and then post those
>>> results to the ticket? This could be a link to a build system, like what
>> we
>>> have with cassci, or just  upload the output of the test run(s).
>>>
>>> I don't have any fixed notions, and am looking forward to hearing other's
>>> ideas.
>>>
>>> Thanks,
>>>
>>> -Jason
>>>
>>> p.s. a big thank you to DataStax for providing the cassci system
>>>



Re: Testing and jira tickets

2017-03-10 Thread Stefan Podkowinski
If I remember correctly, the requirement of providing test results along
with each patch was because of tick-tock, where the goal was to have
stable release branches at all times. Without CI for testing each
individual commit on all branches, this just won't work anymore. But
would that really be that bad? Can't we just get away with a single CI
run per branch and day?

E.g. in the future we could commit to dev branches that are used to run
all tests automatically on Apache CI on daily basis, which is then
exclusively used for that. We don't have that many commits on a single
day, some of them rather trivial, and I think we'd be able to figure out
the one of them causing a regression on the day after. If all tests
pass, we can merge dev manually or even better automatically. If anyone
wants to run tests on his own CI before committing to dev, that's fine
too and will help analyzing any regressions if they happen, as we then
don't have to look at those patches (and all commits before on dev).



On 09.03.2017 19:51, Jason Brown wrote:
> Hey all,
> 
> A nice convention we've stumbled into wrt to patches submitted via Jira is
> to post the results of unit test and dtest runs to the ticket (to show the
> patch doesn't break things). Many contributors have used the
> DataStax-provided cassci system, but that's not the best long term
> solution. To that end, I'd like to start a conversation about what is the
> best way to proceed going forward, and then add it to the "How to
> contribute" docs.
> 
> As an example, should contributors/committers run dtests and unit tests on
> *some* machine (publicly available or otherwise), and then post those
> results to the ticket? This could be a link to a build system, like what we
> have with cassci, or just  upload the output of the test run(s).
> 
> I don't have any fixed notions, and am looking forward to hearing other's
> ideas.
> 
> Thanks,
> 
> -Jason
> 
> p.s. a big thank you to DataStax for providing the cassci system
> 


Re: Contribute to the Cassandra wiki

2017-03-13 Thread Stefan Podkowinski
Agreed. Let's not give up on this as quickly. My suggestion is to at
least provide a getting started guide for writing docs, before
complaining about too few contributions. I'll try to draft something up
this week.

What people are probably not aware of is how easy it is to contribute
docs through github. Just clone our repo, create a document and add your
content. It's all possible through the github web UI including
reStructuredText support for the viewer/editor. I'd even say to lower
the barrier for contributing docs even further by accepting pull
requests for them, so we can have a fully github based workflow for
casual contributors.


On 03/13/2017 05:55 PM, Jonathan Haddad wrote:
> Ugh... Let's put a few facts out in the open before we start pushing to
> move back to the wiki.
>
> First off, take a look at CASSANDRA-8700.  There's plenty of reasoning for
> why the docs are now located in tree.  The TL;DR is:
>
> 1. Nobody used the wiki.  Like, ever.  A handful of edits per year.
> 2. Docs in the wiki were out of sync w/ cassandra.  Trying to outline the
> difference in implementations w/ nuanced behavior was difficult /
> impossible.  With in-tree, you just check the docs that come w/ the version
> you installed.  And you get them locally.  Huzzah!
> 3. The in-tree docs are a million times better quality than the wiki *ever*
> was.
>
> I urge you to try giving the in-tree docs a chance.  It may not be the way
> *you* want it but I have to point out that they're the best we've seen in
> Cassandra world.  Making them prettier won't help anything.
>
> I do agree that the process needs to be a bit smoother for people to add
> stuff to the in tree side.  For instance, maybe for every features that's
> written we start creating a corresponding JIRA for the documentation.  Not
> every developer wants to write docs, and that's fair.  The accompanying
> JIRA would serve as a way for 2 or more people to collaborate on the
> feature & the docs in tandem.  It may also be beneficial to use the dev-ml
> to say "hey, i'm working on feature X, anyone want to help me write the
> docs for it?  check out CASSANDRA-XYZ"
>
> Part of CASSANDRA-8700 was to shut down the wiki.  I still advocate for
> this. At the very minimum we should make it read only with a big notice
> that points people to the in-tree docs.
>
> On Mon, Mar 13, 2017 at 8:49 AM Jeremy Hanna 
> wrote:
>
>> The moinmoin wiki was preferred but because of spam, images couldn’t be
>> attached.  The options were to use confluence or have a moderated list of
>> individuals be approved to update the wiki.  The decision was made to go
>> with the latter because of the preference to stick with moinmoin rather
>> than confluence.  That’s my understanding of the history there.  I don’t
>> know if people would like to revisit using one or the other at this point,
>> though it would take a bit of work to convert.
>>
>>> On Mar 13, 2017, at 9:42 AM, Nate McCall  wrote:
>>>
 Isn't there a way to split tech docs (aka reference) and more
 user-generated and use-case related/content oriented docs? And maybe to
>> use
 a more modern WIKI software or scheme. The CS wiki looks like 1998.
>>> The wiki is what ASF Infra provides by default. Agree that it is a bit
>>> "old-school."
>>>
>>> I'll ask around about what other projects are doing (or folks who are
>>> involved in other ASF projects, please chime in).
>>



Re: Official CS docs

2017-03-01 Thread Stefan Podkowinski
Hi Benjamin

I think the best way to catch up with the motivation behind this is by
reading the following dev post and linked jiras:

https://lists.apache.org/thread.html/029e1273675260630e4973ba301f71a8de5a9d7e294a7b2df6eed65f@%3Cdev.cassandra.apache.org%3E

What are your suggestions to improve the documentation? I think it's
fair to say that the official docs still leave a lot to be desired. But
wikis or any other publishing tools each have their on strengths and
drawbacks. Do you have any example project with a process that we should
follow instead? Did you have a look at the README file in the docs tree
and actually try to add or change any content? What would hold you back
to work from there and submit a patch?



On 01.03.2017 11:10, benjamin roth wrote:
> Hi guys,
> 
> Is there a reason that the docs are part of the git repo?
> In my personal opinion this is very complicated and it puts the hurdle to
> contribute to docs very high.
> 
> There are so many questions on userlists that repeat over and over again
> and that could be put into a knowledge base.
> 
> But ...
> - Maintaining this in a repo is a painful, complicated and slow.
> - I don't like to write docs that I can't preview instantly. I don't want
> to wait for a slow deployment process to see my result.
> - There are tons of solutions for agile and moderated document management
> like wikis or CMS.
> - Doc access is not bound to contribution access and can be handled more
> relaxed.
> 
> One thing that supports my consideration is the fact that the official doc
> site is sparse and contains a lot of TODOs or "Under construction" entries.
> 
> IMHO Doc vs Source is like userlist vs devlist.
> 
> Any thoughts?
> 
> Cheers,
> Ben
> 


Re: [RESULT][VOTE] Ask Infra to move github notification emails to pr@

2017-07-28 Thread Stefan Podkowinski
Can we forward notifications for the new cassandra-dtest repo there as well?

On 24.03.2017 18:59, Jeff Jirsa wrote:
> With 6 binding +1s, 6 non-binding +1s, and no -1s of any kind, the vote 
> passes, I'll ask for a new mailing list and get this transitioned.
> 
> - Jeff
> 
> On 2017-03-20 15:32 (-0700), Jeff Jirsa  wrote: 
>> There's no reason for the dev list to get spammed everytime there's a
>> github PR. We know most of the time we prefer JIRAs for real code PRs, but
>> with docs being in tree and low barrier to entry, we may want to accept
>> docs through PRs ( see https://issues.apache.org/jira/browse/CASSANDRA-13256
>> , and comment on it if you disagree).
>>
>> To make that viable, we should make it not spam dev@ with every comment.
>> Therefore I propose we move github PR comments/actions to pr@ so as
>> not to clutter the dev@ list.
>>
>> Voting to remain open for 72 hours.
>>
>> - Jeff
>>

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: sstabledump expects jna 5.1.0

2017-07-18 Thread Stefan Podkowinski
I haven't been able to reproduce this on Ubuntu or CentOS. Which OS do
you use? Did you install a pre-build package or tarball?

On 18.07.2017 11:43, Micha wrote:
> Hello,
> 
> when calling sstabledump from cassandra 3.11 I get the error:
> 
> 
> "There is an incompatible JNA native library installed on this system
> Expected: 5.1.0
> Found: 4.0.0"
> 
> Maybe I overlooked something, but after searching I found the newest
> version to be 4.4 with 4.5 the upcoming new version.
> 
> My java version is 1.8.0_131, build 25.131-b11
> 
> Setting jna.nosys=true works however.
> 
> So, where does the required version 5.1 come from?
> 
> thanks,
>  Michael
> 
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
> 

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



CHANGES.txt

2017-07-18 Thread Stefan Podkowinski
Has there been any consensus in the past about what goes into
CHANGES.txt and what not? My naive assumption was that the intended
audience for the file are users who want to know about changes between
new releases. Having that in mind, I skipped changes.txt once in a while
for updates that have no relevance except for developers. For releases,
such as 4.0, the list is already substantial and hard to digest. I'm
also wondering how informative entries such as e.g. "Remove unused
method" are for the general user. So my question is, should we add all
resolved tickets to changes.txt, or should we try to keep the list less
verbose in the future?


-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: CHANGES.txt

2017-07-21 Thread Stefan Podkowinski
Thanks your responses! Seems like all of you prefer to have both trivial
and non-trivial updates in CHANGES.txt. I'm going to keep that in mind,
but will continue to omit them for documentation edits.


On 18.07.2017 23:49, kurt greaves wrote:
> I agree that all patches should be added to changes.txt, just to rule out
> any ambiguities. When people look at Changes.txt it's usually to find
> something specific, not to browse the list of changes. Anything significant
> should make it into news.txt, which is more appropriate for users.
> changes.txt is more aimed at developers in my opinion.
> 
> On that note, messages like "Remove unused method" should be more specific,
> that's just a bad commit message in general, and doesn't give at context.
> 

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: Guidelines on testing

2017-04-25 Thread Stefan Podkowinski
I don't see any reasons not to make this part of our guidelines. The
idea of having a list of what should be tested in each kind of test
makes sense. I also like the examples how to improve tests dealing with
global state.

Some of the integration test cases, such as "dry
start"/"restart"/"shutdown"/"upgrade", could use some further
description and how-to examples. Are there any existing tests we can
link for reference?

We also already have a testing related page in our documentation:
http://cassandra.apache.org/doc/latest/development/testing.html
Not sure if it would make sense to merge or create an additional document.


On 24.04.2017 18:13, Blake Eggleston wrote:
> About a month ago, in the ‘Code quality, principles and rules’ thread, I’d 
> proposed adding some testing standards to the project in lieu of revisiting 
> the idea of removing singletons. The idea was that we could drive incremental 
> improvement of the test coverage and testability situation that could be 
> applied in day to day work. I’ve pushed a first draft to my repo here:
> 
> https://github.com/bdeggleston/cassandra/blob/testing-doc/TESTING.md
> 
> Please take a look and let me know what you think. With the blessing of the 
> pmc, I’d like this, or something like it, to be adopted as the reference for 
> contributors and reviewers when deciding if a contribution is properly tested.
> 
> Blake
> 


Re: New contribution - Burst Hour Compaction Strategy

2017-06-09 Thread Stefan Podkowinski
Hello Pedro

Thanks for being interested in contributing to Apache Cassandra.
Creating a new compaction strategy is not an easy task and there are
several things you can do to make it more obvious for other developers
to understand what you're up to.

First of all, if using github, changes to the code base should be done
by having a separate branch in your own fork of the Apache repository.
This will make it possible for others to quickly compare your changes to
the current code base using the web interface. Technically using a new
repo works as well, but isn't as convenient for others, e.g. it starts
by not communicating which Cassandra branch was used as basis for you
changes.

Talking about git, I'd also suggest to learn more about creating a git
history for your code that is easy to review. E.g. you may want to
squash some of the "code clean up" style commits.

As mentioned, implementing a new compaction strategy is quite an effort
and the theories and motivations behind this is at least as interesting
as the actual implementation. Therefor it could be a good idea to have a
design document describing your work on a different abstraction level.
It will also make it more likely to get other people involved in the
discussion, as not everyone will have to check the source code for the
details.

-Stefan


On 08.06.2017 09:31, Pedro Gordo wrote:
> Hi all
> 
> As part of my MSc project, I've done a new compaction strategy for
> Cassandra, called Burst Hour Compaction Strategy. You can find the JIRA
> ticket here: https://issues.apache.org/jira/browse/CASSANDRA-12201
> 
> In a nutshell, the background compaction for this strategy is only
> triggered during a predefined interval, freeing the resources during other
> times of the day. It also tries to make keys unique across all the
> SSTables, when these keys that are present in more than a configurable
> number of tables. Please check the JIRA ticket for a full description.
> 
> The code can be found here: https://github.com/sedulam/CASSANDRA-12201
> 
> Please let me know what you think, or improvements that can be done (some
> ideas are in the ticket description). Since I'm new to Cassandra, I imagine
> that a lot of assumptions might not be the best, e.g. 100MB for the maximum
> table size.
> 
> I'm looking forward to working with this community!
> 
> All the best
> Pedro Gordo
> 

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: Potential block issue for 3.0.13: schema version id mismatch while upgrading

2017-05-31 Thread Stefan Podkowinski

On 31.05.2017 09:34, Stefania Alborghetti wrote:
> There shouldn't be too many people with 3.0.13 in
> production however, since upgrading to 3.0.13 is broken in the first place.

Keep in mind that there are always people upgrading from 2.x, especially
since we had a couple of important bug fixes for 2.x -> 3.0 that people
might have been waiting for before taking the plunge. We should at least
be so kind to let these users know about potential issues in NEWS.txt.

If I understand correctly, this will cause schema migration storms
across nodes for .13 -> .14. I'd rather see us pulling 3.0.13 from
downloads now and take some more time to figure out what can be done for
users already running 3.0.13 on production, instead of rushing .14 and
risk throwing .13 users under the bus.

E.g. can we create a nodetool command to disable execution of schema
migration tasks, which can be run during cluster upgrades in cases like
this?



-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Status on new nodes for builds.apache.org

2017-06-02 Thread Stefan Podkowinski
Just a quick heads up for everyone interested in the jobs history at
builds.apache.org or who wants to run devbranch jobs there. A couple of
Jenkins nodes are not working correctly, which is causing jobs to abort
abnormally during start. You'd either have to rebuild until you hit a
working node, or wait until this issue has been resolved (follow
INFRA-14153 for that).

Btw, thanks for Instclustr for donating much needed resources! Those
nodes will be much appreciated once they are working :)

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: Integrating vendor-specific code and developing plugins

2017-06-03 Thread Stefan Podkowinski
I'd suggest to use the git docs for the new pages, so we can accept pull
requests for adding other plugins. [1]

We can also link there from the main pages. Maybe the community page
would be a good place for that.

[1] https://cassandra.apache.org/doc/latest/development/documentation.html

On 06/03/2017 02:28 AM, J. D. Jordan wrote:
> The site is in svn for the main pages.
> 
> https://svn.apache.org/repos/asf/cassandra/site/src/
> 
> And in git for the docs.
> https://github.com/apache/cassandra/tree/trunk/doc/source
> 
> For suggested changes make a JIRA with proposed changes.
> 
> -Jeremiah
> 
>> On Jun 2, 2017, at 5:36 PM, 大平怜  wrote:
>>
>> Hi all,
>>
>> As for our CAPI Flash enablement code, we are now working on the
>> plugin approach.  Once it is ready, we would like to propose changes
>> in some Web pages of http://cassandra.apache.org for better plugin
>> support.  I don't find any official process to propose such changes,
>> but could anyone tell us who we should work with?
>>
>>
>> Thanks,
>> Rei Odaira
>>
>> 2017-05-19 16:56 GMT-05:00 大平怜 :
>>> Hi all,
>>>
>>> Everybody seems to agree with improving the plugin ecosystem (as well
>>> as not small amount of effort needed to do that), but about
>>> vendor-specific code integration, let me summarize the issues raised
>>> so far.
>>>
>>> 1) How to test it?  What if my code breaks the vendor-specific build?
>>> 2) How to maintain it?  Who is to maintain the code?
>>> 3) How does it affect the Cassandra release cycle?
>>> 4) How to remove it?  It might be hard to remove once integrated, from
>>> both technical and markting perspective.
>>>
>>> I think #3 and #4 are rather general issues for any newly proposed
>>> changes, while #1 and #2 are the most problematic for niche :-)
>>> platform specific code.  #1 is technically solvable, for example, as
>>> Jeff (thanks!) showed with the Jenkins slave at ASF and as we are
>>> trying to connect a ppc machine with a CAPI device to the CI.
>>>
>>> #2 must be socially solved, as a component/platform maintainer system
>>> should be introduced like some other Apache projects.  Is there any
>>> chance to have such a system in Cassandra?
>>>
>>>
>>> Thanks,
>>> Rei Odaira
>>>
>>> 2017-05-18 12:36 GMT-05:00 Jeff Jirsa :


> On Thu, May 18, 2017 at 10:28 AM, Jeff Jirsa  wrote:
>
>
>
> On Mon, May 15, 2017 at 5:25 PM, Jeremiah D Jordan
>  wrote:
>>
>>
>>
>> To me testable means that we can run the tests at the very least for
>> every release, but ideally they would be run more often than that.
>> Especially with the push to not release unless the test board is all
>> passing, we should not be releasing features that we don’t have a test 
>> board
>> for.  Ideally that means we have it in ASF CI.  If there is someone that 
>> can
>> commit to posting results of runs from an outside CI somewhere, then I 
>> think
>> that could work as well, but that gets pretty cumbersome if we have to 
>> check
>> 10 different CI dashboards at different locations before every release.
>
>
>
> It turns out there's a ppc64le jenkins slave @ asf, so I've setup
> https://builds.apache.org/view/A-D/view/Cassandra/job/cassandra-devbranch-ppc64le-testall/
> for testing.
>
> Like our other devbranch-testall builds, it takes a repo+branch as
> parameters, and runs unit tests. While the unit tests aren't passing, this
> platform should now be considered testable.
>

 (Platform != device, though, the CAPI device obviously isn't there, so the
 row cache implementation still doesn't have public testing)


>>
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
>> For additional commands, e-mail: dev-h...@cassandra.apache.org
>>
> 

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: Proposal to retroactively mark materialized views experimental

2017-10-04 Thread Stefan Podkowinski
Introducing feature flags for enabling or disabling different code paths
is not sustainable in the long run. It's hard enough to keep up with
integration testing with the couple of Jenkins jobs that we have.
Running jobs for all permutations of flags that we keep around, would
turn out impractical. But if we don't, I'm pretty sure something will
fall off the radar and it won't take long until someone reports that
enabling feature X after the latest upgrade will simply not work anymore.

There may also be some more subtle assumptions and cross dependencies
between features that may cause side effects by disabling a feature (or
parts of it), even if it's just e.g. a metric value that suddenly won't
get updated anymore, but is used somewhere else. We'll also have to
consider migration paths for turning a feature on and off again without
causing any downtime. If I was to turn on e.g. MVs on a single node in
my cluster, then this should not cause any issues on the other nodes
that still have MV code paths disabled. Again, this would need to be tested.

So to be clear, my point is that any flags should be implemented in a
really non-invasive way on the user facing side only, e.g. by emitting a
log message or cqlsh error. At this point, I'm not really sure if it
would be a good idea to add them to cassandra.yaml, as I'm pretty sure
that eventually they will be used to change the behaviour of our code,
beside printing a log message.


On 04.10.17 10:03, Mick Semb Wever wrote:
>>> CDC sounds like it is in the same basket, but it already has the
>>> `cdc_enabled` yaml flag which defaults false.
>> I went this route because I was incredibly wary of changing the CL
>> code and wanted to shield non-CDC users from any and all risk I
>> reasonably could.
>
> This approach so far is my favourite. (Thanks Josh.)
>
> The flag name `cdc_enabled` is simple and, without adjectives, does not
> imply "experimental" or "beta" or anything like that.
> It does make life easier for both operators and the C* developers.
>
> I'm also fond of how Apache projects often vote both on the release as well
> as its stability flag: Alpha|Beta|GA (General Availability).
> https://httpd.apache.org/dev/release.html
> http://www.apache.org/legal/release-policy.html#release-types
>
> Given the importance of The Database, i'd be keen to see attached such
> community-agreed quality references. And going further, not just to the
> releases but also to substantial new features (those yet to reach GA). Then
> the downloads page could provide a table something like
> https://paste.apache.org/FzrQ
>
> It's just one idea to throw out there, and while it hijacks the thread a
> bit, it could even with just the quality tag on releases go a long way with
> user trust. Especially if we really are humble about it and use GA
> appropriately. For example I'm perfectly happy using a beta in production
> if I see the community otherwise has good processes in place and there's
> strong testing and staging resources to take advantage of. And as Kurt has
> implied many users are indeed smart and wise enough to know how to safely
> test and cautiously use even alpha features in production.
>
> Anyway, with or without the above idea, yaml flag names that don't
> use adjectives could address Kurt's concerns about pulling the rug from
> under the feet of existing users. Such a flag is but a small improvement
> suitable for a minor release (you must read the NEWS.txt before even a
> patch upgrade), and the documentation is only making explicit what should
> have been all along. Users shouldn't feel that we're returning features
> into "alpha|beta" mode when what we're actually doing is improving the
> community's quality assurance documentation.
>
> Mick
>


-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



CCM dependency in dtests

2017-11-27 Thread Stefan Podkowinski
Just wanted to bring a recent discussion about how to use ccm from
dtests to your attention:
https://github.com/apache/cassandra-dtest/pull/13

Basically the idea is to not depend on a released ccm artifact, but to
use a dedicated git branch in the ccm repo instead for executing dtests.
Motivation and details can be found in the PR, please feel free to comment.

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: CCM dependency in dtests

2017-12-01 Thread Stefan Podkowinski
I did test run on b.a.o yesterday and looks like this isn't causing any
additional issues.

I've now merged the suggested change with dtest master.

Any updates to a new release or any hot fixes of the ccm dependency in
dtests will now have to happen by updating the test-branch in the ccm
repo (https://github.com/pcmanus/ccm/tree/cassandra-test). You can
simply open a PR for that. This will allow us to move ahead with ccm
fixes more quickly, without having to bug Philip each time to publish a
new release.


On 01.12.2017 00:06, Michael Kjellman wrote:
> Hey Stefan, any updates on this? Thanks.
> 
> best,
> kjellman
> 
>> On Nov 27, 2017, at 7:34 AM, Michael Kjellman <mkjell...@internalcircle.com> 
>> wrote:
>>
>> thanks for driving this Stefan this is definitely an issue that I 
>> recently saw too trying to get all the dtests passing. having logic you need 
>> to fix in 3 repos isn’t ideal at all. 
>>
>>> On Nov 27, 2017, at 4:05 AM, Stefan Podkowinski <s...@apache.org> wrote:
>>>
>>> Just wanted to bring a recent discussion about how to use ccm from
>>> dtests to your attention:
>>> https://github.com/apache/cassandra-dtest/pull/13
>>>
>>> Basically the idea is to not depend on a released ccm artifact, but to
>>> use a dedicated git branch in the ccm repo instead for executing dtests.
>>> Motivation and details can be found in the PR, please feel free to comment.
>>>
>>> -
>>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
>>> For additional commands, e-mail: dev-h...@cassandra.apache.org
>>>
>>
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
>> For additional commands, e-mail: dev-h...@cassandra.apache.org
>>
> 
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
> 

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: Proposal: github pull requests for all code changes

2017-12-12 Thread Stefan Podkowinski
There's nothing that stops people from using github to discuss code
changes. Many jiras already link to gh branches that can be used to
review and comment code. But it's not always the best place to do so.

The high level discussion should always take place on Jira. Although I'd
have no problem to see in-depth code reviewing happening on gh, I'd hate
to see significant parts of the discussion as a whole spread across Jira
and different PRs related to the ticket.

The missing gh integration with Jira and lack of administrative
permissions is another problem. If we close a Jira ticket, the
corresponding PRs will still stay open. We either have to ask the
contributor to close it or have an ever growing number of open PRs.
There's also no way for us to label, assign or otherwise use PR related
features, so I'm really wondering why it would make sense to more
heavily using them.


On 12.12.2017 09:02, Marcus Eriksson wrote:
> To be able to use the github code review UI and closer CI integration we
> should make it obligatory to submit github pull requests for all code
> changes.
> 
> The process would be:
> 1. Create or find a JIRA ticket
> 2. Submit GH pull request
>  - one PR per branch (one for 3.0, one for 3.11 etc)
>  - the PR title should mention the JIRA ticket so that the PR gets
> linked on the JIRA
> 3. Have it reviewed and approved on GH
> 4. Committer needs to do the manual work of getting the code committed to
> the apache repo like today
>- having "closes #123" in the commit message closes the pull request.
> Adding the same line to the merge commit messages should close the PRs for
> the different branches
> 
> Apache Spark does something similar and their guidelines are here:
> https://spark.apache.org/contributing.html
> 
> Anyone got any concerns or improvement suggestions to this?
> 
> /Marcus
> 

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: [DISCUSS] Cassandra and future Java

2018-05-30 Thread Stefan Podkowinski
That's probably not far off what Robert suggested:

"The idea here is to default to Java 8, but the code also runs on 11"

"Initially, only the combination of C* 4.0 + Java 8 would be labeled as
"stable" and the combination of C* 4.0 + Java 11 as "experimental"."

If Robert wants to go ahead making the code "also run Java 11", while we
keep testing for and officially supporting Java 8, then I can't really
think of any argument against that, as long as we don't end up with tons
of version based code toggles, or too risky changes regarding stability
in general. We would still release 4.0 for Java 8 and afterwards
officially switch to 11 in 4.1, based on the work done in #9608 and
already merged into 4.0. Downloads and packages for 4.0 would be
released for Java 8, while running 4.0 with Java 11 would produce a
warning message indicating that it's experimental.


On 30.05.2018 08:54, kurt greaves wrote:
> So for anyone that missed it, Java 11 will be released in September 2018.
> 
> I'd prefer we target one Java release only. This is purely because we don't
> have the capacity or capability to test both releases. We hardly do a good
> enough job as it is of testing and lumping another JVM into the mix is just
> going to complicate things a lot and all the effort we expend into testing
> both releases is probably better off just spent focusing on one.
> 
> At this point I don't think there is *much* value in supporting 11 for 4.0,
> seeing as we won't be able to fully utilise features in 11 as our feature
> freeze for 4.0 will occur before 11 is released. There is obviously the
> support problem but adoptOpenJDK are claiming they'll support Java 8 until
> September 2022 (https://adoptopenjdk.net/support.html) - which on top of
> all the existing releases is probably good enough for us, and 2022 is far
> enough away that hopefully 4.0 will be EOL'd by then. I don't think it's a
> big risk that support for Java 8 will stop anytime soon, it's pretty
> widespread and it's going to take people a *long* time to get off 8.
> 
> It would make much more sense to me to support 11 in 4.1 that way we can
> actually utilise any benefits of 11.
> 
> On 29 May 2018 at 12:22, Robert Stupp  wrote:
> 
>> Ideally, CI would run against both Java 8 and 11. I’ve no clue about b.a.o
>> though.
>>
>> There will definitely be a log of smaller issues - both for OpenJDK 8 and
>> 11.
>> I think, it’s sufficient to deal with the Linux distros' (RH/deb) openjdk
>> dependencies - just making sure, that we’re using the right Java version -
>> and not let the package manger just pull the newest available.
>> The version-string from adoptopenjdk for example is one of these “minor
>> issues"...
>>
>> —
>> Robert Stupp
>> @snazy
>>
>> On 28. May 2018, at 15:46, Stefan Podkowinski  wrote:
>>
>> The main issue that I see, for supporting both Java 8 + 11, is testing.
>> We should first decide how this would effect builds.apache.org, or how
>> we're going to do CI testing in general for that situation.
>>
>> There are probably also smaller issues that we're not aware of yet, such
>> as which Java dependency to use for our deb and rpm packages,
>> differences in Java distributions (Oracle, AdoptOpenJDK, Redhat,..) and
>> so on. I'd expect we could deal with this on the Java side, but the
>> infra, scripting and testing implications give me a greater headache
>> when thinking of it.
>>
>>
>> On 25.05.2018 15:33, J. D. Jordan wrote:
>>
>> +1 for “Option 3: both 8 + 11” it shouldn’t be too hard to maintain code
>> wise, and leaves people’s options open.
>>
>> -Jeremiah
>>
>> On May 25, 2018, at 6:31 AM, Robert Stupp  wrote:
>>
>> I'd like to bring up the C*/Java discussion again. It's been a while since
>> we've discussed this.
>>
>> To me it sounds like there's still the question about which version(s) of
>> Java we want to support beginning with C* 4.0.
>>
>> I assume, that it's legit (and probably very necessary) to assume that
>> OpenJDK is now (i.e. after Java 6) considered as "production ready" for C*.
>> The public (and legal and free) availability of Oracle's Java 8 will end in
>> January 2019 (unless you're using it privately on your desktop). Java 9 and
>> 10 are not a thing, as both will be EOL when the C* 4.0 branch is about to
>> be cut. The most recent available Java version will be 11, which is meant
>> to be publicly available from Oracle until March 2019 and should get LTS
>> support for OpenJDK 11 from major Linux distros (RHEL and derivates,
>> Ubuntu, Azul Zulu).
>>
>> (

Re: [DISCUSS] Cassandra and future Java

2018-05-28 Thread Stefan Podkowinski
The main issue that I see, for supporting both Java 8 + 11, is testing.
We should first decide how this would effect builds.apache.org, or how
we're going to do CI testing in general for that situation.

There are probably also smaller issues that we're not aware of yet, such
as which Java dependency to use for our deb and rpm packages,
differences in Java distributions (Oracle, AdoptOpenJDK, Redhat,..) and
so on. I'd expect we could deal with this on the Java side, but the
infra, scripting and testing implications give me a greater headache
when thinking of it.


On 25.05.2018 15:33, J. D. Jordan wrote:
> +1 for “Option 3: both 8 + 11” it shouldn’t be too hard to maintain code 
> wise, and leaves people’s options open.
> 
> -Jeremiah
> 
>> On May 25, 2018, at 6:31 AM, Robert Stupp  wrote:
>>
>> I'd like to bring up the C*/Java discussion again. It's been a while since 
>> we've discussed this.
>>
>> To me it sounds like there's still the question about which version(s) of 
>> Java we want to support beginning with C* 4.0.
>>
>> I assume, that it's legit (and probably very necessary) to assume that 
>> OpenJDK is now (i.e. after Java 6) considered as "production ready" for C*. 
>> The public (and legal and free) availability of Oracle's Java 8 will end in 
>> January 2019 (unless you're using it privately on your desktop). Java 9 and 
>> 10 are not a thing, as both will be EOL when the C* 4.0 branch is about to 
>> be cut. The most recent available Java version will be 11, which is meant to 
>> be publicly available from Oracle until March 2019 and should get LTS 
>> support for OpenJDK 11 from major Linux distros (RHEL and derivates, Ubuntu, 
>> Azul Zulu).
>>
>> (Side note: adoptopenjdk is different here, because it does not include the 
>> patch version in the version banner (java.version=1.8.0-adoptopenjdk), so 
>> difficult to check the minimum patch version on startup of C*.)
>>
>> (Attn, rant: I'm not particularly happy with the new release and support 
>> model for Java, because developing something now, that's about to release 
>> end of the year on a Java version that has not even reached feature-complete 
>> status, is, gently speaking, difficult. But sticking to an "antique" Java 
>> version (8) has its own risks as well.)
>>
>> I'm silently ignoring any Java release, that's not aimed to get any 
>> LTS(-ish?) support from anybody - so only Java 8 + 11 remain.
>>
>> There are generally three (IMO legit) options here: only support Java 8, 
>> only support Java 11, support both Java 8 and Java 11. All three options 
>> have a bunch of pros and cons.
>>
>> Option 1, only Java 8: Probably the safest option. Considering the potential 
>> lifetimes of Java 8 and C* 4.0, even the most enthusiastic maintainers may 
>> stop backporting security or bug fixes to OpenJDK 8. It might not be an 
>> issue in practice, but if there's for example a severe issue in the SSL/TLS 
>> area and nobody fixes it in 8, well, good luck.
>>
>> Option 2, only Java 11: The option with the most risks IMO. Java 11 is not 
>> even feature complete, and there a bunch of big projects that still may make 
>> it into 11 (think: Valhalla). There's no guarantee whether the C* code or 
>> any included library will actually work with Java 11 (think: if it works 
>> now, it may not work with the final Java version). However, it leaves the 
>> door wide open for all the neat and geeky things in Java 11.
>>
>> Option 3: both 8 + 11: The idea here is to default to Java 8, but the code 
>> also runs on 11. It leaves the option to benefit from optimizations that are 
>> only available on 11 while maintaining the known stability of 8. Initially, 
>> only the combination of C* 4.0 + Java 8 would be labeled as "stable" and the 
>> combination of C* 4.0 + Java 11 as "experimental". But it gives us time to 
>> "evaluate" 4.0 on 11. When we have enough experience with 11, C* on 11 can 
>> be labeled as "stable" as well. The downside of this "hybrid" is, that it's 
>> a bit more difficult to introduce features, that depend on 11.
>>
>> I think, 3) gives the best of both worlds: stability of 8 and an upgrade 
>> path to 11 in the future, that people can actually test with C* 4.0. Happy 
>> to make the patch for #9608 ready for option 3. But it would be great to get 
>> a consensus here for either option before we review #9608 and commit it.
>>
>> Another proposal, for both options 1+3: Raise the minimum supported version 
>> of 8 for C* 4.0 to something more recent than 8u40, which is quite from the 
>> stone-age. It could be 8u171 or whatever will be recent in autumn.
>>
>> Robert
>>
>> -- 
>> Robert Stupp
>> @snazy
>>
>>
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
>> For additional commands, e-mail: dev-h...@cassandra.apache.org
>>
> 
> -
> To unsubscribe, e-mail: 

Integrating cloud and 3rd party security solutions

2017-10-20 Thread Stefan Podkowinski
I've been recently looking into how we could improve security in
Cassandra by integrating external solutions. There are very interesting
projects out there, such as Vault[0], but also a growing list of
security related APIs offered by cloud providers.

Today Cassandra can already be customized by using different
authenticators. We also have a really nice role based access model. But
there are other parts of Cassandra that are simply painful to work with,
such as certificate management for SSL, or anything related to local
keystores. No one wants to deal with that. Wouldn't it be cool to have
automated, build-in certificate management instead? That's what got me
started to work on CASSANDRA-13971.

Some cloud providers and solutions like Vault also offer key management
features that we could use for data-at-rest encryption. Same for
identity services and authentication.

I'm going to start working on some ideas[1] how we could integrate Vault
for certificate management, data-at-rest encryption and authentication.
But I'd really like to see support for cloud platforms as well. It would
be great to hear some other opinions and suggestions on that, especially
from people who already have been worked with e.g. AWS KMS, AWS cert and
identity manager, or related GC / Azure service. Also, where can we
improve to make Cassandra more secure by default in general?

[0] https://www.vaultproject.io
[1]
https://docs.google.com/document/d/1D8Td_M9wG7_kD0za-AlM_e524cFj2VnbU3mSYpAkViQ/edit?usp=sharing

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: [Patch Available for Review!] CASSANDRA-14134: Migrate dtests to use pytest and python3

2018-01-02 Thread Stefan Podkowinski
I was giving this a try today with some mixed results. First of all,
running pytest locally would fail with an "ccmlib.common.ArgumentError:
Unknown log level NOTSET" error for each test. Although I created a new
virtualenv for that as described in the readme (thanks for updating!)
and use both of your dtest and cassandra branches. But I haven't patched
ccm as described in the ticket, maybe that's why? Can you publish a
patched ccm branch to gh?

The updated circle.yml is now using docker, which seems to be a good
idea to reduce clutter in the yaml file and gives us more control over
the test environment. Can you add the Dockerfile to the .circleci
directory as well? I couldn't find it when I was trying to solve the
pytest error mentioned above.

Next thing I did was to push your trunk_circle branch to my gh repo to
start a circleCI run. Finishing all dtests in 15 minutes sounds
exciting, but requires a paid tier plan to get that kind of
parallelization. Looks like the dtests have even been deliberately
disabled for non-paid accounts, so I couldn't test this any further.

Running dtests from the pytest branch on builds.apache.org did not work
either. At least the run_dtests.py arguments will need to be updated in
cassandra-builds. We currently only use a single cassandra-dtest.sh
script for all builds. Maybe we should create a new job template that
would use an updated script with the wip-pytest dtest branch, to make
this work and testable in parallel.



On 21.12.2017 11:13, Michael Kjellman wrote:
> I just created https://issues.apache.org/jira/browse/CASSANDRA-14134 which 
> includes tons of details (and a patch available for review) with my efforts 
> to migrate dtests from nosetest to pytest (which ultimately ended up also 
> including porting the ode from python 2.7 to python 3).
> 
> I'd love if people could pitch in in any way to help get this reviewed and 
> committed so we can reduce the natural drift that will occur with a huge 
> patch like this against the changes going into master. I apologize for 
> sending this so close to the holidays, but I really have been working 
> non-stop trying to get things into a completed and stable state.
> 
> The latest CircleCI runs I did took roughly 15 minutes to run all the dtests 
> with only 6 failures remaining (when run with vnodes) and 12 failures 
> remaining (when run without vnodes). For comparison the last ASF Jenkins 
> Dtest job to successfully complete took nearly 10 hours (9:51) and we had 36 
> test failures. Of note, while I was working on this and trying to determine a 
> baseline for the existing tests I found that the ASF Jenkins jobs were 
> incorrectly configured due to a typo. The no-vnodes job is actually running 
> with vnodes (meaning the no-vnodes job is identical to the with-vnodes ASF 
> Jenkins job). There are some bootstrap tests that will 100% reliably hang 
> both nosetest and pytest on test cleanup, however this test only runs in the 
> no-vnodes configuration. I've debugged and fixed a lot of these cases across 
> many test cases over the past few weeks and I no longer know of any tests 
> that can hang CI.
> 
> Thanks and I'm optimistic about making testing great for the project and most 
> importantly for the OSS C* community!
> 
> best,
> kjellman
> 
> Some highlights that I quickly thought of (in no particular order): {also 
> included in the JIRA}
> -Migrate dtests from executing using the nosetest framework to pytest
> -Port the entire code base from Python 2.7 to Python 3.6
> -Update run_dtests.py to work with pytest
> -Add --dtest-print-tests-only option to run_dtests.py to get easily parsable 
> list of all available collected tests
> -Update README.md for executing the dtests with pytest
> -Add new debugging tips section to README.md to help with some basics of 
> debugging python3 and pytest
> -Migrate all existing Enviornment Variable usage as a means to control dtest 
> operation modes to argparse command line options with documented help on each 
> toggles intended usage
> -Migration of old unitTest and nose based test structure to modern pytest 
> fixture approach
> -Automatic detection of physical system resources to automatically determine 
> if @pytest.mark.resource_intensive annotated tests should be collected and 
> run on the system where they are being executed
> -new pytest fixture replacements for @since and @pytest.mark.upgrade_test 
> annotations
> -Migration to python logging framework
> -Upgrade thrift bindings to latest version with full python3 compatibility
> -Remove deprecated cql and pycassa dependencies and migrate any remaining 
> tests to fully remove those dependencies
> -Fixed dozens of tests that would hang the pytest framework forever when run 
> in CI enviornments
> -Ran code nearly 300 times in CircleCI during the migration and to find, 
> identify, and fix any tests capable of hanging CI
> -Upgrade Tests do not yet run in CI and still need additional migration work 
> (although all 

Re: [Patch Available for Review!] CASSANDRA-14134: Migrate dtests to use pytest and python3

2018-01-03 Thread Stefan Podkowinski
The latest updates to your branch fixed the logging issue, thanks! Tests
now seem to execute fine locally using pytest.

I was looking at the dockerfile and noticed that you explicitly use
python 3.6 there. Are you aware of any issues with older python3
versions, e.g. 3.5? Do I have to use 3.6 as well locally and do we have
to do the same for jenkins?


On 02.01.2018 22:42, Michael Kjellman wrote:
> I reproduced the NOTSET log issue locally... got a fix.. i'll push a commit 
> up in a moment.
> 
>> On Jan 2, 2018, at 11:24 AM, Michael Kjellman <mkjell...@internalcircle.com> 
>> wrote:
>>
>> Comments Inline: Thanks for giving this a go!!
>>
>>> On Jan 2, 2018, at 6:10 AM, Stefan Podkowinski <s...@apache.org> wrote:
>>>
>>> I was giving this a try today with some mixed results. First of all,
>>> running pytest locally would fail with an "ccmlib.common.ArgumentError:
>>> Unknown log level NOTSET" error for each test. Although I created a new
>>> virtualenv for that as described in the readme (thanks for updating!)
>>> and use both of your dtest and cassandra branches. But I haven't patched
>>> ccm as described in the ticket, maybe that's why? Can you publish a
>>> patched ccm branch to gh?
>>
>> 99% sure this is an issue parsing the logging level passed to pytest to the 
>> python logger... could you paste the exact command you're using to invoke 
>> pytest? should be a small change - i'm sure i just missed a invocation case.
>>
>>>
>>> The updated circle.yml is now using docker, which seems to be a good
>>> idea to reduce clutter in the yaml file and gives us more control over
>>> the test environment. Can you add the Dockerfile to the .circleci
>>> directory as well? I couldn't find it when I was trying to solve the
>>> pytest error mentioned above.
>>
>> This is already tracked in a separate repo: 
>> https://github.com/mkjellman/cassandra-test-docker/blob/master/Dockerfile
>>>
>>> Next thing I did was to push your trunk_circle branch to my gh repo to
>>> start a circleCI run. Finishing all dtests in 15 minutes sounds
>>> exciting, but requires a paid tier plan to get that kind of
>>> parallelization. Looks like the dtests have even been deliberately
>>> disabled for non-paid accounts, so I couldn't test this any further.
>>
>> the plan of action (i already already mentioned this in previous emails) is 
>> to get dtests working for the free circieci oss accounts as well. part of 
>> this work (already included in this pytest effort) is to have fixtures that 
>> look at the system resources and dynamically include tests as possible.
>>
>>>
>>> Running dtests from the pytest branch on builds.apache.org did not work
>>> either. At least the run_dtests.py arguments will need to be updated in
>>> cassandra-builds. We currently only use a single cassandra-dtest.sh
>>> script for all builds. Maybe we should create a new job template that
>>> would use an updated script with the wip-pytest dtest branch, to make
>>> this work and testable in parallel.
>>
>> yes, i didn't touch cassandra-builds yet.. focused on getting circleci and 
>> local runs working first... once we're happy with that and stable we can 
>> make the changes to jenkins configs pretty easily...
>>
>>>
>>>
>>>
>>> On 21.12.2017 11:13, Michael Kjellman wrote:
>>>> I just created https://issues.apache.org/jira/browse/CASSANDRA-14134 which 
>>>> includes tons of details (and a patch available for review) with my 
>>>> efforts to migrate dtests from nosetest to pytest (which ultimately ended 
>>>> up also including porting the ode from python 2.7 to python 3).
>>>>
>>>> I'd love if people could pitch in in any way to help get this reviewed and 
>>>> committed so we can reduce the natural drift that will occur with a huge 
>>>> patch like this against the changes going into master. I apologize for 
>>>> sending this so close to the holidays, but I really have been working 
>>>> non-stop trying to get things into a completed and stable state.
>>>>
>>>> The latest CircleCI runs I did took roughly 15 minutes to run all the 
>>>> dtests with only 6 failures remaining (when run with vnodes) and 12 
>>>> failures remaining (when run without vnodes). For comparison the last ASF 
>>>> Jenkins Dtest job to successfully complete took nearly 10 hours (9:51) and 
>>>> we had 36 test failures. Of note, while I was w

Re: Tracking Testing, Release Status, and Build Health

2018-07-30 Thread Stefan Podkowinski
On 30.07.2018 02:04, Scott Andreas wrote:

> I’m curious on the dev community’s thoughts on how best to organize 
> information like this. My thinking is that by having a space to share this, 
> the community can be more informed on each others’ work toward testing, build 
> health, and active projects.

Let's give it a try by maintaining a dedicated confluence page. But I'd
suggest to start small by focusing on failing tests and build issues and
see if we can mature the process into a more detailed release planing as
needed.

I was also wondering if we should have a dedicated testing@ mailing list
to encourage people to just drop some lines, in case they noticed any
issues that may need some attention, but don't think it's worth posting
on dev@. Maybe we could also get nightly build status updates posted
there from b.a.o or circle.

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



GitHub PR ticket spam

2018-07-30 Thread Stefan Podkowinski
Looks like we had some active PRs recently to discuss code changes in
detail on GitHub, which I think is something we agreed is perfectly
fine, in addition to the usual Jira ticket.

What bugs me a bit is that for some reasons any comments on the PR would
be posted to the Jira ticket as well. I'm not sure what would be the
exact reason for this, I guess it's because the PR is linked in the
ticket? I find this a bit annoying while subscribed to commits@,
especially since we created pr@ for these kind of messages. Also I don't
really see any value in mirroring all github comments to the ticket.
#14556 is a good example how you could end up with tons of unformatted
code in the ticket that will also mess up search in jira. Does anyone
think this is really useful, or can we stop linking the PR in the future
(at least for highly active PRs)?


-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: GitHub PR ticket spam

2018-08-06 Thread Stefan Podkowinski
+1 for worklog option

Here's an example ticket from Arrow, where they seem to be using the
same approach:
https://issues.apache.org/jira/browse/ARROW-2583


On 05.08.2018 09:56, Mick Semb Wever wrote:
>> I find this a bit annoying while subscribed to commits@,
>> especially since we created pr@ for these kind of messages. Also I don't
>> really see any value in mirroring all github comments to the ticket.
> 
> 
> I agree with you Stefan. It makes the jira tickets quite painful to read. And 
> I tend to make comments on the commits rather than the PRs so to avoid 
> spamming back to the jira ticket.
> 
> But the linking to the PR is invaluable. And I can see Ariel's point about a 
> chronological historical archive.
> 
> 
>> Ponies would be for this to be mirrored to a tab 
>> separate from comments in JIRA.
> 
> 
> Ariel, that would be the the "worklog" option.
> https://reference.apache.org/pmc/github
> 
> If this works for you, and others, I can open a INFRA to switch to worklog.
> wdyt?
> 
> 
> Mick.
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
> 

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: Proposing an Apache Cassandra Management process

2018-08-18 Thread Stefan Podkowinski
I think we do have some consensus that 1) we should give it a try to
have one or many side-car processes for non-essential features, and that
2) they should be developed in a separate repo. I'm also open to the
idea of accepting the proposed implementation as a possible side-car
solution and really appreciate effort. But my point is that creating a
new repo, just for the patch, seems to imply that it's also going to
become the de facto official side-car solution, which doesn't feel right
to me, given that the proposed patch isn't even reviewed and hasn't
received much feedback yet.


On 18.08.18 17:44, Sankalp Kohli wrote:
> The thread for side car is months old and no one has opposed to it and hence 
> someone developed it. I am not sure how else you get consensus. 
>
> Regarding separate repo, how do we get consensus? 
>
>> On Aug 18, 2018, at 05:19, Stefan Podkowinski  wrote:
>>
>> I don't see that we have reached sufficient consensus on this yet. We've
>> had a brief discussion about the pros and cons of in-tree cassandra vs
>> separate ASF repo here, but let's not frame it like it's either or. From
>> my perspective, there hasn't been any final decision yet whether the
>> proposed side-car solution should be further developed as part of the
>> Cassandra project, or not.
>>
>>
>>> On 18.08.18 03:12, Dinesh Joshi wrote:
>>> Thanks, Nate. I’ll create this request.
>>>
>>> Dinesh
>>>
>>> On Aug 17, 2018, at 5:09 PM, Nate McCall  wrote:
>>>
>>>>> I'm not sure logistically how we get a new repo created and licensing and
>>>>> such, but if someone helps make it we can cut the patch against it
>>>>>
>>>> This is pretty straight forward. For precedent, see:
>>>> https://issues.apache.org/jira/browse/CASSANDRA-13634
>>>>
>>>> We currently have three repositories:
>>>> https://git-wip-us.apache.org/repos/asf
>>>>
>>>> I'm +0 on what approach we take.
>>>>
>>>> -
>>>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
>>>> For additional commands, e-mail: dev-h...@cassandra.apache.org
>>>>
>>> -
>>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
>>> For additional commands, e-mail: dev-h...@cassandra.apache.org
>>>
>>
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
>> For additional commands, e-mail: dev-h...@cassandra.apache.org
>>
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>


-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: Side Car New Repo vs not

2018-08-21 Thread Stefan Podkowinski
I'm also currently -1 on the in-tree option.

Additionally to what Aleksey mentioned, I also don't see how we could
make this work with the current build and release process. Our scripts
[0] for creating releases (tarballs and native packages), would need
significant work to add support for an independent side-car. Our ant
based build process is also not a great start for adding new tasks, let
alone integrating other tool chains for web components for a potential UI.

[0] https://git-wip-us.apache.org/repos/asf?p=cassandra-builds.git


On 21.08.18 19:20, Aleksey Yeshchenko wrote:
> Sure, allow me to elaborate - at least a little bit. But before I do, just 
> let me note that this wasn’t a veto -1, just a shorthand for “I don’t like 
> this option”.
>
> It would be nice to have sidecar and C* version and release cycles fully 
> decoupled. I know it *can* be done when in-tree, but the way we vote on 
> releases with tags off current branches would have to change somehow. 
> Probably painfully. It would be nice to be able to easily enforce freezes, 
> like the upcoming one, on the whole C* repo, while allowing feature 
> development on the sidecar. It would be nice to not have sidecar commits in 
> emails from commits@ mailing list. It would be nice to not have C* CI trigger 
> necessarily on sidecar commits. Groups of people working on the two repos 
> will mostly be different too, so what’s the point in sharing the repo?
>
> Having an extra repo with its own set of branches is cheap and easy - we 
> already do that with dtests. I like cleanly separated things when coupling is 
> avoidable. As such I would prefer the sidecar to live in a separate new repo, 
> while still being part of the C* project.
>
> —
> AY
>
> On 21 August 2018 at 17:06:39, sankalp kohli (kohlisank...@gmail.com) wrote:
>
> Hi Aleksey,  
> Can you please elaborate on the reasons for your -1? This  
> way we can make progress towards any one approach.  
> Thanks,  
> Sankalp  
>
> On Tue, Aug 21, 2018 at 8:39 AM Aleksey Yeshchenko   
> wrote:  
>
>> FWIW I’m strongly -1 on in-tree approach, and would much prefer a separate  
>> repo, dtest-style.  
>>  
>> —  
>> AY  
>>  
>> On 21 August 2018 at 16:36:02, Jeremiah D Jordan (  
>> jeremiah.jor...@gmail.com) wrote:  
>>  
>> I think the following is a very big plus of it being in tree:  
 * Faster iteration speed in general. For example when we need to add a  
 new  
 JMX endpoint that the sidecar needs, or change something from JMX to a  
 virtual table (e.g. for repair, or monitoring) we can do all changes  
 including tests as one commit within the main repository and don't  
>> have  
 to  
 commit to main repo, sidecar repo,  
>>  
>> I also don’t see a reason why the sidecar being in tree means it would not  
>> work in a mixed version cluster. The nodes themselves must work in a mixed  
>> version cluster during a rolling upgrade, I would expect any management  
>> side car to operate in the same manor, in tree or not.  
>>  
>> This tool will be pretty tightly coupled with the server, and as someone  
>> with experience developing such tightly coupled tools, it is *much* easier  
>> to make sure you don’t accidentally break them if they are in tree. How  
>> many times has someone updated some JMX interface, updated nodetool, and  
>> then moved on? Breaking all the external tools not in tree, without  
>> realizing it. The above point about being able to modify interfaces and the  
>> side car in the same commit is huge in terms of making sure someone doesn’t  
>> inadvertently break the side car while fixing something else.  
>>  
>> -Jeremiah  
>>  
>>  
>>> On Aug 21, 2018, at 10:28 AM, Jonathan Haddad   
>> wrote:  
>>>  
>>> Strongly agree with Blake. In my mind supporting multiple versions is  
>>> mandatory. As I've stated before, we already do it with Reaper, I'd  
>>> consider it a major misstep if we couldn't support multiple with the  
>>> project - provided admin tool. It's the same reason dtests are separate  
>> -  
>>> they work with multiple versions.  
>>>  
>>> The number of repos does not affect distribution - if we want to ship  
>>> Cassandra with the admin / repair tool (we should, imo), that can be  
>> part  
>>> of the build process.  
>>>  
>>>  
>>>  
>>>  
>>> On Mon, Aug 20, 2018 at 9:21 PM Blake Eggleston   
>>> wrote:  
>>>  
 If the sidecar is going to be on a different release cadence, or  
>> support  
 interacting with mixed mode clusters, then it should definitely be in  
>> a  
 separate repo. I don’t even know how branching and merging would work  
>> in a  
 repo that supports 2 separate release targets and/or mixed mode  
 compatibility, but I’m pretty sure it would be a mess.  
  
 As a cluster management tool, mixed mode is probably going to be a goal  
>> at  
 some point. As a new project, it will benefit from not being tied to  
>> the C*  
 release cycle (which would 

Re: Proposing an Apache Cassandra Management process

2018-08-18 Thread Stefan Podkowinski
I don't see that we have reached sufficient consensus on this yet. We've
had a brief discussion about the pros and cons of in-tree cassandra vs
separate ASF repo here, but let's not frame it like it's either or. From
my perspective, there hasn't been any final decision yet whether the
proposed side-car solution should be further developed as part of the
Cassandra project, or not.


On 18.08.18 03:12, Dinesh Joshi wrote:
> Thanks, Nate. I’ll create this request.
>
> Dinesh
>
> On Aug 17, 2018, at 5:09 PM, Nate McCall  wrote:
>
>>> I'm not sure logistically how we get a new repo created and licensing and
>>> such, but if someone helps make it we can cut the patch against it
>>>
>> This is pretty straight forward. For precedent, see:
>> https://issues.apache.org/jira/browse/CASSANDRA-13634
>>
>> We currently have three repositories:
>> https://git-wip-us.apache.org/repos/asf
>>
>> I'm +0 on what approach we take.
>>
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
>> For additional commands, e-mail: dev-h...@cassandra.apache.org
>>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>


-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Proposing an Apache Cassandra Management process

2018-09-09 Thread Stefan Podkowinski
Does it have to be a single project with functionality provided by
multiple plugins? Designing a plugin API at this point seems to be a bit
early and comes with additional complexity around managing plugins in
general.

I was more thinking into the direction of: "what can we do to enable
people to create any kind of side car or tooling solution?". Thinks like:

Common cluster discovery and management API
* Detect local Cassandra processes
* Discover and receive events on cluster topology
* Get assigned tokens for nodes
* Read node configuration
* Health checks (as already proposed)

Any side cars should be easy to install on nodes that already run Cassandra
* Scripts for packaging (tar, deb, rpm)
* Templates for systemd support, optionally with auto-startup dependency
on the Cassandra main process

Integration testing
* Provide basic testing framework for mocking cluster state and messages

Support for other languages / avoid having to use JMX
* JMX bridge (HTTP? gRPC?, already implemented in #14346?)

Obviously the whole side car discussion is not moving into a direction
everyone's happy with. Would it be an option to take a step back and
start implementing such a tooling framework with scripts and libraries
for the features described above, as a small GitHub project, instead of
putting an existing side-car solution up for vote? If that would work
and we get people collaborating on code shared between existing
side-cars, then we could take the next step and think about either
revisit the "official Cassandra side-car" topic, or add the created
client tooling framework as official sub-project to the Cassandra
project (maybe via Apache incubator).


On 08.09.18 02:49, Joseph Lynch wrote:
> On Fri, Sep 7, 2018 at 5:03 PM Jonathan Haddad  wrote:
>> We haven’t even defined any requirements for an admin tool. It’s hard to
>> make a case for anything without agreement on what we’re trying to build.
>>
> We were/are trying to sketch out scope/requirements in the #14395 and
> #14346 tickets as well as their associated design documents. I think
> the general proposed direction is a distributed 1:1 management sidecar
> process similar in architecture to Netflix's Priam except explicitly
> built to be general and pluggable by anyone rather than tightly
> coupled to AWS.
>
> Dinesh, Vinay and I were aiming for low amounts of scope at first and
> take things in an iterative approach with just enough upfront design
> but not so much we are unable to make any progress at all. For example
> maybe something like:
>
> 1. Get a super simple and non controversial sidecar process that ships
> with Cassandra and exposes a lightweight HTTP interface to e.g. some
> basic JMX endpoints
> 2a. Add a pluggable execution engine for cron/oneshot/scheduled jobs
> with the basic interfaces and state store and such
> 2b. Start scoping and implementing the full HTTP interface, e.g.
> backup status, cluster health status, etc ...
> 3a. Start integrating implementations of the jobs from 2a such as
> snapshot, backup, cluster restart, daemon + sstable upgrade, repair,
> etc
> 3b. Start integrating UI components that pair with the HTTP interface from 2b
> 4. ?? Perhaps start unlocking next generation operations like moving
> "background" activities like compaction, streaming, repair etc into
> one or more sidecar contained processes to ensure the main daemon only
> handles read+write requests
>
> There are going to be a lot of questions to answer, and I think trying
> to answer them all up front will mean that we get nowhere or make
> unfortunate compromises that cripple the project from the start. If
> people think we need to do more design and discussion than we have
> been doing then we can spend more time on the design, but personally
> I'd rather start iterating on code and prove value incrementally. If
> it doesn't work out we won't release it GA to the community ...
>
> -Joey
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>


-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: Supporting multiple JDKs

2018-09-08 Thread Stefan Podkowinski
I really don't see any benefit at all in having any additional Java 1.7
specific build and testing changes for the 2.2 branch. The 2.2 version
is reaching EOL and will only get critical patches until then anyways. I
also can't remember any reports on regressions in 2.2 bug fix releases
specific to 1.7. So what's the actual problem we want to solve here?

As for 4.0, we're going to ship multi-release jars, which are targeted
against Java 8, but also contain Java 11 classes that will only be used
when executed under Java 11 (also currently just a single class). I can
see two issues that need our attention with that:
 * We should make sure to use the "_build_multi_java" target for our CI
jobs, so we're really testing the same build that we would ship. It's
probably not going to make a real difference, but who knows..
 * It would also be nice to have the option to run tests on CI under
Java 11, although we only provide "experimental" support for that Java
version. Nice to have at this point, as there will be plenty of bugs in
4.0 to fix, until we should spend time looking more closely for more
subtle Java 11 issues. But if someone wants to contribute any work to
make this happen, I'd be glad to have that option running tests on Java
11, so don't get me wrong.


On 07.09.18 04:10, Sumanth Pasupuleti wrote:
>> And I would suggest to go further and crash the build with JDK1.7 so we
> can take away the possibility for users to shoot their foot off this way.
>
> I like this suggestion. Either we should be on the side of NO support to
> JDK 1.7, or if we say we support JDK1.7, I believe we should be building
> against JDK1.7 to make sure we are compliant.
> I have a quick clarifying question here - I believe origin of
> CASSANDRA-14563 is from the introduction of an API in 2.2 that is
> incompatible with 1.7, that has then been manually detected and fixed. Are
> you suggesting, going further, we would not support 1.7?
>
>> Currently I'm unclear on how we would make a stable release using only
> JDK8, maybe their are plans on the table i don't know about?
>
> From the current state of build.xml and from the past discussions, I do
> believe as well, that we need both JDKs to make a 4.0 release using
> ‘_build_multi_java’. Bonus would be that, the release would also be able to
> run against Java11, but that would be an experimental release.
>
>> I'm not familiar with optional jobs or workflows in CircleCi, do you have
> an example of what you mean at hand?
>
> By optional, I was referring to having workflow definitions in place, but
> calls to those workflows commented out. Basically similar to what we have
> today.
> workflows:
> version: 2
> build_and_run_tests: *default_jobs
> #build_and_run_tests: *with_dtest_jobs_only
> #build_and_run_tests_java11: *with_dtest_jobs_java11
> Jason created CASSANDRA-14609 for this purpose I believe.
>
>> Off-topic, but what are your thoughts on this? Can we add `ant
> artifacts`, and the building of the docs, as a separate jobs into the
> existing default CircleCI workflow? I think we should also be looking into
> getting https://cassandra.apache.org/doc/latest/ automatically updated
> after each successful trunk build, and have
> https://cassandra.apache.org/doc/X.Y versions on the docs in place (which
> are only updated after each patch release).
>
> I like all these ideas! I believe we should be able to add a workflow to
> test out artifact generation. Will create a JIRA for this. Your suggestions
> around auto-update of docs provides a way to keep our website docs
> up-to-date. Not sure what it takes to do it though. Will be happy to
> explore (as part of separate JIRAs).
>
> Thanks,
> Sumanth
>

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Scratch an itch (was: [VOTE] Branching Change for 4.0 Freeze)

2018-07-12 Thread Stefan Podkowinski
These are some valid concerns. But I don’t really see it that way after
thinking about it. We already have restrictions and consensus based
practices in place that may discourage new contributors. E.g. if someone
submits a patch to enable a different GC by default in 2.1, that’s
probably not going to happen, even if carefully tested by the
contributor. We also don’t accept any patches at all for older 3.x
versions, although there may be people who are not able to update to
3.11 and would really like to get their 3.x version patched for something.

That’s not because we want to discourage people from contributing in a
“scratch an itch” way. It’s just what we agreed on how to coordinate our
efforts and what kind of patches to accept for individual releases. So
if it’s fine to tell people that we’re not able to accept patches for
any version they are already running, why should we not be able to do so
for an upcoming, unreleased version that isn’t even used by anyone at
this point? Also, if we tell someone that their contribution will be
reviewed and committed later after 4.0-beta, how is that actually making
a difference for that person, compared to committing it now for a 4.x
version. It may be satisfying to get a patch committed, but what matters
more is when the code will actually be released and deferring committing
contributions after 4.0-beta doesn't necessarily mean that there's any
disadvantage when it comes to that.


On 12.07.18 15:23, Gary Dusbabek wrote:
> -0
>
> I'm not interested in sparking a discussion on this, because a) it has
> already happened and b) it seems I am in a minority. But thought I should
> at least include the rationale for my vote:
> * This proposal goes against the "scratch an itch" philosophy of making
> contributions to an Apache project and IMO will discourage contributions
> that are casual or new.
> * It feels dictatorial. IMO the right way to do this would be for
> impassioned committers to -1 any patch that goes against elements a, b, or
> c of what this vote is for.
>
> Gary.
>
>
> On Wed, Jul 11, 2018 at 4:46 PM sankalp kohli 
> wrote:
>
>> Hi,
>> As discussed in the thread[1], we are proposing that we will not branch
>> on 1st September but will only allow following merges into trunk.
>>
>> a. Bug and Perf fixes to 4.0.
>> b. Critical bugs in any version of C*.
>> c. Testing changes to help test 4.0
>>
>> If someone has a change which does not fall under these three, we can
>> always discuss it and have an exception.
>>
>> Vote will be open for 72 hours.
>>
>> Thanks,
>> Sankalp
>>
>> [1]
>>
>> https://lists.apache.org/thread.html/494c3ced9e83ceeb53fa127e44eec6e2588a01b769896b25867fd59f@%3Cdev.cassandra.apache.org%3E
>>


-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: [VOTE] Branching Change for 4.0 Freeze

2018-07-12 Thread Stefan Podkowinski
+1

(assuming merging patches on documentation will always be possible, as
it's not effecting the code base)


On 11.07.18 23:46, sankalp kohli wrote:
> Hi,
> As discussed in the thread[1], we are proposing that we will not branch
> on 1st September but will only allow following merges into trunk.
>
> a. Bug and Perf fixes to 4.0.
> b. Critical bugs in any version of C*.
> c. Testing changes to help test 4.0
>
> If someone has a change which does not fall under these three, we can
> always discuss it and have an exception.
>
> Vote will be open for 72 hours.
>
> Thanks,
> Sankalp
>
> [1]
> https://lists.apache.org/thread.html/494c3ced9e83ceeb53fa127e44eec6e2588a01b769896b25867fd59f@%3Cdev.cassandra.apache.org%3E
>


-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: Roadmap for 4.0

2018-04-12 Thread Stefan Podkowinski
Maybe people would have preferred to know early about potential
deadlines, before investing a lot of time into "pet ticket"
contributions? It's hard enough to make assumptions about if and when
contributions make it into a release, but with feature freeze deadlines
falling from the sky any time, it's getting a pure gamble and I wouldn't
be surprised to see especially companies becoming more reluctant to
sponsor work on larger contributions.

But I do agree with your statement to "make it clear what kind of
contributions are "preferred" at any given time". But really "any given
time", not just when it's convenient for us to have people help fix
testing, before they "may continue working on their pet tickets" again.


On 12.04.2018 11:37, Sylvain Lebresne wrote:
> On Thu, Apr 12, 2018 at 11:21 AM Sankalp Kohli 
> wrote:
> 
>> We can fix test after freezing if there are resources people are willing
>> to put. We need to gather support to see who can help with the 3 points I
>> have mentioned and when.
>>
> 
> Again though, without disagreeing with your points, those don't play into
> when we freeze. If we freeze tomorrow, even if it take 3 months to gather
> sufficient support for testing, there will still be less to test than if we
> push the freeze in 3 months and more things are committed in that
> time-frame. And in fact, the sooner we freeze, the sooner the project is
> making the statement that people that are willing to contribute to the
> project should now do so helping testing rather than continuing working on
> their pet ticket. And don't get me wrong, it's an open source project, we
> can't force anyone to do anything, so people may continue working on there
> pet ticket even after freeze. But we can at least, as a project, make it
> clear what kind of contributions are "preferred" at any given time.
> 
> 
>>
>> On Apr 12, 2018, at 02:13, Sylvain Lebresne  wrote:
>>

 I agree there's little point freezing if we can't even test the system
 properly.

>>>
>>> I'll mention that I really don't follow the logic of such claim. Why
>> can't
>>> we
>>> fix the testing of the system after freezing? In fact, isn't the whole
>>> point of freezing agreeing that it's high time to fix that? Isn't it
>> easier
>>> to fix tests (and focus on the testing environment if needs be) when
>>> things are frozen and code isn't changing from under you?
>>>
>>> PS: all the questions of this email are rhetorical.
>>>
>>> --
>>> Sylvain
>>
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
>> For additional commands, e-mail: dev-h...@cassandra.apache.org
>>
>>
> 

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: Roadmap for 4.0

2018-04-05 Thread Stefan Podkowinski
June is too early.


On 05.04.18 19:32, Josh McKenzie wrote:
> Just as a matter of perspective, I'm personally mentally diffing from
> when 3.0 hit, not 3.10.
>
>> commit 96f407bce56b98cd824d18e32ee012dbb99a0286
>> Author: T Jake Luciani 
>> Date:   Fri Nov 6 14:38:34 2015 -0500
>>  3.0 release versions
> While June feels close to today relative to momentum for a release
> before this discussion, it's certainly long enough from when the
> previous traditional major released that it doesn't feel "too soon" to
> me.
>
> On Thu, Apr 5, 2018 at 12:46 PM, sankalp kohli  wrote:
>> We can take a look on 1st June how things are then decide if we want to
>> freeze it and whats in and whats out.
>>
>> On Thu, Apr 5, 2018 at 9:31 AM, Ariel Weisberg  wrote:
>>
>>> Hi,
>>>
>>> +1 to having a feature freeze date. June 1st is earlier than I would have
>>> picked.
>>>
>>> Ariel
>>>
>>> On Thu, Apr 5, 2018, at 10:57 AM, Josh McKenzie wrote:
 +1 here for June 1.

 On Thu, Apr 5, 2018 at 9:50 AM, Jason Brown 
>>> wrote:
> +1
>
> On Wed, Apr 4, 2018 at 8:31 PM, Blake Eggleston 
> wrote:
>
>> +1
>>
>> On 4/4/18, 5:48 PM, "Jeff Jirsa"  wrote:
>>
>> Earlier than I’d have personally picked, but I’m +1 too
>>
>>
>>
>> --
>> Jeff Jirsa
>>
>>
>> > On Apr 4, 2018, at 5:06 PM, Nate McCall 
> wrote:
>> >
>> > Top-posting as I think this summary is on point - thanks,
>>> Scott!
> (And
>> > great to have you back, btw).
>> >
>> > It feels to me like we are coalescing on two points:
>> > 1. June 1 as a freeze for alpha
>> > 2. "Stable" is the new "Exciting" (and the testing and
>>> dogfooding
>> > implied by such before a GA)
>> >
>> > How do folks feel about the above points?
>> >
>> >
>> >> Re-raising a point made earlier in the thread by Jeff and
>>> affirmed
>> by Josh:
>> >>
>> >> –––
>> >> Jeff:
>>  A hard date for a feature freeze makes sense, a hard date
>>> for a
>> release
>>  does not.
>> >>
>> >> Josh:
>> >>> Strongly agree. We should also collectively define what
>>> "Done"
>> looks like
>> >>> post freeze so we don't end up in bike-shedding hell like we
>>> have
>> in the
>> >>> past.
>> >> –––
>> >>
>> >> Another way of saying this: ensuring that the 4.0 release is
>>> of
>> high quality is more important than cutting the release on a specific
> date.
>> >>
>> >> If we adopt Sylvain's suggestion of freezing features on a
> "feature
>> complete" date (modulo a "definition of done" as Josh suggested),
>>> that
> will
>> help us align toward the polish, performance work, and dog-fooding
>>> needed
>> to feel great about shipping 4.0. It's a good time to start thinking
> about
>> the approaches to testing, profiling, and dog-fooding various
> contributors
>> will want to take on before release.
>> >>
>> >> I love how Ben put it:
>> >>
>> >>> An "exciting" 4.0 release to me is one that is stable and
>>> usable
>> >>> with no perf regressions on day 1 and includes some of the
>>> big
>> >>> internal changes mentioned previously.
>> >>>
>> >>> This will set the community up well for some awesome and
>>> exciting
>> >>> stuff that will still be in the pipeline if it doesn't make
>>> it to
>> 4.0.
>> >>
>> >> That sounds great to me, too.
>> >>
>> >> – Scott
>> >
>> > 
>> -
>> > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
>> > For additional commands, e-mail: dev-h...@cassandra.apache.org
>> >
>>
>> 
> -
>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
>> For additional commands, e-mail: dev-h...@cassandra.apache.org
>>
>>
>>
>>
>>
>> 
>>> -
>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
>> For additional commands, e-mail: dev-h...@cassandra.apache.org
>>
>>
>>> -
>>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
>>> For additional commands, e-mail: dev-h...@cassandra.apache.org
>>>
>>>
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, 

Re: Debug logging enabled by default since 2.2

2018-03-19 Thread Stefan Podkowinski
I'd agree that INFO should be the default. Turning on the DEBUG logging
can cause notable performance issues and I would not enable it on
production systems unless I really have to. That's why I created 
CASSANDRA-12696 for 4.0, so you'll be able to at least only partially
enable DEBUG based on what's relevant to look at, e.g. `nodetool
setlogginglevel bootstrap DEBUG`.

But small improvements like that won't change the fact that log files
suck in general for more complex analysis, except for trivial tailing
and grepping. You have to make sure that logging is enabled and old
records you're interested in will not be rotated out. Then you have to
gather log files from individual nodes somehow. Eventually I end up with
a local tarball with logs in that situation and the fun starts creating
hacky, regex loaded Python scripts to parse them. As each log message is
limited to a single line of text, it's often missing out relevant
details. You also got to create different parsers for different messages
of course. It's just inefficient and too time consuming to gather
information that way. Let alone implementing more advanced monitoring
solutions on top of that.

That's exactly why I started working on the "diagnostic events"
(CASSANDRA-12944) idea more than a year ago. There's also support for
persistency (CASSANDRA-13460) that would implement storing important but
infrequent events as rich json objects in a local keyspace and allows
retrieving them by using CQL. I still like the idea and think it's worth
pursuing.


On 19.03.18 09:53, Alain RODRIGUEZ wrote:
> Hello,
>
> I am not developing Cassandra, but I am using it actively and helping
> people to work with it. My perspective might be missing some code
> considerations and history as I did not go through the ticket where this
> 'debug' level was added by default. But here is a feedback after upgrading
> a few clusters to Cassandra 2.2:
>
> When upgrading a cluster to Cassandra 2.2, 'disable the debug logs' is in
> my runbook. I mean, very often, when some cluster is upgraded to Cassandra
> 2.2 and has problems with performances, the 2 most frequent issues are:
>
> - DEBUG level being turned on
> - and / or dynamic snitching being enabled
>
> This is especially true for high percentile (very clear on p99). Let's put
> the dynamic snitch aside as it is not our topic here.
>
> From an operational perspective, I prefer to set the debug level to 'DEBUG'
> when I need it than having, out of the box, something that is unexpected
> and impact performances. Plus the debug level can be changed without
> restarting the node, through 'JMX' or even using 'nodetool' now.
>
> Also in most cases, the 'INFO' level is good enough for me to detect most
> of the issues. I was even able to recreate a detailed history of events for
> a customer recently, 'INFO' logs are already very powerful and complete I
> believe (nice work on this by the way). Then monitoring is helping a lot
> too. I did not have to use debug logs for a long time. It might happen, but
> I will find my way to enable them.
>
> Even though it feels great to be able to help people with that easily
> because the cause is often the same and turning off the logs is a
> low hanging fruit in C*2.2 clusters that have very nice results and is easy
> to achieve, I would prefer people not to fall into these performances traps
> in the first place. In my head, 'Debug' logs should be for debug purposes
> (by opposition to 'always on'). It seems legit. I am surprised this brings
> so many discussions I thought this was a common standard widely accepted,
> and beyond Cassandra. That being said, it is good to see those exchanges
> are happening, so the decision that will be taken will be a good one, I am
> sure. I hope this comment will help, I have no other goal, for sure I am
> not willing to feed a conflict but a talk and I hope no one felt offended
> by this feedback. I believe this change was made aiming at
> helping/improving things, but it turns out it is more of an annoyance than
> truly helpful (my personal perspective).
>
> I would +1 on making 'INFO' default again, but if someone is missing
> information that should be in 'INFO'. If some informations are missing at
> the 'INFO' level, why not add informations that should be at the 'INFO'
> level there directly and keep log levels meaningful? Making sure we do not
> bring the logs degrading performances from 'Debug' to 'Info' as much as we
> can.
>
> Hope this is useful,
>
> C*heers,
>
> ---
> Alain Rodriguez - @arodream - al...@thelastpickle.com
> France / Spain
>
> The Last Pickle - Apache Cassandra Consulting
> http://www.thelastpickle.com
>
> 2018-03-19 2:18 GMT+00:00 kurt greaves :
>
>> On the same page as Michael here. We disable debug logs in production due
>> to the performance impact. Personally I think if debug logging is necessary
>> for users to use the software we're doing something wrong. Also in my
>> 

Re: Debug logging enabled by default since 2.2

2018-03-20 Thread Stefan Podkowinski
Are you suggesting to move all messages currently logged via debug() to
info() with the additional marker set, or only particular messages?


On 19.03.2018 19:51, Paulo Motta wrote:
> Thanks for the constructive input and feedback! From this discussion
> it seems like overloading the DEBUG level to signify
> async-verbose-INFO on CASSANDRA-10241 is leading to some confusion and
> we should fix this.
> 
> However, we cannot simply turn debug.log off as during CASSANDRA-10241
> some verbose-but-useful-info-logs, such as flush information were
> changed from INFO to DEBUG, and since the patch has been in for nearly
> 3 years it's probably non-revertable. Furthermore, the practice of
> using the DEBUG level for logging non-debug stuff has been in our
> Logging Guidelines
> (https://wiki.apache.org/cassandra/LoggingGuidelines) since then, so
> there is probably useful DEBUG stuff that would need to be turned into
> INFO if we get rid of debug.log.
> 
> For this reason I'm more in favor of converting the debug.log into
> async/verbose_system.log as suggested by Jeremiah and use a marker to
> direct these logs (former DEBUG level logs) to that log instead.
> Nevertheless, if the majority prefers to get back to a single
> system.log file and get rid of debug.log/verbose_system.log altogether
> then we would need to go through all log usages and readjust them to
> use the proper logging levels and update our logging guidelines to
> reflect whatever new policy is decided, not only disabling debug.log
> and call it a day.
> 
> 2018-03-19 12:02 GMT-03:00 Jeremiah D Jordan :
>> People seem hung up on DEBUG here.  The goal of CASSANDRA-10241 was
>> to clean up the system.log so that it a very high “signal” in terms of what 
>> was logged
>> to it synchronously, but without reducing the ability of the logs to allow 
>> people to
>> solve problems and perform post mortem analysis of issues.  We have 
>> informational
>> log messages that are very useful to understanding the state of things, like 
>> compaction
>> status, repair status, flushing, or the state of gossip in the system that 
>> are very useful to
>> operators, but if they are all in the system.log make said log file harder 
>> to look over for
>> issues.  In 10241 the method chosen for how to keep these log messages 
>> around by
>> default, but get them out of the system.log was that these messages were 
>> changed from
>> INFO to DEBUG and the new debug.log was created.
>>
>> From the discussion here it seems that many would like to change how this 
>> works.  Rather
>> than just turning off the debug.log I would propose that we switch to using 
>> the SLF4J
>> MARKER[1] ability to move the messages back to INFO but tag them as 
>> belonging to
>> the asynchronous_system.log rather than the normal system.log.
>>
>> [1] https://logback.qos.ch/manual/layouts.html#marker 
>> 
>> https://www.slf4j.org/faq.html#fatal 
>>
>>

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: [DISCUSS] java 9 and the future of cassandra on the jdk

2018-03-21 Thread Stefan Podkowinski
There's also another option, which I just want to mention here for the
sake of discussion.

Quoting the Oracle Support Roadmap:
"Instead of relying on a pre-installed standalone JRE, we encourage
application developers to deliver JREs with their applications."

I've played around with Java 9 a while ago and also tested creating a
self contained JRE using jlink, which you can bundle and ship with your
application. So there's a technical solution for that with Java 9. Of
course you'd have to clarify licensing issues (OpenJDK is GPLv2 +
Classpath exception) first.

Bundling a custom JRE along with Cassandra, would be convenient in a way
that we can do all the testing against the bundled Java version. We
could also switch to a new Java version whenever it fits us. Like e.g.
apache-cassandra-4.0.14_openjdk11u321 and two months later release
apache-cassandra-4.0.15_openjdk12u123. History has shown that planing
and timing new releases isn't always working out for us as expected. I'd
rather prefer not having to tightly coordinate our own releases together
with OpenJDK releases, if it can be avoided. At the same time I'd like
to avoid having users updating to incompatible JREs (think about
8u161/#14173), or have them constantly ask which JRE version to use for
which Cassandra version, always with the risk of automatic updates
causing unexpected issues. Bundling the JRE may help us with that, as it
would become more a matter of testing and getting CI turn green, before
we're ready to bundle the next major JRE update, without getting the
user involved at all.

If you would prefer using a global system JRE, that should still be
possible by installing an unbundled Cassandra version, but you'd have to
pay attention which Java version to use for which Cassandra release,
possibly having to provide patches and do some testing for more recent
Cassandra versions, in case of compatibility issues. If we update 3.11
to Java 13 in mid 2019, we'd have to provide release candidates that can
be used for testing for such incompatibilities by LTS users and have
them provide patches, which then have to fully work with Java 13 of
course. Otherwise I can't see how to make Oracle/Redhat/IBM/Azul LTS
releases work, except on this best effort basis without official support
guarantees by us.

I'm not too enthusiastic about this perspective. But I wouldn't
completely dismiss it either, without talking about all the other
options first.


On 20.03.2018 22:32, Ariel Weisberg wrote:
> Hi,
> 
> Synchronizing with Oracle LTS releases is kind of low value if it's a paid 
> offering. But if someone in the community doesn't want to upgrade and pays 
> Oracle we don't want to get in the way of that.
> 
> Which is how you end up with what Jordan and ElasticSearch suggest. I'm still 
> +1 on that although in my heart of hearts I want  to only support the latest 
> OpenJDK on trunk and after we cut a release only change the JDK if there is a 
> serious issue.
> 
> It's going to be annoying once we have a serious security or correctness 
> issue and we need to move to a later OpenJDK. The majority won't be paying 
> Oracle for LTS. I don't think that will happen that often though.
> 
> Regards,
> Ariel
> 
> If that ends up not working and we find it's a problem to not be getting 
> On Tue, Mar 20, 2018, at 4:50 PM, Jason Brown wrote:
>> Thanks to Hannu and others pointing out that the OracleJDK is a
>> *commercial* LTS, and thus not an option. mea culpa for missing the
>> "commercial" and just focusing on the "LTS" bit. OpenJDK is is, then.
>>
>> Stefan's elastic search link is rather interesting. Looks like they are
>> compiling for both a LTS version as well as the current OpenJDK. They
>> assume some of their users will stick to a LTS version and some will run
>> the current version of OpenJDK.
>>
>> While it's extra work to add JDK version as yet another matrix variable in
>> addition to our branching, is that something we should consider? Or are we
>> going to burden maintainers even more? Do we have a choice? Note: I think
>> this is similar to what Jeremiah is proposed.
>>
>> @Ariel: Going beyond 3 years could be tricky in the worst case because
>> bringing in up to 3 years of JDK changes to an older release might mean
>> some of our dependencies no longer function and now it's not just minor
>> fixes it's bringing in who knows what in terms of updated dependencies
>>
>> I'm not sure we have a choice anymore, as we're basically bound to what the
>> JDK developers choose to do (and we're bound to the JDK ...). However, if
>> we have the changes necessary for the JDK releases higher than the LTS (if
>> we following the elastic search model), perhaps it'll be a reasonably
>> smooth transition?
>>
>> On Tue, Mar 20, 2018 at 1:31 PM, Jason Brown  wrote:
>>
>>> copied directly from dev channel, just to keep with this ML conversation
>>>
>>> 08:08:26   Robert Stupp jasobrown: https://www.azul.com/java-
>>> 

Re: [DISCUSS] java 9 and the future of cassandra on the jdk

2018-03-23 Thread Stefan Podkowinski
I think it's pretty safe to assume that Java 8 will stay around much
longer than by the end of the year, after Oracle dropped their official
maintainer role. I also think that we don't have to worry that much how
exactly Java 8 is going to be supported. It's a mature enough version
that I wouldn't expect significant incompatibilities between back-ports
or forks. In the best case, someone will even step up taking the
official maintainer role as part of the OpenJDK project. But I'm pretty
sure we'll manage to keeping up supporting Java 8 for Cassandra
throughout the next years, if we decide to do so.

At the beginning, we've discussed the elastic search plans of supporting
the newest Java release and the latest LTS release at the same time.
Maybe it's a good time to get back thinking about this idea again and
ask us, do we really want to support the latest Java release, even if
it's a non-LTS release? Given the likely overlap of new major Java
releases and continued support for already released Cassandra branches,
I'd expect this to become a strain for developers and possible source of
confusion for users. Do we as developers or any users really care that
much about non-LTS releases in general, that we want to commit us to that?

Let's assume we're only going to support Java LTS releases for now. How
exactly would we want to go on from here? Keep in mind that Java 11 LTS
is already scheduled for September. Let's take a look at some LTS only
options:

1) Release 4.0 for Java 11 exclusively (3.11 for Java 8)

Start upgrading CI to initial Java 11 release candidate, merge Robert's
Java 9 patch and start fixing all incompatibility issues. Release 4.0 at
some point after Java 11. This is probably the most risky option, as we
can't see yet how the Java 11 adoption rate will turn out to be. In the
worst case, Java 8 will still dominate for times to come and depending
on Java 11 as hard requirement may hurt 4.0 adoption.

2) Release 4.0 for Java 8 + 11

Support both LTS versions for 4.0. I'd expect this to be non-trivial,
but maybe Robert can share some first hand experience what would have to
be done to make this possible. As described in the elastic search blog,
what they plan to do is to create multi-release jars for code compiled
against different versions, which is only possible starting with Java 9.
We don't even have that and would still have to make sure the same code
runs on both 8 and 11.

3) Release 4.0 for Java 8, branch 4.1 for Java 11 later

Don't do anything yet and release 4.0 for Java 8. Keep an eye on how the
situation unfolds during the next months and how fast Java 11 will be
adopted by Cassandra users. Branch 4.1 for Java 11, if there's public
demand and we agree that it makes sense at that point. This is basically
an incremental approach to 1), but we'll end up with another branch,
which we also would have to support in the future (4.0 for 8, 4.1 for 11).




On 22.03.2018 23:30, Michael Shuler wrote:
> As I mentioned in IRC and was pasted earlier in the thread, I believe
> the easiest path is to follow the major releases of OpenJDK in the
> long-term-support Linux OS releases. Currently, Debian Stable (Stretch),
> Ubuntu 16.04 (Bionic (near release)), and Red Hat / CentOS 7 all have
> OpenJDK 8 as the default JDK. For long-term support, they all have build
> facilities in place for their supported architectures and developers
> that care about security updates for users through their documented EOL
> dates.
> 
> The current deb and rpm packages for Apache Cassandra all properly
> depend on OpenJDK 8, so there's really nothing to be done here, until
> the project decides to implicitly depend on a JDK version not easily
> installable on the major OS LTS releases. (Users of older OS versions
> may need to fiddle with yum and apt sources to get OpenJDK 8, but this
> is a relatively solved problem.)
> 
> Users have the ability to deviate and set a JAVA_HOME env var to use a
> custom-installed JDK of their liking, or go down the `alternatives` path
> of their favorite OS.
> 
> 1) I don't think we should be get into the business of distributing
> Java, even if licensing allowed it.
> 2) The OS vendors are in the business of keeping users updated with
> upstream releases of Java, so there's no reason not to utilize them.
> 
> Michael
> 
> On 03/22/2018 05:12 PM, Jason Brown wrote:
>> See the legal-discuss@ thread:
>> https://mail-archives.apache.org/mod_mbox/www-legal-discuss/201803.mbox/browser
>> .
>>
>> TL;DR jlink-based distributions are not gonna fly due to OpenJDK's license,
>> so let's focus on other paths forward.
>>
>>
>> On Thu, Mar 22, 2018 at 2:04 PM, Carl Mueller 
>> wrote:
>>
>>> Is OpenJDK really not addressing this at all? Is that because OpenJDK is
>>> beholden to Oracle somehow? This is a major disservice to Apache and the
>>> java ecosystem as a whole.
>>>
>>> When java was fully open sourced, it was supposed to free the ecosystem to
>>> a large degree 

Re: [DISCUSS] java 9 and the future of cassandra on the jdk

2018-03-21 Thread Stefan Podkowinski
On 21.03.2018 15:41, Ariel Weisberg wrote:
> I'm not clear on what building and bundling our own JRE/JDK accomplishes? 

If we talk about OpenJDK, there will be only a single Java version
supported at any time and that is the latest Java version (11, 12, ..).
There is no overlap between supported versions. Therefor it doesn't
really make a lot of sense for us to officially support "a few releases
of the JDK" when we talk about OpenJDK releases. What we'd have to do is
to keep up with new Java versions by testing them and updating our code
base if necessary. Keep in mind that branches like 4.0 and 3.11 will
span several Java versions.

We can do this by communicating a list of branches and corresponding
Java releases that are officially supported. But we can also just bundle
and ship the latest OpenJDK release that we know is to be working for
any Cassandra branch right away, which would avoid any incompatibility
issues between our releases and JREs installed by the user and is
probably easier for everyone. But thats pretty much the biggest selling
point on bundling the JRE, but will probably not happen anyway due to
the licensing restrictions.


-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: [DISCUSS] java 9 and the future of cassandra on the jdk

2018-03-21 Thread Stefan Podkowinski
The idea was not about building a custom JDK and ship it along with
Cassandra, rather than using the new modular run-time images feature [0]
introduced in Java 9. See also the link posted by Jason [1] for an
practical introduction.

[0] http://openjdk.java.net/jeps/220
[1]
https://steveperkins.com/using-java-9-modularization-to-ship-zero-dependency-native-apps/


On 21.03.18 17:26, Michael Burman wrote:
> On 03/21/2018 04:52 PM, Josh McKenzie wrote:
>
>> This would certainly mitigate a lot of the core problems with the new
>> release model. Has there been any public statements of plans/intent
>> with regards to distros doing this?
> Since the latest official LTS version is Java 8, that's the only one
> with publicly available information For RHEL, OpenJDK8 will receive
> updates until October 2020.  "A major version of OpenJDK is supported
> for a period of six years from the time that it is first introduced in
> any version of RHEL, or until the retirement date of the underlying
> RHEL platform , whichever is earlier." [1]
>
> [1] https://access.redhat.com/articles/1299013
>
>> In terms of the burden of bugfixes and security fixes if we bundled a
>> JRE w/C*, cutting a patch release of C* with a new JRE distribution
>> would be a really low friction process (add to build, check CI, green,
>> done), so I don't think that would be a blocker for the concept.
>>
> And do we have someone actively monitoring CVEs for this? Would we
> ship a version of OpenJDK which ensures that it works with all the
> major distributions? Would we run tests against all the major
> distributions for each of the OpenJDK version we would ship after each
> CVE with each Cassandra version? Who compiles the OpenJDK distribution
> we would create (which wouldn't be the official one if we need to
> maintain support for each distribution we support) ? What if one build
> doesn't work for one distro? Would we not update that CVE? OpenJDK
> builds that are in the distros are not necessarily the pure ones from
> the upstream, they might include patches that provide better support
> for the distribution - or even fix bugs that are not yet in the
> upstream version.
>
> I guess we also need the Windows versions, maybe the PowerPC & ARM
> versions also at some point. I'm not sure if we plan to support J9 or
> other JVMs at some point.
>
> We would also need to create CVE reports after each Java CVE for
> Cassandra as well I would assume since it would affect us separately
> (and updating only the Java wouldn't help).
>
> To me this sounds like an understatement of the amount of work that
> would go to this. Not to mention the bad publicity if Java CVEs are
> not actually patched instantly in the Cassandra also (and then each
> user would have to validate that the shipped version actually works
> with their installation in their hardware since they won't get support
> for it from the vendors as it's unofficial package).
>
>   - Micke
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>


-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Proposed changes to CircleCI testing workflow

2018-10-26 Thread Stefan Podkowinski
I'd like to give you a quick update on the work that has been done
lately on running tests using CircleCI. Please let me know if you have
any objections or don't think this is going into the right direction, or
have any other feedback!

We've been using CircleCI for a while now and results are used on
constant basis for new patches. Not only by committers, but also by
casual contributors to run unit tests. Looks like people find the
service valuable and we should keep using it. Therefor I'd like to make
some improvements that will make it easier to add new tests and to
continue making CircleCI an option for all contributors, both on paid
and free plans.

The general idea of the changes implemented in #14806, is to consolidate
the existing config to make it more modular and have smaller jobs that
can be scheduled ad-hoc by the developer, instead of running a few big
jobs on every commit. Reorganizing and breaking up the existing config
was done using the new 2.1 config features. Starting jobs on request,
instead of automatically, is done using the manual approval feature,
i.e. you now have to click on that job in the workflow page in order to
start it. I'd like to see us having smaller, more specialized groups of
tests that we can run more selectively during development, while still
being able to run bigger tests before committing, or firing up all of
them during testing and releasing. Other example of smaller jobs would
be testing coverage (#14788) or cqlsh tests (#14298). But also
individual jobs for different ant targets, like burn, stress or benchmarks.

We'd now also be able to run tests using different docker images and
different JDKs. I've already updated the used image to also include Java
11 and added unit and dtest jobs to the config for that. It's now really
easy to run tests on Java 11, although these won't pass yet. It seems to
be important to me to have this kind of flexibility, given the
increasingly diverse ecosystem of Java distributions. We can also add
jobs for packaging and doing smoke tests by installing and starting
packages on different docker images (Redhat, Debian, Ubuntu,..) at a
later point.

As for the paid vs free plans issue, I'd also like us to discuss how we
can make tests faster and less resource intensive in general. As a
desired consequence, we'd be able to move away from multi-node dtests,
to something that can be run using the free plan. I'm looking forward to
see if #14821 can get us into that direction. Ideally we can add these
tests into a job that can be completed on the free plan and encourage
contributors to add new tests there, instead of having to write a dtest,
which they won't be able to run on CircleCI without a paid plan.

Whats changing for you as a CircleCI user?
* All tests, except unit tests, will need to be started manually and
will not run on every commit (this can be further discussed and changed
anytime if needed)
* Updating the config.yml file now requires using the CircleCI cli tool
and should not be done directly (see #14806 for technical details)
* High resource settings can be enabled using a script/patch, either run
manually or as commit hook (again see ticket for details)
* Both free and paid plan users now have more tests to run

As already mentioned, please let me know if you have any thoughts on
this, or if you think this is going into the wrong direction.

Thanks.


-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: [DISCUSS] changing default token behavior for 4.0

2018-09-22 Thread Stefan Podkowinski
There already have been some discussions on this here:
https://issues.apache.org/jira/browse/CASSANDRA-13701

The mentioned blocker there on the token allocation shouldn't exist
anymore. Although it would be good to get more feedback on it, in case
we want to enable it by default, along with new defaults for number of
tokens.


On 22.09.18 06:30, Dinesh Joshi wrote:
> Jon, thanks for starting this thread!
>
> I have created CASSANDRA-14784 to track this. 
>
> Dinesh
>
>> On Sep 21, 2018, at 9:18 PM, Sankalp Kohli  wrote:
>>
>> Putting it on JIRA is to make sure someone is assigned to it and it is 
>> tracked. Changes should be discussed over ML like you are saying. 
>>
>> On Sep 21, 2018, at 21:02, Jonathan Haddad  wrote:
>>
 We should create a JIRA to find what other defaults we need revisit.
>>> Changing a default is a pretty big deal, I think we should discuss any
>>> changes to defaults here on the ML before moving it into JIRA.  It's nice
>>> to get a bit more discussion around the change than what happens in JIRA.
>>>
>>> We (TLP) did some testing on 4 tokens and found it to work surprisingly
>>> well.   It wasn't particularly formal, but we verified the load stays
>>> pretty even with only 4 tokens as we added nodes to the cluster.  Higher
>>> token count hurts availability by increasing the number of nodes any given
>>> node is a neighbor with, meaning any 2 nodes that fail have an increased
>>> chance of downtime when using QUORUM.  In addition, with the recent
>>> streaming optimization it seems the token counts will give a greater chance
>>> of a node streaming entire sstables (with LCS), meaning we'll do a better
>>> job with node density out of the box.
>>>
>>> Next week I can try to put together something a little more convincing.
>>> Weekend time.
>>>
>>> Jon
>>>
>>>
>>> On Fri, Sep 21, 2018 at 8:45 PM sankalp kohli 
>>> wrote:
>>>
 +1 to lowering it.
 Thanks Jon for starting this.We should create a JIRA to find what other
 defaults we need revisit. (Please keep this discussion for "default token"
 only.  )

> On Fri, Sep 21, 2018 at 8:26 PM Jeff Jirsa  wrote:
>
> Also agree it should be lowered, but definitely not to 1, and probably
> something closer to 32 than 4.
>
> --
> Jeff Jirsa
>
>
>> On Sep 21, 2018, at 8:24 PM, Jeremy Hanna 
> wrote:
>> I agree that it should be lowered. What I’ve seen debated a bit in the
> past is the number but I don’t think anyone thinks that it should remain
> 256.
>>> On Sep 21, 2018, at 7:05 PM, Jonathan Haddad 
 wrote:
>>> One thing that's really, really bothered me for a while is how we
> default
>>> to 256 tokens still.  There's no experienced operator that leaves it
 as
> is
>>> at this point, meaning the only people using 256 are the poor folks
 that
>>> just got started using C*.  I've worked with over a hundred clusters
 in
> the
>>> last couple years, and I think I only worked with one that had lowered
> it
>>> to something else.
>>>
>>> I think it's time we changed the default to 4 (or 8, up for debate).
>>>
>>> To improve the behavior, we need to change a couple other things.  The
>>> allocate_tokens_for_keyspace setting is... odd.  It requires you have
 a
>>> keyspace already created, which doesn't help on new clusters.  What
 I'd
>>> like to do is add a new setting, allocate_tokens_for_rf, and set it to
> 3 by
>>> default.
>>>
>>> To handle clusters that are already using 256 tokens, we could prevent
> the
>>> new node from joining unless a -D flag is set to explicitly allow
>>> imbalanced tokens.
>>>
>>> We've agreed to a trunk freeze, but I feel like this is important
 enough
>>> (and pretty trivial) to do now.  I'd also personally characterize this
> as a
>>> bug fix since 256 is horribly broken when the cluster gets to any
>>> reasonable size, but maybe I'm alone there.
>>>
>>> I honestly can't think of a use case where random tokens is a good
> choice
>>> anymore, so I'd be fine / ecstatic with removing it completely and
>>> requiring either allocate_tokens_for_keyspace (for existing clusters)
>>> or allocate_tokens_for_rf
>>> to be set.
>>>
>>> Thoughts?  Objections?
>>> --
>>> Jon Haddad
>>> http://www.rustyrazorblade.com
>>> twitter: rustyrazorblade
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
>> For additional commands, e-mail: dev-h...@cassandra.apache.org
>>
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
>
>>>
>>> -- 
>>> Jon Haddad
>>> http://www.rustyrazorblade.com
>>> 

Re: Proposing an Apache Cassandra Management process

2018-11-18 Thread Stefan Podkowinski
My goal for a side car would be to enable more people to contribute to
the project, by making it more accessible for anyone who’s not familiar
with the Cassandra code base, or not familiar with Java development in
general. Although most of the functionality described in the proposal
sounds useful to have, I’d already be happy to have a solid REST API for
the existing nodetool and JMX functionality. If an official side car,
installed separately on each node, would provide that, I’m sure we’d see
lots of new tools created by the community (web UIs, cli tools, ..)
based on that. This would also be a good foundation for other existing
tool to converge upon, e.g. by calling the REST APIs for repair
scheduling and progress tracking instead of JMX, or by continually
integrating and sharing useful helper calls. This would also give
Cassandra devs more leeway to replace some of the existing tooling
related code in Cassandra, e.g. by migrating to virtual tables, while at
the same time keep providing a stable API through the side car.

What I’d also like to point out here is that implementing such a project
as an *official* side car, also implies to me having the same standards
when it comes to release quality. I’d also really prefer having feature
sets matching between Cassandra and the side car, e.g. authentication
and SSL should also be supported in the side car from the beginning,
ideally without any additional configuration.


On 06.11.18 10:40, Dinesh Joshi wrote:
> Hi all,
>
> Joey, Vinay & I have fleshed out the Management process proposal as the very 
> first CIP document (with Jason’s inputs). It is available on the cwiki - 
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=95652224
>
> Please comment on it and provide us any input that you may have. We want to 
> ideally time-box the period to 2 weeks so we avoid waiting indefinitely.
>
> Thanks,
>
> Dinesh
>
>> On Oct 22, 2018, at 7:30 AM, "dinesh.jo...@yahoo.com.INVALID" 
>>  wrote:
>>
>> Thanks for starting this, Mick. I will flesh it out.
>> Dinesh 
>>
>>On Sunday, October 21, 2018, 1:52:10 AM PDT, Mick Semb Wever 
>>  wrote:  
>>
>>
>>> But I'll try to put together a strawman proposal for the doc(s) over the 
>>> weekend. 
>>
>> I've thrown something quickly together here:
>> - https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=95652201
>> - 
>> https://cwiki.apache.org/confluence/display/CASSANDRA/CIP-1%3A+Proposing+an+Apache+Cassandra+Management+process
>>
>> The former is a blatant rip-off from the Kafka and Spark design proposal 
>> pages that Dinesh previously mentioned. I'd hoped to do more of an analysis 
>> of the existing C* habits and precedence on design proposals (implicit in 
>> jira tickets), but in lei of that this is a strawman to start the discussion.
>>
>> The latter still needs to be fleshed out. Dinesh, can you do this? I can add 
>> a subpage/section that describes the alternative/consuming third-party tools 
>> out there.
>>
>> regards,
>> Mick
>>
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
>> For additional commands, e-mail: dev-h...@cassandra.apache.org
>>


-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: JIRA Workflow Proposals

2018-12-05 Thread Stefan Podkowinski
Thanks Benedict and everyone involved putting up the proposal! It really
deserves some more feedback and I realize that I'm a bit late for that
and probably missed a good deal of the conversation so far. I'd still
like to share some of my notes that I've taken while reading through it,
for the sake of discussion.


Priority:
Blocker and critical seem to be more useful to me, compared to "urgent",
which is not clear about what's being urgent to whom and probably be
picked from a personal perspective. Blockers are useful for identifying
issues that need to be solved before creating an imminent release.
Critical should be used for patches important enough for releases that
are maintained on "critical bug fixes only" basis. It's not strictly
used that way, but "urgent" doesn't address this.

Complexity:
Possibly inappropriate for some type of issues, simply not known yet, or
impossible to tell. I'd not add it and keep using labels for that if we
have to highlight outliers, like lhf.

Discovered by:
Neat idea, would be very interesting to analyze that.

Bug category:
More complex bugs will probably fall into multiple categories. Choices
are hard to answer without the description. Do we really need this as a
mandatory field for new bug reports, which may not have been even
analyzed yet. What would you pick e.g. when reporting a broken test on CI?

Component:
I'd prefer not having to have subcategories or multi-select values. It's
too inflexible (why would any hint issues necessarily be consistency
issues?).

Features:
Distinction of features/components isn't really clear to me. At least
crosscuts like observability should be there, instead of components.
It's still useful, as it's general enough, easy to answer and
insightful. I'd reduce the component list a bit and make this an
optional follow up selection after features. 

Remove 'Reproduced in':
We should have a field that allows the user to report the used Cassandra
version for an issue.


As for workflow changes, I don't have a real opinion on that yet and
like to give this some more thoughts. But the suggested review status
changes are something I'd definitely like to see happening.


On 04.12.18 20:12, Benedict Elliott Smith wrote:
> Ok, so after an initial flurry everyone has lost interest :)
>
> I think we should take a quick poll (not a vote), on people’s positions on 
> the questions raised so far.  If people could try to take the time to stake a 
> +1/-1, or A/B, for each item, that would be really great.  This poll will not 
> be the end of discussions, but will (hopefully) at least draw a line under 
> the current open questions.
>
> I will start with some verbiage, then summarise with options for everyone to 
> respond to.  You can scroll to the summary immediately if you like.
>
> - 1. Component: Multi-select or Cascading-select (i.e. only one component 
> possible per ticket, but neater UX)
> - 2. Labels: rather than litigate people’s positions, I propose we do the 
> least controversial thing, which is to simply leave labels intact, and only 
> supplement them with the new schema information.  We can later revisit if we 
> decide it’s getting messy.
> - 3. "First review completed; second review ongoing": I don’t think we need 
> to complicate the process; if there are two reviews in flight, the first 
> reviewer can simply comment that they are done when ready, and the second 
> reviewer can move the status once they are done.  If the first reviewer wants 
> substantive changes, they can move the status to "Change Request” before the 
> other reviewer completes, if they like.  Everyone involved can probably 
> negotiate this fairly well, but we can introduce some specific guidance on 
> how to conduct yourself here in a follow-up.  
> - 4. Priorities: Option A: Wish, Low, Normal, High, Urgent; Option B: Wish, 
> Low, Normal, Urgent
> - 5. Mandatory Platform and Feature. Make mandatory by introducing new “All” 
> and “None” (respectively) options, so always possible to select an option.
> - 6. Environment field: Remove?
>
> I think this captures everything that has been brought up so far, except for 
> the suggestion to make "Since Version” a “Version” - but that needs more 
> discussion, as I don’t think there’s a clear alternative proposal yet.
>
> Summary:
>
> 1: Component. (A) Multi-select; (B) Cascading-select
> 2: Labels: leave alone +1/-1
> 3: No workflow changes for first/second review: +1/-1
> 4: Priorities: Including High +1/-1
> 5: Mandatory Platform and Feature: +1/-1
> 6: Remove Environment field: +1/-1
>
> I will begin.
>
> 1: A
> 2: +1
> 3: +1
> 4: +1
> 5: Don’t mind
> 6: +1
>
>
>
>
>> On 29 Nov 2018, at 22:04, Scott Andreas  wrote:
>>
>> If I read Josh’s reply right, I think the suggestion is to periodically 
>> review active labels and promote those that are demonstrably useful to 
>> components (cf. folksonomy -> 
>> taxonomy). 
>> I hadn’t read the reply as 

Re: JIRA Reports in Confluence

2018-11-22 Thread Stefan Podkowinski
Thanks for sorting out components across all these tickets. I really
like the idea of having predefined reports.

Looking at how tickets are grouped between 4.0, 4.0.x and 4.x, we should
probably do some cleanup for the "fix version" attribute as well. We use
to set a ultimate version once a patch has been committed, e.g. "3.11.3"
should list only issues that have been addressed in that version and all
issues should be set to resolved by now. Any issues using "3.11.x"
indicate that this issue would be a potential candidate for the next
3.11.4 release, if resolved in time. Following this approach, an
unresolved 4.0 issue should not exist. Those should be 4.x. Maybe the
author wanted to indicate that this issue should definitely be resolved
for 4.0, but isn't ready yet. But in that case, we should bump the
priority to blocker instead or create a label. Also the "4.0.x" value
doesn't make any sense to me and should probably simply set to "4.x".


On 19.11.18 01:51, Scott Andreas wrote:
> Hi everyone,
>
> I’ve created several new JIRA reports in Confluence organized under this 
> top-level page:
> https://cwiki.apache.org/confluence/display/CASSANDRA/Jira+reports
>
> These pages report open issues by Component and fix version. My aims in 
> creating them are to:
>
> – Represent what’s currently screened into each upcoming release milestone.
> – Break releases down by component to assess health/outstanding work in each.
> – Make it easier to identify what’s scoped where (and to gauge release size).
> – Create pages that can be used as queues for screening and for patches 
> awaiting review.
>
> They may also help structure discussions on release scope and what’s in / 
> what’s out. I’ve refrained from updating the “fix version” field on tickets 
> this weekend, but hope that these pages can become useful toward doing so as 
> a dev community. Additional JIRA grooming is needed before these can support 
> scope / timeline discussions on a per-release basis (esp. getting all active 
> and planned testing work represented) – but representing the current state of 
> things seemed like a prerequisite.
>
> The current reports are:
>
> – 4.0: Open Issues by Component
> https://cwiki.apache.org/confluence/display/CASSANDRA/4.0%3A+Open+Issues+by+Component
>
> – 4.0.x: Open Issues by Component
> https://cwiki.apache.org/confluence/display/CASSANDRA/4.0.x%3A+Open+Issues+by+Component
>
> – 4.x: Open Issues by Component
> https://cwiki.apache.org/confluence/display/CASSANDRA/4.x%3A+Open+Issues+by+Component
>
> – Open Issues by Component (Unscreened)
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=97550493
>
> – Patch Available
> https://cwiki.apache.org/confluence/display/CASSANDRA/Patch+Available
>
> If you’ve got cabin fever from poor air quality (or just really love 
> screening bugs), I’d love help adding components to tickets on the "Open 
> Issues by Component - Unscreened” page.
>
> Cheers,
>
> – Scott
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>


-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: Who should be in our distribution KEYS file?

2019-01-07 Thread Stefan Podkowinski
I don't see any reason to have any keys in there, except from release 
managers who are signing releases. Additional keys from other developers 
may even harm security, by creating more opportunities for compromising 
keys.


On 07.01.19 11:29, Mick Semb Wever wrote:

And when should it get updated?

Currently our KEYS file: that contains the public keys of those that can signed 
released binary artifacts; only contains a few of the PMC. My understanding is 
that we've avoid updating it because it causes headache for operators in having 
to validate the authenticity of a new key that's signed a binary when upgrading.

If this is accurate, how prevalent is this problem actually on operators? Do 
some operators download the KEYS fresh from apache.org every release? Are the 
keys of our PMCs already in the existing web of trust?

I'm not knowledgeable on the precedence here for operators, and curious to 
what's the community's stance (and why)… And whether it is the time right to 
add all/more our PMC to the file? And whether we should always add new PMC to 
the file (if they're already in the web of trust?)?

cheers,
Mick

https://www.apache.org/info/verification.html#Validating
https://www.apache.org/dyn/closer.cgi#verify
https://dist.apache.org/repos/dist/release/cassandra/KEYS

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: Who should be in our distribution KEYS file?

2019-01-09 Thread Stefan Podkowinski
If creating native deb/rpm package signatures and repo metadata 
signatures is the only issue that stops us from releasing more often and 
sharing the burden in doing so, then I'd be happy to discuss dropping 
this altogether. We can always distribute binary packages along with 
checksums and a detached gpg signature by any KEYS person, just as we do 
with the tarballs.


Having an Apache hosted Cassandra yum/deb repo may be convenient, but I 
wonder how many users will still download the packages and host them in 
their own internal repo. Even if it's just to have the option to pin the 
package to a specific version, which we currently don't offer, as only 
the latest package is included in the repo metadata. It might also 
encourage distros to keep downstream repos updated as well, since that 
would be the ideal way of creating and distributing package, i.e. having 
a Debian/Fedora/Ubuntu maintainer taking care of that and only have 
users come back to us for the vanilla rpm in case their downstream 
version is outdated.



On 08.01.19 02:48, Michael Shuler wrote:

Yeah, I asked if someone made a request thinking I totally missed it!
Since the last couple tick-tock releases, which were time based, every
release has been initiated by someone commenting in IRC or dev@ that
"there are a lot of things in CHANGES.txt" or "important fix foo has
been committed, let's release". It looks like the interval on releases
has been about 4-6 months, so we're due :)

More packaging / GPG key tickets, if someone wants to take them on:
https://issues.apache.org/jira/browse/CASSANDRA-14966
https://issues.apache.org/jira/browse/CASSANDRA-14967
https://issues.apache.org/jira/browse/CASSANDRA-14968

I see other Debian and RPM type repositories configured in the Apache
bintray project, so perhaps signing of the package repositories can be
done using their signing feature(?).

I just really wish to cause users installation/upgrade difficulty as
little as possible. If we're going to change up to something that allows
more people to do maven and tarball releases, great, I don't care, since
there's no tar-install-client that cares. I absolutely don't want
deb/rpm package repository users to constantly need to update GPG keys
every release, that's not nice to our users. If there is a way to get
the repos set up correctly in bintray, and it's OK with INFRA to use the
bintray key signing feature, great. We should set up a new package
signing key and configure bintray to sign the metadata. One change for a
long-term solution.

Michael

On 1/7/19 4:34 PM, Jeff Jirsa wrote:

I dont think it's awkward, I think a lot of us know there are serious bugs
and we need a release, but we keep finding other bugs and it's super
tempting to say "one more fix"

We should probably just cut next 3.0.x and 3.11.x though, because there are
some nasty bugs hiding in there that the testing for 4.0 has uncovered.


On Mon, Jan 7, 2019 at 2:14 PM Jonathan Haddad  wrote:


I don't understand how adding keys changes release frequency. Did

someone request a release to be made or are we on some assumed date
interval?

I don't know if it would (especially by itself), I just know that if more
people are able to do releases that's more opportunity to do so.

I think getting more folks involved in the release process is a good idea
for other reasons.  People take vacations, there's job conflicts, there's
life stuff (kids usually take priority), etc.

The last release of 3.11 was almost half a year ago, and there's 30+ bug
fixes in the 3.11 branch.


Did someone request a release to be made or are we on some assumed date

interval?

I can't recall (and a search didn't find) anyone asking for a 3.11.4
release, but I think part of the point is that requesting a release from a
static release manager is a sign of a flaw in the release process.

On a human, note, it feels a little awkward asking for a release.  I might
be alone on this though.

Jon


On Mon, Jan 7, 2019 at 1:16 PM Michael Shuler 
wrote:


Mick and I have discussed this previously, but I don't recall if it was
email or irc. Apologies if I was unable to describe the problem to a
point of general understanding.

To reiterate the problem, changing gpg signature keys screws our debian
and redhat package repositories for all users. Tarballs are not
installed with a client that checks signatures in a known trust
database. When gpg key signer changes, users need to modify their trust
on every node, importing new key(s), in order for packages to
install/upgrade with apt or yum.

I don't understand how adding keys changes release frequency. Did
someone request a release to be made or are we on some assumed date
interval?

Michael

On 1/7/19 2:30 PM, Jonathan Haddad wrote:

That's a good point.  Looking at the ASF docs I had assumed the release
manager was per-project, but on closer inspection it appears to be
per-release.  You're right, it does say that it can be any committer.


Latest changes to CircleCI testing workflow

2019-03-15 Thread Stefan Podkowinski



tldr; make sure to read the new instructions in the .circleci directory, 
if you’re a circleci user:

https://github.com/apache/cassandra/tree/trunk/.circleci


It’s been a while since I last reached out on dev- regarding proposed 
changes to our CircleCI setup [0]. Meanwhile Marcus and I have been able 
to finish working out the details as part of CASSANDRA-14806 and the 
discussed changes are now live.


The config.yml file has been updated to the latest config format, 
cleaned up and should now allow us to run tests in a more selective way, 
by making use of the “approval” feature, which enables you to explicitly 
start tests in the UI, instead of running everything automatically or 
having to edit the config file manually to do that. We’ll still run unit 
and jvm dtests automatically, as they finish quickly, but we can do so 
for other tests, too, if that would make sense.


Some new tests have been made available as well, along with the option 
to run tests with Java 11 on trunk. You can find an example build of 
what this would look like here:
https://circleci.com/workflow-run/e1eec049-70f1-4a07-b3e9-1c519cb26888 
(ci login required)


I’ve been in contact with the CircleCI support for possible options for 
not having to edit the config.yml to switch between high/low resource 
settings, but that doesn’t seem to be possible right now. As a 
compromise, we changed from having to edit config.yml to just copy an 
existing config version, if you want to use high resources settings, 
which should still be a bit less annoying to do than before. Please 
refer to the readme linked on top for details.


[0] 
https://lists.apache.org/thread.html/9519d69e5c1a94e67c376462ecbff00aee9f0bf016126ad928fb3283@%3Cdev.cassandra.apache.org%3E



-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: cqlsh tests and Python 3

2019-02-12 Thread Stefan Podkowinski

Previous discussion can be found here:

https://lists.apache.org/thread.html/cbc50f5ac085ac759b52eb7e87277a3b82e2773c6d507c4b525d@%3Cdev.cassandra.apache.org%3E


On 11.02.19 19:58, Ariel Weisberg wrote:

Hi,

Do you mean Python 2/3 compatibility?

This has been discussed earlier and I think that being compatible with both is 
an easier sell.

Ariel


On Feb 11, 2019, at 1:24 PM, dinesh.jo...@yahoo.com.INVALID 
 wrote:

Hey all,
We've gotten the cqlsh tests running in the Cassandra repo (these are distinct from 
the cqlsh tests in dtests repo). They're in Python 2.7 and using the nosetests. 
We'd like to make them consistent with the rest of the tests which means moving 
them to Python 3 & Pytest framework. However this would involve migrating cqlsh 
to Python 3. Does anybody have any concerns if we move cqlsh to Python 3? Please 
note that Python 2 is EOL'd and will be unsupported in about 10 months.
So here are the options -
1. Leave cqlsh in Python 2.7 & nosetests. Just make sure they're running as part of 
the build process.2. Move cqlsh to Python 3 & pytests.3. Leave cqlsh in Python 2.7 
but move to Pytests. This option doesn't really add much value though.
Thanks,
Dinesh


-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: [VOTE] Release Apache Cassandra 2.1.21

2019-02-03 Thread Stefan Podkowinski
What are we voting on here? Releasing the 2.1.21 candidate, or that 2.1 
would become EOL? Please let's have separate votes on that, if you want 
to propose putting 2.1 EOL (which I'm strongly -1).



On 03.02.19 01:32, Michael Shuler wrote:

*EOL* release for the 2.1 series. There will be no new releases from the
'cassandra-2.1' branch after this release.



I propose the following artifacts for release as 2.1.21.

sha1: 9bb75358dfdf1b9824f9a454e70ee2c02bc64a45
Git:
https://gitbox.apache.org/repos/asf?p=cassandra.git;a=shortlog;h=refs/tags/2.1.21-tentative
Artifacts:
https://repository.apache.org/content/repositories/orgapachecassandra-1173/org/apache/cassandra/apache-cassandra/2.1.21/
Staging repository:
https://repository.apache.org/content/repositories/orgapachecassandra-1173/

The Debian and RPM packages are available here:
http://people.apache.org/~mshuler

The vote will be open for 72 hours (longer if needed).

[1]: CHANGES.txt:
https://gitbox.apache.org/repos/asf?p=cassandra.git;a=blob_plain;f=CHANGES.txt;hb=refs/tags/2.1.21-tentative
[2]: NEWS.txt:
https://gitbox.apache.org/repos/asf?p=cassandra.git;a=blob_plain;f=CHANGES.txt;hb=refs/tags/2.1.21-tentative



-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: [VOTE] Release Apache Cassandra 2.1.21

2019-02-04 Thread Stefan Podkowinski
We currently intent to support 2.1 with critical fixes only, which 
leaves some room for interpretation. As usually, people have different 
views, in this case on what exactly a critical fix is. If there are some 
patches that are potential candidates for 2.1, but haven’t been 
committed, then we should revisit them and discuss back-porting. But EOL 
dates are not something that should be changed ad-hoc. If we do that, we 
might as well completely stop communicating support periods at all, 
since we may change our minds any time anyways, and I don’t think that’s 
acceptable.


On 04.02.19 01:31, Michael Shuler wrote:

My first couple sentences in the vote email were intended as a
statement, based on a lack of concerns voiced on EOL of 2.1.

I made a request for comment on EOL of 2.1 series a month ago, in
"Subject: EOL 2.1 series?":
https://lists.apache.org/thread.html/87ee2e3d13ea96744545ed8612496a93f8235747c4f60d0084b37bae@%3Cdev.cassandra.apache.org%3E

Yes, I'm aware our download page states we support 2.1 until 4.0, but we
do not really do so.

The reality is that developers have been actively ignoring the branch,
even when suggested to commit to the 2.1 branch. I can go dig up IRC
logs and commits, but I really don't feel like it adds any value to the
conversation. As Jonathan Haddad says, let's just be honest with users
about what has already been happening independently.

To continue stating we actively support 2.1 until 4.0 and actually
follow through, the project should audit fixed bugs in 2.2+ and see if
they still exist in 2.1, unfixed. I imagine there are a few. I know for
sure of one that was not committed. Alternatively, we sunset the branch,
make that change on the download page, and move on. I don't think it's
right to continue telling users we are doing something, if we aren't and
haven't been.

Michael

On 2/3/19 5:24 PM, Anthony Grasso wrote:

+1 non-binding, for the release of 2.1.21

Regarding EOL of 2.1.x, did we announce in the past that 2.1.21 would be
the final release?

According to the download <http://cassandra.apache.org/download/> page 2.1
is meant to be supported with critical fixes only until 4.0 is released. I
suspect that people may be relying on this, as I have seen a number 2.1.x
clusters still in production use.

On Mon, 4 Feb 2019 at 07:09, Jonathan Haddad  wrote:


I think having the discussion around EOL is pretty important, in order to
set the right expectations for the community.

Looking at the commits for 2.1, there's hardly any activity going on,
meaning it's effectively been EOL'ed for a long time now.  I think it's
better that we be honest with folks about it.

On Sun, Feb 3, 2019 at 9:34 AM Nate McCall  wrote:


+1 on the release of 2.1.21 (let's focus on that in the spirit of
these other votes we have up right now).

I don't feel the need to be absolutist about something being EOL.

On Sun, Feb 3, 2019 at 1:47 AM Stefan Podkowinski 

wrote:

What are we voting on here? Releasing the 2.1.21 candidate, or that 2.1
would become EOL? Please let's have separate votes on that, if you want
to propose putting 2.1 EOL (which I'm strongly -1).


On 03.02.19 01:32, Michael Shuler wrote:

*EOL* release for the 2.1 series. There will be no new releases from

the

'cassandra-2.1' branch after this release.



I propose the following artifacts for release as 2.1.21.

sha1: 9bb75358dfdf1b9824f9a454e70ee2c02bc64a45
Git:


https://gitbox.apache.org/repos/asf?p=cassandra.git;a=shortlog;h=refs/tags/2.1.21-tentative

Artifacts:


https://repository.apache.org/content/repositories/orgapachecassandra-1173/org/apache/cassandra/apache-cassandra/2.1.21/

Staging repository:


https://repository.apache.org/content/repositories/orgapachecassandra-1173/

The Debian and RPM packages are available here:
http://people.apache.org/~mshuler

The vote will be open for 72 hours (longer if needed).

[1]: CHANGES.txt:


https://gitbox.apache.org/repos/asf?p=cassandra.git;a=blob_plain;f=CHANGES.txt;hb=refs/tags/2.1.21-tentative

[2]: NEWS.txt:


https://gitbox.apache.org/repos/asf?p=cassandra.git;a=blob_plain;f=CHANGES.txt;hb=refs/tags/2.1.21-tentative

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org


-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



--
Jon Haddad
http://www.rustyrazorblade.com
twitter: rustyrazorblade



-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional command

Re: [VOTE] Apache Cassandra Release Lifecycle

2019-10-01 Thread Stefan Podkowinski
What exactly will be the implication of the outcome of this vote, in 
case the content is agreed upon? What's the proposal of the voting?


The document seems to be informative rather then formal. It's verbose on 
definitions that should be commonly understood or can only broadly 
defined (what is alpha/beta/RC, GA for production, etc.), while at the 
same time is unclear and weasel-wordy on our actual commitment and rules.



On 30.09.19 20:51, sankalp kohli wrote:

Hi,
 We have discussed in the email thread[1] about Apache Cassandra Release
Lifecycle. We came up with a doc[2] for it. Please vote on it if you agree
with the content of the doc[2].

Thanks,
Sankalp

[1]
https://lists.apache.org/thread.html/c610b23f9002978636b66d09f0e0481ed3de9b78895050da22c91c6f@%3Cdev.cassandra.apache.org%3E
[2]
https://docs.google.com/document/d/1bS6sr-HSrHFjZb0welife6Qx7u3ZDgRiAoENMLYlfz8/edit#heading=h.633eppni91tw



-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: [VOTE-2] Apache Cassandra Release Lifecycle

2019-10-08 Thread Stefan Podkowinski

From the document:

General Availability (GA): “A new branch is created for the release with 
the new major version, limiting any new feature addition to the new 
release branch, with new feature development will continue to happen 
only on trunk.”
Maintenance: “Missing features from newer generation releases are 
back-ported on per - PMC vote basis.”


We had a feature freeze before 4.0, which showed that people have 
different views on what actually qualifies as a feature. It doesn’t work 
without defining “feature” in more detail. Although I’d rather avoid to 
have this in the document at all, since I don’t think this is getting us 
anywhere, without having a clearer picture on the bigger context in 
which release are going to happen in the future, starting with release 
cadence and support periods. How can we decide that *all* new features 
are suppose to go into trunk only, if we don’t even have an idea about 
the upcoming release schedule?


“Bug fix releases have associated new minor version.”

So the next bug fix version will be 4.1? There will be no minor feature 
releases like we did with 3.x.0/2.x.0?


Deprecated:
"Through a dev community voting process, EOL date is determined for this 
release.”

“Users actively encouraged to move away from the offering.”

We should give users a way to plan, by having EOL dates that may be 
months or years ahead in the future. We did this with 3.0 and 2.x, which 
would be all “deprecated” a long time ago with the new proposal.


Deprecated: “Only security vulnerabilities and production-impacting bugs 
without workarounds are addressed.”


Although devs will define their own definition of “production-impacting 
bugs without workarounds” in any way they need, I don’t think we should 
have this in the document. It’s okay to use EOLed releases and we should 
not prevent users from contributing smaller fixes, performance 
improvements and useful enhancements for minor feature releases.


On 08.10.19 20:00, sankalp kohli wrote:

Hi,
 We have discussed in the email thread[1] about Apache Cassandra Release
Lifecycle. We came up with a doc[2] for it. We have finalized the doc
here[3] Please vote on it if you agree with the content of the doc [3].

We did not proceed with the previous vote as we want to use confluence for
it. Here is the link for that[4]. It also mentions why we are doing this
vote.

Vote will remain open for 72 hours.

Thanks,
Sankalp

[1]
https://lists.apache.org/thread.html/c610b23f9002978636b66d09f0e0481ed3de9b78895050da22c91c6f@%3Cdev.cassandra.apache.org%3E
[2]
https://docs.google.com/document/d/1bS6sr-HSrHFjZb0welife6Qx7u3ZDgRiAoENMLYlfz8/edit#heading=h.633eppni91tw
[3]https://cwiki.apache.org/confluence/display/CASSANDRA/Release+Lifecycle
Attachments area
[4]
https://lists.apache.org/thread.html/169b00f45dbad295e1aea1da70365fabc8452ef497f78ddfd28c311f@%3Cdev.cassandra.apache.org%3E




-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: [VOTE] Project governance wiki doc (take 2)

2020-06-22 Thread Stefan Podkowinski

+1

On 22.06.20 20:12, Blake Eggleston wrote:

+1


On Jun 20, 2020, at 8:12 AM, Joshua McKenzie  wrote:

Link to doc:
https://cwiki.apache.org/confluence/display/CASSANDRA/Apache+Cassandra+Project+Governance

Change since previous cancelled vote:
"A simple majority of this electorate becomes the low-watermark for votes
in favour necessary to pass a motion, with new PMC members added to the
calculation."

This previously read "super majority". We have lowered the low water mark
to "simple majority" to balance strong consensus against risk of stall due
to low participation.


   - Vote will run through 6/24/20
   - pmc votes considered binding
   - simple majority of binding participants passes the vote
   - committer and community votes considered advisory

Lastly, I propose we take the count of pmc votes in this thread as our
initial roll call count for electorate numbers and low watermark
calculation on subsequent votes.

Thanks again everyone (and specifically Benedict and Jon) for the time and
collaboration on this.

~Josh


-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org