It sounds to me (please correct me if I'm wrong) like Jeff is arguing that
releasing 4.0 in 2 months isn't worth the effort of evaluating it, because
it's a big task and there's not enough stuff in 4.0 to make it worthwhile.
If that is the case, I'm not quite sure how increasing the surface area of
changed code which needs to be vetted is going to make the process any
easier. Changing 10 things vs changing 100, you're not looking at a 10x
difference in time to figure out what's going on, it's significantly more
than that due to the number of interactions between the 100 changed
That said, I think everyone should look at the list of changes  to
evaluate if they really think it's worthwhile to release 4.0 as is.
Consider the reasons people upgrade. Usually it's for a feature or
improved performance / density which leads to lower costs. Major releases
are typically not thought of as stability upgrades, no one with any
knowledge of the project's history will believe it anyways. Yes, we should
aim for a rock solid 4.0.0, but it's on us to be the first adopters to
prove it works, as Blake said previously.
As Jeff notes, the cost of upgrading Cassandra clusters is non trivial, so
to make it worth the investment there has to be some real gain. Otherwise
in 3 years people are still going to be running 2.1, and frankly, that's
not great for the project either. As someone that advises teams almost
daily, my main justification for pushing someone to upgrade to 4.0 would be
improved stability, especially around incremental repair. Is that enough
to get more teams using newer versions, going through the entire cycle of
dev time / QA / upgrading / fixing stuff? If something isn't on fire right
now, and they're not using anything that's a pain point, would upgrading be
a high priority?
Regarding testing, I've been thinking about this for a while, and there's
definitely a gap there. For non-committers & contributors, it might be
really hard, probably too hard, to get into this. I've been working my
head around how to build something along the lines of a stress tool, but
with pre-defined loads that cover a lot of use cases, that can be run
across clusters of various sizes with the ability to inject uncertainty in
a manner similar to Jepsen. I don't have anything concrete to share but I
think having a common toolset that *anyone* can pick up to help test
clusters would enable more people to help act as QA. Assuming a team has a
staging environment, or the ability to spin one up, a tool like this could
help get some really fast feedback from a broader range of teams.
Just to be clear, I'm not advocating for anything other than everyone
taking a step back from their own opinion and considering the broader
On Thu, Apr 12, 2018 at 9:06 AM Michael Shuler <mich...@pbandjelly.org>
> On 04/12/2018 10:57 AM, Michael Shuler wrote:
> > Our current internal trunk test summary is attached. We're actually in a
> > pretty good state on the baseline test suites, thanks to
> > committers/reviewers.
> > Due to compute resource limitations and error noise on py2->py3 update,
> > the following test suites are not being run internally on our CI system,
> > so these should probably get first effort in the larger "fix all the
> > tests" theme:
> > novnode-dtest (repetitive, so disabled atm)
> > large-dtest (big instances, disabled on resources)
> > upgrade-dtest (error noise and long test, disabled on resources)
> > cqlsh-tests (py2->py3 work needed)
> > Just wanted to throw this out there for some context. There is certainly
> > some work to be done, but I don't think we're in such dire straights as
> > may be assumed from the conversation.
> It looks like attachments may get stripped from the list.
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org