I don't agree at face value that early re-open is in sum a lot simpler than MV, or that adding CQL and deprecating Thrift was a lot simpler, or the 8099 refactor, etc. Different types of complexity, certainly, and MV's are arguably harder to prove correct due to surface area of exposure to failure states. Definitions of complexity aside, I do agree with the general principle that MV's are very complex and, as with many other things in the DB, boundary conditions are insufficiently understood and tested at this time. There's also a recency bias to the defects and active work people are seeing with MV as there has been a recent focus on stabilizing that rather than with the long tail we've seen with other, more pervasive and foundational changes to the code-base over the course of the past few years.
MV's aren't the only thing in the DB that I think qualify for 'flagging as not-production-ready' by the criteria people are attempting to selectively apply to the feature here. If we go the route of flagging one already released feature experimental because we lack confidence in it, there are other things we similarly lack confidence in that should be treated similarly (incremental repair, SASI to name two that immediately come to mind). I personally don't think changing the qualification and user experience of features post-release sends a good message to said users; if we all agreed unanimously that these features were this failure-prone and high-risk, it would be more appropriate to make that change however that's obviously not the case here. On Wed, Oct 4, 2017 at 10:41 AM, Benedict Elliott Smith <_...@belliottsmith.com > wrote: > So, as the author of one of the disasters you mention (early re-open), I > would prefer to learn from the mistake and not repeat it. Unfortunately we > seem to be in the habit of repeating it, and that feature was a lot *lot* > simpler. > > Let’s not kid ourselves: MVs are by far and away the most complicated > feature we have ever delivered. We do not fully understand it, even in > theory, let alone can we be sure we have the implementation right. > > So, if we all agree our testing is ordinarily insufficient, can’t we agree > it is probably *really* insufficient here? > > I don’t want to give the impression I’m shifting the goals. I’ve been > against MV inclusion as they stand for some time, as were several others. > I think in the new world order of project/community structure, they > probably would have been rejected as they stand. > > I’ve consistently listed my own requirements for considering them > production ready: extensive modelling and simulation of the algorithm’s > properties (in lieu of formal proofs), *safe* default behaviour (rollback > CASSANDRA-10230, or make it a per-table option, and default to fast only > for existing tables to avoid surprise), tools for detecting and repairing > inconsistencies, and more extensive testing. > > Many of these things were agreed as prerequisites for release of 3.0, but > ultimately they were not delivered. > > I do, however, absolutely agree with Sylvain that we need to minimise > surprise in a patch version. > > > On 4 Oct 2017, at 08:58, Josh McKenzie <jmcken...@apache.org> wrote: > > >> and providing a feature we don't fully understand, have not fully > > documented the caveats of, let alone discovered all the problems with nor > > had that knowledge percolate fully into the wider community. > > There appear to be varying levels of understanding of the implementation > > details of MV's (that seem to directly correlate with faith in the > > feature's correctness for the use-cases recommended) on this email thread > > so while I respect a sense of general wariness about the state of > > correctness testing with C*, I don't agree that the thoroughness of > testing > > of MV's is any different than any other feature we've added to the > > code-base since the project's inception. > > > > That's not to say I think the current extent of our testing before GA on > > features is adequate; I don't, but I don't think it makes sense to draw > an > > arbitrary line in the sand with already released features that are in use > > in production clusters, flagging said features as experimental after the > > fact, and thus eroding users' trust in our collective definition of done. > > What's to stop us from flagging other, seemingly arbitrary features > people > > are relying on in production as experimental in the future? What does > that > > mean for their faith in the project and their job security? SASI? LWT? > > Counters? Triggers? Repair and compaction due to (still arising) > edge-cases > > and defects in early re-open and incremental repair? All of these > features > > still have edge-cases due to the inherent complexity of the code-base and > > problem domain in which we work. > > > > Right now there appear to be the two camps of 'I can't clearly articulate > > what Good Enough is since it's Complicated, but I know we're not there' > and > > 'if people are relying on it in production without issue it's by > definition > > good enough for their use-case'. It's a compromise; nothing is ever > perfect > > (as we all know). I'm all for us saying 'We need better testing of > features > > going forward', 'We need better metrics for the coverage and branch > testing > > of things in C*', etc, and definitely in favor of us spending some time > to > > increase our coverage for existing features. > > > > I don't think MV's are any different than anything else in this code-base > > in terms of how well vetted the features are, for better or for worse. > > > > On Wed, Oct 4, 2017 at 5:21 AM, kurt greaves <k...@instaclustr.com> > wrote: > > > >>> > >>> The flag name `cdc_enabled` is simple and, without adjectives, does not > >>> imply "experimental" or "beta" or anything like that. > >>> It does make life easier for both operators and the C* developers. > >> > >> I would be all for a mv_enabled option, assuming it's enabled by default > >> for all existing branches. I don't think saying that you are meant to > read > >> NEWS.txt before upgrading a patch is acceptable. Most people don't, and > >> expecting them to is a bit insane. Also Assuming that if they read it > >> they'd understand all implications is also a bit questionable. If deemed > >> suitable to turn it off that can be done in the next major/minor, but I > >> think that would be unlikely, as we should really require sufficient > >> evidence that it's dangerous which I just don't think we have. I'm > still of > >> the opinion that MV in their current state are no worse off than a lot > of > >> other features, and marking them as experimental and disabling now would > >> just be detrimental to their development and annoy users. Also if we > give > >> them that treatment then there a whole load of other defaults we should > >> change and disable which is just not acceptable in a patch release. It's > >> not really necessary anyway, we don't have anyone crying bloody murder > on > >> the mailing list about how everything went to hell because they used > >> feature x. > >> > >> No one has really provided any counter evidence yet that MV's are in > some > >> awful state and they are going to shoot users. There are a few existing > >> issues that I've brought up already, but they are really quite minor, > >> nothing comparable to "lol you can't repair if you use vnodes, sorry". I > >> think we really need some real examples/evidence before making calls > like > >> "lets disable this feature in a patch release and mark it experimental" > >> > >>> I personally believe it is better to offer the feature as experimental > >>> until we iron out all of the problems > >> > >> What problems are you referring to, and how exactly will we know when > all > >> of them have been sufficiently ironed? If we mark it as experimental how > >> exactly are we going to get people to use said feature to find issues? > >> > >> >