Re: Proposal to retroactively mark materialized views experimental

Benedict Elliott Smith Wed, 04 Oct 2017 08:57:06 -0700

Oh, come on. You're being disingenuous.

I invented both algorithms, so I get some say in which is more complex.  I 
fully understand the behaviour of early reopen and can explain it to a lay 
person in around five minutes.  Last time I posted an analysis of MVs it took 
me several days to get it straight in my head just enough to be sure the novel 
problems I was pointing out existed - and in no way did I have confidence I had 
established all the problems.  It wasn't until well after it was completed we 
realised it had some hugely fundamental limitations around primary keys.  I 
would NOT be able to explain the algorithm or its implications to a lay person 
AT ALL.


That said, I would absolutely be comfortable marking incremental repair and 
SASI experimental if this is required to cover MVs with the moniker.  The 
former is less complex than  MVs, but It fits a similar category of complex 
distributed systems implications we hadn't properly modelled. It *has* now had 
extensive testing in the wild though. Conversely SASI has had very little burn 
test, but employs fairly well established approaches, and suffers from very 
little distributed systems complexity.

> On 4 Oct 2017, at 11:12, Josh McKenzie <jmcken...@apache.org> wrote:
> 
> I don't agree at face value that early re-open is in sum a lot simpler than
> MV, or that adding CQL and deprecating Thrift was a lot simpler, or the
> 8099 refactor, etc. Different types of complexity, certainly, and MV's are
> arguably harder to prove correct due to surface area of exposure to failure
> states. Definitions of complexity aside, I do agree with the general
> principle that MV's are very complex and, as with many other things in the
> DB, boundary conditions are insufficiently understood and tested at this
> time. There's also a recency bias to the defects and active work people are
> seeing with MV as there has been a recent focus on stabilizing that rather
> than with the long tail we've seen with other, more pervasive and
> foundational changes to the code-base over the course of the past few years.
> 
> MV's aren't the only thing in the DB that I think qualify for 'flagging as
> not-production-ready' by the criteria people are attempting to selectively
> apply to the feature here. If we go the route of flagging one already
> released feature experimental because we lack confidence in it, there are
> other things we similarly lack confidence in that should be treated
> similarly (incremental repair, SASI to name two that immediately come to
> mind). I personally don't think changing the qualification and user
> experience of features post-release sends a good message to said users; if
> we all agreed unanimously that these features were this failure-prone and
> high-risk, it would be more appropriate to make that change however that's
> obviously not the case here.
> 
> 
> On Wed, Oct 4, 2017 at 10:41 AM, Benedict Elliott Smith 
> <_...@belliottsmith.com
>> wrote:
> 
>> So, as the author of one of the disasters you mention (early re-open), I
>> would prefer to learn from the mistake and not repeat it.  Unfortunately we
>> seem to be in the habit of repeating it, and that feature was a lot *lot*
>> simpler.
>> 
>> Let’s not kid ourselves: MVs are by far and away the most complicated
>> feature we have ever delivered.  We do not fully understand it, even in
>> theory, let alone can we be sure we have the implementation right.
>> 
>> So, if we all agree our testing is ordinarily insufficient, can’t we agree
>> it is probably *really* insufficient here?
>> 
>> I don’t want to give the impression I’m shifting the goals.  I’ve been
>> against MV inclusion as they stand for some time, as were several others.
>> I think in the new world order of project/community structure, they
>> probably would have been rejected as they stand.
>> 
>> I’ve consistently listed my own requirements for considering them
>> production ready:  extensive modelling and simulation of the algorithm’s
>> properties (in lieu of formal proofs), *safe* default behaviour (rollback
>> CASSANDRA-10230, or make it a per-table option, and default to fast only
>> for existing tables to avoid surprise), tools for detecting and repairing
>> inconsistencies, and more extensive testing.
>> 
>> Many of these things were agreed as prerequisites for release of 3.0, but
>> ultimately they were not delivered.
>> 
>> I do, however, absolutely agree with Sylvain that we need to minimise
>> surprise in a patch version.
>> 
>> 
>> On 4 Oct 2017, at 08:58, Josh McKenzie <jmcken...@apache.org> wrote:
>> 
>>>> and providing a feature we don't fully understand, have not fully
>>> documented the caveats of, let alone discovered all the problems with nor
>>> had that knowledge percolate fully into the wider community.
>>> There appear to be varying levels of understanding of the implementation
>>> details of MV's (that seem to directly correlate with faith in the
>>> feature's correctness for the use-cases recommended) on this email thread
>>> so while I respect a sense of general wariness about the state of
>>> correctness testing with C*, I don't agree that the thoroughness of
>> testing
>>> of MV's is any different than any other feature we've added to the
>>> code-base since the project's inception.
>>> 
>>> That's not to say I think the current extent of our testing before GA on
>>> features is adequate; I don't, but I don't think it makes sense to draw
>> an
>>> arbitrary line in the sand with already released features that are in use
>>> in production clusters, flagging said features as experimental after the
>>> fact, and thus eroding users' trust in our collective definition of done.
>>> What's to stop us from flagging other, seemingly arbitrary features
>> people
>>> are relying on in production as experimental in the future? What does
>> that
>>> mean for their faith in the project and their job security? SASI? LWT?
>>> Counters? Triggers? Repair and compaction due to (still arising)
>> edge-cases
>>> and defects in early re-open and incremental repair? All of these
>> features
>>> still have edge-cases due to the inherent complexity of the code-base and
>>> problem domain in which we work.
>>> 
>>> Right now there appear to be the two camps of 'I can't clearly articulate
>>> what Good Enough is since it's Complicated, but I know we're not there'
>> and
>>> 'if people are relying on it in production without issue it's by
>> definition
>>> good enough for their use-case'. It's a compromise; nothing is ever
>> perfect
>>> (as we all know). I'm all for us saying 'We need better testing of
>> features
>>> going forward', 'We need better metrics for the coverage and branch
>> testing
>>> of things in C*', etc, and definitely in favor of us spending some time
>> to
>>> increase our coverage for existing features.
>>> 
>>> I don't think MV's are any different than anything else in this code-base
>>> in terms of how well vetted the features are, for better or for worse.
>>> 
>>> On Wed, Oct 4, 2017 at 5:21 AM, kurt greaves <k...@instaclustr.com>
>> wrote:
>>> 
>>>>> 
>>>>> The flag name `cdc_enabled` is simple and, without adjectives, does not
>>>>> imply "experimental" or "beta" or anything like that.
>>>>> It does make life easier for both operators and the C* developers.
>>>> 
>>>> I would be all for a mv_enabled option, assuming it's enabled by default
>>>> for all existing branches. I don't think saying that you are meant to
>> read
>>>> NEWS.txt before upgrading a patch is acceptable. Most people don't, and
>>>> expecting them to is a bit insane. Also Assuming that if they read it
>>>> they'd understand all implications is also a bit questionable. If deemed
>>>> suitable to turn it off that can be done in the next major/minor, but I
>>>> think that would be unlikely, as we should really require sufficient
>>>> evidence that it's dangerous which I just don't think we have. I'm
>> still of
>>>> the opinion that MV in their current state are no worse off than a lot
>> of
>>>> other features, and marking them as experimental and disabling now would
>>>> just be detrimental to their development and annoy users. Also if we
>> give
>>>> them that treatment then there a whole load of other defaults we should
>>>> change and disable which is just not acceptable in a patch release. It's
>>>> not really necessary anyway, we don't have anyone crying bloody murder
>> on
>>>> the mailing list about how everything went to hell because they used
>>>> feature x.
>>>> 
>>>> No one has really provided any counter evidence yet that MV's are in
>> some
>>>> awful state and they are going to shoot users. There are a few existing
>>>> issues that I've brought up already, but they are really quite minor,
>>>> nothing comparable to "lol you can't repair if you use vnodes, sorry". I
>>>> think we really need some real examples/evidence before making calls
>> like
>>>> "lets disable this feature in a patch release and mark it experimental"
>>>> 
>>>>> I personally believe it is better to offer the feature as experimental
>>>>> until we iron out all of the problems
>>>> 
>>>> What problems are you referring to, and how exactly will we know when
>> all
>>>> of them have been sufficiently ironed? If we mark it as experimental how
>>>> exactly are we going to get people to use said feature to find issues?
>>>> 
>>>> 
>> 


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org

Re: Proposal to retroactively mark materialized views experimental

Reply via email to