Re: Proposal to retroactively mark materialized views experimental

Pavel Yaskevich Wed, 04 Oct 2017 12:22:13 -0700

On Wed, Oct 4, 2017 at 12:09 PM, Jon Haddad <j...@jonhaddad.com> wrote:


> MVs work fine for *some use cases*, not the general use case.  That’s why
> there should be a flag.  To opt into the feature when the behavior is only
> known to be correct under a certain set of circumstances.  Nobody is saying
> the flag should be “enable_terrible_feature_nobody_tested_and_we_all_hate”,
> or something ridiculous like that.  It’s not an attack against the work
> done by anyone, the level of effort put in, or minimizing the complexity of
> the problem.  “enable_materialized_views” would be just fine.
>
> We should be honest to people about what they’re getting into.  You may
> not be aware of this, but a lot of people still believe Cassandra isn’t a
> DB that you should put in prod.  It’s because features like SASI, MVs,  or
> incremental repair get merged in prematurely (or even made the default),
> without having been thoroughly tested, understood and vetted by trusted
> community members.  New users hit the snags because they deploy the
> bleeding edge code and hit the bugs.
>

I beg to differ in case of SASI, it has been tested and vetted and ported
to different versions. I'm pretty sure it still has better test coverage
then most of the project does, it's not a "default" and you actually have
to opt-in to it by creating a custom index, how is that premature or
misleading to users?


>
> That’s not how the process should work.
>
> Ideally, we’d follow a process that looks a lot more like this:
>
> 1. New feature is built with an opt in flag.  Unknowns are documented, the
> risk of using the feature is known to the end user.
> 2. People test and use the feature that know what they’re doing.  They are
> able to read the code, submit patches, and help flush out the issues.  They
> do so in low risk environments.  In the case of MVs, they can afford to
> drop and rebuild the view over a week, or rebuild the cluster altogether.
> We may not even need to worry as much about backwards compatibility.
> 3. The feature matures.  More tests are written.  More people become aware
> of how to contribute to the feature’s stability.
> 4. After a while, we vote on removing the feature flag and declare it
> stable for general usage.
>
> If nobody actually cares about a feature (why it was it written in the
> first place?), then it would never get to 2, 3, 4.  It would take a while
> for big features like MVs to be marked stable, and that’s fine, because it
> takes a long time to actually stabilize them.  I think we can all agree
> they are really, really hard problems to solve, and maybe it takes a while.
>
> Jon
>
>
>
> > On Oct 4, 2017, at 11:44 AM, Josh McKenzie <jmcken...@apache.org> wrote:
> >
> >>
> >> So you’d rather continue to lie to users about the stability of the
> >> feature rather than admitting it was merged in prematurely?
> >
> >
> > Much like w/SASI, this is something that's in the code-base that for
> >> certain use-cases apparently works just fine.
> >
> > I don't know of any outstanding issues with the feature,
> >
> > There appear to be varying levels of understanding of the implementation
> >> details of MV's (that seem to directly correlate with faith in the
> >> feature's correctness for the use-cases recommended)
> >
> > We have users in the wild relying on MV's with apparent success (same
> holds
> >> true of all the other punching bags that have come up in this thread)
> >
> > You're right, Jon. That's clearly exactly what I'm saying.
> >
> >
> > On Wed, Oct 4, 2017 at 2:39 PM, Jon Haddad <j...@jonhaddad.com> wrote:
> >
> >> So you’d rather continue to lie to users about the stability of the
> >> feature rather than admitting it was merged in prematurely?  I’d rather
> >> come clean and avoid future problems, and give people the opportunity to
> >> stop using MVs rather than let them keep taking risks they’re unaware
> of.
> >> This is incredibly irresponsible in my opinion.
> >>
> >>> On Oct 4, 2017, at 11:26 AM, Josh McKenzie <jmcken...@apache.org>
> wrote:
> >>>
> >>>>
> >>>> Oh, come on. You're being disingenuous.
> >>>
> >>> Not my intent. MV's (and SASI, for example) are fairly well isolated;
> we
> >>> have a history of other changes that are much more broadly and higher
> >>> impact risk-wise across the code-base.
> >>>
> >>> If I were an operator and built a critical part of my business on a
> >>> released feature that developers then decided to default-disable as
> >>> 'experimental' post-hoc, I'd think long and hard about using any new
> >>> features in that project in the future (and revisit my confidence in
> all
> >>> other features I relied on, and the software as a whole). We have users
> >> in
> >>> the wild relying on MV's with apparent success (same holds true of all
> >> the
> >>> other punching bags that have come up in this thread) and I'd hate to
> see
> >>> us alienate them by being over-aggressive in the way we handle this.
> >>>
> >>> I'd much rather we continue to aggressively improve and continue to
> >> analyze
> >>> MV's stability before a 4.0 release and then use the experimental flag
> in
> >>> the future, if at all possible.
> >>>
> >>> On Wed, Oct 4, 2017 at 2:01 PM, Benedict Elliott Smith <_@
> >> belliottsmith.com>
> >>> wrote:
> >>>
> >>>> Can't we promote these behavioural flags to keyspace properties (with
> >>>> suitable permissions to edit necessary)?
> >>>>
> >>>> I agree that enabling/disabling features shouldn't require a rolling
> >>>> restart, and nor should switching their consistency safety level.
> >>>>
> >>>> I think this would be the most suitable equivalent to ALLOW FILTERING
> >> for
> >>>> MVs.
> >>>>
> >>>>
> >>>>
> >>>>> On 4 Oct 2017, at 12:31, Jeremy Hanna <jeremy.hanna1...@gmail.com>
> >>>> wrote:
> >>>>>
> >>>>> Not to detract from the discussion about whether or not to classify X
> >> or
> >>>> Y as experimental but https://issues.apache.org/
> >> jira/browse/CASSANDRA-8303
> >>>> <https://issues.apache.org/jira/browse/CASSANDRA-8303> was originally
> >>>> about operators preventing users from abusing features (e.g. allow
> >>>> filtering).  Could that concept be extended to features like MVs or
> >> SASI or
> >>>> anything else?  On the one hand it is nice to be able to set those
> >> things
> >>>> dynamically without a rolling restart as well as by user.  On the
> other
> >>>> it’s less clear about defaults.  There could be a property file or
> just
> >> in
> >>>> the yaml, the operator could specify the default features that are
> >> enabled
> >>>> for users and then it could be overridden within that framework.
> >>>>>
> >>>>>> On Oct 4, 2017, at 10:24 AM, Aleksey Yeshchenko <alek...@apple.com>
> >>>> wrote:
> >>>>>>
> >>>>>> We already have those for UDFs and CDC.
> >>>>>>
> >>>>>> We should have more: for triggers, SASI, and MVs, at least.
> Operators
> >>>> need a way to disable features they haven’t validated.
> >>>>>>
> >>>>>> We already have sufficient consensus to introduce the flags, and we
> >>>> should. There also seems to be sufficient consensus on emitting
> >> warnings.
> >>>>>>
> >>>>>> The debate is now on their defaults for MVs in 3.0, 3.11, and 4.0. I
> >>>> agree with Sylvain that flipping the default in a minor would be
> >> invasive.
> >>>> We shouldn’t do that.
> >>>>>>
> >>>>>> For trunk, though, I think we should default to off. When it comes
> to
> >>>> releasing 4.0 we can collectively decide if there is sufficient trust
> in
> >>>> MVs at the time to warrant flipping the default to true. Ultimately we
> >> can
> >>>> decide this in a PMC vote. If I misread the consensus regarding the
> >> default
> >>>> for 4.0, then we might as well vote on that. What I see is sufficient
> >>>> distrust coming from core committers, including the author of the v1
> >>>> design, to warrant opt-in for MVs.
> >>>>>>
> >>>>>> If we don’t trust in them as developers, we shouldn’t be cavalier
> with
> >>>> the users, either. Not until that trust is gained/regained.
> >>>>>>
> >>>>>> —
> >>>>>> AY
> >>>>>>
> >>>>>> On 4 October 2017 at 13:26:10, Stefan Podkowinski (s...@apache.org)
> >>>> wrote:
> >>>>>>
> >>>>>> Introducing feature flags for enabling or disabling different code
> >> paths
> >>>>>> is not sustainable in the long run. It's hard enough to keep up with
> >>>>>> integration testing with the couple of Jenkins jobs that we have.
> >>>>>> Running jobs for all permutations of flags that we keep around,
> would
> >>>>>> turn out impractical. But if we don't, I'm pretty sure something
> will
> >>>>>> fall off the radar and it won't take long until someone reports that
> >>>>>> enabling feature X after the latest upgrade will simply not work
> >>>> anymore.
> >>>>>>
> >>>>>> There may also be some more subtle assumptions and cross
> dependencies
> >>>>>> between features that may cause side effects by disabling a feature
> >> (or
> >>>>>> parts of it), even if it's just e.g. a metric value that suddenly
> >> won't
> >>>>>> get updated anymore, but is used somewhere else. We'll also have to
> >>>>>> consider migration paths for turning a feature on and off again
> >> without
> >>>>>> causing any downtime. If I was to turn on e.g. MVs on a single node
> in
> >>>>>> my cluster, then this should not cause any issues on the other nodes
> >>>>>> that still have MV code paths disabled. Again, this would need to be
> >>>> tested.
> >>>>>>
> >>>>>> So to be clear, my point is that any flags should be implemented in
> a
> >>>>>> really non-invasive way on the user facing side only, e.g. by
> >> emitting a
> >>>>>> log message or cqlsh error. At this point, I'm not really sure if it
> >>>>>> would be a good idea to add them to cassandra.yaml, as I'm pretty
> sure
> >>>>>> that eventually they will be used to change the behaviour of our
> code,
> >>>>>> beside printing a log message.
> >>>>>>
> >>>>>>
> >>>>>> On 04.10.17 10:03, Mick Semb Wever wrote:
> >>>>>>>>> CDC sounds like it is in the same basket, but it already has the
> >>>>>>>>> `cdc_enabled` yaml flag which defaults false.
> >>>>>>>> I went this route because I was incredibly wary of changing the CL
> >>>>>>>> code and wanted to shield non-CDC users from any and all risk I
> >>>>>>>> reasonably could.
> >>>>>>>
> >>>>>>> This approach so far is my favourite. (Thanks Josh.)
> >>>>>>>
> >>>>>>> The flag name `cdc_enabled` is simple and, without adjectives, does
> >> not
> >>>>>>> imply "experimental" or "beta" or anything like that.
> >>>>>>> It does make life easier for both operators and the C* developers.
> >>>>>>>
> >>>>>>> I'm also fond of how Apache projects often vote both on the release
> >> as
> >>>> well
> >>>>>>> as its stability flag: Alpha|Beta|GA (General Availability).
> >>>>>>> https://httpd.apache.org/dev/release.html
> >>>>>>> http://www.apache.org/legal/release-policy.html#release-types
> >>>>>>>
> >>>>>>> Given the importance of The Database, i'd be keen to see attached
> >> such
> >>>>>>> community-agreed quality references. And going further, not just to
> >> the
> >>>>>>> releases but also to substantial new features (those yet to reach
> >> GA).
> >>>> Then
> >>>>>>> the downloads page could provide a table something like
> >>>>>>> https://paste.apache.org/FzrQ
> >>>>>>>
> >>>>>>> It's just one idea to throw out there, and while it hijacks the
> >> thread
> >>>> a
> >>>>>>> bit, it could even with just the quality tag on releases go a long
> >> way
> >>>> with
> >>>>>>> user trust. Especially if we really are humble about it and use GA
> >>>>>>> appropriately. For example I'm perfectly happy using a beta in
> >>>> production
> >>>>>>> if I see the community otherwise has good processes in place and
> >>>> there's
> >>>>>>> strong testing and staging resources to take advantage of. And as
> >> Kurt
> >>>> has
> >>>>>>> implied many users are indeed smart and wise enough to know how to
> >>>> safely
> >>>>>>> test and cautiously use even alpha features in production.
> >>>>>>>
> >>>>>>> Anyway, with or without the above idea, yaml flag names that don't
> >>>>>>> use adjectives could address Kurt's concerns about pulling the rug
> >> from
> >>>>>>> under the feet of existing users. Such a flag is but a small
> >>>> improvement
> >>>>>>> suitable for a minor release (you must read the NEWS.txt before
> even
> >> a
> >>>>>>> patch upgrade), and the documentation is only making explicit what
> >>>> should
> >>>>>>> have been all along. Users shouldn't feel that we're returning
> >> features
> >>>>>>> into "alpha|beta" mode when what we're actually doing is improving
> >> the
> >>>>>>> community's quality assurance documentation.
> >>>>>>>
> >>>>>>> Mick
> >>>>>>>
> >>>>>>
> >>>>>>
> >>>>>> ------------------------------------------------------------
> ---------
> >>>>>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> >>>>>> For additional commands, e-mail: dev-h...@cassandra.apache.org
> >>>>>>
> >>>>>
> >>>>
> >>>>
> >>>> ---------------------------------------------------------------------
> >>>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> >>>> For additional commands, e-mail: dev-h...@cassandra.apache.org
> >>>>
> >>>>
> >>
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> >> For additional commands, e-mail: dev-h...@cassandra.apache.org
> >>
> >>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
>

Re: Proposal to retroactively mark materialized views experimental

Reply via email to