Re: [DISCUSS] Future of MVs

David Capwell Tue, 30 Jun 2020 15:28:26 -0700

If that is the case then shouldn't we add MV to "4.0 Quality: Components
and Test Plans" (CASSANDRA-15536)?  It is currently missing, so adding it
to the testing road map would be a clear sign that someone is planning to
champion and own this feature; if people feel that this is a broken
feature, shouldn't we have tests showing this?  Would be great to see
traction here.


On Tue, Jun 30, 2020 at 3:11 PM Joshua McKenzie <jmcken...@apache.org>
wrote:

> Let's forget I said anything about release cadence. That's another thread
> entirely and a good deep conversation to explore. Don't want to derail.
>
> If there's a question about "is anyone stepping forward to maintain MV's",
> I can say with certainty that at least one full time contributor I work
> with will engage and continue to work on and improve this feature going
> forward. Who precisely that ends up being stands to be seen; that's more
> fluid, but there are no plans to stop working on it going forward.
>
> On Tue, Jun 30, 2020 at 5:45 PM Benedict Elliott Smith <
> bened...@apache.org>
> wrote:
>
> > I don't think we can realistically expect majors, with the deprecation
> > cycle they entail, to come every six months.  If nothing else, we would
> > have too many versions to maintain at once.  I personally think all the
> > project needs on that front is clearer roadmapping at the start of a
> > release cycle, and we would be fine with 12-18mo release cycles.
> >
> > That's another whole discussion to distract us from 4.0, anyway - though
> I
> > think we can tolerate a few slow burn conversations.
> >
> >
> > On 30/06/2020, 22:10, "Joshua McKenzie" <jmcken...@apache.org> wrote:
> >
> >     Seems like a reasonable point of view to me Sankalp. I'd also suggest
> > we
> >     try to find other sources of data than just the user ML, like
> > searching on
> >     github for instance. A collection of imperfect metrics beats just one
> > in my
> >     experience.
> >
> >     Though I would ask why we're having this discussion this late in the
> >     release cycle when we have what, 4 tickets left until cutting beta 1?
> > Seems
> >     like the kind of thing we could reasonably defer while we focus on
> > getting
> >     4.0 out, though I'm sympathetic to the "release is cutoff for
> > deprecation"
> >     argument.
> >
> >     If we cadence our majors to calendar (like every 6 months for
> example)
> >     instead of scope this would become significantly less of a big issue
> > imo.
> >
> >     On Tue, Jun 30, 2020 at 5:01 PM sankalp kohli <
> kohlisank...@gmail.com>
> >     wrote:
> >
> >     > Hi,
> >     >     I think we should revisit all features which require a lot more
> > work to
> >     > make them work. Here is how I think we should do for each one of
> them
> >     >
> >     > 1. Identify such features and some details of why they are
> > deprecation
> >     > candidates.
> >     > 2. Ask the dev list if anyone is willing to work on improving them
> > over the
> >     > next 1 or 2 major releases.
> >     > 3. We then move to the user list to find who all are using it and
> if
> > they
> >     > are opposed to removing/deprecating it. Assuming few will be using
> > it, we
> >     > need to see the tradeoff of keeping it vs removing it on a case by
> > case
> >     > basis.
> >     > 4. Deprecate it in the next major or make it experimental if #2 and
> > #3
> >     > removes them from deprecation.
> >     > 5. Remove it in next major
> >     >
> >     > For MV, I see this email as step #2. We should move to asking the
> > user list
> >     > next.
> >     >
> >     > Thanks,
> >     > Sankalp
> >     >
> >     > On Tue, Jun 30, 2020 at 1:46 PM Joshua McKenzie <
> > jmcken...@apache.org>
> >     > wrote:
> >     >
> >     > > We're just short of 98 tickets on the component since it's
> > original merge
> >     > > so at least *some* work has been done to stabilize them. Not to
> > say I'm
> >     > > endorsing running them at massive scale today without knowing
> what
> > you're
> >     > > doing, to be clear. They are perhaps our largest loaded gun of a
> > feature
> >     > of
> >     > > self-foot-shooting atm. Zhao did a bunch of work on them
> > internally and
> >     > > we've backported much of that to OSS; I've pinged him to chime in
> > here.
> >     > >
> >     > > The "data is orphaned in your view when you lose all base
> > replicas" issue
> >     > > is more or less "unsolvable", since a scan of a view to confirm
> > data in
> >     > the
> >     > > base table is so slow you're talking weeks to process and it
> > totally
> >     > > trashes your page cache. I think Paulo landed on a "you have to
> > rebuild
> >     > the
> >     > > view if you lose all base data" reality. There's also, I believe,
> > the
> >     > > unresolved issue of modeling how much data a base table with one
> > to many
> >     > > views will end up taking up in its final form when denormalized.
> > This
> >     > could
> >     > > be vastly improved with something like an "EXPLAIN ANALYZE" for a
> > table
> >     > > with views, if you'll excuse the mapping, to show "N bytes in
> base
> > will
> >     > > become M with base + views" or something.
> >     > >
> >     > > Last but definitely not least in dumping the state in my head
> > about this,
> >     > > there's a bunch of potential for guardrailing people away from
> > self-harm
> >     > > with MV's if we decide to go the route of guardrails (link:
> >     > >
> >     > >
> >     >
> >
> https://cwiki.apache.org/confluence/display/CASSANDRA/%28DRAFT%29+-+CEP-3%3A+Guardrails
> >     > > ).
> >     > >
> >     > > So  from my PoV, I'm against us just voting to deprecate and
> remove
> >     > without
> >     > > going into more depth into the current state of things and what
> > options
> >     > are
> >     > > on the table, since people will continue to build MV's at the
> > client
> >     > level
> >     > > which, in theory, should have worse correctness and performance
> >     > > characteristics than having a clean and well stabilized
> > implementation in
> >     > > the coordinator.
> >     > >
> >     > > Having them flagged as experimental for now as we stabilize 4.0
> > and get
> >     > > things out the door *seems* sufficient to me, but if people are
> > widely
> >     > > using these out in the wild and ignoring that status and the
> >     > corresponding
> >     > > warning, maybe we consider raising the volume on that warning for
> > 4.0
> >     > while
> >     > > we figure this out.
> >     > >
> >     > > Just my .02.
> >     > >
> >     > > ~Josh
> >     > >
> >     > > On Tue, Jun 30, 2020 at 4:22 PM Dinesh Joshi <djo...@apache.org>
> > wrote:
> >     > >
> >     > > > > On Jun 30, 2020, at 12:43 PM, Jon Haddad <j...@jonhaddad.com>
> > wrote:
> >     > > > >
> >     > > > > As we move forward with the 4.0 release, we should consider
> > this an
> >     > > > > opportunity to deprecate materialized views, and remove them
> > in 5.0.
> >     > > We
> >     > > > > should take this opportunity to learn from the mistake and
> > raise the
> >     > > bar
> >     > > > > for new features to undergo a much more thorough run the
> > wringer
> >     > before
> >     > > > > merging.
> >     > > >
> >     > > > I'm in favor of marking them as deprecated and removing them in
> > 5.0. If
> >     > > > someone steps up and can fix them in 5.0, then we always have
> the
> >     > option
> >     > > of
> >     > > > accepting the fix.
> >     > > >
> >     > > > Dinesh
> >     > > >
> > ---------------------------------------------------------------------
> >     > > > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> >     > > > For additional commands, e-mail: dev-h...@cassandra.apache.org
> >     > > >
> >     > > >
> >     > >
> >     >
> >
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> > For additional commands, e-mail: dev-h...@cassandra.apache.org
> >
> >
>

Re: [DISCUSS] Future of MVs

Reply via email to