Re: [DISCUSS] Future of MVs

2020-06-30 Thread joshua . mckenzie
It would be incredibly helpful for us to have some empirical data and agreed 
upon terms and benchmarks to help us navigate discussions like this:

  * How widely used is a feature  in C* deployments worldwide?
  * What are the primary issues users face when deploying them? Scaling them? 
During failure scenarios?
  * What does the engineering effort to bridge these gaps look like? Who will 
do that? On what time horizon?
  * What does our current test coverage for this feature look like?
  * What shape of defects are arising with the feature? In a specific 
subsection of the module or usage?
  * Do we have an agreed upon set of standards for labeling a feature stable? 
As experimental? If not, how do we get there?
  * What effort will it take to bridge from where we are to where we agree we 
need to be? On what timeline is this acceptable?

I believe these are not only answerable questions, but fundamentally the 
underlying themes our discussion alludes to. They’re also questions that apply 
to a lot more than just MV’s and tie into what you’re speaking to above 
Benedict.


> On Jun 30, 2020, at 8:32 PM, sankalp kohli  wrote:
> 
> I see this discussion as several decisions which can be made in small
> increments.
> 
> 1. In release cycles, when can we propose a feature to be deprecated or
> marked experimental. Ideally a new feature should come out experimental if
> required but we have several who are candidates now. We can work on
> integrating this in the release lifecycle doc we already have.
> 2. What is the process of making an existing feature experimental? How does
> it affect major releases around testing.
> 3. What is the process of deprecating/removing an experimental feature.
> (Assuming experimental features should be deprecated/removed)
> 
> Coming to MV, I think we need more data before we can say we
> should deprecate MV. Here are some of them which should be part of
> deprecation process
> 1.Talk to customers who use them and understand what is the impact. Give
> them a forum to talk about it.
> 2. Do we have enough resources to bring this feature out of the
> experimental feature list in next 1 or 2 major releases. We cannot have too
> many experimental features in the database. Marking a feature experimental
> should not be a parking place for a non functioning feature but a place
> while we stabilize it.
> 
> 
> 
> 
>> On Tue, Jun 30, 2020 at 4:52 PM  wrote:
>> 
>> I followed up with the clarification about unit and dtests for that reason
>> Dinesh. We test experimental features now.
>> 
>> If we’re talking about adding experimental features to the 40 quality
>> testing effort, how does that differ from just saying “we won’t release
>> until we’ve tested and stabilized these features and they’re no longer
>> experimental”?
>> 
>> Maybe I’m just misunderstanding something here?
>> 
 On Jun 30, 2020, at 7:12 PM, Dinesh Joshi  wrote:
>>> 
>>> 
 
 On Jun 30, 2020, at 4:05 PM, Brandon Williams  wrote:
 
 Instead of ripping it out, we could instead disable them in the yaml
 with big fat warning comments around it.  That way people already
 using them can just enable them again, but it will raise the bar for
 new users who ignore/miss the warnings in the logs and just use them.
>>> 
>>> Not a bad idea. Although, the real issue is that users enable MV on a 3
>> node cluster with a few megs of data and conclude that MVs will
>> horizontally scale with the size of data. This is what causes issues for
>> users who naively roll it out in production and discover that MVs do not
>> scale with their data growth. So whatever we do, the big fat warning should
>> educate the unsuspecting operator.
>>> 
>>> Dinesh
>>> -
>>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
>>> For additional commands, e-mail: dev-h...@cassandra.apache.org
>>> 
>> 
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
>> For additional commands, e-mail: dev-h...@cassandra.apache.org
>> 
>> 

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: [DISCUSS] Future of MVs

2020-06-30 Thread sankalp kohli
I see this discussion as several decisions which can be made in small
increments.

1. In release cycles, when can we propose a feature to be deprecated or
marked experimental. Ideally a new feature should come out experimental if
required but we have several who are candidates now. We can work on
integrating this in the release lifecycle doc we already have.
2. What is the process of making an existing feature experimental? How does
it affect major releases around testing.
3. What is the process of deprecating/removing an experimental feature.
(Assuming experimental features should be deprecated/removed)

Coming to MV, I think we need more data before we can say we
should deprecate MV. Here are some of them which should be part of
deprecation process
1.Talk to customers who use them and understand what is the impact. Give
them a forum to talk about it.
2. Do we have enough resources to bring this feature out of the
experimental feature list in next 1 or 2 major releases. We cannot have too
many experimental features in the database. Marking a feature experimental
should not be a parking place for a non functioning feature but a place
while we stabilize it.




On Tue, Jun 30, 2020 at 4:52 PM  wrote:

> I followed up with the clarification about unit and dtests for that reason
> Dinesh. We test experimental features now.
>
> If we’re talking about adding experimental features to the 40 quality
> testing effort, how does that differ from just saying “we won’t release
> until we’ve tested and stabilized these features and they’re no longer
> experimental”?
>
> Maybe I’m just misunderstanding something here?
>
> > On Jun 30, 2020, at 7:12 PM, Dinesh Joshi  wrote:
> >
> > 
> >>
> >> On Jun 30, 2020, at 4:05 PM, Brandon Williams  wrote:
> >>
> >> Instead of ripping it out, we could instead disable them in the yaml
> >> with big fat warning comments around it.  That way people already
> >> using them can just enable them again, but it will raise the bar for
> >> new users who ignore/miss the warnings in the logs and just use them.
> >
> > Not a bad idea. Although, the real issue is that users enable MV on a 3
> node cluster with a few megs of data and conclude that MVs will
> horizontally scale with the size of data. This is what causes issues for
> users who naively roll it out in production and discover that MVs do not
> scale with their data growth. So whatever we do, the big fat warning should
> educate the unsuspecting operator.
> >
> > Dinesh
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> > For additional commands, e-mail: dev-h...@cassandra.apache.org
> >
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
>


Re: [DISCUSS] Future of MVs

2020-06-30 Thread Dinesh Joshi
> On Jun 30, 2020, at 4:52 PM, joshua.mcken...@gmail.com wrote:
> 
> I followed up with the clarification about unit and dtests for that reason 
> Dinesh. We test experimental features now.

I hit send before seeing your clarification. I personally feel that unit and 
dtests may not surface regressions. I'd prefer the user community trying out 
the alpha, beta, RC releases and report regressions as they find them.

Dinesh
-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: [DISCUSS] Future of MVs

2020-06-30 Thread J. D. Jordan
>>> Instead of ripping it out, we could instead disable them in the yaml
>>> with big fat warning comments around it. 


FYI we have already disabled use of materialized views, SASI, and transient 
replication by default in 4.0

https://github.com/apache/cassandra/blob/trunk/conf/cassandra.yaml#L1393

> On Jun 30, 2020, at 6:53 PM, joshua.mcken...@gmail.com wrote:
> 
> I followed up with the clarification about unit and dtests for that reason 
> Dinesh. We test experimental features now.
> 
> If we’re talking about adding experimental features to the 40 quality testing 
> effort, how does that differ from just saying “we won’t release until we’ve 
> tested and stabilized these features and they’re no longer experimental”?
> 
> Maybe I’m just misunderstanding something here?
> 
>> On Jun 30, 2020, at 7:12 PM, Dinesh Joshi  wrote:
>> 
>> 
>>> 
 On Jun 30, 2020, at 4:05 PM, Brandon Williams  wrote:
>>> 
>>> Instead of ripping it out, we could instead disable them in the yaml
>>> with big fat warning comments around it.  That way people already
>>> using them can just enable them again, but it will raise the bar for
>>> new users who ignore/miss the warnings in the logs and just use them.
>> 
>> Not a bad idea. Although, the real issue is that users enable MV on a 3 node 
>> cluster with a few megs of data and conclude that MVs will horizontally 
>> scale with the size of data. This is what causes issues for users who 
>> naively roll it out in production and discover that MVs do not scale with 
>> their data growth. So whatever we do, the big fat warning should educate the 
>> unsuspecting operator.
>> 
>> Dinesh
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
>> For additional commands, e-mail: dev-h...@cassandra.apache.org
>> 
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
> 


Re: [DISCUSS] Future of MVs

2020-06-30 Thread joshua . mckenzie
I followed up with the clarification about unit and dtests for that reason 
Dinesh. We test experimental features now.

If we’re talking about adding experimental features to the 40 quality testing 
effort, how does that differ from just saying “we won’t release until we’ve 
tested and stabilized these features and they’re no longer experimental”?

Maybe I’m just misunderstanding something here?

> On Jun 30, 2020, at 7:12 PM, Dinesh Joshi  wrote:
> 
> 
>> 
>> On Jun 30, 2020, at 4:05 PM, Brandon Williams  wrote:
>> 
>> Instead of ripping it out, we could instead disable them in the yaml
>> with big fat warning comments around it.  That way people already
>> using them can just enable them again, but it will raise the bar for
>> new users who ignore/miss the warnings in the logs and just use them.
> 
> Not a bad idea. Although, the real issue is that users enable MV on a 3 node 
> cluster with a few megs of data and conclude that MVs will horizontally scale 
> with the size of data. This is what causes issues for users who naively roll 
> it out in production and discover that MVs do not scale with their data 
> growth. So whatever we do, the big fat warning should educate the 
> unsuspecting operator.
> 
> Dinesh
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
> 

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: [DISCUSS] Stabilizing 4.0

2020-06-30 Thread Dinesh Joshi
Thank you all those who responded.

One potential way we could speed up sussing out issues is running regular "Bug 
Bashes" with the help of the user community. We could periodically post stats 
and recognize folks who contribute the most issues. This would help gain 
confidence in the builds we're putting out there. Thoughts?

Dinesh

> On Jun 30, 2020, at 7:21 AM, Benjamin Lerer  
> wrote:
> 
> It is a good catch, Mick. :-)
> 
> I will triage those tickets to be sure that our view of things is accurate.
> 
> 
> On Tue, Jun 30, 2020 at 11:38 AM Berenguer Blasi 
> wrote:
> 
>> That's a very good point. At the risk of saying sthg silly or being
>> captain obvious, as I am not familiar with the project dynamics, there
>> should be a periodic 'backlog triage' or similar. Otherwise we'll have
>> the impression we have just a handful of pending issues while another
>> 10x packet is hiding but we didn't notice yet.
>> 
>> On 30/6/20 11:18, Mick Semb Wever wrote:
 
 Berenguer pointed out to me that we already have a graph to track those
 things:
 
 
 
>> https://issues.apache.org/jira/secure/ConfigureReport.jspa?projectOrFilterId=filter-12347782&periodName=weekly&daysprevious=30&cumulative=true&versionLabels=none&selectedProjectId=12310865&reportKey=com.atlassian.jira.jira-core-reports-plugin%3Acreatedvsresolved-report&atl_token=A5KQ-2QAV-T4JA-FDED_fd75a3db98350d94229fbb4cf29cb50f3051d7ce_lin&Next=Next
>>> 
>>> 
>>> A lot of issues are also coming in without any fixVersion defined.
>>> For example (just in the past 4 weeks):
>>> 
>>> 
>> https://issues.apache.org/jira/issues/?filter=12347782&jql=project%20%3D%20cassandra%20AND%20((fixVersion%20is%20EMPTY%20AND%20created%20%20%3E%3D%20-4w))%20%20AND%20(resolution%20%3D%20unresolved%20OR%20status%20!%3D%20resolved%20OR%20resolved%20%3E%3D%20-4w)%20ORDER%20BY%20priority%20DESC%2C%20assignee
>>> 
>> 
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
>> For additional commands, e-mail: dev-h...@cassandra.apache.org
>> 
>> 


-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: [DISCUSS] Future of MVs

2020-06-30 Thread Dinesh Joshi
> On Jun 30, 2020, at 4:05 PM, Brandon Williams  wrote:
> 
> Instead of ripping it out, we could instead disable them in the yaml
> with big fat warning comments around it.  That way people already
> using them can just enable them again, but it will raise the bar for
> new users who ignore/miss the warnings in the logs and just use them.

Not a bad idea. Although, the real issue is that users enable MV on a 3 node 
cluster with a few megs of data and conclude that MVs will horizontally scale 
with the size of data. This is what causes issues for users who naively roll it 
out in production and discover that MVs do not scale with their data growth. So 
whatever we do, the big fat warning should educate the unsuspecting operator.

Dinesh
-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: [DISCUSS] Future of MVs

2020-06-30 Thread Brandon Williams
On Tue, Jun 30, 2020 at 5:41 PM  wrote:
> Given we’re at a place where things like MV’s and sasi are backing production 
> cases (power users one would hope or smaller use cases) I don’t think ripping 
> those features out and further excluding users from the ecosystem is the 
> right move.

Instead of ripping it out, we could instead disable them in the yaml
with big fat warning comments around it.  That way people already
using them can just enable them again, but it will raise the bar for
new users who ignore/miss the warnings in the logs and just use them.

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: [DISCUSS] Future of MVs

2020-06-30 Thread Dinesh Joshi
> On Jun 30, 2020, at 3:40 PM, joshua.mcken...@gmail.com wrote:
> 
> I don’t think we should hold up releases on testing experimental features. 
> Especially with how many of them we have.
> 
> Given we’re at a place where things like MV’s and sasi are backing production 
> cases (power users one would hope or smaller use cases)

Lets back up for a second here. MV's are backing production cases but we should 
not spend time in testing them for 4.0? That is inherently a contradictory 
position.

Dinesh
-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: [DISCUSS] Future of MVs

2020-06-30 Thread joshua . mckenzie
Just to clarify one thing. I understand experimental features to be alpha / 
beta quality, and as such the guarantees of correctness to differ from the 
other features presented in the database. We should likely articulate this in 
the wiki and docs if we have not.

In the case of mv’s, since they began as a regular feature, obviously we don’t 
want a degradation in functionality on the feature, experimental or not. Our 
guarantees and codification of feature apis and functionality have historically 
taken the form of unit tests and dtests, which while limited in their ability 
to explore and test a state space do provide a minimal guarantee of api 
consistency that should be sufficient to maintain our contracts of correctness 
with experimental features.  



Sent from my iPhone

> On Jun 30, 2020, at 6:40 PM, joshua.mcken...@gmail.com wrote:
> 

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: [DISCUSS] Future of MVs

2020-06-30 Thread joshua . mckenzie
I don’t think we should hold up releases on testing experimental features. 
Especially with how many of them we have.

Agree re: needing a more quantitative bar for new additions which we can also 
retroactively apply to experimental features to bring up to speed and 
eventually graduate. Probably worth separately defining criteria for submission 
of a feature as experimental while we’re at it.

Given we’re at a place where things like MV’s and sasi are backing production 
cases (power users one would hope or smaller use cases) I don’t think ripping 
those features out and further excluding users from the ecosystem is the right 
move. 

> On Jun 30, 2020, at 6:27 PM, David Capwell  wrote:
> 
> If that is the case then shouldn't we add MV to "4.0 Quality: Components
> and Test Plans" (CASSANDRA-15536)?  It is currently missing, so adding it
> to the testing road map would be a clear sign that someone is planning to
> champion and own this feature; if people feel that this is a broken
> feature, shouldn't we have tests showing this?  Would be great to see
> traction here.
> 
>> On Tue, Jun 30, 2020 at 3:11 PM Joshua McKenzie 
>> wrote:
>> 
>> Let's forget I said anything about release cadence. That's another thread
>> entirely and a good deep conversation to explore. Don't want to derail.
>> 
>> If there's a question about "is anyone stepping forward to maintain MV's",
>> I can say with certainty that at least one full time contributor I work
>> with will engage and continue to work on and improve this feature going
>> forward. Who precisely that ends up being stands to be seen; that's more
>> fluid, but there are no plans to stop working on it going forward.
>> 
>> On Tue, Jun 30, 2020 at 5:45 PM Benedict Elliott Smith <
>> bened...@apache.org>
>> wrote:
>> 
>>> I don't think we can realistically expect majors, with the deprecation
>>> cycle they entail, to come every six months.  If nothing else, we would
>>> have too many versions to maintain at once.  I personally think all the
>>> project needs on that front is clearer roadmapping at the start of a
>>> release cycle, and we would be fine with 12-18mo release cycles.
>>> 
>>> That's another whole discussion to distract us from 4.0, anyway - though
>> I
>>> think we can tolerate a few slow burn conversations.
>>> 
>>> 
>>> On 30/06/2020, 22:10, "Joshua McKenzie"  wrote:
>>> 
>>>Seems like a reasonable point of view to me Sankalp. I'd also suggest
>>> we
>>>try to find other sources of data than just the user ML, like
>>> searching on
>>>github for instance. A collection of imperfect metrics beats just one
>>> in my
>>>experience.
>>> 
>>>Though I would ask why we're having this discussion this late in the
>>>release cycle when we have what, 4 tickets left until cutting beta 1?
>>> Seems
>>>like the kind of thing we could reasonably defer while we focus on
>>> getting
>>>4.0 out, though I'm sympathetic to the "release is cutoff for
>>> deprecation"
>>>argument.
>>> 
>>>If we cadence our majors to calendar (like every 6 months for
>> example)
>>>instead of scope this would become significantly less of a big issue
>>> imo.
>>> 
>>>On Tue, Jun 30, 2020 at 5:01 PM sankalp kohli <
>> kohlisank...@gmail.com>
>>>wrote:
>>> 
 Hi,
I think we should revisit all features which require a lot more
>>> work to
 make them work. Here is how I think we should do for each one of
>> them
 
 1. Identify such features and some details of why they are
>>> deprecation
 candidates.
 2. Ask the dev list if anyone is willing to work on improving them
>>> over the
 next 1 or 2 major releases.
 3. We then move to the user list to find who all are using it and
>> if
>>> they
 are opposed to removing/deprecating it. Assuming few will be using
>>> it, we
 need to see the tradeoff of keeping it vs removing it on a case by
>>> case
 basis.
 4. Deprecate it in the next major or make it experimental if #2 and
>>> #3
 removes them from deprecation.
 5. Remove it in next major
 
 For MV, I see this email as step #2. We should move to asking the
>>> user list
 next.
 
 Thanks,
 Sankalp
 
 On Tue, Jun 30, 2020 at 1:46 PM Joshua McKenzie <
>>> jmcken...@apache.org>
 wrote:
 
> We're just short of 98 tickets on the component since it's
>>> original merge
> so at least *some* work has been done to stabilize them. Not to
>>> say I'm
> endorsing running them at massive scale today without knowing
>> what
>>> you're
> doing, to be clear. They are perhaps our largest loaded gun of a
>>> feature
 of
> self-foot-shooting atm. Zhao did a bunch of work on them
>>> internally and
> we've backported much of that to OSS; I've pinged him to chime in
>>> here.
> 
> The "data is orphaned in your view when you lose all base
>>> replicas" issue
> is more or less "unsolvable", since a scan of a view to confirm
>>> data 

Re: [DISCUSS] Future of MVs

2020-06-30 Thread Dinesh Joshi
> On Jun 30, 2020, at 3:27 PM, David Capwell  wrote:
> 
> If that is the case then shouldn't we add MV to "4.0 Quality: Components
> and Test Plans" (CASSANDRA-15536)?  It is currently missing, so adding it
> to the testing road map would be a clear sign that someone is planning to
> champion and own this feature; if people feel that this is a broken
> feature, shouldn't we have tests showing this?  Would be great to see
> traction here.

Good point, we should definitely test it to ensure there are no regressions 
even though it is marked as experimental.

I'd also like to clarify that the feature works for a certain subset of 
use-cases when it is limited to a certain scale. It unfortunately does not 
scale well with the size of data. I think it is important to call out this 
distinction. For many users, it's acceptable. For others it is not.

Dinesh
-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: [DISCUSS] Future of MVs

2020-06-30 Thread Nate McCall
On Wed, Jul 1, 2020 at 10:27 AM David Capwell  wrote:

> If that is the case then shouldn't we add MV to "4.0 Quality: Components
> and Test Plans" (CASSANDRA-15536)?  It is currently missing, so adding it
> to the testing road map would be a clear sign that someone is planning to
> champion and own this feature; if people feel that this is a broken
> feature, shouldn't we have tests showing this?  Would be great to see
> traction here.
>

+1 - Surfacing it like that feels like a good next step to me.


Re: [DISCUSS] Future of MVs

2020-06-30 Thread David Capwell
If that is the case then shouldn't we add MV to "4.0 Quality: Components
and Test Plans" (CASSANDRA-15536)?  It is currently missing, so adding it
to the testing road map would be a clear sign that someone is planning to
champion and own this feature; if people feel that this is a broken
feature, shouldn't we have tests showing this?  Would be great to see
traction here.

On Tue, Jun 30, 2020 at 3:11 PM Joshua McKenzie 
wrote:

> Let's forget I said anything about release cadence. That's another thread
> entirely and a good deep conversation to explore. Don't want to derail.
>
> If there's a question about "is anyone stepping forward to maintain MV's",
> I can say with certainty that at least one full time contributor I work
> with will engage and continue to work on and improve this feature going
> forward. Who precisely that ends up being stands to be seen; that's more
> fluid, but there are no plans to stop working on it going forward.
>
> On Tue, Jun 30, 2020 at 5:45 PM Benedict Elliott Smith <
> bened...@apache.org>
> wrote:
>
> > I don't think we can realistically expect majors, with the deprecation
> > cycle they entail, to come every six months.  If nothing else, we would
> > have too many versions to maintain at once.  I personally think all the
> > project needs on that front is clearer roadmapping at the start of a
> > release cycle, and we would be fine with 12-18mo release cycles.
> >
> > That's another whole discussion to distract us from 4.0, anyway - though
> I
> > think we can tolerate a few slow burn conversations.
> >
> >
> > On 30/06/2020, 22:10, "Joshua McKenzie"  wrote:
> >
> > Seems like a reasonable point of view to me Sankalp. I'd also suggest
> > we
> > try to find other sources of data than just the user ML, like
> > searching on
> > github for instance. A collection of imperfect metrics beats just one
> > in my
> > experience.
> >
> > Though I would ask why we're having this discussion this late in the
> > release cycle when we have what, 4 tickets left until cutting beta 1?
> > Seems
> > like the kind of thing we could reasonably defer while we focus on
> > getting
> > 4.0 out, though I'm sympathetic to the "release is cutoff for
> > deprecation"
> > argument.
> >
> > If we cadence our majors to calendar (like every 6 months for
> example)
> > instead of scope this would become significantly less of a big issue
> > imo.
> >
> > On Tue, Jun 30, 2020 at 5:01 PM sankalp kohli <
> kohlisank...@gmail.com>
> > wrote:
> >
> > > Hi,
> > > I think we should revisit all features which require a lot more
> > work to
> > > make them work. Here is how I think we should do for each one of
> them
> > >
> > > 1. Identify such features and some details of why they are
> > deprecation
> > > candidates.
> > > 2. Ask the dev list if anyone is willing to work on improving them
> > over the
> > > next 1 or 2 major releases.
> > > 3. We then move to the user list to find who all are using it and
> if
> > they
> > > are opposed to removing/deprecating it. Assuming few will be using
> > it, we
> > > need to see the tradeoff of keeping it vs removing it on a case by
> > case
> > > basis.
> > > 4. Deprecate it in the next major or make it experimental if #2 and
> > #3
> > > removes them from deprecation.
> > > 5. Remove it in next major
> > >
> > > For MV, I see this email as step #2. We should move to asking the
> > user list
> > > next.
> > >
> > > Thanks,
> > > Sankalp
> > >
> > > On Tue, Jun 30, 2020 at 1:46 PM Joshua McKenzie <
> > jmcken...@apache.org>
> > > wrote:
> > >
> > > > We're just short of 98 tickets on the component since it's
> > original merge
> > > > so at least *some* work has been done to stabilize them. Not to
> > say I'm
> > > > endorsing running them at massive scale today without knowing
> what
> > you're
> > > > doing, to be clear. They are perhaps our largest loaded gun of a
> > feature
> > > of
> > > > self-foot-shooting atm. Zhao did a bunch of work on them
> > internally and
> > > > we've backported much of that to OSS; I've pinged him to chime in
> > here.
> > > >
> > > > The "data is orphaned in your view when you lose all base
> > replicas" issue
> > > > is more or less "unsolvable", since a scan of a view to confirm
> > data in
> > > the
> > > > base table is so slow you're talking weeks to process and it
> > totally
> > > > trashes your page cache. I think Paulo landed on a "you have to
> > rebuild
> > > the
> > > > view if you lose all base data" reality. There's also, I believe,
> > the
> > > > unresolved issue of modeling how much data a base table with one
> > to many
> > > > views will end up taking up in its final form when denormalized.
> > This
> > > could
> > > > be vastly improved with something like an "EXPLAIN ANALYZE" for a
> > 

Re: [DISCUSS] Future of MVs

2020-06-30 Thread Benedict Elliott Smith
I think the point is that we need to have a clear plan of action to bring 
features up to an acceptable standard.  That also implies a need to agree how 
we determine if a feature has reached an acceptable standard - both going 
forwards and retrospectively.  For those that don't reach that standard today, 
we need something like a retrospective CEP to agree how to rectify that.  Then 
we can figure out if the necessary resources can be mustered, or if we need to 
consider obsolescence.

I'm not convinced this discussion has to be resolved immediately, but that's 
how I view the situation.


On 30/06/2020, 23:11, "Joshua McKenzie"  wrote:

Let's forget I said anything about release cadence. That's another thread
entirely and a good deep conversation to explore. Don't want to derail.

If there's a question about "is anyone stepping forward to maintain MV's",
I can say with certainty that at least one full time contributor I work
with will engage and continue to work on and improve this feature going
forward. Who precisely that ends up being stands to be seen; that's more
fluid, but there are no plans to stop working on it going forward.

On Tue, Jun 30, 2020 at 5:45 PM Benedict Elliott Smith 
wrote:

> I don't think we can realistically expect majors, with the deprecation
> cycle they entail, to come every six months.  If nothing else, we would
> have too many versions to maintain at once.  I personally think all the
> project needs on that front is clearer roadmapping at the start of a
> release cycle, and we would be fine with 12-18mo release cycles.
>
> That's another whole discussion to distract us from 4.0, anyway - though I
> think we can tolerate a few slow burn conversations.
>
>
> On 30/06/2020, 22:10, "Joshua McKenzie"  wrote:
>
> Seems like a reasonable point of view to me Sankalp. I'd also suggest
> we
> try to find other sources of data than just the user ML, like
> searching on
> github for instance. A collection of imperfect metrics beats just one
> in my
> experience.
>
> Though I would ask why we're having this discussion this late in the
> release cycle when we have what, 4 tickets left until cutting beta 1?
> Seems
> like the kind of thing we could reasonably defer while we focus on
> getting
> 4.0 out, though I'm sympathetic to the "release is cutoff for
> deprecation"
> argument.
>
> If we cadence our majors to calendar (like every 6 months for example)
> instead of scope this would become significantly less of a big issue
> imo.
>
> On Tue, Jun 30, 2020 at 5:01 PM sankalp kohli 
> wrote:
>
> > Hi,
> > I think we should revisit all features which require a lot more
> work to
> > make them work. Here is how I think we should do for each one of 
them
> >
> > 1. Identify such features and some details of why they are
> deprecation
> > candidates.
> > 2. Ask the dev list if anyone is willing to work on improving them
> over the
> > next 1 or 2 major releases.
> > 3. We then move to the user list to find who all are using it and if
> they
> > are opposed to removing/deprecating it. Assuming few will be using
> it, we
> > need to see the tradeoff of keeping it vs removing it on a case by
> case
> > basis.
> > 4. Deprecate it in the next major or make it experimental if #2 and
> #3
> > removes them from deprecation.
> > 5. Remove it in next major
> >
> > For MV, I see this email as step #2. We should move to asking the
> user list
> > next.
> >
> > Thanks,
> > Sankalp
> >
> > On Tue, Jun 30, 2020 at 1:46 PM Joshua McKenzie <
> jmcken...@apache.org>
> > wrote:
> >
> > > We're just short of 98 tickets on the component since it's
> original merge
> > > so at least *some* work has been done to stabilize them. Not to
> say I'm
> > > endorsing running them at massive scale today without knowing what
> you're
> > > doing, to be clear. They are perhaps our largest loaded gun of a
> feature
> > of
> > > self-foot-shooting atm. Zhao did a bunch of work on them
> internally and
> > > we've backported much of that to OSS; I've pinged him to chime in
> here.
> > >
> > > The "data is orphaned in your view when you lose all base
> replicas" issue
> > > is more or less "unsolvable", since a scan of a view to confirm
> data in
> > the
> > > base table is so slow you're talking weeks to process and it
> totally
> > > trashes your page cache. I think Paulo landed on a "you have to
> rebuild
> > the

Re: [DISCUSS] Future of MVs

2020-06-30 Thread Joshua McKenzie
Let's forget I said anything about release cadence. That's another thread
entirely and a good deep conversation to explore. Don't want to derail.

If there's a question about "is anyone stepping forward to maintain MV's",
I can say with certainty that at least one full time contributor I work
with will engage and continue to work on and improve this feature going
forward. Who precisely that ends up being stands to be seen; that's more
fluid, but there are no plans to stop working on it going forward.

On Tue, Jun 30, 2020 at 5:45 PM Benedict Elliott Smith 
wrote:

> I don't think we can realistically expect majors, with the deprecation
> cycle they entail, to come every six months.  If nothing else, we would
> have too many versions to maintain at once.  I personally think all the
> project needs on that front is clearer roadmapping at the start of a
> release cycle, and we would be fine with 12-18mo release cycles.
>
> That's another whole discussion to distract us from 4.0, anyway - though I
> think we can tolerate a few slow burn conversations.
>
>
> On 30/06/2020, 22:10, "Joshua McKenzie"  wrote:
>
> Seems like a reasonable point of view to me Sankalp. I'd also suggest
> we
> try to find other sources of data than just the user ML, like
> searching on
> github for instance. A collection of imperfect metrics beats just one
> in my
> experience.
>
> Though I would ask why we're having this discussion this late in the
> release cycle when we have what, 4 tickets left until cutting beta 1?
> Seems
> like the kind of thing we could reasonably defer while we focus on
> getting
> 4.0 out, though I'm sympathetic to the "release is cutoff for
> deprecation"
> argument.
>
> If we cadence our majors to calendar (like every 6 months for example)
> instead of scope this would become significantly less of a big issue
> imo.
>
> On Tue, Jun 30, 2020 at 5:01 PM sankalp kohli 
> wrote:
>
> > Hi,
> > I think we should revisit all features which require a lot more
> work to
> > make them work. Here is how I think we should do for each one of them
> >
> > 1. Identify such features and some details of why they are
> deprecation
> > candidates.
> > 2. Ask the dev list if anyone is willing to work on improving them
> over the
> > next 1 or 2 major releases.
> > 3. We then move to the user list to find who all are using it and if
> they
> > are opposed to removing/deprecating it. Assuming few will be using
> it, we
> > need to see the tradeoff of keeping it vs removing it on a case by
> case
> > basis.
> > 4. Deprecate it in the next major or make it experimental if #2 and
> #3
> > removes them from deprecation.
> > 5. Remove it in next major
> >
> > For MV, I see this email as step #2. We should move to asking the
> user list
> > next.
> >
> > Thanks,
> > Sankalp
> >
> > On Tue, Jun 30, 2020 at 1:46 PM Joshua McKenzie <
> jmcken...@apache.org>
> > wrote:
> >
> > > We're just short of 98 tickets on the component since it's
> original merge
> > > so at least *some* work has been done to stabilize them. Not to
> say I'm
> > > endorsing running them at massive scale today without knowing what
> you're
> > > doing, to be clear. They are perhaps our largest loaded gun of a
> feature
> > of
> > > self-foot-shooting atm. Zhao did a bunch of work on them
> internally and
> > > we've backported much of that to OSS; I've pinged him to chime in
> here.
> > >
> > > The "data is orphaned in your view when you lose all base
> replicas" issue
> > > is more or less "unsolvable", since a scan of a view to confirm
> data in
> > the
> > > base table is so slow you're talking weeks to process and it
> totally
> > > trashes your page cache. I think Paulo landed on a "you have to
> rebuild
> > the
> > > view if you lose all base data" reality. There's also, I believe,
> the
> > > unresolved issue of modeling how much data a base table with one
> to many
> > > views will end up taking up in its final form when denormalized.
> This
> > could
> > > be vastly improved with something like an "EXPLAIN ANALYZE" for a
> table
> > > with views, if you'll excuse the mapping, to show "N bytes in base
> will
> > > become M with base + views" or something.
> > >
> > > Last but definitely not least in dumping the state in my head
> about this,
> > > there's a bunch of potential for guardrailing people away from
> self-harm
> > > with MV's if we decide to go the route of guardrails (link:
> > >
> > >
> >
> https://cwiki.apache.org/confluence/display/CASSANDRA/%28DRAFT%29+-+CEP-3%3A+Guardrails
> > > ).
> > >
> > > So  from my PoV, I'm against us just voting to deprecate and remove
> > without
> > > going into more depth into the current state of things and what
> options
> > 

Re: [DISCUSS] Future of MVs

2020-06-30 Thread Benedict Elliott Smith
I don't think we can realistically expect majors, with the deprecation cycle 
they entail, to come every six months.  If nothing else, we would have too many 
versions to maintain at once.  I personally think all the project needs on that 
front is clearer roadmapping at the start of a release cycle, and we would be 
fine with 12-18mo release cycles.

That's another whole discussion to distract us from 4.0, anyway - though I 
think we can tolerate a few slow burn conversations.
 

On 30/06/2020, 22:10, "Joshua McKenzie"  wrote:

Seems like a reasonable point of view to me Sankalp. I'd also suggest we
try to find other sources of data than just the user ML, like searching on
github for instance. A collection of imperfect metrics beats just one in my
experience.

Though I would ask why we're having this discussion this late in the
release cycle when we have what, 4 tickets left until cutting beta 1? Seems
like the kind of thing we could reasonably defer while we focus on getting
4.0 out, though I'm sympathetic to the "release is cutoff for deprecation"
argument.

If we cadence our majors to calendar (like every 6 months for example)
instead of scope this would become significantly less of a big issue imo.

On Tue, Jun 30, 2020 at 5:01 PM sankalp kohli 
wrote:

> Hi,
> I think we should revisit all features which require a lot more work 
to
> make them work. Here is how I think we should do for each one of them
>
> 1. Identify such features and some details of why they are deprecation
> candidates.
> 2. Ask the dev list if anyone is willing to work on improving them over 
the
> next 1 or 2 major releases.
> 3. We then move to the user list to find who all are using it and if they
> are opposed to removing/deprecating it. Assuming few will be using it, we
> need to see the tradeoff of keeping it vs removing it on a case by case
> basis.
> 4. Deprecate it in the next major or make it experimental if #2 and #3
> removes them from deprecation.
> 5. Remove it in next major
>
> For MV, I see this email as step #2. We should move to asking the user 
list
> next.
>
> Thanks,
> Sankalp
>
> On Tue, Jun 30, 2020 at 1:46 PM Joshua McKenzie 
> wrote:
>
> > We're just short of 98 tickets on the component since it's original 
merge
> > so at least *some* work has been done to stabilize them. Not to say I'm
> > endorsing running them at massive scale today without knowing what 
you're
> > doing, to be clear. They are perhaps our largest loaded gun of a feature
> of
> > self-foot-shooting atm. Zhao did a bunch of work on them internally and
> > we've backported much of that to OSS; I've pinged him to chime in here.
> >
> > The "data is orphaned in your view when you lose all base replicas" 
issue
> > is more or less "unsolvable", since a scan of a view to confirm data in
> the
> > base table is so slow you're talking weeks to process and it totally
> > trashes your page cache. I think Paulo landed on a "you have to rebuild
> the
> > view if you lose all base data" reality. There's also, I believe, the
> > unresolved issue of modeling how much data a base table with one to many
> > views will end up taking up in its final form when denormalized. This
> could
> > be vastly improved with something like an "EXPLAIN ANALYZE" for a table
> > with views, if you'll excuse the mapping, to show "N bytes in base will
> > become M with base + views" or something.
> >
> > Last but definitely not least in dumping the state in my head about 
this,
> > there's a bunch of potential for guardrailing people away from self-harm
> > with MV's if we decide to go the route of guardrails (link:
> >
> >
> 
https://cwiki.apache.org/confluence/display/CASSANDRA/%28DRAFT%29+-+CEP-3%3A+Guardrails
> > ).
> >
> > So  from my PoV, I'm against us just voting to deprecate and remove
> without
> > going into more depth into the current state of things and what options
> are
> > on the table, since people will continue to build MV's at the client
> level
> > which, in theory, should have worse correctness and performance
> > characteristics than having a clean and well stabilized implementation 
in
> > the coordinator.
> >
> > Having them flagged as experimental for now as we stabilize 4.0 and get
> > things out the door *seems* sufficient to me, but if people are widely
> > using these out in the wild and ignoring that status and the
> corresponding
> > warning, maybe we consider raising the volume on that warning for 4.0
> while
> > we figure this out.
> >
> > Just my .02.
> >
> > ~Josh
> >
> > On Tue, Jun 30, 2020 at 4:22 PM Dinesh Joshi  wrote:
> >
> > > > On Jun 30, 2020, at 12:43 PM, J

Re: [DISCUSS] Future of MVs

2020-06-30 Thread Jeff Jirsa
On Tue, Jun 30, 2020 at 1:46 PM Joshua McKenzie 
wrote:

> We're just short of 98 tickets on the component since it's original merge
> so at least *some* work has been done to stabilize them. Not to say I'm
> endorsing running them at massive scale today without knowing what you're
> doing, to be clear. They are perhaps our largest loaded gun of a feature of
> self-foot-shooting atm. Zhao did a bunch of work on them internally and
> we've backported much of that to OSS; I've pinged him to chime in here.
>

Probably true.


>
> The "data is orphaned in your view when you lose all base replicas" issue
> is more or less "unsolvable", since a scan of a view to confirm data in the
> base table is so slow you're talking weeks to process and it totally
> trashes your page cache.


"Make the scan faster"
"Make the scan incremental and automatic"
"Make it not blow up your page cache"
"Make losing your base replicas less likely".

There's a concrete, real opportunity with MVs to create integrity
assertions we're missing. A dangling record from an MV that would point to
missing base data is something that could raise alarm bells and signal
JIRAs so we can potentially find and fix more surprise edge cases.


> So  from my PoV, I'm against us just voting to deprecate and remove without
> going into more depth into the current state of things and what options are
> on the table, since people will continue to build MV's at the client level
> which, in theory, should have worse correctness and performance
> characteristics than having a clean and well stabilized implementation in
> the coordinator.
>

Yanking features will definitely be painful for users. Leaving it
experimental seems much better for users as long as the
maintenance overhead is tolerable.


Re: [DISCUSS] Future of MVs

2020-06-30 Thread Benedict Elliott Smith
I think, just as importantly, we also need to grapple with what went wrong when 
features landed this way, since these were not isolated occurrences - 
suggesting structural issues were at play.

I'm not sure if a retrospective is viable with this organisational structure, 
but we can perhaps engage with it implicitly, in a positive way, by working to 
create a framework with clear expectations for how features should be delivered 
- to go hand-in-hand with CEP proposals.  

This framework can then also be applied to existing features considered to be 
inadequate, as we decide how to move forward with them.


On 30/06/2020, 22:01, "sankalp kohli"  wrote:

Hi,
I think we should revisit all features which require a lot more work to
make them work. Here is how I think we should do for each one of them

1. Identify such features and some details of why they are deprecation
candidates.
2. Ask the dev list if anyone is willing to work on improving them over the
next 1 or 2 major releases.
3. We then move to the user list to find who all are using it and if they
are opposed to removing/deprecating it. Assuming few will be using it, we
need to see the tradeoff of keeping it vs removing it on a case by case
basis.
4. Deprecate it in the next major or make it experimental if #2 and #3
removes them from deprecation.
5. Remove it in next major

For MV, I see this email as step #2. We should move to asking the user list
next.

Thanks,
Sankalp

On Tue, Jun 30, 2020 at 1:46 PM Joshua McKenzie 
wrote:

> We're just short of 98 tickets on the component since it's original merge
> so at least *some* work has been done to stabilize them. Not to say I'm
> endorsing running them at massive scale today without knowing what you're
> doing, to be clear. They are perhaps our largest loaded gun of a feature 
of
> self-foot-shooting atm. Zhao did a bunch of work on them internally and
> we've backported much of that to OSS; I've pinged him to chime in here.
>
> The "data is orphaned in your view when you lose all base replicas" issue
> is more or less "unsolvable", since a scan of a view to confirm data in 
the
> base table is so slow you're talking weeks to process and it totally
> trashes your page cache. I think Paulo landed on a "you have to rebuild 
the
> view if you lose all base data" reality. There's also, I believe, the
> unresolved issue of modeling how much data a base table with one to many
> views will end up taking up in its final form when denormalized. This 
could
> be vastly improved with something like an "EXPLAIN ANALYZE" for a table
> with views, if you'll excuse the mapping, to show "N bytes in base will
> become M with base + views" or something.
>
> Last but definitely not least in dumping the state in my head about this,
> there's a bunch of potential for guardrailing people away from self-harm
> with MV's if we decide to go the route of guardrails (link:
>
> 
https://cwiki.apache.org/confluence/display/CASSANDRA/%28DRAFT%29+-+CEP-3%3A+Guardrails
> ).
>
> So  from my PoV, I'm against us just voting to deprecate and remove 
without
> going into more depth into the current state of things and what options 
are
> on the table, since people will continue to build MV's at the client level
> which, in theory, should have worse correctness and performance
> characteristics than having a clean and well stabilized implementation in
> the coordinator.
>
> Having them flagged as experimental for now as we stabilize 4.0 and get
> things out the door *seems* sufficient to me, but if people are widely
> using these out in the wild and ignoring that status and the corresponding
> warning, maybe we consider raising the volume on that warning for 4.0 
while
> we figure this out.
>
> Just my .02.
>
> ~Josh
>
> On Tue, Jun 30, 2020 at 4:22 PM Dinesh Joshi  wrote:
>
> > > On Jun 30, 2020, at 12:43 PM, Jon Haddad  wrote:
> > >
> > > As we move forward with the 4.0 release, we should consider this an
> > > opportunity to deprecate materialized views, and remove them in 5.0.
> We
> > > should take this opportunity to learn from the mistake and raise the
> bar
> > > for new features to undergo a much more thorough run the wringer 
before
> > > merging.
> >
> > I'm in favor of marking them as deprecated and removing them in 5.0. If
> > someone steps up and can fix them in 5.0, then we always have the option
> of
> > accepting the fix.
> >
> > Dinesh
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> > For additional commands, e-mail: dev-h...@cassandra.apache.org
> >
> >
>



---

Re: [DISCUSS] Future of MVs

2020-06-30 Thread Joshua McKenzie
Seems like a reasonable point of view to me Sankalp. I'd also suggest we
try to find other sources of data than just the user ML, like searching on
github for instance. A collection of imperfect metrics beats just one in my
experience.

Though I would ask why we're having this discussion this late in the
release cycle when we have what, 4 tickets left until cutting beta 1? Seems
like the kind of thing we could reasonably defer while we focus on getting
4.0 out, though I'm sympathetic to the "release is cutoff for deprecation"
argument.

If we cadence our majors to calendar (like every 6 months for example)
instead of scope this would become significantly less of a big issue imo.

On Tue, Jun 30, 2020 at 5:01 PM sankalp kohli 
wrote:

> Hi,
> I think we should revisit all features which require a lot more work to
> make them work. Here is how I think we should do for each one of them
>
> 1. Identify such features and some details of why they are deprecation
> candidates.
> 2. Ask the dev list if anyone is willing to work on improving them over the
> next 1 or 2 major releases.
> 3. We then move to the user list to find who all are using it and if they
> are opposed to removing/deprecating it. Assuming few will be using it, we
> need to see the tradeoff of keeping it vs removing it on a case by case
> basis.
> 4. Deprecate it in the next major or make it experimental if #2 and #3
> removes them from deprecation.
> 5. Remove it in next major
>
> For MV, I see this email as step #2. We should move to asking the user list
> next.
>
> Thanks,
> Sankalp
>
> On Tue, Jun 30, 2020 at 1:46 PM Joshua McKenzie 
> wrote:
>
> > We're just short of 98 tickets on the component since it's original merge
> > so at least *some* work has been done to stabilize them. Not to say I'm
> > endorsing running them at massive scale today without knowing what you're
> > doing, to be clear. They are perhaps our largest loaded gun of a feature
> of
> > self-foot-shooting atm. Zhao did a bunch of work on them internally and
> > we've backported much of that to OSS; I've pinged him to chime in here.
> >
> > The "data is orphaned in your view when you lose all base replicas" issue
> > is more or less "unsolvable", since a scan of a view to confirm data in
> the
> > base table is so slow you're talking weeks to process and it totally
> > trashes your page cache. I think Paulo landed on a "you have to rebuild
> the
> > view if you lose all base data" reality. There's also, I believe, the
> > unresolved issue of modeling how much data a base table with one to many
> > views will end up taking up in its final form when denormalized. This
> could
> > be vastly improved with something like an "EXPLAIN ANALYZE" for a table
> > with views, if you'll excuse the mapping, to show "N bytes in base will
> > become M with base + views" or something.
> >
> > Last but definitely not least in dumping the state in my head about this,
> > there's a bunch of potential for guardrailing people away from self-harm
> > with MV's if we decide to go the route of guardrails (link:
> >
> >
> https://cwiki.apache.org/confluence/display/CASSANDRA/%28DRAFT%29+-+CEP-3%3A+Guardrails
> > ).
> >
> > So  from my PoV, I'm against us just voting to deprecate and remove
> without
> > going into more depth into the current state of things and what options
> are
> > on the table, since people will continue to build MV's at the client
> level
> > which, in theory, should have worse correctness and performance
> > characteristics than having a clean and well stabilized implementation in
> > the coordinator.
> >
> > Having them flagged as experimental for now as we stabilize 4.0 and get
> > things out the door *seems* sufficient to me, but if people are widely
> > using these out in the wild and ignoring that status and the
> corresponding
> > warning, maybe we consider raising the volume on that warning for 4.0
> while
> > we figure this out.
> >
> > Just my .02.
> >
> > ~Josh
> >
> > On Tue, Jun 30, 2020 at 4:22 PM Dinesh Joshi  wrote:
> >
> > > > On Jun 30, 2020, at 12:43 PM, Jon Haddad  wrote:
> > > >
> > > > As we move forward with the 4.0 release, we should consider this an
> > > > opportunity to deprecate materialized views, and remove them in 5.0.
> > We
> > > > should take this opportunity to learn from the mistake and raise the
> > bar
> > > > for new features to undergo a much more thorough run the wringer
> before
> > > > merging.
> > >
> > > I'm in favor of marking them as deprecated and removing them in 5.0. If
> > > someone steps up and can fix them in 5.0, then we always have the
> option
> > of
> > > accepting the fix.
> > >
> > > Dinesh
> > > -
> > > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> > > For additional commands, e-mail: dev-h...@cassandra.apache.org
> > >
> > >
> >
>


Re: [DISCUSS] Future of MVs

2020-06-30 Thread Jeremiah D Jordan
> So  from my PoV, I'm against us just voting to deprecate and remove without
> going into more depth into the current state of things and what options are
> on the table, since people will continue to build MV's at the client level
> which, in theory, should have worse correctness and performance
> characteristics than having a clean and well stabilized implementation in
> the coordinator.

I agree with Josh here.  Multiple people have put in effort to improve the 
stability of MV’s since they were first put into the code base and the reasons 
for having them be in the DB have not changed.  Building MV like tables at the 
client level is actually harder to get right than doing it in the server.

-Jeremiah


> On Jun 30, 2020, at 3:45 PM, Joshua McKenzie  wrote:
> 
> We're just short of 98 tickets on the component since it's original merge
> so at least *some* work has been done to stabilize them. Not to say I'm
> endorsing running them at massive scale today without knowing what you're
> doing, to be clear. They are perhaps our largest loaded gun of a feature of
> self-foot-shooting atm. Zhao did a bunch of work on them internally and
> we've backported much of that to OSS; I've pinged him to chime in here.
> 
> The "data is orphaned in your view when you lose all base replicas" issue
> is more or less "unsolvable", since a scan of a view to confirm data in the
> base table is so slow you're talking weeks to process and it totally
> trashes your page cache. I think Paulo landed on a "you have to rebuild the
> view if you lose all base data" reality. There's also, I believe, the
> unresolved issue of modeling how much data a base table with one to many
> views will end up taking up in its final form when denormalized. This could
> be vastly improved with something like an "EXPLAIN ANALYZE" for a table
> with views, if you'll excuse the mapping, to show "N bytes in base will
> become M with base + views" or something.
> 
> Last but definitely not least in dumping the state in my head about this,
> there's a bunch of potential for guardrailing people away from self-harm
> with MV's if we decide to go the route of guardrails (link:
> https://cwiki.apache.org/confluence/display/CASSANDRA/%28DRAFT%29+-+CEP-3%3A+Guardrails
> ).
> 
> So  from my PoV, I'm against us just voting to deprecate and remove without
> going into more depth into the current state of things and what options are
> on the table, since people will continue to build MV's at the client level
> which, in theory, should have worse correctness and performance
> characteristics than having a clean and well stabilized implementation in
> the coordinator.
> 
> Having them flagged as experimental for now as we stabilize 4.0 and get
> things out the door *seems* sufficient to me, but if people are widely
> using these out in the wild and ignoring that status and the corresponding
> warning, maybe we consider raising the volume on that warning for 4.0 while
> we figure this out.
> 
> Just my .02.
> 
> ~Josh
> 
> On Tue, Jun 30, 2020 at 4:22 PM Dinesh Joshi  wrote:
> 
>>> On Jun 30, 2020, at 12:43 PM, Jon Haddad  wrote:
>>> 
>>> As we move forward with the 4.0 release, we should consider this an
>>> opportunity to deprecate materialized views, and remove them in 5.0.  We
>>> should take this opportunity to learn from the mistake and raise the bar
>>> for new features to undergo a much more thorough run the wringer before
>>> merging.
>> 
>> I'm in favor of marking them as deprecated and removing them in 5.0. If
>> someone steps up and can fix them in 5.0, then we always have the option of
>> accepting the fix.
>> 
>> Dinesh
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
>> For additional commands, e-mail: dev-h...@cassandra.apache.org
>> 
>> 


-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: [DISCUSS] Future of MVs

2020-06-30 Thread sankalp kohli
Hi,
I think we should revisit all features which require a lot more work to
make them work. Here is how I think we should do for each one of them

1. Identify such features and some details of why they are deprecation
candidates.
2. Ask the dev list if anyone is willing to work on improving them over the
next 1 or 2 major releases.
3. We then move to the user list to find who all are using it and if they
are opposed to removing/deprecating it. Assuming few will be using it, we
need to see the tradeoff of keeping it vs removing it on a case by case
basis.
4. Deprecate it in the next major or make it experimental if #2 and #3
removes them from deprecation.
5. Remove it in next major

For MV, I see this email as step #2. We should move to asking the user list
next.

Thanks,
Sankalp

On Tue, Jun 30, 2020 at 1:46 PM Joshua McKenzie 
wrote:

> We're just short of 98 tickets on the component since it's original merge
> so at least *some* work has been done to stabilize them. Not to say I'm
> endorsing running them at massive scale today without knowing what you're
> doing, to be clear. They are perhaps our largest loaded gun of a feature of
> self-foot-shooting atm. Zhao did a bunch of work on them internally and
> we've backported much of that to OSS; I've pinged him to chime in here.
>
> The "data is orphaned in your view when you lose all base replicas" issue
> is more or less "unsolvable", since a scan of a view to confirm data in the
> base table is so slow you're talking weeks to process and it totally
> trashes your page cache. I think Paulo landed on a "you have to rebuild the
> view if you lose all base data" reality. There's also, I believe, the
> unresolved issue of modeling how much data a base table with one to many
> views will end up taking up in its final form when denormalized. This could
> be vastly improved with something like an "EXPLAIN ANALYZE" for a table
> with views, if you'll excuse the mapping, to show "N bytes in base will
> become M with base + views" or something.
>
> Last but definitely not least in dumping the state in my head about this,
> there's a bunch of potential for guardrailing people away from self-harm
> with MV's if we decide to go the route of guardrails (link:
>
> https://cwiki.apache.org/confluence/display/CASSANDRA/%28DRAFT%29+-+CEP-3%3A+Guardrails
> ).
>
> So  from my PoV, I'm against us just voting to deprecate and remove without
> going into more depth into the current state of things and what options are
> on the table, since people will continue to build MV's at the client level
> which, in theory, should have worse correctness and performance
> characteristics than having a clean and well stabilized implementation in
> the coordinator.
>
> Having them flagged as experimental for now as we stabilize 4.0 and get
> things out the door *seems* sufficient to me, but if people are widely
> using these out in the wild and ignoring that status and the corresponding
> warning, maybe we consider raising the volume on that warning for 4.0 while
> we figure this out.
>
> Just my .02.
>
> ~Josh
>
> On Tue, Jun 30, 2020 at 4:22 PM Dinesh Joshi  wrote:
>
> > > On Jun 30, 2020, at 12:43 PM, Jon Haddad  wrote:
> > >
> > > As we move forward with the 4.0 release, we should consider this an
> > > opportunity to deprecate materialized views, and remove them in 5.0.
> We
> > > should take this opportunity to learn from the mistake and raise the
> bar
> > > for new features to undergo a much more thorough run the wringer before
> > > merging.
> >
> > I'm in favor of marking them as deprecated and removing them in 5.0. If
> > someone steps up and can fix them in 5.0, then we always have the option
> of
> > accepting the fix.
> >
> > Dinesh
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> > For additional commands, e-mail: dev-h...@cassandra.apache.org
> >
> >
>


Re: [DISCUSS] Future of MVs

2020-06-30 Thread Jasonstack Zhao Yang
> While at TLP, I helped numerous customers move off of MVs, mostly because
> they affected stability of clusters in a horrific way.  The most telling
> project involved helping someone create new tables to manage 1GB of data
> because the views performed so poorly they made the cluster unresponsive
> and unusable.

The documented way to report bugs:
https://cassandra.apache.org/doc/latest/bugs.html#

with JIRA, Version, Environment.


> As we move forward with the 4.0 release, we should consider this an
opportunity to deprecate materialized views, and remove them in 5.0.

While the community is focusing on 4.0 and unable to review
CEP/Improvements,
should we discuss it when community is ready to discuss about
CEP/Improvements?


> We should take this opportunity to learn from the mistake and raise the
bar
> for new features to undergo a much more thorough run the wringer before
> merging.

Agreed to learn from mistakes, but there are still users using MV.
I think it's more responsible to work with users to improve MV on their use
cases.


>  Am I missing a JIRA
> that can magically fix the issues with performance, availability &
> correctness?

Is there any formal discussion/analysis about things being impossible to
fix/improve?

On Wed, 1 Jul 2020 at 04:23, Dinesh Joshi  wrote:

> > On Jun 30, 2020, at 12:43 PM, Jon Haddad  wrote:
> >
> > As we move forward with the 4.0 release, we should consider this an
> > opportunity to deprecate materialized views, and remove them in 5.0.  We
> > should take this opportunity to learn from the mistake and raise the bar
> > for new features to undergo a much more thorough run the wringer before
> > merging.
>
> I'm in favor of marking them as deprecated and removing them in 5.0. If
> someone steps up and can fix them in 5.0, then we always have the option of
> accepting the fix.
>
> Dinesh
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
>


Re: [DISCUSS] Future of MVs

2020-06-30 Thread Joshua McKenzie
We're just short of 98 tickets on the component since it's original merge
so at least *some* work has been done to stabilize them. Not to say I'm
endorsing running them at massive scale today without knowing what you're
doing, to be clear. They are perhaps our largest loaded gun of a feature of
self-foot-shooting atm. Zhao did a bunch of work on them internally and
we've backported much of that to OSS; I've pinged him to chime in here.

The "data is orphaned in your view when you lose all base replicas" issue
is more or less "unsolvable", since a scan of a view to confirm data in the
base table is so slow you're talking weeks to process and it totally
trashes your page cache. I think Paulo landed on a "you have to rebuild the
view if you lose all base data" reality. There's also, I believe, the
unresolved issue of modeling how much data a base table with one to many
views will end up taking up in its final form when denormalized. This could
be vastly improved with something like an "EXPLAIN ANALYZE" for a table
with views, if you'll excuse the mapping, to show "N bytes in base will
become M with base + views" or something.

Last but definitely not least in dumping the state in my head about this,
there's a bunch of potential for guardrailing people away from self-harm
with MV's if we decide to go the route of guardrails (link:
https://cwiki.apache.org/confluence/display/CASSANDRA/%28DRAFT%29+-+CEP-3%3A+Guardrails
).

So  from my PoV, I'm against us just voting to deprecate and remove without
going into more depth into the current state of things and what options are
on the table, since people will continue to build MV's at the client level
which, in theory, should have worse correctness and performance
characteristics than having a clean and well stabilized implementation in
the coordinator.

Having them flagged as experimental for now as we stabilize 4.0 and get
things out the door *seems* sufficient to me, but if people are widely
using these out in the wild and ignoring that status and the corresponding
warning, maybe we consider raising the volume on that warning for 4.0 while
we figure this out.

Just my .02.

~Josh

On Tue, Jun 30, 2020 at 4:22 PM Dinesh Joshi  wrote:

> > On Jun 30, 2020, at 12:43 PM, Jon Haddad  wrote:
> >
> > As we move forward with the 4.0 release, we should consider this an
> > opportunity to deprecate materialized views, and remove them in 5.0.  We
> > should take this opportunity to learn from the mistake and raise the bar
> > for new features to undergo a much more thorough run the wringer before
> > merging.
>
> I'm in favor of marking them as deprecated and removing them in 5.0. If
> someone steps up and can fix them in 5.0, then we always have the option of
> accepting the fix.
>
> Dinesh
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
>


Re: [DISCUSS] Future of MVs

2020-06-30 Thread Brandon Williams
+1

On Tue, Jun 30, 2020 at 2:44 PM Jon Haddad  wrote:
>
> A couple days ago when writing a separate email I came across this DataStax
> blog post discussing MVs [1].  Imagine my surprise when I noticed the date
> was five years ago...
>
> While at TLP, I helped numerous customers move off of MVs, mostly because
> they affected stability of clusters in a horrific way.  The most telling
> project involved helping someone create new tables to manage 1GB of data
> because the views performed so poorly they made the cluster unresponsive
> and unusable.  Despite being around for five years, they've seen very
> little improvement that makes them usable for non trivial, non laptop
> workloads.
>
> Since the original commits, it doesn't look like there's been much work to
> improve them, and they're yet another feature I ended up saying "just don't
> use".  I haven't heard any plans to improve them in any meaningful way -
> either to address their issues with performance or the inability to repair
> them.
>
> The original contributor of MVs (Carl Yeksigian) seems to have disappeared
> from the project, meaning we have a broken feature without a maintainer,
> and no plans to fix it.
>
> As we move forward with the 4.0 release, we should consider this an
> opportunity to deprecate materialized views, and remove them in 5.0.  We
> should take this opportunity to learn from the mistake and raise the bar
> for new features to undergo a much more thorough run the wringer before
> merging.
>
> I'm curious what folks think - am I way off base here?  Am I missing a JIRA
> that can magically fix the issues with performance, availability &
> correctness?
>
> [1]
> https://www.datastax.com/blog/2015/06/new-cassandra-30-materialized-views
> [2] https://issues.apache.org/jira/browse/CASSANDRA-6477

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: [DISCUSS] Future of MVs

2020-06-30 Thread Dinesh Joshi
> On Jun 30, 2020, at 12:43 PM, Jon Haddad  wrote:
> 
> As we move forward with the 4.0 release, we should consider this an
> opportunity to deprecate materialized views, and remove them in 5.0.  We
> should take this opportunity to learn from the mistake and raise the bar
> for new features to undergo a much more thorough run the wringer before
> merging.

I'm in favor of marking them as deprecated and removing them in 5.0. If someone 
steps up and can fix them in 5.0, then we always have the option of accepting 
the fix.

Dinesh
-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: [DISCUSS] Future of MVs

2020-06-30 Thread Blake Eggleston
+1 for deprecation and removal (assuming a credible plan to fix them doesn't 
materialize)

> On Jun 30, 2020, at 12:43 PM, Jon Haddad  wrote:
> 
> A couple days ago when writing a separate email I came across this DataStax
> blog post discussing MVs [1].  Imagine my surprise when I noticed the date
> was five years ago...
> 
> While at TLP, I helped numerous customers move off of MVs, mostly because
> they affected stability of clusters in a horrific way.  The most telling
> project involved helping someone create new tables to manage 1GB of data
> because the views performed so poorly they made the cluster unresponsive
> and unusable.  Despite being around for five years, they've seen very
> little improvement that makes them usable for non trivial, non laptop
> workloads.
> 
> Since the original commits, it doesn't look like there's been much work to
> improve them, and they're yet another feature I ended up saying "just don't
> use".  I haven't heard any plans to improve them in any meaningful way -
> either to address their issues with performance or the inability to repair
> them.
> 
> The original contributor of MVs (Carl Yeksigian) seems to have disappeared
> from the project, meaning we have a broken feature without a maintainer,
> and no plans to fix it.
> 
> As we move forward with the 4.0 release, we should consider this an
> opportunity to deprecate materialized views, and remove them in 5.0.  We
> should take this opportunity to learn from the mistake and raise the bar
> for new features to undergo a much more thorough run the wringer before
> merging.
> 
> I'm curious what folks think - am I way off base here?  Am I missing a JIRA
> that can magically fix the issues with performance, availability &
> correctness?
> 
> [1]
> https://www.datastax.com/blog/2015/06/new-cassandra-30-materialized-views
> [2] https://issues.apache.org/jira/browse/CASSANDRA-6477


-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



[DISCUSS] Future of MVs

2020-06-30 Thread Jon Haddad
A couple days ago when writing a separate email I came across this DataStax
blog post discussing MVs [1].  Imagine my surprise when I noticed the date
was five years ago...

While at TLP, I helped numerous customers move off of MVs, mostly because
they affected stability of clusters in a horrific way.  The most telling
project involved helping someone create new tables to manage 1GB of data
because the views performed so poorly they made the cluster unresponsive
and unusable.  Despite being around for five years, they've seen very
little improvement that makes them usable for non trivial, non laptop
workloads.

Since the original commits, it doesn't look like there's been much work to
improve them, and they're yet another feature I ended up saying "just don't
use".  I haven't heard any plans to improve them in any meaningful way -
either to address their issues with performance or the inability to repair
them.

The original contributor of MVs (Carl Yeksigian) seems to have disappeared
from the project, meaning we have a broken feature without a maintainer,
and no plans to fix it.

As we move forward with the 4.0 release, we should consider this an
opportunity to deprecate materialized views, and remove them in 5.0.  We
should take this opportunity to learn from the mistake and raise the bar
for new features to undergo a much more thorough run the wringer before
merging.

I'm curious what folks think - am I way off base here?  Am I missing a JIRA
that can magically fix the issues with performance, availability &
correctness?

[1]
https://www.datastax.com/blog/2015/06/new-cassandra-30-materialized-views
[2] https://issues.apache.org/jira/browse/CASSANDRA-6477


Re: [DISCUSS] Stabilizing 4.0

2020-06-30 Thread Benjamin Lerer
It is a good catch, Mick. :-)

I will triage those tickets to be sure that our view of things is accurate.


On Tue, Jun 30, 2020 at 11:38 AM Berenguer Blasi 
wrote:

> That's a very good point. At the risk of saying sthg silly or being
> captain obvious, as I am not familiar with the project dynamics, there
> should be a periodic 'backlog triage' or similar. Otherwise we'll have
> the impression we have just a handful of pending issues while another
> 10x packet is hiding but we didn't notice yet.
>
> On 30/6/20 11:18, Mick Semb Wever wrote:
> >>
> >> Berenguer pointed out to me that we already have a graph to track those
> >> things:
> >>
> >>
> >>
> https://issues.apache.org/jira/secure/ConfigureReport.jspa?projectOrFilterId=filter-12347782&periodName=weekly&daysprevious=30&cumulative=true&versionLabels=none&selectedProjectId=12310865&reportKey=com.atlassian.jira.jira-core-reports-plugin%3Acreatedvsresolved-report&atl_token=A5KQ-2QAV-T4JA-FDED_fd75a3db98350d94229fbb4cf29cb50f3051d7ce_lin&Next=Next
> >
> >
> > A lot of issues are also coming in without any fixVersion defined.
> > For example (just in the past 4 weeks):
> >
> >
> https://issues.apache.org/jira/issues/?filter=12347782&jql=project%20%3D%20cassandra%20AND%20((fixVersion%20is%20EMPTY%20AND%20created%20%20%3E%3D%20-4w))%20%20AND%20(resolution%20%3D%20unresolved%20OR%20status%20!%3D%20resolved%20OR%20resolved%20%3E%3D%20-4w)%20ORDER%20BY%20priority%20DESC%2C%20assignee
> >
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
>


Re: [DISCUSS] Stabilizing 4.0

2020-06-30 Thread Berenguer Blasi
That's a very good point. At the risk of saying sthg silly or being
captain obvious, as I am not familiar with the project dynamics, there
should be a periodic 'backlog triage' or similar. Otherwise we'll have
the impression we have just a handful of pending issues while another
10x packet is hiding but we didn't notice yet.

On 30/6/20 11:18, Mick Semb Wever wrote:
>>
>> Berenguer pointed out to me that we already have a graph to track those
>> things:
>>
>>
>> https://issues.apache.org/jira/secure/ConfigureReport.jspa?projectOrFilterId=filter-12347782&periodName=weekly&daysprevious=30&cumulative=true&versionLabels=none&selectedProjectId=12310865&reportKey=com.atlassian.jira.jira-core-reports-plugin%3Acreatedvsresolved-report&atl_token=A5KQ-2QAV-T4JA-FDED_fd75a3db98350d94229fbb4cf29cb50f3051d7ce_lin&Next=Next
>
>
> A lot of issues are also coming in without any fixVersion defined.
> For example (just in the past 4 weeks):
>
> https://issues.apache.org/jira/issues/?filter=12347782&jql=project%20%3D%20cassandra%20AND%20((fixVersion%20is%20EMPTY%20AND%20created%20%20%3E%3D%20-4w))%20%20AND%20(resolution%20%3D%20unresolved%20OR%20status%20!%3D%20resolved%20OR%20resolved%20%3E%3D%20-4w)%20ORDER%20BY%20priority%20DESC%2C%20assignee
>

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: [DISCUSS] Stabilizing 4.0

2020-06-30 Thread Berenguer Blasi
That is a good finger in the air starting point imo. We'd have to adjust
the backing filter to reflect exactly what we want. But we have the data
and a graph report available already at hand which is good :-)

On 30/6/20 11:09, Benjamin Lerer wrote:
>> It would be nice to have a graph on our weekly status of the number of
>> issues reported on 4.0. I feel like having a visual representation of the
>> number of bugs on 4.0 over time would be really helpful to give us a
>> feeling of the progress on its stability.
>>
> Berenguer pointed out to me that we already have a graph to track those
> things:
>
> https://issues.apache.org/jira/secure/ConfigureReport.jspa?projectOrFilterId=filter-12347782&periodName=weekly&daysprevious=30&cumulative=true&versionLabels=none&selectedProjectId=12310865&reportKey=com.atlassian.jira.jira-core-reports-plugin%3Acreatedvsresolved-report&atl_token=A5KQ-2QAV-T4JA-FDED_fd75a3db98350d94229fbb4cf29cb50f3051d7ce_lin&Next=Next
>
>
>
> On Tue, Jun 30, 2020 at 10:20 AM Benjamin Lerer 
> wrote:
>
>> Thanks a lot for starting this thread Dinesh.
>>
>> As a baseline expectation, we thought big users of Cassandra should be
>>> running the latest trunk internally and testing it out for their particular
>>> use cases. We wanted them to file as many jiras as possible based on their
>>> experience. Operations such as host replacement, expansions, shrinks, etc.
>>> and obviously any issues with durability, performance, availability. This
>>> was thought to generate a body of work (jiras), when fixed, over time would
>>> stabilize trunk. When we see the trickle of new jiras coming to a halt or
>>> at least nothing serious shows up, thats when the big users of Cassandra
>>> would feel comfortable running the build in prod. This would be a good time
>>> to cut the final stable release.
>>>
>> It would be nice to have a graph on our weekly status of the number of
>> issues reported on 4.0. I feel like having a visual representation of the
>> number of bugs on 4.0 over time would be really helpful to give us a
>> feeling of the progress on its stability.
>> It might also be interesting to see which components are the most affected
>> to help us to determine where we should increase the testing.
>>
>> We also created a confluence doc for a test plan with major areas that
>>> require testing. There were shepherds that were tentatively assigned[1].
>>> The rationale for this doc was that these areas have significantly changed
>>> and we need more eye balls on it to ensure stability. The shepherds would
>>> help guide the testing for these areas.
>>
>> I had a quick look at the JIRAs associated with the different areas of the
>> plan and a lot of them appear to be blocked. I believe that most people are
>> unsure of what or how to test things and want to get some feedback before
>> starting to add tests.
>> It would be great if in the coming weeks we can all help to unblock those
>> tickets by clarifying what needs to be done on each of them. I guess that
>> none of us have a clear picture but sharing ideas would definitely help.
>> :-)
>>
>> The final concern was around some people felt that the lack of visible
>>> activity signals that the project is dead. While I don't fully agree with
>>> this assessment, I suspect sending a periodic update on new issues or test
>>> runs that people are running to the mailing list would definitely help
>>> keeping everyone engaged. It also helps bring visibility to the community.
>>> I am not 100% sure whether it is feasible for everyone to share what
>>> they're doing internally but I think if you're working on something,
>>> summarizing on a weekly or biweekly basis can help the community. This is
>>> just a thought and if there are other suggestions, lets discuss them
>>> without shooting down new ideas (assume positive intent).
>>>
>> Your suggestion makes sense to me.Hopefully releasing 4.0-beta will also
>> be a strong sign that the project is still active.
>>
>> On Mon, Jun 29, 2020 at 10:48 PM Dinesh Joshi  wrote:
>>
>>> Hi all,
>>>
>>> I am starting a separate thread as the other thread has veered off in a
>>> very different direction. The ground rules for this thread are that we are
>>> not discussing branching models or release strategy here.
>>>
>>> Some folks in the community had the following questions and concerns:
>>>
>>> 1. Lack of clarity on how is stability and quality is being measured.
>>> 2. Lack of visibility on the progress to stabilizing 4.0.
>>> 3. Lack of clarity on what is remaining to get 4.0 to a stable state.
>>>
>>> My 2 cents on these 3 questions are as follows:
>>>
>>> As a baseline expectation, we thought big users of Cassandra should be
>>> running the latest trunk internally and testing it out for their particular
>>> use cases. We wanted them to file as many jiras as possible based on their
>>> experience. Operations such as host replacement, expansions, shrinks, etc.
>>> and obviously any issues with durability, performance, av

Re: [DISCUSS] Stabilizing 4.0

2020-06-30 Thread Mick Semb Wever
>
>
> Berenguer pointed out to me that we already have a graph to track those
> things:
>
>
> https://issues.apache.org/jira/secure/ConfigureReport.jspa?projectOrFilterId=filter-12347782&periodName=weekly&daysprevious=30&cumulative=true&versionLabels=none&selectedProjectId=12310865&reportKey=com.atlassian.jira.jira-core-reports-plugin%3Acreatedvsresolved-report&atl_token=A5KQ-2QAV-T4JA-FDED_fd75a3db98350d94229fbb4cf29cb50f3051d7ce_lin&Next=Next



A lot of issues are also coming in without any fixVersion defined.
For example (just in the past 4 weeks):

https://issues.apache.org/jira/issues/?filter=12347782&jql=project%20%3D%20cassandra%20AND%20((fixVersion%20is%20EMPTY%20AND%20created%20%20%3E%3D%20-4w))%20%20AND%20(resolution%20%3D%20unresolved%20OR%20status%20!%3D%20resolved%20OR%20resolved%20%3E%3D%20-4w)%20ORDER%20BY%20priority%20DESC%2C%20assignee


Re: [DISCUSS] Stabilizing 4.0

2020-06-30 Thread Benjamin Lerer
>
> It would be nice to have a graph on our weekly status of the number of
> issues reported on 4.0. I feel like having a visual representation of the
> number of bugs on 4.0 over time would be really helpful to give us a
> feeling of the progress on its stability.
>

Berenguer pointed out to me that we already have a graph to track those
things:

https://issues.apache.org/jira/secure/ConfigureReport.jspa?projectOrFilterId=filter-12347782&periodName=weekly&daysprevious=30&cumulative=true&versionLabels=none&selectedProjectId=12310865&reportKey=com.atlassian.jira.jira-core-reports-plugin%3Acreatedvsresolved-report&atl_token=A5KQ-2QAV-T4JA-FDED_fd75a3db98350d94229fbb4cf29cb50f3051d7ce_lin&Next=Next



On Tue, Jun 30, 2020 at 10:20 AM Benjamin Lerer 
wrote:

> Thanks a lot for starting this thread Dinesh.
>
> As a baseline expectation, we thought big users of Cassandra should be
>> running the latest trunk internally and testing it out for their particular
>> use cases. We wanted them to file as many jiras as possible based on their
>> experience. Operations such as host replacement, expansions, shrinks, etc.
>> and obviously any issues with durability, performance, availability. This
>> was thought to generate a body of work (jiras), when fixed, over time would
>> stabilize trunk. When we see the trickle of new jiras coming to a halt or
>> at least nothing serious shows up, thats when the big users of Cassandra
>> would feel comfortable running the build in prod. This would be a good time
>> to cut the final stable release.
>>
>
> It would be nice to have a graph on our weekly status of the number of
> issues reported on 4.0. I feel like having a visual representation of the
> number of bugs on 4.0 over time would be really helpful to give us a
> feeling of the progress on its stability.
> It might also be interesting to see which components are the most affected
> to help us to determine where we should increase the testing.
>
> We also created a confluence doc for a test plan with major areas that
>> require testing. There were shepherds that were tentatively assigned[1].
>> The rationale for this doc was that these areas have significantly changed
>> and we need more eye balls on it to ensure stability. The shepherds would
>> help guide the testing for these areas.
>
>
> I had a quick look at the JIRAs associated with the different areas of the
> plan and a lot of them appear to be blocked. I believe that most people are
> unsure of what or how to test things and want to get some feedback before
> starting to add tests.
> It would be great if in the coming weeks we can all help to unblock those
> tickets by clarifying what needs to be done on each of them. I guess that
> none of us have a clear picture but sharing ideas would definitely help.
> :-)
>
> The final concern was around some people felt that the lack of visible
>> activity signals that the project is dead. While I don't fully agree with
>> this assessment, I suspect sending a periodic update on new issues or test
>> runs that people are running to the mailing list would definitely help
>> keeping everyone engaged. It also helps bring visibility to the community.
>> I am not 100% sure whether it is feasible for everyone to share what
>> they're doing internally but I think if you're working on something,
>> summarizing on a weekly or biweekly basis can help the community. This is
>> just a thought and if there are other suggestions, lets discuss them
>> without shooting down new ideas (assume positive intent).
>>
>
> Your suggestion makes sense to me.Hopefully releasing 4.0-beta will also
> be a strong sign that the project is still active.
>
> On Mon, Jun 29, 2020 at 10:48 PM Dinesh Joshi  wrote:
>
>> Hi all,
>>
>> I am starting a separate thread as the other thread has veered off in a
>> very different direction. The ground rules for this thread are that we are
>> not discussing branching models or release strategy here.
>>
>> Some folks in the community had the following questions and concerns:
>>
>> 1. Lack of clarity on how is stability and quality is being measured.
>> 2. Lack of visibility on the progress to stabilizing 4.0.
>> 3. Lack of clarity on what is remaining to get 4.0 to a stable state.
>>
>> My 2 cents on these 3 questions are as follows:
>>
>> As a baseline expectation, we thought big users of Cassandra should be
>> running the latest trunk internally and testing it out for their particular
>> use cases. We wanted them to file as many jiras as possible based on their
>> experience. Operations such as host replacement, expansions, shrinks, etc.
>> and obviously any issues with durability, performance, availability. This
>> was thought to generate a body of work (jiras), when fixed, over time would
>> stabilize trunk. When we see the trickle of new jiras coming to a halt or
>> at least nothing serious shows up, thats when the big users of Cassandra
>> would feel comfortable running the build in prod. This would be a good time
>> 

Re: [DISCUSS] Stabilizing 4.0

2020-06-30 Thread Benjamin Lerer
Thanks a lot for starting this thread Dinesh.

As a baseline expectation, we thought big users of Cassandra should be
> running the latest trunk internally and testing it out for their particular
> use cases. We wanted them to file as many jiras as possible based on their
> experience. Operations such as host replacement, expansions, shrinks, etc.
> and obviously any issues with durability, performance, availability. This
> was thought to generate a body of work (jiras), when fixed, over time would
> stabilize trunk. When we see the trickle of new jiras coming to a halt or
> at least nothing serious shows up, thats when the big users of Cassandra
> would feel comfortable running the build in prod. This would be a good time
> to cut the final stable release.
>

It would be nice to have a graph on our weekly status of the number of
issues reported on 4.0. I feel like having a visual representation of the
number of bugs on 4.0 over time would be really helpful to give us a
feeling of the progress on its stability.
It might also be interesting to see which components are the most affected
to help us to determine where we should increase the testing.

We also created a confluence doc for a test plan with major areas that
> require testing. There were shepherds that were tentatively assigned[1].
> The rationale for this doc was that these areas have significantly changed
> and we need more eye balls on it to ensure stability. The shepherds would
> help guide the testing for these areas.


I had a quick look at the JIRAs associated with the different areas of the
plan and a lot of them appear to be blocked. I believe that most people are
unsure of what or how to test things and want to get some feedback before
starting to add tests.
It would be great if in the coming weeks we can all help to unblock those
tickets by clarifying what needs to be done on each of them. I guess that
none of us have a clear picture but sharing ideas would definitely help.
:-)

The final concern was around some people felt that the lack of visible
> activity signals that the project is dead. While I don't fully agree with
> this assessment, I suspect sending a periodic update on new issues or test
> runs that people are running to the mailing list would definitely help
> keeping everyone engaged. It also helps bring visibility to the community.
> I am not 100% sure whether it is feasible for everyone to share what
> they're doing internally but I think if you're working on something,
> summarizing on a weekly or biweekly basis can help the community. This is
> just a thought and if there are other suggestions, lets discuss them
> without shooting down new ideas (assume positive intent).
>

Your suggestion makes sense to me.Hopefully releasing 4.0-beta will also be
a strong sign that the project is still active.

On Mon, Jun 29, 2020 at 10:48 PM Dinesh Joshi  wrote:

> Hi all,
>
> I am starting a separate thread as the other thread has veered off in a
> very different direction. The ground rules for this thread are that we are
> not discussing branching models or release strategy here.
>
> Some folks in the community had the following questions and concerns:
>
> 1. Lack of clarity on how is stability and quality is being measured.
> 2. Lack of visibility on the progress to stabilizing 4.0.
> 3. Lack of clarity on what is remaining to get 4.0 to a stable state.
>
> My 2 cents on these 3 questions are as follows:
>
> As a baseline expectation, we thought big users of Cassandra should be
> running the latest trunk internally and testing it out for their particular
> use cases. We wanted them to file as many jiras as possible based on their
> experience. Operations such as host replacement, expansions, shrinks, etc.
> and obviously any issues with durability, performance, availability. This
> was thought to generate a body of work (jiras), when fixed, over time would
> stabilize trunk. When we see the trickle of new jiras coming to a halt or
> at least nothing serious shows up, thats when the big users of Cassandra
> would feel comfortable running the build in prod. This would be a good time
> to cut the final stable release.
>
> We also created a confluence doc for a test plan with major areas that
> require testing. There were shepherds that were tentatively assigned[1].
> The rationale for this doc was that these areas have significantly changed
> and we need more eye balls on it to ensure stability. The shepherds would
> help guide the testing for these areas.
>
> I think the big missing piece is that we don't know who is actively
> running trunk internally and how aggressive their timelines are in getting
> to a stable 4.0. However, we can see new jiras being reported every day.
> There are also a lot of open jiras that require attention and they are
> being reported by diverse set of Cassandra users which is great. I think
> everyone would like to see a stable release in ~6 months from now. The
> quality of this release will be depende