IMO the alpha / beta / GA terminology makes sense, and makes things clearer
to users, which is good.

Some thoughts on the specifics of your proposal:

- You're suggesting we commit to a specific number of releases that a GA
feature will be forward / backward compatible for. IMO, our current
commitment (one major release) is okay, but it would be good to strive for
doing this as infrequently as possible. In the future, we may decide to do
major releases less often, which will naturally lengthen the commitment
times.

- I like the idea of phasing in the testing bar as features move from alpha
-> beta -> GA. I think it'd be good to point to examples of features where
the testing is done "right" for each stage. It should help contributors
know what to shoot for.

- Plenty of GA features today do not meet the testing bar you've mentioned,
including some "day 1" features. This is fine — it is a natural consequence
of raising the testing bar over time — but we should have an idea of what
we want to do about this. One possible approach is to require that tests be
added to meet the bar when fixes or changes are made to the feature. But
this leads to situations where a small change can't be made without adding
a mountain of tests. IMO it'd be good to do an amount of new testing
commensurate with the scope of the change. A big refactor to a feature that
doesn't have much testing should involve adding a mountain of tests to it.
But we don't necessarily need to require that for a small bug fix or
enhancement (but it would be great, of course!).

- For "beta" the definition you suggest is all negative ("not battle
tested", "may change", "may not be compatible"). We should include
something positive as well, to illustrate what makes beta better than
alpha. How about "no major known issues" or "no major API changes planned"?

- I would suggest moving the "appropriate user-facing documentation"
requirement to beta rather than GA. In order to have a useful beta testing
period, we need to have good user-facing docs so people can try the feature
out.

- I think we might want to leave some alpha features undocumented, if their
quality or stability level is so low that they won't be useful to people
that aren't developers. The goal would be to avoid clogging up the
user-facing docs with a bunch of half-baked stuff. Too much of that lowers
the perceived quality level of the project.

Now, thinking about specific features, I suggest we classify the current
experimental features in the following way:

- Java 11 support: Beta or GA (depending on how good the test coverage is)
- HTTP remote task runner: Alpha (there aren't integration tests yet)
- Router process: GA
- Indexer process: Alpha or Beta (also depending on how good the test
coverage is)
- Segment locking / minor compaction: Alpha
- Approximate histograms: GA, but deprecated (they are stable and have
plenty of tests, but users should consider switching to DataSketches
quantiles)
- Lookups: Beta
- Kinesis ingestion: GA (now that there are integration tests:
https://github.com/apache/druid/pull/9724)
- Materialized view extension: Alpha
- Moments sketch extension: Alpha

On Mon, Jun 8, 2020 at 1:49 PM Suneet Saldanha <suneet.salda...@imply.io>
wrote:

> Hi Druid devs!
>
> I've been thinking about our release process and would love to get your
> thoughts on how we manage new features.
>
> When a new feature is added is it first marked as experimental?
> How do users know which features are experimental?
> How do we ensure that features do not break with each new release?
> Should the release manager manually check each feature works as part of the
> release process?
>     This doesn't seem like it can scale.
> Should integration tests always be required if the feature is being added
> to core?
>
> To address these issues, I'd like to propose we introduce a feature
> lifecycle for all features so that we can set expectations for users
> appropriately - either in the docs, product or both. I'd like to propose
> something like this:
> * Alpha - Known major bugs / performance issues. Incomplete functionality.
> Disabled by default.
> * Beta - Feature is not yet battle tested in production. API and
> compatibility may change in the future. May not be forward/ backward
> compatible.
> * GA - Feature has appropriate user facing documentation and testing so
> that it won't regres with a version upgrade. Will be forward / backward
> compatible for x releases (maybe 4? ~ 1 year)
>
> I think a model like this will allow us to continue to ship features
> quickly while keeping the release quality bar high so that our users can
> continue to rely on Druid without worrying about upgrade issues.
> I understand that adding integration tests may not always make sense for
> early / experimental features when we're uncertain of the API or the
> broader use case we're trying to solve. This model would make it clear to
> our users which features are still work in progress, and which ones they
> can expect to remain stable for a longer time.
>
> Below is an example of how I think this model can be applied to a new
> feature:
>
> This PR adds support for a new feature -
> https://github.com/apache/druid/pull/9449
>
> While it has been tested locally, there may be changes that enter Druid
> before the 0.19 release that break this feature, or more likely - a
> refactoring after 0.19 that breaks something in this feature. In this
> example, I think the feature should be marked as alpha, since there are
> future changes expected to the functionality. At this stage integration
> tests are not expected. Once the feature is complete, there should be happy
> path integration tests for the feature and it can graduate to Beta. After
> it has been running in production for a while, the feature can graduate to
> GA once we've added enough integration tests that we feel confident that
> the feature will continue to work if the integration tests run
> successfully.
>
> I know this is a very long email, but I look forward to hearing your
> thoughts on this.
> Suneet
>

Reply via email to