Hi Kudu developers,

Back in mid-October I proposed a plan for the 1.1 and 1.2 releases[1]. 1.1
is now behind us, so I wanted to start a thread regarding 1.2.

The original proposal was to aim for a 1.2 release for mid-January. While
it was always planned to be a time-based release, we also discussed that
we’d likely be able to get some of the security features into it, as well
as a bunch of other work completed that’s been in flight the last month or
two.

Now that we’re getting a bit closer, I want to re-evaluate the timing and
scope a bit. Although we made some good progress on security features,
things have slowed a bit in recent weeks as people have shifted focus onto
other critical improvements and issues. Given that, and the fact that a lot
of people will be out for multiple weeks in December due to holidays, I
don’t think it’s realistic to guess that we’ll complete (and fully
test/stabilize) the security work in time for a mid-Jan release date.

On the other hand, there has been a lot of other good work in master
recently, both in terms of features as well as bug fixes. Here’s an
incomplete list of features and improvements from my skim of the git log
since 1.1.0 which are either complete or almost-complete:

- Improved consistency of reads (snapshot consistency / read-your-writes)
- File descriptor cache and container size limiting to prevent file system
corruption on RHEL 6
- Ability to bound memory usage for errors in C++ client (new API)
- Ability to list range partitions in Java API (new API)
- Performance improvements (AVX2) for BITSHUFFLE and DICT encoding
- Validation and “guard rails” for table metadata, column names, etc.

There are also a bunch of bug fixes, some of which might be a bit risky to
cherry-pick into a 1.1.x release due to the complexity of the patches. I’ve
spent a good amount of my last several weeks working on stress testing the
master branch, and can tell you that it’s significantly better than 1.1.

So, even without the security work complete, I think we’ve already got
enough juicy stuff lined up for 1.2 to make it enticing for users to
upgrade. We can then push the security work out to 1.3 some time in early
2017.

So, the next question is timing. I’d like to propose that we cut a
branch-1.2 from master this week, and give it a couple weeks of soak time
before release. This differs from previous releases in which we’ve cut a
branch only a day or two before the release candidate. My reasoning is as
follows:

1) Some of the changes since 1.1 have been fairly invasive. In particular,
the consistency work has done surgery on where and how timestamps are
assigned to writes, and a bug here could result in replicas diverging,
crashes, deadlocks, etc. Having a bit of time to soak, stress test, and
correctness-test this work before releasing it to the community would be
prudent.

2) Now that we are post-1.0, we have more and more users depending on Kudu
for production (or almost-production) workloads. As we enter a stage of
greater maturity, it’s probably smart to give each release a bit more
“soak” in a branch before giving it to users.

3) Given that a bunch of the above-mentioned features are currently
wrapping up, we’ll probably get back to working on authentication features,
which will cause quite a bit of churn. It would be better to avoid exposing
the upcoming 1.2 branch to this churn, and branching early is a good way to
insulate it.

Thoughts?

-Todd


[1] http://www.mail-archive.com/dev@kudu.apache.org/msg00237.html

Reply via email to