Xuefu, thanks, I think this is a good start. I've inlined a few responses.
Xuefu Zhang <mailto:[email protected]>
November 24, 2015 at 7:26
Hi all,
In the last contributor meetup, the topic around data correctness or data
corruption is rather concerning. Not only is the number of such issues
that
have been reported recently, but also the way that Hive community is
handling these issues. The latter is the the topic of this discussion. I
think everyone agrees that the current practice is problematic and that
Hive community should treat data correctness more seriously.
Therefore, I'd
like to find a "standard" procedural that we should follow. Here are my
initial thought:
1. JIRA should be correctly labeled and the title should reflect data
correctness.
I think we should have data corruption (which is now possible with ACID)
as a separate category that is treated in the same way (ie we would have
separate labels for correctness and corruption issues).
2. JIRA should bear adequate description about the issue, including
affected version, JIRA that incurred the issue, any workaround, etc.
3. Once confirmed, advisory message should be sent to user@ and @dev
regarding the problem.
4. Once the JIRA is closed, a message should be sent again to the lists
advising the availability of the fix.
I don't think this is necessary. The people who are concerned about the
JIRA can follow it. That way they will know when it's committed and
when it's released and in what version, without polluting the mailing
list with it. The email to the user and dev lists when we discover the
issue should include instruction on how to follow the JIRA.
A couple of other thoughts:
5) We should make it easy for users to discover known correctness and
corruption issues in released versions of Hive. This can be as simple
as a wiki page with a link to a JIRA search that will return all these
issues for each released Hive version.
6) We need to think about how we quickly produce maintenance releases to
address these issues. Finding a correctness issue and then not
releasing a fix for 6 months will not be good. I think we should set a
goal of releasing a maintenance release with at least the correctness or
corruption issue repaired within 6 weeks of discovering the issue.
Alan.
I know these may not be all clear or actionable, but I hope we can have
concrete steps to follow at the end of this discussion.
Please share your thoughts.
Thanks,
Xuefu