[
https://issues.apache.org/jira/browse/HADOOP-13714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15995399#comment-15995399
]
Steve Loughran commented on HADOOP-13714:
-----------------------------------------
I care about compatibility; I think everyone does. it's just really hard to
achieve.
regarding Semantic versioning, the problem is "all changes may break things".
Even ifs something as minor as changing a string in an exception message,
something else could be looking for it and find it breaks. Or subtle
differences in performance and concurrency which look like they work, but have
adverse consequences downstream. I am very bleak about semantic versioning
being viable.
Now, regarding the specs, the FS spec was written precisely because too much of
the FS behaviour was hidden in the HDFS code and nobody had written down all
that was happening, including what the exceptions were, and there was nothing
clear as to what features were deliberate versus accidental (example: mkdirs -p
/a/b/c being atomic. Deliberate? Or accidental side effect of a locking
optimisation? And what happens if it is now changed?). Its incomplete
(HADOOP-13327) and there's a tendency for new features in HDFS to consider it
something unimportant, leading to issues like HADOOP-14365. I understand why
(timetable pressure, test centric dev doesn't focus on the specs), but it's
frustrating.
To be fair though: it does get read, it is broadly understood by people, which
shows how Python makes a good syntax for specification.
Where it is limited programatically is that as it isn't something you can use
in theorem provers, the way TLA+ can be, I can't use it in some specification
of what a committer does, use it to prove that the MR committer V1 and v2
algorithms work, etc. I'm playing with the more rigorous approach in
HADOOP-13786, but I know once I have got a TLA+ spec for an object store & its
commit algorithm,. *nobody* is going to review it. We just don't have enough
people who play in that area to to the reviewing. Which means it wouldn't get
the maintenance either. Python it is, then.
> Tighten up our compatibility guidelines for Hadoop 3
> ----------------------------------------------------
>
> Key: HADOOP-13714
> URL: https://issues.apache.org/jira/browse/HADOOP-13714
> Project: Hadoop Common
> Issue Type: Improvement
> Components: documentation
> Affects Versions: 2.7.3
> Reporter: Karthik Kambatla
> Assignee: Daniel Templeton
> Priority: Blocker
> Attachments: HADOOP-13714.WIP-001.patch
>
>
> Our current compatibility guidelines are incomplete and loose. For many
> categories, we do not have a policy. It would be nice to actually define
> those policies so our users know what to expect and the developers know what
> releases to target their changes.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]