[
https://issues.apache.org/jira/browse/HDFS-9016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15254255#comment-15254255
]
Allen Wittenauer commented on HDFS-9016:
----------------------------------------
This is basically Hadoop's operability problems coming to the forefront:
* The compatibility guidelines don't offer any real out for CLI output that
actually needs to change based upon the implementation. So no, technically, a
special flag like '-replicadetails' would not be magically immune. Once the
output is in a released version, it's fixed. If the output changes based upon
how the system is configured, there is no hints anywhere visible that this is
going to occur. The compatibility guidelines are the ONLY thread by which
operation teams are holding on and every time we ignore them, all hell breaks
loose. (Of course, a lot of the people who work on the code don't realize this
because they have no direct lines of communication or really pay attention that
much when an ops person does point out that the world broke. "Feature
expediency" takes over for common sense just way too much. HDFS rolling
upgrade is a great example--it actually caused data loss in certain instances
because someone thought it was a great idea to turn a heavily depended upon NN
flag to be a no-op with a success exit code.)
* We don't build that many interfaces that can actually be used by the
scripting languages (perl, python, ruby, etc) leaving stdout as the only way
the vast majority of ops people are going to be able to process information.
While the JMX->REST hook was a great help, it's read only and still doesn't
expose vital information (fsck being the worst offender, because frankly, it's
doing way too much. Why does it have to be literally the only source for block
level information?).
To me, things like the storagepolicy code should have taken on the PMC and
tried to revamp the compatibility guidelines to specifically spell out that
command line arguments that generate output need to also specify stability in
their accompanying documentation. Buried in a javadoc is useless. Unless
people are writing code, users don't see that information. See: metrics, rack
awareness, and a host of other bits that have had real documentation written
over the past 2 years. All of that information was previously done through word
of mouth.
That said, I know what the outcome of this JIRA will be. Another cranny where
the rules don't apply to come back and bite someone hard in the future.
> Display upgrade domain information in fsck
> ------------------------------------------
>
> Key: HDFS-9016
> URL: https://issues.apache.org/jira/browse/HDFS-9016
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Reporter: Ming Ma
> Assignee: Ming Ma
> Attachments: HDFS-9016.patch
>
>
> This will make it easy for people to use fsck to check block placement when
> upgrade domain is enabled.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)