[
https://issues.apache.org/jira/browse/HDFS-13501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16457763#comment-16457763
]
Allen Wittenauer commented on HDFS-13501:
-----------------------------------------
Some important background:
One of my key goals with the rewrite was to reduce the amount of stuff that
printed to the screen. With a few exceptions, output broke down into three
buckets:
* stdout: vitally important information that the user either requested or can't
act on but needs to know
* stderr: vitally important information that the user has an action they must
take
* --debug: non-vital information that is only interesting when debugging
As a result, there are lots of places where branch-2 has output that 3.x+ does
not. There's not a whole lot in the bash code where 'stdout' is appropriate.
On the flip side, there is a lot more 'stderr' output because of significantly
better error handling.
That said...
The problem of the missing pid file is one of them that caused me the most
problems. It's an error from a logical program sense, but what is the user
action? If the daemon is still running, but the pid file is missing, then
something likely catastrophic happened, including a very screwed up directory
structure/config or multiple invocations of the --daemon flag. Both of those
are things that are really beyond the bash code to fix. Then there is the
opposite situation:
{code}
$ hdfs --daemon stop namenode
$ hdfs --daemon stop namenode
{code}
The daemon isn't running, and so the pid file should be gone. Is that an error
worth disturbing the user? Also, how common is that? (Morgan Freeman voice:
It is very common.) Then there is the old ops habit of running ps even after
issuing stop commands because no one trusts the system...
By comparison, branch-2 does
{codes}
echo no $command to stop
{code}
... which is mostly useless but does confirm the thinking that a missing pid
file is primarily interpreted as "daemon is already down; no action required."
OK, fine. All of that was a bit of a dead end. So then I thought about it from
"what is the pid file anyway?". Ultimately it's a file system lock for the
bash code. Nothing else that ships with Hadoop cares about it. And with the
introduction of '--daemon status,' there isn't much of a reason for anything
else to be looking at them either. That mostly makes them private.
In the end, I opted to not print a message at all because I couldn't answer the
"action" question. There isn't anything for a user to do when the pid file is
missing.
FWIW: this also highlights the problem of what to do with the exit status.
IIRC, it currently exits with 0 when the pid file isn't found because again, it
is assumed that the daemon was stopped successfully the same as branch-2. In
one sense that feels wrong, but I felt it was better to stay compatible in this
instance.
> Secure Datanode stop/start from cli does not throw a valid error if
> HDFS_DATANODE_SECURE_USER is not set
> --------------------------------------------------------------------------------------------------------
>
> Key: HDFS-13501
> URL: https://issues.apache.org/jira/browse/HDFS-13501
> Project: Hadoop HDFS
> Issue Type: Improvement
> Reporter: Ajay Kumar
> Assignee: Ajay Kumar
> Priority: Major
>
> Secure Datanode start/stop from cli does not throw a valid error if
> HADOOP_SECURE_DN_USER/HDFS_DATANODE_SECURE_USER is not set. If
> HDFS_DATANODE_SECURE_USER and JSVC_HOME is not set start/stop is expected to
> fail (when privilege ports are used) but it should show some valid message.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]