[
https://issues.apache.org/jira/browse/HADOOP-12105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14612873#comment-14612873
]
Allen Wittenauer commented on HADOOP-12105:
-------------------------------------------
(I'm going to assume we're talking about trunk. branch-2's shell code is so
screwed up isn't worth discussing. e.g., there is no guarantee that you'll
even get pid files...)
bq. Can we check status of process as: if ps -fp pid | grep
process_classname > /dev/null 2>&1; then
Nope. ps is one of the *worst* commands to work with because it is specifically
not portable between SVID and BSD. 'ps -p' is known to be portable to both.
ps -f, being SVID, will not work on many BSDs. To add insult to injury, grep
here forces another executable when using bash regex would be significantly
faster.
Do I think this is worth fixing? No, not really. I had the chance when I
rewrote this code, but with ps being non-portable, one ends up doing a ton of
gymnastics to make it work. Especially considering, as [~walter.k.su] pointed
out:
bq. It's very rare(because of PIDs wrap around)
... and even on those occasions when it does happen, the vast majority of
operations staff know how to handle it because we certainly aren't the only
ones that suffer from this issue.
> Avoid returning 0 , while fetching the status of a process ,which is not
> running.
> ---------------------------------------------------------------------------------
>
> Key: HADOOP-12105
> URL: https://issues.apache.org/jira/browse/HADOOP-12105
> Project: Hadoop Common
> Issue Type: Improvement
> Reporter: J.Andreina
> Assignee: J.Andreina
>
> If the process(Namenode) is crashed there will be stale pid file .
> Any other JVM process might get allocated with the same pid as that of the
> content of stale pid file.
> As per current implementation for fetching the status :
> we check if any process is running with pid (fetched from
> corresponding process pid file) then returns 0.
> {code}
> if ps -p "${pid}" > /dev/null 2>&1; then
> return 0
> fi
> {code}
> *So while fetching the status of namenode return code will be 0 , even if the
> namenode process is not running (because some other process is assigned with
> same pid ).*
> Can we check status of process as below
> {code}
> if ps -fp pid | grep process_classname > /dev/null 2>&1; then
> return 0
> fi
> {code}
> Please provide your feedback.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)