Andrew Wang updated HADOOP-13632:
    Attachment: HADOOP-13632.001.patch

Here's a patch which moves us over to {{hadoop_status_daemon}}. Tested manually 
with an empty config that causes the NN to abort quickly. I left out the error 
message, but I can add it if you think it doesn't hurt.

The timing condition is quite fine though. If I instead use a valid config but 
an unformatted namedir so it dies later during NN initialization, it doesn't 

Since this is a pretty common error, we could try and catch this by extending 
the timer loop. I remember talking to a Cloudera Manager engineer who maintains 
a similar startup script, and CM waits for longer than 5s (I think 30s?) to 
confirm that the process is still alive.


> Daemonization does not check process liveness before renicing
> -------------------------------------------------------------
>                 Key: HADOOP-13632
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13632
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: scripts
>    Affects Versions: 3.0.0-alpha1
>            Reporter: Andrew Wang
>         Attachments: HADOOP-13632.001.patch
> If you try to daemonize a process that is incorrectly configured, it will die 
> quite quickly. However, the daemonization function will still try to renice 
> it even if it's down, leading to something like this for my namenode:
> {noformat}
> -> % bin/hdfs --daemon start namenode
> ERROR: Cannot set priority of namenode process 12036
> {noformat}
> It'd be more user-friendly instead of this renice error, we said that the 
> process couldn't be started.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

Reply via email to