[ 
https://issues.apache.org/jira/browse/BIGTOP-1573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14246741#comment-14246741
 ] 

Konstantin Boudnik commented on BIGTOP-1573:
--------------------------------------------

Will this change work with current initd bootstrap sequence?

> rpm init scripts do not wait for network
> ----------------------------------------
>
>                 Key: BIGTOP-1573
>                 URL: https://issues.apache.org/jira/browse/BIGTOP-1573
>             Project: Bigtop
>          Issue Type: Bug
>          Components: rpm
>    Affects Versions: 0.8.0
>         Environment: CentOS 7
>            Reporter: Alexander van der Meij
>              Labels: build
>
> I have used Bigtop to generate a set of RPM's for the purpose of deploying 
> multi-node Hadoop clusters. All the components work well, save for one 
> network issue. 
> It seems that the Hadoop daemons, when started at boot through their init 
> scripts, do not wait for network initialisation to complete before they 
> themselves are processed. As a result, when I reboot for example one of my 
> datanodes, the hadoop-hdfs-datanode process is started using 
> "localhost.localdomain" as its hostname - and it also advertises itself as 
> such to the ResourceManager, leading to all sort of connectivity problems in 
> a multi-node environment.
> I first noticed this problem when, after a reboot, I saw log files being 
> created of the form /var/log/hadoop-hdfs-datanode-localhost.localdomain.log. 
> When I would restart the hdfs-datanode process using the same init scripts, 
> the correct /var/log/hadoop-hdfs-datanode-{fqdn}.log are created. 
> I believe the problem is caused by the introduction of systemd in CentOS 7; 
> init scripts are run in parallel and there are no contraints present in the 
> Hadoop init scripts that instruct it to wait until network initialisation is 
> complete. 
> Now for the good news, adding $network to the Required-Start/Stop list for 
> all Hadoop daemons solves the issue for me:
> /etc/init.d/hadoop-hdfs-datanode:
> # Required-Start:    $syslog $remote_fs $network
> # Required-Stop:     $syslog $remote_fs $network



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to