Attila Kanto created KNOX-864:
---------------------------------

             Summary: Knox init scripts are not Upstart compatible
                 Key: KNOX-864
                 URL: https://issues.apache.org/jira/browse/KNOX-864
             Project: Apache Knox
          Issue Type: Improvement
          Components: Server
    Affects Versions: 0.11.0
            Reporter: Attila Kanto


It is critical that we have a service that can auto-restart during crashes and 
reboots. On Amazon Linux this tasks are done by Upstart.

By default Upstart will track the life cycle of the first PID that it executes 
in the exec or script stanzas (defined in the Upstart config file),  however, 
most Unix services will "daemonize", meaning that they will create a new 
process (using fork(2)) which is a child of the initial process. This is what 
also happens when when gateway.sh or ldap.sh is invoked.

In order to track the right PID, Upstart must determine the final process ID 
for a job, and in case of  daemonized processes it needs to know how many times 
that process will call fork(2).

Upstart supports the followings:
* *expect fork*: Upstart will expect the process executed to call fork(2) 
exactly once.
* *expect daemon*: Upstart will expect the process executed to call fork(2) 
exactly twice

Unfortunately none of the above cases fits to gateway.sh and ldap.sh, since 
they are calling fork many times and Upstart always tracks the wrong PID.

According to Upstart doc 
http://upstart.ubuntu.com/cookbook/#how-to-establish-fork-count if the 
application you are attempting to create a Job Configuration File does not 
document how many times it forks, you can run it with a tool such as strace(1) 
which will allow you to count the number of forks:

{code}
[root@ip-10-0-4-107 ~]# strace -o /tmp/strace.log -fFv su -c 
"/usr/hdp/current/knox-server/bin/gateway.sh start" knox
Starting Gateway succeeded with PID 25528.
[root@ip-10-0-4-107 ~]# sudo egrep "\<(fork|clone)\>\(" /tmp/strace.log | wc | 
awk '{print $1}'
86
{code}

Ambari had similar issues in the past: 
https://issues.apache.org/jira/browse/AMBARI-14842



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to