[
https://issues.apache.org/jira/browse/DAEMON-459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17717125#comment-17717125
]
Mark Thomas commented on DAEMON-459:
------------------------------------
I'm testing this with Tomcat and I am unable to repeat the problem described.
`kill -HUP <child-pid>` works as expected for repeated invocations. Note that
if you request a restart within 60s of the last start, there will be a delay of
60s between the service stopping and it restarting (to prevent looping).
Please provide the steps to recreate this issue using the current source code.
Running with -debug is useful to see exactly what is going on during a restart.
> Restart only works once (regression)
> ------------------------------------
>
> Key: DAEMON-459
> URL: https://issues.apache.org/jira/browse/DAEMON-459
> Project: Commons Daemon
> Issue Type: Bug
> Components: Jsvc
> Affects Versions: 1.3.3
> Reporter: Klaus Malorny
> Priority: Major
>
> For certain functions, especially code updates, we rely on the ability to
> restart the child process. This seems to work only once. On the subsequent
> attempt, the child process hangs.
> I tracked down the problem and found out that the problem is within the
> {{jsvc-unix.c}} file. The {{main_reload}} function is called to send the
> signal to itself, but this does not happen. In the first restart, the
> {{controlled}} variable holds the value of 0. This works by chance, as the
> signal is sent to the parent, which sends it back to the child. In the second
> attempt, the variable holds the PID of the previous child, thus the signal is
> sent to a no longer existing process.
> The {{controlled}} variable is used both by the parent and the child process.
> In earlier versions of the file, the child process determines its own PID by
> using the {{getpid}} system function. This call has been – likely
> accidentally – removed in version 1.3.3 or earlier. Thus, the variable
> contains the parent's value before the fork which has created the child.
> The solution is simple: in the function {{{}child{}}}, add
> {{ controlled = getpid ();}}
> between the {{sigaction}} calls and the {{log_debug ("Waiting for a signal to
> be delivered")}} call (line 913 in my copy of the file), i.e.
> {{ ...}}
> {{ memset(&act, '\0', sizeof(act));}}
> {{ act.sa_handler = handler;}}
> {{ sigemptyset(&act.sa_mask);}}
> {{ act.sa_flags = SA_RESTART | SA_NOCLDSTOP;}}
> {{ sigaction(SIGHUP, &act, NULL);}}
> {{ sigaction(SIGUSR1, &act, NULL);}}
> {{ sigaction(SIGUSR2, &act, NULL);}}
> {{ sigaction(SIGTERM, &act, NULL);}}
> {{ sigaction(SIGINT, &act, NULL);}}
> {{ *controlled = getpid ();*}}
> {{ log_debug("Waiting for a signal to be delivered");}}
> {{ create_tmp_file(args);}}
> {{ while (!stopping) {}}
> {{ ...}}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)