Agents and Collectors do not respond to HUP signals
---------------------------------------------------

                 Key: CHUKWA-631
                 URL: https://issues.apache.org/jira/browse/CHUKWA-631
             Project: Chukwa
          Issue Type: Bug
          Components: data collection
    Affects Versions: 0.4.0
         Environment: Red Hat Enterprise Linux Server release 5.4, 64 bit

{noformat}
$ java -version
java version "1.6.0_23"
Java(TM) SE Runtime Environment (build 1.6.0_23-b05)
Java HotSpot(TM) 64-Bit Server VM (build 19.0-b09, mixed mode)
{noformat}

            Reporter: Noel Duffy


On RHEL 5.4, agents and collectors do not respond to HUP signals sent to the 
process PID.

I start the agent like this, then attempt to kill it like so:

{noformat}
$ pwd
/usr/local/chukwa/bin

$ ./start-agents.sh

$ pgrep -lf Agent
23148 /usr/java/latest/bin/java 
-Djava.library.path=/usr/lib/hadoop/lib/native/Linux-X86-64 [...] 
org.apache.hadoop.chukwa.datacollection.agent.ChukwaAgent

$ cat /var/run/Agent.pid 
23148

$ kill -hup 23148

$ ps -fp 23148
UID        PID  PPID  C STIME TTY          TIME CMD
root     23148     1  0 Jan23 pts/0    00:00:19 /usr/java/latest/bin/java 
-Djava.library.path=/usr/lib/hadoop/lib/native/}}

$ pgrep -lf Agent
23148 /usr/java/latest/bin/java 
-Djava.library.path=/usr/lib/hadoop/lib/native/Linux-X86-64  [...] : 
org.apache.hadoop.chukwa.datacollection.agent.ChukwaAgent}}
{noformat}

The agent is still running, and /var/log/agent.log shows no indication that the 
signal was seen. 

The agent is running one adaptor:

{noformat}
$ echo "list" | nc localhost 9093
adaptor_b5382452cfdc2ae81bd45d8126804843)  
org.apache.hadoop.chukwa.datacollection.adaptor.ExecAdaptor Iostat 60 
/usr/bin/iostat -x -k 55 2 11215696
{noformat}

We can see this child process running too:

{noformat}
$ ps -f --ppid 23148
UID        PID  PPID  C STIME TTY          TIME CMD
root      7116 23148  0 12:40 pts/0    00:00:00 /usr/bin/iostat -x -k 55 2
{noformat}

In the case of the collector, I start it like so:

{noformat}
$ ./start-collectors.sh 
localhost: starting collector, logging to 
/var/log/chukwa-chukwa-collector-<hostname.>out
localhost: 2012-01-24 04:24:11.04::INFO:  Logging to STDERR via 
org.mortbay.log.StdErrLog
localhost: 2012-01-24 04:24:11.104::INFO:  jetty-6.1.11

$ pgrep -lf collector
10220 /usr/java/latest/bin/java 
-Djava.library.path=/usr/lib/hadoop/lib/native/Linux-X86-64 
-DCHUKWA_HOME=/usr/local/chukwa/bin/.. [...] : 
org.apache.hadoop.chukwa.datacollection.collector.CollectorStub

$ kill -HUP 10220

$ pgrep -lf collector
10220 /usr/java/latest/bin/java 
-Djava.library.path=/usr/lib/hadoop/lib/native/Linux-X86-64 
-DCHUKWA_HOME=/usr/local/chukwa/bin/ [...]: 
org.apache.hadoop.chukwa.datacollection.collector.CollectorStub
{noformat}

The logs on the collector show no indication that any signal was received.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to