These days I'm seeing a weird problem of zombie worker process.
When I restart a topology, for example, in one machine worker process is not
starting.
and found out that a process which should have died is still alive, even though
worker port 6700 is down and nothing is coming up in the corresponding worker
log file. Meanwhile, the supervisor log keeps reporting that a new worker has
not been started.
One thing to note here is that in storm.yaml, I configured jmx port open when
process is up as below. But it turns out that those 16700 port is still up even
after topology dies. It does not always occur, and I think there are other
related issues which I don't know for sure at this moment.
-Dcom.sun.management.jmxremote
-Dcom.sun.management.jmxremote.ssl=false
-Dcom.sun.management.jmxremote.authenticate=false
-Dcom.sun.management.jmxremote.local.only=false
-Dcom.sun.management.jmxremote.port=1%ID%
Could you suggest anything to figure out about this more?
I am using 0.9.5 version of Storm.
Regards,
Adrian Seungjin Lee