stack commented on HBASE-5844:

Looking at this w/ j-d, now we no longer do nohup so the parent process can 
stick around to watch out for the server crash. This make it so now there are 
two  hbase processes listed per launched daemon.  This is kinda ugly.

When we have this bash script watching the running java process we verge into 
the territory normally occupied by babysitters like supervise.   Our parent 
bash script will always be less than a real babysitter -- supervise, god, etc. 
-- so maybe we should just have this kill znode as an optional script w/ 
prescription for how to set it up -- e.g. run znode remover on daemon crash 
before starting new one (if we want supervise to start a new one).

I'm thinking we should back this out since there are open questions still.
> Delete the region servers znode after a regions server crash
> ------------------------------------------------------------
>                 Key: HBASE-5844
>                 URL: https://issues.apache.org/jira/browse/HBASE-5844
>             Project: HBase
>          Issue Type: Improvement
>          Components: regionserver, scripts
>    Affects Versions: 0.96.0
>            Reporter: nkeywal
>            Assignee: nkeywal
>             Fix For: 0.96.0
>         Attachments: 5844.v1.patch, 5844.v2.patch, 5844.v3.patch, 
> 5844.v3.patch, 5844.v4.patch
> today, if the regions server crashes, its znode is not deleted in ZooKeeper. 
> So the recovery process will stop only after a timeout, usually 30s.
> By deleting the znode in start script, we remove this delay and the recovery 
> starts immediately.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to