[jira] [Commented] (HBASE-7386) Investigate providing some supervisor support for znode deletion

stack (JIRA) Tue, 18 Dec 2012 23:31:17 -0800

    [ 
https://issues.apache.org/jira/browse/HBASE-7386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13535744#comment-13535744
 ]


stack commented on HBASE-7386:
------------------------------

I'd say just repurpose 'autorestart', especially if now broke.  What was there 
previous was mickey mouse.  This is real deal.

bq. ... and does not respect any other environment settings (e.g. 
HBASE_CONF_DIR).

Would this be fixed if we "...through all the config files in hbase-daemon and 
do something appropriate."?

On questions:

1. Yes this is valid direction.  Perhaps we could extract the stuff you hacked 
out into a 'wrapper' script, a poor-mans' supervise such that it was there as 
an option... you could run it if you wanted poor-mans' supervise but otherwise, 
scripts ran as they used to.  But this would likely be wasted effort... effort 
better spent getting it so optionally, if supervisord was installed, you could 
just run with it.
2. I agree with nkeywal that templates/samples inevitably rot.  Unused software 
also rots so providing supervisord scripts unless they are used, they will go 
bad.  How much work involved making it so could do 
./bin/start-supervisord-hbase.sh?  Would be coolio if you could do 
./bin/start-hbase.sh and ./bin/start-supervisord-hbase.sh if supervisor 
available (likely on most systems I'd say) and then in doc. we encourage folks 
to do the latter.

What to do for the case where a shop has chosen other than supervisord to 
monitor their processes?  I suppose we could let them do the convertion from 
'supervise' to 'god', etc.?

This is great stuff G.





                
> Investigate providing some supervisor support for znode deletion
> ----------------------------------------------------------------
>
>                 Key: HBASE-7386
>                 URL: https://issues.apache.org/jira/browse/HBASE-7386
>             Project: HBase
>          Issue Type: Task
>          Components: master, regionserver, scripts
>            Reporter: Gregory Chanan
>            Assignee: Gregory Chanan
>             Fix For: 0.96.0
>
>         Attachments: HBASE-7386-v0.patch, supervisordconfigs-v0.patch
>
>
> There a couple of JIRAs for deleting the znode on a process failure:
> HBASE-5844 (RS)
> HBASE-5926 (Master)
> which are pretty neat; on process failure, they delete the znode of the 
> underlying process so HBase can recover faster.
> These JIRAs were implemented via the startup scripts; i.e. the script hangs 
> around and waits for the process to exit, then deletes the znode.
> There are a few problems associated with this approach, as listed in the 
> below JIRAs:
> 1) Hides startup output in script
> https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463401&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463401
> 2) two hbase processes listed per launched daemon
> https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463409&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463409
> 3) Not run by a real supervisor
> https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463409&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463409
> 4) Weird output after kill -9 actual process in standalone mode
> https://issues.apache.org/jira/browse/HBASE-5926?focusedCommentId=13506801&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13506801
> 5) Can kill existing RS if called again
> https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463401&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463401
> 6) Hides stdout/stderr[6]
> https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13506832&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13506832
> I suspect running in via something like supervisor.d can solve these issues 
> if we provide the right support.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7386) Investigate providing some supervisor support for znode deletion

Reply via email to