[ 
https://issues.apache.org/jira/browse/HBASE-7386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16084730#comment-16084730
 ] 

Samir Ahmic commented on HBASE-7386:
------------------------------------

[~stack] i have done some testing with last patches against master branch and 
good news is that most of code(with small changes) and functionality works 
fine.  So original idea to improve MTTR by removing stale master and rs znodes 
plus watchdog which will restart process in case of unexpected failure is still 
valid.
My original scripts here are written with idea to be optional route in managing 
hbase processes using supervisor, and that approach opens couple of questions 
which i would like to discuss:
# Amount of code added and options to reduce it (i will anyway try to reduce it 
to minimum) probably some code can be integrated in exiting scripts to avoid 
copying
# Where are we going to add new scripts supervisord folder inside bin dir was 
may original idea and same thing goes for config files supervisord folder in 
conf dir
# Testing: i will cover supervisor 3.3.2 version(last stable) and some older 
version that are installed trough system packet manages
# And finally would it be better to implement our own Java supervisor which 
would do similar thing as python implementation 

Based on what we decide i will continue work here, if we go with python 
supervisor i can have patch ready for testing in couple of days. 

> Investigate providing some supervisor support for znode deletion
> ----------------------------------------------------------------
>
>                 Key: HBASE-7386
>                 URL: https://issues.apache.org/jira/browse/HBASE-7386
>             Project: HBase
>          Issue Type: Task
>          Components: master, regionserver, scripts
>            Reporter: Gregory Chanan
>            Assignee: stack
>            Priority: Blocker
>         Attachments: HBASE-7386-bin.patch, HBASE-7386-bin-v2.patch, 
> HBASE-7386-bin-v3.patch, HBASE-7386-conf.patch, HBASE-7386-conf-v2.patch, 
> HBASE-7386-conf-v3.patch, HBASE-7386-src.patch, HBASE-7386-v0.patch, 
> supervisordconfigs-v0.patch
>
>
> There a couple of JIRAs for deleting the znode on a process failure:
> HBASE-5844 (RS)
> HBASE-5926 (Master)
> which are pretty neat; on process failure, they delete the znode of the 
> underlying process so HBase can recover faster.
> These JIRAs were implemented via the startup scripts; i.e. the script hangs 
> around and waits for the process to exit, then deletes the znode.
> There are a few problems associated with this approach, as listed in the 
> below JIRAs:
> 1) Hides startup output in script
> https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463401&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463401
> 2) two hbase processes listed per launched daemon
> https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463409&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463409
> 3) Not run by a real supervisor
> https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463409&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463409
> 4) Weird output after kill -9 actual process in standalone mode
> https://issues.apache.org/jira/browse/HBASE-5926?focusedCommentId=13506801&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13506801
> 5) Can kill existing RS if called again
> https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463401&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463401
> 6) Hides stdout/stderr[6]
> https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13506832&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13506832
> I suspect running in via something like supervisor.d can solve these issues 
> if we provide the right support.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to