[ https://issues.apache.org/jira/browse/HBASE-7386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13535517#comment-13535517 ]
Gregory Chanan commented on HBASE-7386: --------------------------------------- I'm going to first tackle the Master case, HBASE-5926. The proposal is to rip out the script modification of HBASE-5926 and replace with supervisor.d support. The java code from HBASE-5926 would of course stay, only the restart/cleanup code in the scripts would go. The goal is to solve the above listed issues: #2 is solved by running via supervisor (jps reports only one HMaster process when HMaster started via supervisor.d) #3 is self-evidently solved by running on a real supervisor #5 is clearly not relevant, as it is only related to the RS. So, we are left with #1, #4, and #6. #1 and #6 are solved because the previous script behavior returns when not run in supervisor and supervisor can redirect stdout/stderr when run. #4 is solved because the old script behavior returns when not run in supervisor and I don't think it makes any sense to run a standalone HBase via supervisor. Example patch coming soon. > Investigate providing some supervisor support for znode deletion > ---------------------------------------------------------------- > > Key: HBASE-7386 > URL: https://issues.apache.org/jira/browse/HBASE-7386 > Project: HBase > Issue Type: Task > Components: master, regionserver, scripts > Reporter: Gregory Chanan > Assignee: Gregory Chanan > Fix For: 0.96.0 > > > There a couple of JIRAs for deleting the znode on a process failure: > HBASE-5844 (RS) > HBASE-5926 (Master) > which are pretty neat; on process failure, they delete the znode of the > underlying process so HBase can recover faster. > These JIRAs were implemented via the startup scripts; i.e. the script hangs > around and waits for the process to exit, then deletes the znode. > There are a few problems associated with this approach, as listed in the > below JIRAs: > 1) Hides startup output in script > https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463401&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463401 > 2) two hbase processes listed per launched daemon > https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463409&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463409 > 3) Not run by a real supervisor > https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463409&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463409 > 4) Weird output after kill -9 actual process in standalone mode > https://issues.apache.org/jira/browse/HBASE-5926?focusedCommentId=13506801&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13506801 > 5) Can kill existing RS if called again > https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463401&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463401 > 6) Hides stdout/stderr[6] > https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13506832&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13506832 > I suspect running in via something like supervisor.d can solve these issues > if we provide the right support. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira