[jira] [Commented] (HBASE-7386) Investigate providing some supervisor support for znode deletion

2021-06-01 Thread Duo Zhang (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-7386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17354963#comment-17354963
 ] 

Duo Zhang commented on HBASE-7386:
--

Any updates here?

Thanks.

> Investigate providing some supervisor support for znode deletion
> 
>
> Key: HBASE-7386
> URL: https://issues.apache.org/jira/browse/HBASE-7386
> Project: HBase
>  Issue Type: Task
>  Components: master, regionserver, scripts
>Reporter: Gregory Chanan
>Assignee: Michael Stack
>Priority: Blocker
> Fix For: 3.0.0-alpha-1
>
> Attachments: HBASE-7386-bin-v2.patch, HBASE-7386-bin-v3.patch, 
> HBASE-7386-bin.patch, HBASE-7386-conf-v2.patch, HBASE-7386-conf-v3.patch, 
> HBASE-7386-conf.patch, HBASE-7386-master-00.patch, 
> HBASE-7386-master-01.patch, HBASE-7386-src.patch, HBASE-7386-v0.patch, 
> supervisordconfigs-v0.patch
>
>
> There a couple of JIRAs for deleting the znode on a process failure:
> HBASE-5844 (RS)
> HBASE-5926 (Master)
> which are pretty neat; on process failure, they delete the znode of the 
> underlying process so HBase can recover faster.
> These JIRAs were implemented via the startup scripts; i.e. the script hangs 
> around and waits for the process to exit, then deletes the znode.
> There are a few problems associated with this approach, as listed in the 
> below JIRAs:
> 1) Hides startup output in script
> https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463401=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463401
> 2) two hbase processes listed per launched daemon
> https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463409=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463409
> 3) Not run by a real supervisor
> https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463409=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463409
> 4) Weird output after kill -9 actual process in standalone mode
> https://issues.apache.org/jira/browse/HBASE-5926?focusedCommentId=13506801=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13506801
> 5) Can kill existing RS if called again
> https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463401=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463401
> 6) Hides stdout/stderr[6]
> https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13506832=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13506832
> I suspect running in via something like supervisor.d can solve these issues 
> if we provide the right support.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-7386) Investigate providing some supervisor support for znode deletion

2018-04-20 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16446039#comment-16446039
 ] 

Hadoop QA commented on HBASE-7386:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
14s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} shelldocs {color} | {color:blue}  0m  
3s{color} | {color:blue} Shelldocs was not available. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} pylint {color} | {color:green}  0m  
1s{color} | {color:green} There were no new pylint issues. {color} |
| {color:red}-1{color} | {color:red} shellcheck {color} | {color:red}  0m  
1s{color} | {color:red} The patch generated 44 new + 59 unchanged - 17 fixed = 
103 total (was 76) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
 8s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}  0m 36s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:d8b550f |
| JIRA Issue | HBASE-7386 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12878542/HBASE-7386-master-01.patch
 |
| Optional Tests |  asflicense  shellcheck  shelldocs  pylint  |
| uname | Linux 054cbf364276 4.4.0-43-generic #63-Ubuntu SMP Wed Oct 12 
13:48:03 UTC 2016 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / 8219ec7493 |
| maven | version: Apache Maven 3.5.3 
(3383c37e1f9e9b3bc3df5050c29c8aff9f295297; 2018-02-24T19:49:05Z) |
| shellcheck | v0.4.4 |
| pylint | v1.6.5 |
| shellcheck | 
https://builds.apache.org/job/PreCommit-HBASE-Build/12575/artifact/patchprocess/diff-patch-shellcheck.txt
 |
| Max. process+thread count | 47 (vs. ulimit of 1) |
| modules | C: . U: . |
| Console output | 
https://builds.apache.org/job/PreCommit-HBASE-Build/12575/console |
| Powered by | Apache Yetus 0.7.0   http://yetus.apache.org |


This message was automatically generated.



> Investigate providing some supervisor support for znode deletion
> 
>
> Key: HBASE-7386
> URL: https://issues.apache.org/jira/browse/HBASE-7386
> Project: HBase
>  Issue Type: Task
>  Components: master, regionserver, scripts
>Reporter: Gregory Chanan
>Assignee: stack
>Priority: Blocker
> Fix For: 3.0.0
>
> Attachments: HBASE-7386-bin-v2.patch, HBASE-7386-bin-v3.patch, 
> HBASE-7386-bin.patch, HBASE-7386-conf-v2.patch, HBASE-7386-conf-v3.patch, 
> HBASE-7386-conf.patch, HBASE-7386-master-00.patch, 
> HBASE-7386-master-01.patch, HBASE-7386-src.patch, HBASE-7386-v0.patch, 
> supervisordconfigs-v0.patch
>
>
> There a couple of JIRAs for deleting the znode on a process failure:
> HBASE-5844 (RS)
> HBASE-5926 (Master)
> which are pretty neat; on process failure, they delete the znode of the 
> underlying process so HBase can recover faster.
> These JIRAs were implemented via the startup scripts; i.e. the script hangs 
> around and waits for the process to exit, then deletes the znode.
> There are a few problems associated with this approach, as listed in the 
> below JIRAs:
> 1) Hides startup output in script
> https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463401=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463401
> 2) two hbase processes listed per launched daemon
> https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463409=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463409
> 3) Not run by a real supervisor
> https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463409=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463409
> 4) Weird output after kill -9 actual process in standalone mode
> https://issues.apache.org/jira/browse/HBASE-5926?focusedCommentId=13506801=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13506801
> 5) Can kill 

[jira] [Commented] (HBASE-7386) Investigate providing some supervisor support for znode deletion

2018-04-20 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16446030#comment-16446030
 ] 

Hadoop QA commented on HBASE-7386:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
13s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} shelldocs {color} | {color:blue}  0m  
4s{color} | {color:blue} Shelldocs was not available. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} pylint {color} | {color:green}  0m  
1s{color} | {color:green} There were no new pylint issues. {color} |
| {color:red}-1{color} | {color:red} shellcheck {color} | {color:red}  0m  
2s{color} | {color:red} The patch generated 44 new + 59 unchanged - 17 fixed = 
103 total (was 76) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
 8s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}  0m 36s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:d8b550f |
| JIRA Issue | HBASE-7386 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12878542/HBASE-7386-master-01.patch
 |
| Optional Tests |  asflicense  shellcheck  shelldocs  pylint  |
| uname | Linux fb37c345af50 4.4.0-43-generic #63-Ubuntu SMP Wed Oct 12 
13:48:03 UTC 2016 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / 8219ec7493 |
| maven | version: Apache Maven 3.5.3 
(3383c37e1f9e9b3bc3df5050c29c8aff9f295297; 2018-02-24T19:49:05Z) |
| shellcheck | v0.4.4 |
| pylint | v1.6.5 |
| shellcheck | 
https://builds.apache.org/job/PreCommit-HBASE-Build/12574/artifact/patchprocess/diff-patch-shellcheck.txt
 |
| Max. process+thread count | 48 (vs. ulimit of 1) |
| modules | C: . U: . |
| Console output | 
https://builds.apache.org/job/PreCommit-HBASE-Build/12574/console |
| Powered by | Apache Yetus 0.7.0   http://yetus.apache.org |


This message was automatically generated.



> Investigate providing some supervisor support for znode deletion
> 
>
> Key: HBASE-7386
> URL: https://issues.apache.org/jira/browse/HBASE-7386
> Project: HBase
>  Issue Type: Task
>  Components: master, regionserver, scripts
>Reporter: Gregory Chanan
>Assignee: stack
>Priority: Blocker
> Fix For: 3.0.0
>
> Attachments: HBASE-7386-bin-v2.patch, HBASE-7386-bin-v3.patch, 
> HBASE-7386-bin.patch, HBASE-7386-conf-v2.patch, HBASE-7386-conf-v3.patch, 
> HBASE-7386-conf.patch, HBASE-7386-master-00.patch, 
> HBASE-7386-master-01.patch, HBASE-7386-src.patch, HBASE-7386-v0.patch, 
> supervisordconfigs-v0.patch
>
>
> There a couple of JIRAs for deleting the znode on a process failure:
> HBASE-5844 (RS)
> HBASE-5926 (Master)
> which are pretty neat; on process failure, they delete the znode of the 
> underlying process so HBase can recover faster.
> These JIRAs were implemented via the startup scripts; i.e. the script hangs 
> around and waits for the process to exit, then deletes the znode.
> There are a few problems associated with this approach, as listed in the 
> below JIRAs:
> 1) Hides startup output in script
> https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463401=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463401
> 2) two hbase processes listed per launched daemon
> https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463409=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463409
> 3) Not run by a real supervisor
> https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463409=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463409
> 4) Weird output after kill -9 actual process in standalone mode
> https://issues.apache.org/jira/browse/HBASE-5926?focusedCommentId=13506801=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13506801
> 5) Can kill 

[jira] [Commented] (HBASE-7386) Investigate providing some supervisor support for znode deletion

2017-07-29 Thread Samir Ahmic (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16106090#comment-16106090
 ] 

Samir Ahmic commented on HBASE-7386:


[~stack] did you maybe have chance to take scripts on test drive ? Any 
suggestions how we can improve this  chief ? 

> Investigate providing some supervisor support for znode deletion
> 
>
> Key: HBASE-7386
> URL: https://issues.apache.org/jira/browse/HBASE-7386
> Project: HBase
>  Issue Type: Task
>  Components: master, regionserver, scripts
>Reporter: Gregory Chanan
>Assignee: stack
>Priority: Blocker
> Fix For: 3.0.0
>
> Attachments: HBASE-7386-bin.patch, HBASE-7386-bin-v2.patch, 
> HBASE-7386-bin-v3.patch, HBASE-7386-conf.patch, HBASE-7386-conf-v2.patch, 
> HBASE-7386-conf-v3.patch, HBASE-7386-master-00.patch, 
> HBASE-7386-master-01.patch, HBASE-7386-src.patch, HBASE-7386-v0.patch, 
> supervisordconfigs-v0.patch
>
>
> There a couple of JIRAs for deleting the znode on a process failure:
> HBASE-5844 (RS)
> HBASE-5926 (Master)
> which are pretty neat; on process failure, they delete the znode of the 
> underlying process so HBase can recover faster.
> These JIRAs were implemented via the startup scripts; i.e. the script hangs 
> around and waits for the process to exit, then deletes the znode.
> There are a few problems associated with this approach, as listed in the 
> below JIRAs:
> 1) Hides startup output in script
> https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463401=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463401
> 2) two hbase processes listed per launched daemon
> https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463409=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463409
> 3) Not run by a real supervisor
> https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463409=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463409
> 4) Weird output after kill -9 actual process in standalone mode
> https://issues.apache.org/jira/browse/HBASE-5926?focusedCommentId=13506801=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13506801
> 5) Can kill existing RS if called again
> https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463401=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463401
> 6) Hides stdout/stderr[6]
> https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13506832=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13506832
> I suspect running in via something like supervisor.d can solve these issues 
> if we provide the right support.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-7386) Investigate providing some supervisor support for znode deletion

2017-07-23 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16097618#comment-16097618
 ] 

Hadoop QA commented on HBASE-7386:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
19s{color} | {color:blue} Docker mode activated. {color} |
| {color:blue}0{color} | {color:blue} shelldocs {color} | {color:blue}  0m  
4s{color} | {color:blue} Shelldocs was not available. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} pylint {color} | {color:green}  0m  
3s{color} | {color:green} There were no new pylint issues. {color} |
| {color:red}-1{color} | {color:red} shellcheck {color} | {color:red}  0m  
4s{color} | {color:red} The patch generated 46 new + 480 unchanged - 18 fixed = 
526 total (was 498) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
29m 25s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.6.1 2.6.2 2.6.3 2.6.4 2.6.5 2.7.1 2.7.2 2.7.3 or 3.0.0-alpha4. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
14s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 30m 20s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=1.12.3 Server=1.12.3 Image:yetus/hbase:bdc94b1 |
| JIRA Issue | HBASE-7386 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12878542/HBASE-7386-master-01.patch
 |
| Optional Tests |  asflicense  shellcheck  shelldocs  pylint  |
| uname | Linux 24d9842af88e 3.13.0-119-generic #166-Ubuntu SMP Wed May 3 
12:18:55 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / e9d8a7b |
| shellcheck | v0.4.6 |
| pylint | v1.7.2 |
| shellcheck | 
https://builds.apache.org/job/PreCommit-HBASE-Build/7763/artifact/patchprocess/diff-patch-shellcheck.txt
 |
| modules | C: . U: . |
| Console output | 
https://builds.apache.org/job/PreCommit-HBASE-Build/7763/console |
| Powered by | Apache Yetus 0.4.0   http://yetus.apache.org |


This message was automatically generated.



> Investigate providing some supervisor support for znode deletion
> 
>
> Key: HBASE-7386
> URL: https://issues.apache.org/jira/browse/HBASE-7386
> Project: HBase
>  Issue Type: Task
>  Components: master, regionserver, scripts
>Reporter: Gregory Chanan
>Assignee: stack
>Priority: Blocker
> Fix For: 3.0.0
>
> Attachments: HBASE-7386-bin.patch, HBASE-7386-bin-v2.patch, 
> HBASE-7386-bin-v3.patch, HBASE-7386-conf.patch, HBASE-7386-conf-v2.patch, 
> HBASE-7386-conf-v3.patch, HBASE-7386-master-00.patch, 
> HBASE-7386-master-01.patch, HBASE-7386-src.patch, HBASE-7386-v0.patch, 
> supervisordconfigs-v0.patch
>
>
> There a couple of JIRAs for deleting the znode on a process failure:
> HBASE-5844 (RS)
> HBASE-5926 (Master)
> which are pretty neat; on process failure, they delete the znode of the 
> underlying process so HBase can recover faster.
> These JIRAs were implemented via the startup scripts; i.e. the script hangs 
> around and waits for the process to exit, then deletes the znode.
> There are a few problems associated with this approach, as listed in the 
> below JIRAs:
> 1) Hides startup output in script
> https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463401=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463401
> 2) two hbase processes listed per launched daemon
> https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463409=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463409
> 3) Not run by a real supervisor
> https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463409=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463409
> 4) Weird output after kill -9 actual process in standalone mode
> https://issues.apache.org/jira/browse/HBASE-5926?focusedCommentId=13506801=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13506801
> 5) Can kill existing RS if called again
> 

[jira] [Commented] (HBASE-7386) Investigate providing some supervisor support for znode deletion

2017-07-21 Thread Samir Ahmic (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16096776#comment-16096776
 ] 

Samir Ahmic commented on HBASE-7386:


Thanks [~stack].  Why python supervisor? Well we originally started this story 
around it, and after some time testing it, at least for me,  choosing mature 
and well proven process control system instead of writing custom bash scripts 
has multiple advantages. 
To be honest work here extends original issue of just removing stale znodes to 
creating watchdog over hbase processes and making alternative option for 
managing cluster but when we started tackling supervisor approach why not offer 
folks chance to 
less worry when rs process dies because it will be automatically restarted :) 
Also python supervisor has set of very cool futures like, auto-restart, event 
listeners (that may execute arbitrary code based on process state) an so on, 
and folks may start creating  they own  listeners for different proposes.
Btw i will address shellcheck and pylint issues in next patch. 

> Investigate providing some supervisor support for znode deletion
> 
>
> Key: HBASE-7386
> URL: https://issues.apache.org/jira/browse/HBASE-7386
> Project: HBase
>  Issue Type: Task
>  Components: master, regionserver, scripts
>Reporter: Gregory Chanan
>Assignee: stack
>Priority: Blocker
> Fix For: 3.0.0
>
> Attachments: HBASE-7386-bin.patch, HBASE-7386-bin-v2.patch, 
> HBASE-7386-bin-v3.patch, HBASE-7386-conf.patch, HBASE-7386-conf-v2.patch, 
> HBASE-7386-conf-v3.patch, HBASE-7386-master-00.patch, HBASE-7386-src.patch, 
> HBASE-7386-v0.patch, supervisordconfigs-v0.patch
>
>
> There a couple of JIRAs for deleting the znode on a process failure:
> HBASE-5844 (RS)
> HBASE-5926 (Master)
> which are pretty neat; on process failure, they delete the znode of the 
> underlying process so HBase can recover faster.
> These JIRAs were implemented via the startup scripts; i.e. the script hangs 
> around and waits for the process to exit, then deletes the znode.
> There are a few problems associated with this approach, as listed in the 
> below JIRAs:
> 1) Hides startup output in script
> https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463401=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463401
> 2) two hbase processes listed per launched daemon
> https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463409=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463409
> 3) Not run by a real supervisor
> https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463409=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463409
> 4) Weird output after kill -9 actual process in standalone mode
> https://issues.apache.org/jira/browse/HBASE-5926?focusedCommentId=13506801=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13506801
> 5) Can kill existing RS if called again
> https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463401=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463401
> 6) Hides stdout/stderr[6]
> https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13506832=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13506832
> I suspect running in via something like supervisor.d can solve these issues 
> if we provide the right support.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-7386) Investigate providing some supervisor support for znode deletion

2017-07-21 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16096438#comment-16096438
 ] 

stack commented on HBASE-7386:
--

This looks really nice [~asamir] Let me give it a closer review Thank you. 
Why python supervisor scripts and not say bash? Thanks boss.

> Investigate providing some supervisor support for znode deletion
> 
>
> Key: HBASE-7386
> URL: https://issues.apache.org/jira/browse/HBASE-7386
> Project: HBase
>  Issue Type: Task
>  Components: master, regionserver, scripts
>Reporter: Gregory Chanan
>Assignee: stack
>Priority: Blocker
> Fix For: 3.0.0
>
> Attachments: HBASE-7386-bin.patch, HBASE-7386-bin-v2.patch, 
> HBASE-7386-bin-v3.patch, HBASE-7386-conf.patch, HBASE-7386-conf-v2.patch, 
> HBASE-7386-conf-v3.patch, HBASE-7386-master-00.patch, HBASE-7386-src.patch, 
> HBASE-7386-v0.patch, supervisordconfigs-v0.patch
>
>
> There a couple of JIRAs for deleting the znode on a process failure:
> HBASE-5844 (RS)
> HBASE-5926 (Master)
> which are pretty neat; on process failure, they delete the znode of the 
> underlying process so HBase can recover faster.
> These JIRAs were implemented via the startup scripts; i.e. the script hangs 
> around and waits for the process to exit, then deletes the znode.
> There are a few problems associated with this approach, as listed in the 
> below JIRAs:
> 1) Hides startup output in script
> https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463401=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463401
> 2) two hbase processes listed per launched daemon
> https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463409=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463409
> 3) Not run by a real supervisor
> https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463409=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463409
> 4) Weird output after kill -9 actual process in standalone mode
> https://issues.apache.org/jira/browse/HBASE-5926?focusedCommentId=13506801=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13506801
> 5) Can kill existing RS if called again
> https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463401=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463401
> 6) Hides stdout/stderr[6]
> https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13506832=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13506832
> I suspect running in via something like supervisor.d can solve these issues 
> if we provide the right support.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-7386) Investigate providing some supervisor support for znode deletion

2017-07-20 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16095330#comment-16095330
 ] 

Hadoop QA commented on HBASE-7386:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
18s{color} | {color:blue} Docker mode activated. {color} |
| {color:blue}0{color} | {color:blue} shelldocs {color} | {color:blue}  0m  
4s{color} | {color:blue} Shelldocs was not available. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} pylint {color} | {color:red}  0m  
1s{color} | {color:red} The patch generated 6 new + 0 unchanged - 0 fixed = 6 
total (was 0) {color} |
| {color:red}-1{color} | {color:red} shellcheck {color} | {color:red}  0m  
7s{color} | {color:red} The patch generated 153 new + 489 unchanged - 9 fixed = 
642 total (was 498) {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch 3 line(s) with tabs. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
31m 40s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.6.1 2.6.2 2.6.3 2.6.4 2.6.5 2.7.1 2.7.2 2.7.3 or 3.0.0-alpha4. {color} |
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
12s{color} | {color:red} The patch generated 2 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 32m 32s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=1.12.3 Server=1.12.3 Image:yetus/hbase:757bf37 |
| JIRA Issue | HBASE-7386 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12878238/HBASE-7386-master-00.patch
 |
| Optional Tests |  asflicense  shellcheck  shelldocs  pylint  |
| uname | Linux 1c2618903bc3 3.13.0-119-generic #166-Ubuntu SMP Wed May 3 
12:18:55 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build@2/component/dev-support/hbase-personality.sh
 |
| git revision | master / bc93b66 |
| shellcheck | v0.4.6 |
| pylint | v1.7.2 |
| pylint | 
https://builds.apache.org/job/PreCommit-HBASE-Build/7737/artifact/patchprocess/diff-patch-pylint.txt
 |
| shellcheck | 
https://builds.apache.org/job/PreCommit-HBASE-Build/7737/artifact/patchprocess/diff-patch-shellcheck.txt
 |
| whitespace | 
https://builds.apache.org/job/PreCommit-HBASE-Build/7737/artifact/patchprocess/whitespace-tabs.txt
 |
| asflicense | 
https://builds.apache.org/job/PreCommit-HBASE-Build/7737/artifact/patchprocess/patch-asflicense-problems.txt
 |
| modules | C: . U: . |
| Console output | 
https://builds.apache.org/job/PreCommit-HBASE-Build/7737/console |
| Powered by | Apache Yetus 0.4.0   http://yetus.apache.org |


This message was automatically generated.



> Investigate providing some supervisor support for znode deletion
> 
>
> Key: HBASE-7386
> URL: https://issues.apache.org/jira/browse/HBASE-7386
> Project: HBase
>  Issue Type: Task
>  Components: master, regionserver, scripts
>Reporter: Gregory Chanan
>Assignee: stack
>Priority: Blocker
> Fix For: 3.0.0
>
> Attachments: HBASE-7386-bin.patch, HBASE-7386-bin-v2.patch, 
> HBASE-7386-bin-v3.patch, HBASE-7386-conf.patch, HBASE-7386-conf-v2.patch, 
> HBASE-7386-conf-v3.patch, HBASE-7386-master-00.patch, HBASE-7386-src.patch, 
> HBASE-7386-v0.patch, supervisordconfigs-v0.patch
>
>
> There a couple of JIRAs for deleting the znode on a process failure:
> HBASE-5844 (RS)
> HBASE-5926 (Master)
> which are pretty neat; on process failure, they delete the znode of the 
> underlying process so HBase can recover faster.
> These JIRAs were implemented via the startup scripts; i.e. the script hangs 
> around and waits for the process to exit, then deletes the znode.
> There are a few problems associated with this approach, as listed in the 
> below JIRAs:
> 1) Hides startup output in script
> https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463401=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463401
> 2) two hbase processes listed per launched daemon
> https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463409=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463409
> 3) Not run by a real supervisor
> https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463409=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463409
> 4) Weird output after kill -9 actual process in 

[jira] [Commented] (HBASE-7386) Investigate providing some supervisor support for znode deletion

2017-07-12 Thread Samir Ahmic (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16084730#comment-16084730
 ] 

Samir Ahmic commented on HBASE-7386:


[~stack] i have done some testing with last patches against master branch and 
good news is that most of code(with small changes) and functionality works 
fine.  So original idea to improve MTTR by removing stale master and rs znodes 
plus watchdog which will restart process in case of unexpected failure is still 
valid.
My original scripts here are written with idea to be optional route in managing 
hbase processes using supervisor, and that approach opens couple of questions 
which i would like to discuss:
# Amount of code added and options to reduce it (i will anyway try to reduce it 
to minimum) probably some code can be integrated in exiting scripts to avoid 
copying
# Where are we going to add new scripts supervisord folder inside bin dir was 
may original idea and same thing goes for config files supervisord folder in 
conf dir
# Testing: i will cover supervisor 3.3.2 version(last stable) and some older 
version that are installed trough system packet manages
# And finally would it be better to implement our own Java supervisor which 
would do similar thing as python implementation 

Based on what we decide i will continue work here, if we go with python 
supervisor i can have patch ready for testing in couple of days. 

> Investigate providing some supervisor support for znode deletion
> 
>
> Key: HBASE-7386
> URL: https://issues.apache.org/jira/browse/HBASE-7386
> Project: HBase
>  Issue Type: Task
>  Components: master, regionserver, scripts
>Reporter: Gregory Chanan
>Assignee: stack
>Priority: Blocker
> Attachments: HBASE-7386-bin.patch, HBASE-7386-bin-v2.patch, 
> HBASE-7386-bin-v3.patch, HBASE-7386-conf.patch, HBASE-7386-conf-v2.patch, 
> HBASE-7386-conf-v3.patch, HBASE-7386-src.patch, HBASE-7386-v0.patch, 
> supervisordconfigs-v0.patch
>
>
> There a couple of JIRAs for deleting the znode on a process failure:
> HBASE-5844 (RS)
> HBASE-5926 (Master)
> which are pretty neat; on process failure, they delete the znode of the 
> underlying process so HBase can recover faster.
> These JIRAs were implemented via the startup scripts; i.e. the script hangs 
> around and waits for the process to exit, then deletes the znode.
> There are a few problems associated with this approach, as listed in the 
> below JIRAs:
> 1) Hides startup output in script
> https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463401=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463401
> 2) two hbase processes listed per launched daemon
> https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463409=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463409
> 3) Not run by a real supervisor
> https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463409=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463409
> 4) Weird output after kill -9 actual process in standalone mode
> https://issues.apache.org/jira/browse/HBASE-5926?focusedCommentId=13506801=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13506801
> 5) Can kill existing RS if called again
> https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463401=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463401
> 6) Hides stdout/stderr[6]
> https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13506832=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13506832
> I suspect running in via something like supervisor.d can solve these issues 
> if we provide the right support.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-7386) Investigate providing some supervisor support for znode deletion

2017-07-03 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16073079#comment-16073079
 ] 

stack commented on HBASE-7386:
--

Thank you [~asamir]

> Investigate providing some supervisor support for znode deletion
> 
>
> Key: HBASE-7386
> URL: https://issues.apache.org/jira/browse/HBASE-7386
> Project: HBase
>  Issue Type: Task
>  Components: master, regionserver, scripts
>Reporter: Gregory Chanan
>Assignee: stack
>Priority: Blocker
> Attachments: HBASE-7386-bin.patch, HBASE-7386-bin-v2.patch, 
> HBASE-7386-bin-v3.patch, HBASE-7386-conf.patch, HBASE-7386-conf-v2.patch, 
> HBASE-7386-conf-v3.patch, HBASE-7386-src.patch, HBASE-7386-v0.patch, 
> supervisordconfigs-v0.patch
>
>
> There a couple of JIRAs for deleting the znode on a process failure:
> HBASE-5844 (RS)
> HBASE-5926 (Master)
> which are pretty neat; on process failure, they delete the znode of the 
> underlying process so HBase can recover faster.
> These JIRAs were implemented via the startup scripts; i.e. the script hangs 
> around and waits for the process to exit, then deletes the znode.
> There are a few problems associated with this approach, as listed in the 
> below JIRAs:
> 1) Hides startup output in script
> https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463401=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463401
> 2) two hbase processes listed per launched daemon
> https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463409=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463409
> 3) Not run by a real supervisor
> https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463409=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463409
> 4) Weird output after kill -9 actual process in standalone mode
> https://issues.apache.org/jira/browse/HBASE-5926?focusedCommentId=13506801=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13506801
> 5) Can kill existing RS if called again
> https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463401=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463401
> 6) Hides stdout/stderr[6]
> https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13506832=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13506832
> I suspect running in via something like supervisor.d can solve these issues 
> if we provide the right support.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-7386) Investigate providing some supervisor support for znode deletion

2017-07-03 Thread Samir Ahmic (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16072789#comment-16072789
 ] 

Samir Ahmic commented on HBASE-7386:


[~stack] anything improving MTTR and cluster operability deserves decent retry 
:), i will take look at work done here and check if same or similar concept can 
be applied at current hbase clusters. 

> Investigate providing some supervisor support for znode deletion
> 
>
> Key: HBASE-7386
> URL: https://issues.apache.org/jira/browse/HBASE-7386
> Project: HBase
>  Issue Type: Task
>  Components: master, regionserver, scripts
>Reporter: Gregory Chanan
>Assignee: stack
>Priority: Blocker
> Attachments: HBASE-7386-bin.patch, HBASE-7386-bin-v2.patch, 
> HBASE-7386-bin-v3.patch, HBASE-7386-conf.patch, HBASE-7386-conf-v2.patch, 
> HBASE-7386-conf-v3.patch, HBASE-7386-src.patch, HBASE-7386-v0.patch, 
> supervisordconfigs-v0.patch
>
>
> There a couple of JIRAs for deleting the znode on a process failure:
> HBASE-5844 (RS)
> HBASE-5926 (Master)
> which are pretty neat; on process failure, they delete the znode of the 
> underlying process so HBase can recover faster.
> These JIRAs were implemented via the startup scripts; i.e. the script hangs 
> around and waits for the process to exit, then deletes the znode.
> There are a few problems associated with this approach, as listed in the 
> below JIRAs:
> 1) Hides startup output in script
> https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463401=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463401
> 2) two hbase processes listed per launched daemon
> https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463409=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463409
> 3) Not run by a real supervisor
> https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463409=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463409
> 4) Weird output after kill -9 actual process in standalone mode
> https://issues.apache.org/jira/browse/HBASE-5926?focusedCommentId=13506801=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13506801
> 5) Can kill existing RS if called again
> https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463401=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463401
> 6) Hides stdout/stderr[6]
> https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13506832=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13506832
> I suspect running in via something like supervisor.d can solve these issues 
> if we provide the right support.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-7386) Investigate providing some supervisor support for znode deletion

2017-07-01 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16071439#comment-16071439
 ] 

stack commented on HBASE-7386:
--

What you think of this [~asamir] Is it salvageable?

> Investigate providing some supervisor support for znode deletion
> 
>
> Key: HBASE-7386
> URL: https://issues.apache.org/jira/browse/HBASE-7386
> Project: HBase
>  Issue Type: Task
>  Components: master, regionserver, scripts
>Reporter: Gregory Chanan
>Assignee: stack
>Priority: Blocker
> Attachments: HBASE-7386-bin.patch, HBASE-7386-bin-v2.patch, 
> HBASE-7386-bin-v3.patch, HBASE-7386-conf.patch, HBASE-7386-conf-v2.patch, 
> HBASE-7386-conf-v3.patch, HBASE-7386-src.patch, HBASE-7386-v0.patch, 
> supervisordconfigs-v0.patch
>
>
> There a couple of JIRAs for deleting the znode on a process failure:
> HBASE-5844 (RS)
> HBASE-5926 (Master)
> which are pretty neat; on process failure, they delete the znode of the 
> underlying process so HBase can recover faster.
> These JIRAs were implemented via the startup scripts; i.e. the script hangs 
> around and waits for the process to exit, then deletes the znode.
> There are a few problems associated with this approach, as listed in the 
> below JIRAs:
> 1) Hides startup output in script
> https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463401=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463401
> 2) two hbase processes listed per launched daemon
> https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463409=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463409
> 3) Not run by a real supervisor
> https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463409=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463409
> 4) Weird output after kill -9 actual process in standalone mode
> https://issues.apache.org/jira/browse/HBASE-5926?focusedCommentId=13506801=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13506801
> 5) Can kill existing RS if called again
> https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463401=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463401
> 6) Hides stdout/stderr[6]
> https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13506832=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13506832
> I suspect running in via something like supervisor.d can solve these issues 
> if we provide the right support.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-7386) Investigate providing some supervisor support for znode deletion

2014-05-21 Thread Shengzhe Yao (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14005054#comment-14005054
 ] 

Shengzhe Yao commented on HBASE-7386:
-

Any updates ? Are we going to merge the latest patch ? This is a pretty cool 
feature and please let me know if there are things I can help :) 

 Investigate providing some supervisor support for znode deletion
 

 Key: HBASE-7386
 URL: https://issues.apache.org/jira/browse/HBASE-7386
 Project: HBase
  Issue Type: Task
  Components: master, regionserver, scripts
Reporter: Gregory Chanan
Assignee: stack
Priority: Blocker
 Attachments: HBASE-7386-bin-v2.patch, HBASE-7386-bin-v3.patch, 
 HBASE-7386-bin.patch, HBASE-7386-conf-v2.patch, HBASE-7386-conf-v3.patch, 
 HBASE-7386-conf.patch, HBASE-7386-src.patch, HBASE-7386-v0.patch, 
 supervisordconfigs-v0.patch


 There a couple of JIRAs for deleting the znode on a process failure:
 HBASE-5844 (RS)
 HBASE-5926 (Master)
 which are pretty neat; on process failure, they delete the znode of the 
 underlying process so HBase can recover faster.
 These JIRAs were implemented via the startup scripts; i.e. the script hangs 
 around and waits for the process to exit, then deletes the znode.
 There are a few problems associated with this approach, as listed in the 
 below JIRAs:
 1) Hides startup output in script
 https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463401page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463401
 2) two hbase processes listed per launched daemon
 https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463409page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463409
 3) Not run by a real supervisor
 https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463409page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463409
 4) Weird output after kill -9 actual process in standalone mode
 https://issues.apache.org/jira/browse/HBASE-5926?focusedCommentId=13506801page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13506801
 5) Can kill existing RS if called again
 https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463401page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463401
 6) Hides stdout/stderr[6]
 https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13506832page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13506832
 I suspect running in via something like supervisor.d can solve these issues 
 if we provide the right support.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-7386) Investigate providing some supervisor support for znode deletion

2014-01-13 Thread Samir Ahmic (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13869578#comment-13869578
 ] 

Samir Ahmic commented on HBASE-7386:


Update, with [HBASE-10310] fixed  master---backupmaster failover time is 4s 
when cluster is controlled with supervisor and 3s when is controlled with 
standard scripts.

 Investigate providing some supervisor support for znode deletion
 

 Key: HBASE-7386
 URL: https://issues.apache.org/jira/browse/HBASE-7386
 Project: HBase
  Issue Type: Task
  Components: master, regionserver, scripts
Reporter: Gregory Chanan
Assignee: stack
Priority: Blocker
 Attachments: HBASE-7386-bin-v2.patch, HBASE-7386-bin.patch, 
 HBASE-7386-conf-v2.patch, HBASE-7386-conf.patch, HBASE-7386-src.patch, 
 HBASE-7386-v0.patch, supervisordconfigs-v0.patch


 There a couple of JIRAs for deleting the znode on a process failure:
 HBASE-5844 (RS)
 HBASE-5926 (Master)
 which are pretty neat; on process failure, they delete the znode of the 
 underlying process so HBase can recover faster.
 These JIRAs were implemented via the startup scripts; i.e. the script hangs 
 around and waits for the process to exit, then deletes the znode.
 There are a few problems associated with this approach, as listed in the 
 below JIRAs:
 1) Hides startup output in script
 https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463401page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463401
 2) two hbase processes listed per launched daemon
 https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463409page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463409
 3) Not run by a real supervisor
 https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463409page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463409
 4) Weird output after kill -9 actual process in standalone mode
 https://issues.apache.org/jira/browse/HBASE-5926?focusedCommentId=13506801page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13506801
 5) Can kill existing RS if called again
 https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463401page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463401
 6) Hides stdout/stderr[6]
 https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13506832page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13506832
 I suspect running in via something like supervisor.d can solve these issues 
 if we provide the right support.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-7386) Investigate providing some supervisor support for znode deletion

2014-01-13 Thread Nicolas Liochon (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13869629#comment-13869629
 ] 

Nicolas Liochon commented on HBASE-7386:


Thanks a lot for the fix of  HBASE-10310, Samir. I went through your patch. 
It's a difficult read when you don't know supervisor ;-). The definition of 
'PROCESS_STATE_UNKNOWN' is a little scary (as we kill the region server when we 
reach this state).

There are some typos ('Test is supevisored installed' instead of supevisord).

I'm not sure about stuff like 'subprocess.call('/bin/mail -s 
HBASE_PROCESS_EVENT %s  %s'%(email, tmp_file), shell=True)': seems machine 
dependent, there is no /bin/mail on my ubuntu desktop.

Do we have to use python?

It would be good to have a review from someone who knows supervisor... As well, 
this should be documented in the hbase reference guide imho.

 Investigate providing some supervisor support for znode deletion
 

 Key: HBASE-7386
 URL: https://issues.apache.org/jira/browse/HBASE-7386
 Project: HBase
  Issue Type: Task
  Components: master, regionserver, scripts
Reporter: Gregory Chanan
Assignee: stack
Priority: Blocker
 Attachments: HBASE-7386-bin-v2.patch, HBASE-7386-bin.patch, 
 HBASE-7386-conf-v2.patch, HBASE-7386-conf.patch, HBASE-7386-src.patch, 
 HBASE-7386-v0.patch, supervisordconfigs-v0.patch


 There a couple of JIRAs for deleting the znode on a process failure:
 HBASE-5844 (RS)
 HBASE-5926 (Master)
 which are pretty neat; on process failure, they delete the znode of the 
 underlying process so HBase can recover faster.
 These JIRAs were implemented via the startup scripts; i.e. the script hangs 
 around and waits for the process to exit, then deletes the znode.
 There are a few problems associated with this approach, as listed in the 
 below JIRAs:
 1) Hides startup output in script
 https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463401page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463401
 2) two hbase processes listed per launched daemon
 https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463409page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463409
 3) Not run by a real supervisor
 https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463409page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463409
 4) Weird output after kill -9 actual process in standalone mode
 https://issues.apache.org/jira/browse/HBASE-5926?focusedCommentId=13506801page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13506801
 5) Can kill existing RS if called again
 https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463401page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463401
 6) Hides stdout/stderr[6]
 https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13506832page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13506832
 I suspect running in via something like supervisor.d can solve these issues 
 if we provide the right support.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-7386) Investigate providing some supervisor support for znode deletion

2014-01-13 Thread Samir Ahmic (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13869915#comment-13869915
 ] 

Samir Ahmic commented on HBASE-7386:


Thanks for review [~nkeywal]. 
I agree about 'PROCESS_STATE_UNKNOWN',  i checked it in supervisor source code 
and it is look like that  is used for actions if supervisor is unable to 
determine state of process. I will remove it from event listener since it can 
cause issues. 

I was planing to make mail notification optional even to create separate event 
listener that will handle email notifications. '/bin/mail' is most simple 
solution and following that example folks could develop there own solution. 
What do you think how this should be handled ?

bq. Do we have to use python?
According to documentation: Event listener can be written in any language 
supported by the platform you’re using to run supervisor. There is special 
library support for Python in the form of a supervisor.childutils module, which 
makes creating event listeners in Python slightly easier than in other 
languages. Any suggestions what should we use instead of python ? Java ?
When we complete this work it should be documented probably under 15. Apache 
HBase Operational Management ?






 Investigate providing some supervisor support for znode deletion
 

 Key: HBASE-7386
 URL: https://issues.apache.org/jira/browse/HBASE-7386
 Project: HBase
  Issue Type: Task
  Components: master, regionserver, scripts
Reporter: Gregory Chanan
Assignee: stack
Priority: Blocker
 Attachments: HBASE-7386-bin-v2.patch, HBASE-7386-bin.patch, 
 HBASE-7386-conf-v2.patch, HBASE-7386-conf.patch, HBASE-7386-src.patch, 
 HBASE-7386-v0.patch, supervisordconfigs-v0.patch


 There a couple of JIRAs for deleting the znode on a process failure:
 HBASE-5844 (RS)
 HBASE-5926 (Master)
 which are pretty neat; on process failure, they delete the znode of the 
 underlying process so HBase can recover faster.
 These JIRAs were implemented via the startup scripts; i.e. the script hangs 
 around and waits for the process to exit, then deletes the znode.
 There are a few problems associated with this approach, as listed in the 
 below JIRAs:
 1) Hides startup output in script
 https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463401page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463401
 2) two hbase processes listed per launched daemon
 https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463409page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463409
 3) Not run by a real supervisor
 https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463409page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463409
 4) Weird output after kill -9 actual process in standalone mode
 https://issues.apache.org/jira/browse/HBASE-5926?focusedCommentId=13506801page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13506801
 5) Can kill existing RS if called again
 https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463401page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463401
 6) Hides stdout/stderr[6]
 https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13506832page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13506832
 I suspect running in via something like supervisor.d can solve these issues 
 if we provide the right support.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-7386) Investigate providing some supervisor support for znode deletion

2014-01-10 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13868220#comment-13868220
 ] 

stack commented on HBASE-7386:
--

Related, from our boys at Xiaomi  https://github.com/XiaoMi/minos

 Investigate providing some supervisor support for znode deletion
 

 Key: HBASE-7386
 URL: https://issues.apache.org/jira/browse/HBASE-7386
 Project: HBase
  Issue Type: Task
  Components: master, regionserver, scripts
Reporter: Gregory Chanan
Assignee: stack
Priority: Blocker
 Attachments: HBASE-7386-bin-v2.patch, HBASE-7386-bin.patch, 
 HBASE-7386-conf-v2.patch, HBASE-7386-conf.patch, HBASE-7386-src.patch, 
 HBASE-7386-v0.patch, supervisordconfigs-v0.patch


 There a couple of JIRAs for deleting the znode on a process failure:
 HBASE-5844 (RS)
 HBASE-5926 (Master)
 which are pretty neat; on process failure, they delete the znode of the 
 underlying process so HBase can recover faster.
 These JIRAs were implemented via the startup scripts; i.e. the script hangs 
 around and waits for the process to exit, then deletes the znode.
 There are a few problems associated with this approach, as listed in the 
 below JIRAs:
 1) Hides startup output in script
 https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463401page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463401
 2) two hbase processes listed per launched daemon
 https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463409page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463409
 3) Not run by a real supervisor
 https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463409page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463409
 4) Weird output after kill -9 actual process in standalone mode
 https://issues.apache.org/jira/browse/HBASE-5926?focusedCommentId=13506801page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13506801
 5) Can kill existing RS if called again
 https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463401page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463401
 6) Hides stdout/stderr[6]
 https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13506832page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13506832
 I suspect running in via something like supervisor.d can solve these issues 
 if we provide the right support.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-7386) Investigate providing some supervisor support for znode deletion

2014-01-08 Thread Nicolas Liochon (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13865272#comment-13865272
 ] 

Nicolas Liochon commented on HBASE-7386:


bq. i have verify that supervisor approach improves master failover in my 
testing this time is ~7s when using supervisor and when using standard scripts 
it is ~40s

This is strange. Do you know why? What is the test scenario?

 Investigate providing some supervisor support for znode deletion
 

 Key: HBASE-7386
 URL: https://issues.apache.org/jira/browse/HBASE-7386
 Project: HBase
  Issue Type: Task
  Components: master, regionserver, scripts
Reporter: Gregory Chanan
Assignee: stack
Priority: Blocker
 Attachments: HBASE-7386-bin-v2.patch, HBASE-7386-bin.patch, 
 HBASE-7386-conf-v2.patch, HBASE-7386-conf.patch, HBASE-7386-src.patch, 
 HBASE-7386-v0.patch, supervisordconfigs-v0.patch


 There a couple of JIRAs for deleting the znode on a process failure:
 HBASE-5844 (RS)
 HBASE-5926 (Master)
 which are pretty neat; on process failure, they delete the znode of the 
 underlying process so HBase can recover faster.
 These JIRAs were implemented via the startup scripts; i.e. the script hangs 
 around and waits for the process to exit, then deletes the znode.
 There are a few problems associated with this approach, as listed in the 
 below JIRAs:
 1) Hides startup output in script
 https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463401page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463401
 2) two hbase processes listed per launched daemon
 https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463409page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463409
 3) Not run by a real supervisor
 https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463409page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463409
 4) Weird output after kill -9 actual process in standalone mode
 https://issues.apache.org/jira/browse/HBASE-5926?focusedCommentId=13506801page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13506801
 5) Can kill existing RS if called again
 https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463401page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463401
 6) Hides stdout/stderr[6]
 https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13506832page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13506832
 I suspect running in via something like supervisor.d can solve these issues 
 if we provide the right support.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-7386) Investigate providing some supervisor support for znode deletion

2014-01-08 Thread Samir Ahmic (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13865479#comment-13865479
 ] 

Samir Ahmic commented on HBASE-7386:


From what i could see it is all about removing master znode in zookeeper. In 
supervisor scenario master znode is deleted by autorestart and in standard 
scripts we don't delete master znode. Is master znode ephemeral ?  It should 
be gone when master dies. 
Test scenario is very simple:
* distrubuted cluster 0.96
* start master and backup master on different machines 
* date; kill -9 master and watch logs on backup master to see when it become 
active
* i have also used python based script that watches '/hbase/master' znode and 
detect changes



 Investigate providing some supervisor support for znode deletion
 

 Key: HBASE-7386
 URL: https://issues.apache.org/jira/browse/HBASE-7386
 Project: HBase
  Issue Type: Task
  Components: master, regionserver, scripts
Reporter: Gregory Chanan
Assignee: stack
Priority: Blocker
 Attachments: HBASE-7386-bin-v2.patch, HBASE-7386-bin.patch, 
 HBASE-7386-conf-v2.patch, HBASE-7386-conf.patch, HBASE-7386-src.patch, 
 HBASE-7386-v0.patch, supervisordconfigs-v0.patch


 There a couple of JIRAs for deleting the znode on a process failure:
 HBASE-5844 (RS)
 HBASE-5926 (Master)
 which are pretty neat; on process failure, they delete the znode of the 
 underlying process so HBase can recover faster.
 These JIRAs were implemented via the startup scripts; i.e. the script hangs 
 around and waits for the process to exit, then deletes the znode.
 There are a few problems associated with this approach, as listed in the 
 below JIRAs:
 1) Hides startup output in script
 https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463401page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463401
 2) two hbase processes listed per launched daemon
 https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463409page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463409
 3) Not run by a real supervisor
 https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463409page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463409
 4) Weird output after kill -9 actual process in standalone mode
 https://issues.apache.org/jira/browse/HBASE-5926?focusedCommentId=13506801page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13506801
 5) Can kill existing RS if called again
 https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463401page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463401
 6) Hides stdout/stderr[6]
 https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13506832page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13506832
 I suspect running in via something like supervisor.d can solve these issues 
 if we provide the right support.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-7386) Investigate providing some supervisor support for znode deletion

2014-01-08 Thread Nicolas Liochon (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13865513#comment-13865513
 ] 

Nicolas Liochon commented on HBASE-7386:


bq. standard scripts we don't delete master znode
We should, that's what HBASE-5926 is about.It used to work for sure.
It's better to delete it just after the server death, as the restart may never 
happen...


 Investigate providing some supervisor support for znode deletion
 

 Key: HBASE-7386
 URL: https://issues.apache.org/jira/browse/HBASE-7386
 Project: HBase
  Issue Type: Task
  Components: master, regionserver, scripts
Reporter: Gregory Chanan
Assignee: stack
Priority: Blocker
 Attachments: HBASE-7386-bin-v2.patch, HBASE-7386-bin.patch, 
 HBASE-7386-conf-v2.patch, HBASE-7386-conf.patch, HBASE-7386-src.patch, 
 HBASE-7386-v0.patch, supervisordconfigs-v0.patch


 There a couple of JIRAs for deleting the znode on a process failure:
 HBASE-5844 (RS)
 HBASE-5926 (Master)
 which are pretty neat; on process failure, they delete the znode of the 
 underlying process so HBase can recover faster.
 These JIRAs were implemented via the startup scripts; i.e. the script hangs 
 around and waits for the process to exit, then deletes the znode.
 There are a few problems associated with this approach, as listed in the 
 below JIRAs:
 1) Hides startup output in script
 https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463401page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463401
 2) two hbase processes listed per launched daemon
 https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463409page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463409
 3) Not run by a real supervisor
 https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463409page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463409
 4) Weird output after kill -9 actual process in standalone mode
 https://issues.apache.org/jira/browse/HBASE-5926?focusedCommentId=13506801page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13506801
 5) Can kill existing RS if called again
 https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463401page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463401
 6) Hides stdout/stderr[6]
 https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13506832page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13506832
 I suspect running in via something like supervisor.d can solve these issues 
 if we provide the right support.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-7386) Investigate providing some supervisor support for znode deletion

2014-01-08 Thread Samir Ahmic (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13865571#comment-13865571
 ] 

Samir Ahmic commented on HBASE-7386:


bq. It's better to delete it just after the server death, as the restart may 
never happen...

Are you suggesting that i modify 'zk_cleaner.py' listener script to delete 
master znode when detects that master is in one of this states 
('PROCESS_STATE_STOPPING', 'PROCESS_STATE_EXITED', 'PROCESS_STATE_UNKNOWN) ? 
I'm already doing this for regionservers so it should few lines of code.

 Investigate providing some supervisor support for znode deletion
 

 Key: HBASE-7386
 URL: https://issues.apache.org/jira/browse/HBASE-7386
 Project: HBase
  Issue Type: Task
  Components: master, regionserver, scripts
Reporter: Gregory Chanan
Assignee: stack
Priority: Blocker
 Attachments: HBASE-7386-bin-v2.patch, HBASE-7386-bin.patch, 
 HBASE-7386-conf-v2.patch, HBASE-7386-conf.patch, HBASE-7386-src.patch, 
 HBASE-7386-v0.patch, supervisordconfigs-v0.patch


 There a couple of JIRAs for deleting the znode on a process failure:
 HBASE-5844 (RS)
 HBASE-5926 (Master)
 which are pretty neat; on process failure, they delete the znode of the 
 underlying process so HBase can recover faster.
 These JIRAs were implemented via the startup scripts; i.e. the script hangs 
 around and waits for the process to exit, then deletes the znode.
 There are a few problems associated with this approach, as listed in the 
 below JIRAs:
 1) Hides startup output in script
 https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463401page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463401
 2) two hbase processes listed per launched daemon
 https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463409page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463409
 3) Not run by a real supervisor
 https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463409page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463409
 4) Weird output after kill -9 actual process in standalone mode
 https://issues.apache.org/jira/browse/HBASE-5926?focusedCommentId=13506801page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13506801
 5) Can kill existing RS if called again
 https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463401page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463401
 6) Hides stdout/stderr[6]
 https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13506832page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13506832
 I suspect running in via something like supervisor.d can solve these issues 
 if we provide the right support.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-7386) Investigate providing some supervisor support for znode deletion

2014-01-08 Thread Nicolas Liochon (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13865594#comment-13865594
 ] 

Nicolas Liochon commented on HBASE-7386:


may be ;-). Just that if the process exit, we can clean the ZK node 
immediately, ideally w/o relying on a separate watchdog. 
What's the PROCESS_STATE_UNKNOWN? 

 Investigate providing some supervisor support for znode deletion
 

 Key: HBASE-7386
 URL: https://issues.apache.org/jira/browse/HBASE-7386
 Project: HBase
  Issue Type: Task
  Components: master, regionserver, scripts
Reporter: Gregory Chanan
Assignee: stack
Priority: Blocker
 Attachments: HBASE-7386-bin-v2.patch, HBASE-7386-bin.patch, 
 HBASE-7386-conf-v2.patch, HBASE-7386-conf.patch, HBASE-7386-src.patch, 
 HBASE-7386-v0.patch, supervisordconfigs-v0.patch


 There a couple of JIRAs for deleting the znode on a process failure:
 HBASE-5844 (RS)
 HBASE-5926 (Master)
 which are pretty neat; on process failure, they delete the znode of the 
 underlying process so HBase can recover faster.
 These JIRAs were implemented via the startup scripts; i.e. the script hangs 
 around and waits for the process to exit, then deletes the znode.
 There are a few problems associated with this approach, as listed in the 
 below JIRAs:
 1) Hides startup output in script
 https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463401page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463401
 2) two hbase processes listed per launched daemon
 https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463409page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463409
 3) Not run by a real supervisor
 https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463409page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463409
 4) Weird output after kill -9 actual process in standalone mode
 https://issues.apache.org/jira/browse/HBASE-5926?focusedCommentId=13506801page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13506801
 5) Can kill existing RS if called again
 https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463401page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463401
 6) Hides stdout/stderr[6]
 https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13506832page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13506832
 I suspect running in via something like supervisor.d can solve these issues 
 if we provide the right support.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-7386) Investigate providing some supervisor support for znode deletion

2014-01-08 Thread Samir Ahmic (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13865644#comment-13865644
 ] 

Samir Ahmic commented on HBASE-7386:


bq. What's the PROCESS_STATE_UNKNOWN?
According to supervisor documentation [http://supervisord.org/subprocess.html] 
The process is in an unknown state (supervisord programming error).

 Investigate providing some supervisor support for znode deletion
 

 Key: HBASE-7386
 URL: https://issues.apache.org/jira/browse/HBASE-7386
 Project: HBase
  Issue Type: Task
  Components: master, regionserver, scripts
Reporter: Gregory Chanan
Assignee: stack
Priority: Blocker
 Attachments: HBASE-7386-bin-v2.patch, HBASE-7386-bin.patch, 
 HBASE-7386-conf-v2.patch, HBASE-7386-conf.patch, HBASE-7386-src.patch, 
 HBASE-7386-v0.patch, supervisordconfigs-v0.patch


 There a couple of JIRAs for deleting the znode on a process failure:
 HBASE-5844 (RS)
 HBASE-5926 (Master)
 which are pretty neat; on process failure, they delete the znode of the 
 underlying process so HBase can recover faster.
 These JIRAs were implemented via the startup scripts; i.e. the script hangs 
 around and waits for the process to exit, then deletes the znode.
 There are a few problems associated with this approach, as listed in the 
 below JIRAs:
 1) Hides startup output in script
 https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463401page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463401
 2) two hbase processes listed per launched daemon
 https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463409page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463409
 3) Not run by a real supervisor
 https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463409page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463409
 4) Weird output after kill -9 actual process in standalone mode
 https://issues.apache.org/jira/browse/HBASE-5926?focusedCommentId=13506801page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13506801
 5) Can kill existing RS if called again
 https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463401page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463401
 6) Hides stdout/stderr[6]
 https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13506832page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13506832
 I suspect running in via something like supervisor.d can solve these issues 
 if we provide the right support.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-7386) Investigate providing some supervisor support for znode deletion

2013-12-28 Thread Samir Ahmic (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13857994#comment-13857994
 ] 

Samir Ahmic commented on HBASE-7386:


Thanks for review add comments [~stack]]
In respect of usage and documentation this scripts are following same logic as 
scripts in bin directory. For example here is output of 
start-supervisord-hbase.sh command:
{code}
$ $HBASE_HOME/bin/supervisord/start-supervisord-hbase.sh
localhost: hbase-ZK: started
hbase-MASTER: started
localhost: hbase-RS: started
{code}
so considering usage this relations are true:
start-hbase.sh ~= start-supervisord-hbase.sh
stop-hbase.sh ~= stop-supervisord-hbase.sh
hbase-daemon.sh ~= hbase-supervisord.sh

I agree there is danger that scripts 'rot'  but also i believe that this 
approach can solve number of issues for ops people and generally improve hbase 
MTTR .  What is your suggestion how to address 'rot' scripts issue ?

graceful-stop.sh from bin dir can be modified to avoid copy/paste. I will also 
check rest of scripts to try to reduce amount of copy/paste.

migrate_to_supervisord.sh will switch running cluster that was started with 
scripts from bin directory  to use supervisor. It will stop hbase daemons on 
nodes using hbase-daemon.sh and then it will start then using 
hbase-supervisord.sh script (revert_to_scripts.sh will do opposite).

For master znode is removed by autostart method (patch in 
HMasterCommandLine.java ) in moment of starting. We have supervisor config 
autorestart=true so if master process dies unexpectedly supervisor will kick 
off autorestart and in that moment znode will be removed giving enough time for 
backup master to become active.  Alternative is to craft listener script 
similar to mail_notification.py that will remove master znode when detects that 
process is exiting.

Regarding RS znodes scripts does not  remove them yet. I was thinking about 
listener script (similar to  mail_notification.py) calling 'hbase zkcli rmr  
RSznode'  or we can modify HRegionServerCommadLine.java and add 'autorestart' 
like in  HMasterCommandLine.java. What is your suggestion how to address this ?

Basically all this scripts are wrappers around 'supervisord'  and 
'supervisorctl' commands which are python based, 
I hope i have clarify some details. 

Cheers





 Investigate providing some supervisor support for znode deletion
 

 Key: HBASE-7386
 URL: https://issues.apache.org/jira/browse/HBASE-7386
 Project: HBase
  Issue Type: Task
  Components: master, regionserver, scripts
Reporter: Gregory Chanan
Assignee: stack
Priority: Blocker
 Attachments: HBASE-7386-bin.patch, HBASE-7386-conf.patch, 
 HBASE-7386-src.patch, HBASE-7386-v0.patch, supervisordconfigs-v0.patch


 There a couple of JIRAs for deleting the znode on a process failure:
 HBASE-5844 (RS)
 HBASE-5926 (Master)
 which are pretty neat; on process failure, they delete the znode of the 
 underlying process so HBase can recover faster.
 These JIRAs were implemented via the startup scripts; i.e. the script hangs 
 around and waits for the process to exit, then deletes the znode.
 There are a few problems associated with this approach, as listed in the 
 below JIRAs:
 1) Hides startup output in script
 https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463401page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463401
 2) two hbase processes listed per launched daemon
 https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463409page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463409
 3) Not run by a real supervisor
 https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463409page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463409
 4) Weird output after kill -9 actual process in standalone mode
 https://issues.apache.org/jira/browse/HBASE-5926?focusedCommentId=13506801page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13506801
 5) Can kill existing RS if called again
 https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463401page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463401
 6) Hides stdout/stderr[6]
 https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13506832page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13506832
 I suspect running in via something like supervisor.d can solve these issues 
 if we provide the right support.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-7386) Investigate providing some supervisor support for znode deletion

2013-12-27 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13857589#comment-13857589
 ] 

stack commented on HBASE-7386:
--

[~asamir]  Excellent!  Thank you for working on this.  If you add a little 
detail on how it looks when in place working -- I can add a section to the 
reference guide that points at these scripts (I can commit the autorestart 
changes no problem).

We 'could' commit these scripts.  It might encourage folks to go the supervised 
route.  Only issue is they may 'rot'.

Do you have to 'copy' the graceful_stop.sh script, can you not point at the 
original (any needed changes we can make no problem) -- ditto for any other 
changes you'd need to you could reduce the amount of copy/paste.

FYI, these lines are unnecessary '+# * Copyright 2007 The Apache Software 
Foundation'

You probably don't want this in the scripts: +email=sa...@personal.com  (call 
out the necessary config in README?)

How do these migrate scripts work?  They look nice if they are doing what I 
think they are doing.

I see mail notification but do you actually do the removal of a znode?

When you say python supervise, you just mean the script that supervise has as 
callback is in python?

Nice job [~asamir]


 Investigate providing some supervisor support for znode deletion
 

 Key: HBASE-7386
 URL: https://issues.apache.org/jira/browse/HBASE-7386
 Project: HBase
  Issue Type: Task
  Components: master, regionserver, scripts
Reporter: Gregory Chanan
Assignee: stack
Priority: Blocker
 Attachments: HBASE-7386-bin.patch, HBASE-7386-conf.patch, 
 HBASE-7386-src.patch, HBASE-7386-v0.patch, supervisordconfigs-v0.patch


 There a couple of JIRAs for deleting the znode on a process failure:
 HBASE-5844 (RS)
 HBASE-5926 (Master)
 which are pretty neat; on process failure, they delete the znode of the 
 underlying process so HBase can recover faster.
 These JIRAs were implemented via the startup scripts; i.e. the script hangs 
 around and waits for the process to exit, then deletes the znode.
 There are a few problems associated with this approach, as listed in the 
 below JIRAs:
 1) Hides startup output in script
 https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463401page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463401
 2) two hbase processes listed per launched daemon
 https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463409page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463409
 3) Not run by a real supervisor
 https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463409page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463409
 4) Weird output after kill -9 actual process in standalone mode
 https://issues.apache.org/jira/browse/HBASE-5926?focusedCommentId=13506801page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13506801
 5) Can kill existing RS if called again
 https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463401page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463401
 6) Hides stdout/stderr[6]
 https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13506832page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13506832
 I suspect running in via something like supervisor.d can solve these issues 
 if we provide the right support.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-7386) Investigate providing some supervisor support for znode deletion

2013-12-26 Thread Samir Ahmic (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13856860#comment-13856860
 ] 

Samir Ahmic commented on HBASE-7386:


Based on what was discussed here i have created set of scripts and configs that 
support running hbase processes under python supervisor control. I did not 
touch any scrips from bin directory as i see this as optional solution for 
managing hbase processes. Only change in hbase source is adding 'autorestart' 
option in HMasterCommandLine.java.  Scripts does not require any config changes 
or additional env variables. 
All testing was done on 0.96 branch and partially on 0.94 using supervisor 3.0 
version, OS was Centos 5.8 and 5.10.
All suggestions and critics are welcome:)
Cheers

 Investigate providing some supervisor support for znode deletion
 

 Key: HBASE-7386
 URL: https://issues.apache.org/jira/browse/HBASE-7386
 Project: HBase
  Issue Type: Task
  Components: master, regionserver, scripts
Reporter: Gregory Chanan
Assignee: stack
Priority: Blocker
 Attachments: HBASE-7386-bin.patch, HBASE-7386-conf.patch, 
 HBASE-7386-src.patch, HBASE-7386-v0.patch, supervisordconfigs-v0.patch


 There a couple of JIRAs for deleting the znode on a process failure:
 HBASE-5844 (RS)
 HBASE-5926 (Master)
 which are pretty neat; on process failure, they delete the znode of the 
 underlying process so HBase can recover faster.
 These JIRAs were implemented via the startup scripts; i.e. the script hangs 
 around and waits for the process to exit, then deletes the znode.
 There are a few problems associated with this approach, as listed in the 
 below JIRAs:
 1) Hides startup output in script
 https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463401page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463401
 2) two hbase processes listed per launched daemon
 https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463409page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463409
 3) Not run by a real supervisor
 https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463409page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463409
 4) Weird output after kill -9 actual process in standalone mode
 https://issues.apache.org/jira/browse/HBASE-5926?focusedCommentId=13506801page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13506801
 5) Can kill existing RS if called again
 https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463401page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463401
 6) Hides stdout/stderr[6]
 https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13506832page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13506832
 I suspect running in via something like supervisor.d can solve these issues 
 if we provide the right support.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-7386) Investigate providing some supervisor support for znode deletion

2013-12-02 Thread Samir Ahmic (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13836520#comment-13836520
 ] 

Samir Ahmic commented on HBASE-7386:


Hi [~stack], 

Is someone is working on this currently? This looks like great idea. I would 
like to continue this work if we  still think that this is good direction ?

Cheers

 Investigate providing some supervisor support for znode deletion
 

 Key: HBASE-7386
 URL: https://issues.apache.org/jira/browse/HBASE-7386
 Project: HBase
  Issue Type: Task
  Components: master, regionserver, scripts
Reporter: Gregory Chanan
Assignee: stack
Priority: Blocker
 Attachments: HBASE-7386-v0.patch, supervisordconfigs-v0.patch


 There a couple of JIRAs for deleting the znode on a process failure:
 HBASE-5844 (RS)
 HBASE-5926 (Master)
 which are pretty neat; on process failure, they delete the znode of the 
 underlying process so HBase can recover faster.
 These JIRAs were implemented via the startup scripts; i.e. the script hangs 
 around and waits for the process to exit, then deletes the znode.
 There are a few problems associated with this approach, as listed in the 
 below JIRAs:
 1) Hides startup output in script
 https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463401page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463401
 2) two hbase processes listed per launched daemon
 https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463409page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463409
 3) Not run by a real supervisor
 https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463409page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463409
 4) Weird output after kill -9 actual process in standalone mode
 https://issues.apache.org/jira/browse/HBASE-5926?focusedCommentId=13506801page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13506801
 5) Can kill existing RS if called again
 https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463401page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463401
 6) Hides stdout/stderr[6]
 https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13506832page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13506832
 I suspect running in via something like supervisor.d can solve these issues 
 if we provide the right support.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-7386) Investigate providing some supervisor support for znode deletion

2012-12-19 Thread Gregory Chanan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13536656#comment-13536656
 ] 

Gregory Chanan commented on HBASE-7386:
---

Thanks for the feedback, I agree with what both of you are saying.  I'm working 
on an a start-hbase like script.

bq. What to do for the case where a shop has chosen other than supervisord to 
monitor their processes? I suppose we could let them do the convertion from 
'supervise' to 'god', etc.?
I don't think we should worry about it for the first pass.  I figure if a shop 
is familiar with a different tool they should be able to write their own 
scripts/configs with the supervisord ones we provide as a guide.

 Investigate providing some supervisor support for znode deletion
 

 Key: HBASE-7386
 URL: https://issues.apache.org/jira/browse/HBASE-7386
 Project: HBase
  Issue Type: Task
  Components: master, regionserver, scripts
Reporter: Gregory Chanan
Assignee: Gregory Chanan
 Fix For: 0.96.0

 Attachments: HBASE-7386-v0.patch, supervisordconfigs-v0.patch


 There a couple of JIRAs for deleting the znode on a process failure:
 HBASE-5844 (RS)
 HBASE-5926 (Master)
 which are pretty neat; on process failure, they delete the znode of the 
 underlying process so HBase can recover faster.
 These JIRAs were implemented via the startup scripts; i.e. the script hangs 
 around and waits for the process to exit, then deletes the znode.
 There are a few problems associated with this approach, as listed in the 
 below JIRAs:
 1) Hides startup output in script
 https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463401page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463401
 2) two hbase processes listed per launched daemon
 https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463409page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463409
 3) Not run by a real supervisor
 https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463409page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463409
 4) Weird output after kill -9 actual process in standalone mode
 https://issues.apache.org/jira/browse/HBASE-5926?focusedCommentId=13506801page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13506801
 5) Can kill existing RS if called again
 https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463401page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463401
 6) Hides stdout/stderr[6]
 https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13506832page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13506832
 I suspect running in via something like supervisor.d can solve these issues 
 if we provide the right support.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7386) Investigate providing some supervisor support for znode deletion

2012-12-18 Thread Gregory Chanan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13535517#comment-13535517
 ] 

Gregory Chanan commented on HBASE-7386:
---

I'm going to first tackle the Master case, HBASE-5926.  The proposal is to rip 
out the script modification of HBASE-5926 and replace with supervisor.d 
support.  The java code from HBASE-5926 would of course stay, only the 
restart/cleanup code in the scripts would go.  The goal is to solve the above 
listed issues:
#2 is solved by running via supervisor (jps reports only one HMaster process 
when HMaster started via supervisor.d)
#3 is self-evidently solved by running on a real supervisor
#5 is clearly not relevant, as it is only related to the RS.

So, we are left with #1, #4, and #6.
#1 and #6 are solved because the previous script behavior returns when not run 
in supervisor and supervisor can redirect stdout/stderr when run.
#4 is solved because the old script behavior returns when not run in supervisor 
and I don't think it makes any sense to run a standalone HBase via supervisor.

Example patch coming soon.

 Investigate providing some supervisor support for znode deletion
 

 Key: HBASE-7386
 URL: https://issues.apache.org/jira/browse/HBASE-7386
 Project: HBase
  Issue Type: Task
  Components: master, regionserver, scripts
Reporter: Gregory Chanan
Assignee: Gregory Chanan
 Fix For: 0.96.0


 There a couple of JIRAs for deleting the znode on a process failure:
 HBASE-5844 (RS)
 HBASE-5926 (Master)
 which are pretty neat; on process failure, they delete the znode of the 
 underlying process so HBase can recover faster.
 These JIRAs were implemented via the startup scripts; i.e. the script hangs 
 around and waits for the process to exit, then deletes the znode.
 There are a few problems associated with this approach, as listed in the 
 below JIRAs:
 1) Hides startup output in script
 https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463401page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463401
 2) two hbase processes listed per launched daemon
 https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463409page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463409
 3) Not run by a real supervisor
 https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463409page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463409
 4) Weird output after kill -9 actual process in standalone mode
 https://issues.apache.org/jira/browse/HBASE-5926?focusedCommentId=13506801page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13506801
 5) Can kill existing RS if called again
 https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463401page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463401
 6) Hides stdout/stderr[6]
 https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13506832page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13506832
 I suspect running in via something like supervisor.d can solve these issues 
 if we provide the right support.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7386) Investigate providing some supervisor support for znode deletion

2012-12-18 Thread Gregory Chanan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13535562#comment-13535562
 ] 

Gregory Chanan commented on HBASE-7386:
---

Thanks for pointing that out, Enis.  What I'm doing here is compatible with 
that, I think.  Rather than deleting the znode we just expire the session.  
Should be a small change.

 Investigate providing some supervisor support for znode deletion
 

 Key: HBASE-7386
 URL: https://issues.apache.org/jira/browse/HBASE-7386
 Project: HBase
  Issue Type: Task
  Components: master, regionserver, scripts
Reporter: Gregory Chanan
Assignee: Gregory Chanan
 Fix For: 0.96.0


 There a couple of JIRAs for deleting the znode on a process failure:
 HBASE-5844 (RS)
 HBASE-5926 (Master)
 which are pretty neat; on process failure, they delete the znode of the 
 underlying process so HBase can recover faster.
 These JIRAs were implemented via the startup scripts; i.e. the script hangs 
 around and waits for the process to exit, then deletes the znode.
 There are a few problems associated with this approach, as listed in the 
 below JIRAs:
 1) Hides startup output in script
 https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463401page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463401
 2) two hbase processes listed per launched daemon
 https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463409page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463409
 3) Not run by a real supervisor
 https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463409page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463409
 4) Weird output after kill -9 actual process in standalone mode
 https://issues.apache.org/jira/browse/HBASE-5926?focusedCommentId=13506801page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13506801
 5) Can kill existing RS if called again
 https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463401page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463401
 6) Hides stdout/stderr[6]
 https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13506832page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13506832
 I suspect running in via something like supervisor.d can solve these issues 
 if we provide the right support.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7386) Investigate providing some supervisor support for znode deletion

2012-12-18 Thread Gregory Chanan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13535587#comment-13535587
 ] 

Gregory Chanan commented on HBASE-7386:
---

Here is how I verified this all works (apply both patches and set HBASE_HOME, 
HBASE_LOG_DIR in your environment but nothing else).  Also make sure 
supervisord is installed.

{noformat}
1. $HBASE_HOME/bin/hbase-daemons.sh --config  start zookeeper
2. cd $HBASE_HOME/conf/supervisord
3. supervisord -c supervisord.conf (call jps and record the pid of the 
HMaster)
4. $HBASE_HOME/bin/hbase-daemons.sh --config  --hosts localhost.txt start 
regionserver (localhost.txt is a text file with the line localhost)
5. $HBASE_HOME/bin/./local-master-backup.sh start 1
6. kill -9 [PID_OF_HMASTER_RECORDED_AT_STEP_3]
{noformat}

and notice the backup master takes over right away.  Also use start-hbase.sh 
and notice it has the same behavior as before (does not swallow output).

 Investigate providing some supervisor support for znode deletion
 

 Key: HBASE-7386
 URL: https://issues.apache.org/jira/browse/HBASE-7386
 Project: HBase
  Issue Type: Task
  Components: master, regionserver, scripts
Reporter: Gregory Chanan
Assignee: Gregory Chanan
 Fix For: 0.96.0

 Attachments: HBASE-7386-v0.patch, supervisordconfigs-v0.patch


 There a couple of JIRAs for deleting the znode on a process failure:
 HBASE-5844 (RS)
 HBASE-5926 (Master)
 which are pretty neat; on process failure, they delete the znode of the 
 underlying process so HBase can recover faster.
 These JIRAs were implemented via the startup scripts; i.e. the script hangs 
 around and waits for the process to exit, then deletes the znode.
 There are a few problems associated with this approach, as listed in the 
 below JIRAs:
 1) Hides startup output in script
 https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463401page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463401
 2) two hbase processes listed per launched daemon
 https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463409page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463409
 3) Not run by a real supervisor
 https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463409page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463409
 4) Weird output after kill -9 actual process in standalone mode
 https://issues.apache.org/jira/browse/HBASE-5926?focusedCommentId=13506801page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13506801
 5) Can kill existing RS if called again
 https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463401page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463401
 6) Hides stdout/stderr[6]
 https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13506832page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13506832
 I suspect running in via something like supervisor.d can solve these issues 
 if we provide the right support.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7386) Investigate providing some supervisor support for znode deletion

2012-12-18 Thread Gregory Chanan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13535590#comment-13535590
 ] 

Gregory Chanan commented on HBASE-7386:
---

Open Questions:
1) Do we think this is a useful direction?
2) If so, what level of supervisor scripts do we provide?
None?
Samples?
Actual scripts that can be run [i.e. mimicking HBASE-daemon.sh -- probably want 
to move all the config setup to a common file]?
Actual scripts + ssh scripting to actually run with a start-hbase.sh, i.e. 
start-supervised-hbase.sh

 Investigate providing some supervisor support for znode deletion
 

 Key: HBASE-7386
 URL: https://issues.apache.org/jira/browse/HBASE-7386
 Project: HBase
  Issue Type: Task
  Components: master, regionserver, scripts
Reporter: Gregory Chanan
Assignee: Gregory Chanan
 Fix For: 0.96.0

 Attachments: HBASE-7386-v0.patch, supervisordconfigs-v0.patch


 There a couple of JIRAs for deleting the znode on a process failure:
 HBASE-5844 (RS)
 HBASE-5926 (Master)
 which are pretty neat; on process failure, they delete the znode of the 
 underlying process so HBase can recover faster.
 These JIRAs were implemented via the startup scripts; i.e. the script hangs 
 around and waits for the process to exit, then deletes the znode.
 There are a few problems associated with this approach, as listed in the 
 below JIRAs:
 1) Hides startup output in script
 https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463401page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463401
 2) two hbase processes listed per launched daemon
 https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463409page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463409
 3) Not run by a real supervisor
 https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463409page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463409
 4) Weird output after kill -9 actual process in standalone mode
 https://issues.apache.org/jira/browse/HBASE-5926?focusedCommentId=13506801page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13506801
 5) Can kill existing RS if called again
 https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463401page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463401
 6) Hides stdout/stderr[6]
 https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13506832page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13506832
 I suspect running in via something like supervisor.d can solve these issues 
 if we provide the right support.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7386) Investigate providing some supervisor support for znode deletion

2012-12-18 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13535721#comment-13535721
 ] 

nkeywal commented on HBASE-7386:


bq.  1) Do we think this is a useful direction?
Personally, I do. We just need to have a default way to start/stop and ideally 
monitor HBase. Scripts are the simplest way. Supervisor is better.

bq.  2) If so, what level of supervisor scripts do we provide?
Samples are doomed to be broken. So I would go for  Actual scripts + ssh 
scripting to actually run with a start-hbase.sh, i.e. 
start-supervised-hbase.sh.

Is there any reason not to use supervisor by default (licenses, supported 
platforms?)

 Investigate providing some supervisor support for znode deletion
 

 Key: HBASE-7386
 URL: https://issues.apache.org/jira/browse/HBASE-7386
 Project: HBase
  Issue Type: Task
  Components: master, regionserver, scripts
Reporter: Gregory Chanan
Assignee: Gregory Chanan
 Fix For: 0.96.0

 Attachments: HBASE-7386-v0.patch, supervisordconfigs-v0.patch


 There a couple of JIRAs for deleting the znode on a process failure:
 HBASE-5844 (RS)
 HBASE-5926 (Master)
 which are pretty neat; on process failure, they delete the znode of the 
 underlying process so HBase can recover faster.
 These JIRAs were implemented via the startup scripts; i.e. the script hangs 
 around and waits for the process to exit, then deletes the znode.
 There are a few problems associated with this approach, as listed in the 
 below JIRAs:
 1) Hides startup output in script
 https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463401page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463401
 2) two hbase processes listed per launched daemon
 https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463409page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463409
 3) Not run by a real supervisor
 https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463409page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463409
 4) Weird output after kill -9 actual process in standalone mode
 https://issues.apache.org/jira/browse/HBASE-5926?focusedCommentId=13506801page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13506801
 5) Can kill existing RS if called again
 https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463401page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463401
 6) Hides stdout/stderr[6]
 https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13506832page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13506832
 I suspect running in via something like supervisor.d can solve these issues 
 if we provide the right support.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7386) Investigate providing some supervisor support for znode deletion

2012-12-18 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13535744#comment-13535744
 ] 

stack commented on HBASE-7386:
--

I'd say just repurpose 'autorestart', especially if now broke.  What was there 
previous was mickey mouse.  This is real deal.

bq. ... and does not respect any other environment settings (e.g. 
HBASE_CONF_DIR).

Would this be fixed if we ...through all the config files in hbase-daemon and 
do something appropriate.?

On questions:

1. Yes this is valid direction.  Perhaps we could extract the stuff you hacked 
out into a 'wrapper' script, a poor-mans' supervise such that it was there as 
an option... you could run it if you wanted poor-mans' supervise but otherwise, 
scripts ran as they used to.  But this would likely be wasted effort... effort 
better spent getting it so optionally, if supervisord was installed, you could 
just run with it.
2. I agree with nkeywal that templates/samples inevitably rot.  Unused software 
also rots so providing supervisord scripts unless they are used, they will go 
bad.  How much work involved making it so could do 
./bin/start-supervisord-hbase.sh?  Would be coolio if you could do 
./bin/start-hbase.sh and ./bin/start-supervisord-hbase.sh if supervisor 
available (likely on most systems I'd say) and then in doc. we encourage folks 
to do the latter.

What to do for the case where a shop has chosen other than supervisord to 
monitor their processes?  I suppose we could let them do the convertion from 
'supervise' to 'god', etc.?

This is great stuff G.






 Investigate providing some supervisor support for znode deletion
 

 Key: HBASE-7386
 URL: https://issues.apache.org/jira/browse/HBASE-7386
 Project: HBase
  Issue Type: Task
  Components: master, regionserver, scripts
Reporter: Gregory Chanan
Assignee: Gregory Chanan
 Fix For: 0.96.0

 Attachments: HBASE-7386-v0.patch, supervisordconfigs-v0.patch


 There a couple of JIRAs for deleting the znode on a process failure:
 HBASE-5844 (RS)
 HBASE-5926 (Master)
 which are pretty neat; on process failure, they delete the znode of the 
 underlying process so HBase can recover faster.
 These JIRAs were implemented via the startup scripts; i.e. the script hangs 
 around and waits for the process to exit, then deletes the znode.
 There are a few problems associated with this approach, as listed in the 
 below JIRAs:
 1) Hides startup output in script
 https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463401page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463401
 2) two hbase processes listed per launched daemon
 https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463409page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463409
 3) Not run by a real supervisor
 https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463409page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463409
 4) Weird output after kill -9 actual process in standalone mode
 https://issues.apache.org/jira/browse/HBASE-5926?focusedCommentId=13506801page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13506801
 5) Can kill existing RS if called again
 https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463401page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463401
 6) Hides stdout/stderr[6]
 https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13506832page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13506832
 I suspect running in via something like supervisor.d can solve these issues 
 if we provide the right support.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira