[
https://issues.apache.org/jira/browse/TRAFODION-2142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15408186#comment-15408186
]
ASF GitHub Bot commented on TRAFODION-2142:
-------------------------------------------
GitHub user DaveBirdsall opened a pull request:
https://github.com/apache/incubator-trafodion/pull/640
[TRAFODION-2142] Add script to restart HBase for developer regressions
This check-in adds a script, keepHBaseUp.py, that can be used in tandem
with developer regressions to keep regressions from getting hung when the
local_hadoop HMaster goes away (as it is wont to do on busy workstations). The
script periodically checks to see if the HMaster is up. If it isn't, it
attempts to start it. It will retry the start, at geometrically longer
intervals, until over an hour passes without success, then it gives up.
One way to use this script is to start it running in one shell window, then
start the developer regressions in another.
The script logs the times when it does its checks, so the developer can
correlate HBase down scenarios with regression test failures.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/DaveBirdsall/incubator-trafodion Trafodion2142
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/incubator-trafodion/pull/640.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #640
----
commit 498dd2f1b1e8588e4bf56556a951164a4154ddec
Author: Dave Birdsall <[email protected]>
Date: 2016-08-04T17:31:06Z
[TRAFODION-2142] Add script to restart HBase for developer regressions
----
> Test script to restart HBase automatically in local_hadoop test settings
> ------------------------------------------------------------------------
>
> Key: TRAFODION-2142
> URL: https://issues.apache.org/jira/browse/TRAFODION-2142
> Project: Apache Trafodion
> Issue Type: Improvement
> Components: foundation
> Affects Versions: any
> Reporter: David Wayne Birdsall
> Assignee: David Wayne Birdsall
> Priority: Minor
> Fix For: 2.1-incubating
>
>
> In development environments, developers often use local_hadoop for unit and
> developer regression testing. Often these test environments are on
> workstations shared between many developers. When running regressions
> overnight, quite frequently the HMaster process will die due to timeouts if
> the workstation is particularly busy. This sometimes causes HBase errors
> during the tests but more often causes hangs. It would be nice to have a tool
> that will monitor HMaster and if it goes away, try to restart it. It has been
> observed that restarting it often resolves the hangs, allowing the regression
> run to continue.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)