[
https://issues.apache.org/jira/browse/HBASE-4047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13170501#comment-13170501
]
Andrew Purtell commented on HBASE-4047:
---------------------------------------
Start with testcases, the first a test that confirms a stuck child process via
SIGSTOP doesn't take down the regionserver. Thinking there should be three
selectable strategies:
1. Close and reopen the region, triggering force termination of the stuck child
on close, and fork/initialization of a new child on open, along with reinit of
all region related resources, other coprocessors, etc.
2. Unload/reload the malfunctioning coprocessor. Will require some work in the
coprocessor framework to actually support unloading in a reasonable way. The
JVM may make this complicated for integrated CPs, so perhaps just for those
hosted in external processes.
3. Unload/terminate the malfunctioning coprocessor and continue operation.
Consider changes in the CP framework for temporary blacklisting, will need that
to avoid loading the suspect CP after a split.
> [Coprocessors] Generic external process host
> --------------------------------------------
>
> Key: HBASE-4047
> URL: https://issues.apache.org/jira/browse/HBASE-4047
> Project: HBase
> Issue Type: New Feature
> Components: coprocessors
> Reporter: Andrew Purtell
> Assignee: Andrew Purtell
>
> Where HBase coprocessors deviate substantially from the design (as I
> understand it) of Google's BigTable coprocessors is we've reimagined it as a
> framework for internal extension. In contrast BigTable coprocessors run as
> separate processes colocated with tablet servers. The essential trade off is
> between performance, flexibility and possibility, and the ability to control
> and enforce resource usage.
> Since the initial design of HBase coprocessors some additional considerations
> are in play:
> - Developing computational frameworks sitting directly on top of HBase hosted
> in coprocessor(s);
> - Introduction of the map reduce next generation (mrng) resource management
> model, and the probability that limits will be enforced via cgroups at the OS
> level after this is generally available, e.g. when RHEL 6 deployments are
> common;
> - The possibility of deployment of HBase onto mrng-enabled Hadoop clusters
> via the mrng resource manager and a HBase-specific application controller.
> Therefore we should consider developing a coprocessor that is a generic host
> for another coprocessor, but one that forks a child process, loads the target
> coprocessor into the child, establishes a bidirectional pipe and uses an
> eventing model and umbilical protocol to provide for the coprocessor loaded
> into the child the same semantics as if it was loaded internally to the
> parent, and (eventually) use available resource management capabilities on
> the platform -- perhaps via the mrng resource controller or directly with
> cgroups -- to limit the child as desired by system administrators or the
> application designer.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira