[
https://issues.apache.org/jira/browse/HBASE-4014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13084524#comment-13084524
]
[email protected] commented on HBASE-4014:
------------------------------------------------------
bq. On 2011-08-12 23:46:30, Gary Helmling wrote:
bq. > Nice work, Eugene. I think we're getting close. Just two suggested
improvements below.
bq. >
bq. > The main question still open to debate, I think, is whether or not
aborting the server on unhandled exceptions is appropriate.
bq. >
bq. > On the one hand, aborting takes the fail-fast approach and makes buggy
coprocessors much more visible. It's a lot more likely that a bug will be
noticed and fixed if it brings down a region server!
bq. >
bq. > On the other hand, I think coprocessors already pose enough of a
stability risk to a cluster. I think we should be working to minimize that by
containing the impact that a buggy coprocessor can have. If they coprocessor
really wants or needs to trigger an abort, it can already do so, since
(Master|RegionServer)Services extend Server, which extends Abortable.
bq. >
bq. > I think I'd be more in favor of removing the coprocessor from the active
set (we should make this as visible as possible so it's clear the coprocessor
is no longer "active"), or at least wrapping the exception in a
DoNotRetryIOException and communicating it back to the client? Maybe both?
bq. >
bq. > I guess I'd be okay with a configuration option to abort on error (I
think a single config option is sufficient), as long as it's disabled by
default. But that would still imply we need some other handling when the
option is disabled.
I like Gary's reasoning here.
- Michael
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/969/#review1433
-----------------------------------------------------------
On 2011-08-10 22:48:08, Eugene Koontz wrote:
bq.
bq. -----------------------------------------------------------
bq. This is an automatically generated e-mail. To reply, visit:
bq. https://reviews.apache.org/r/969/
bq. -----------------------------------------------------------
bq.
bq. (Updated 2011-08-10 22:48:08)
bq.
bq.
bq. Review request for hbase, Gary Helmling and Mingjie Lai.
bq.
bq.
bq. Summary
bq. -------
bq.
bq. https://issues.apache.org/jira/browse/HBASE-4014 Coprocessors: Flag the
presence of coprocessors in logged exceptions
bq.
bq. The general gist here is to wrap each of
{Master,RegionServer}CoprocessorHost's coprocessor call inside a
bq.
bq. "try { ... } catch (Throwable e) { handleCoprocessorThrowable(e) }"
bq.
bq. block.
bq.
bq. handleCoprocessorThrowable() is responsible for either passing 'e' along
to the client (if 'e' is an IOException) or, otherwise, aborting the service
(Regionserver or Master).
bq.
bq. The abort message contains a list of the loaded coprocessors for crash
analysis.
bq.
bq.
bq. This addresses bug HBASE-4014.
bq. https://issues.apache.org/jira/browse/HBASE-4014
bq.
bq.
bq. Diffs
bq. -----
bq.
bq. src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorHost.java
18ba6e7
bq. src/main/java/org/apache/hadoop/hbase/master/HMaster.java 8beeb68
bq. src/main/java/org/apache/hadoop/hbase/master/MasterCoprocessorHost.java
aa930f5
bq. src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
23225d7
bq.
src/main/java/org/apache/hadoop/hbase/regionserver/RegionCoprocessorHost.java
c44da73
bq.
src/test/java/org/apache/hadoop/hbase/coprocessor/TestMasterCoprocessorException.java
PRE-CREATION
bq.
src/test/java/org/apache/hadoop/hbase/coprocessor/TestRegionServerCoprocessorException.java
PRE-CREATION
bq.
bq. Diff: https://reviews.apache.org/r/969/diff
bq.
bq.
bq. Testing
bq. -------
bq.
bq. patch includes two tests:
bq.
bq. TestMasterCoprocessorException.java
bq. TestRegionServerCoprocessorException.java
bq.
bq. both tests pass in my build environment.
bq.
bq.
bq. Thanks,
bq.
bq. Eugene
bq.
bq.
> Coprocessors: Flag the presence of coprocessors in logged exceptions
> --------------------------------------------------------------------
>
> Key: HBASE-4014
> URL: https://issues.apache.org/jira/browse/HBASE-4014
> Project: HBase
> Issue Type: Improvement
> Components: coprocessors
> Reporter: Andrew Purtell
> Assignee: Eugene Koontz
> Fix For: 0.92.0
>
> Attachments: HBASE-4014.patch, HBASE-4014.patch, HBASE-4014.patch,
> HBASE-4014.patch, HBASE-4014.patch
>
>
> For some initial triage of bug reports for core versus for deployments with
> loaded coprocessors, we need something like the Linux kernel's taint flag,
> and list of linked in modules that show up in the output of every OOPS, to
> appear above or below exceptions that appear in the logs.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira