[ 
https://issues.apache.org/jira/browse/HBASE-4014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13076045#comment-13076045
 ] 

[email protected] commented on HBASE-4014:
------------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/969/#review1261
-----------------------------------------------------------


Sorry for the belated review.

The additional explicit error handling around coprocessor invocations here is 
good, but I think this is still missing part of the intent of the original bug 
description.  I'm concerned about aborts that happen not just resulting 
directly from unhandled exceptions in coprocessor code, but from bad 
coprocessor behavior that eventually triggers an abort from within core _HBase_ 
code.  This could be a memory leak in the coprocessor that eventually triggers 
an OOME that only shows up in reading in the next RPC request, or something 
that corrupts internal state or data structures.  As a result, I think it's 
equally important that the loaded coprocessor set be logged _within_ 
HMaster.abort() and HRegionServer.abort(), in all cases.  So I don't think 
logging from the CoprocessorHost implementations is sufficient.

For HRegionServer at least, since RegionCoprocessorHost is associated at the 
HRegion level, this would need a distinct set of all coprocessor classes that 
have been loaded in that region server's lifetime, accessible at the top 
(server) level.  (I thought we discussed this at some point, but can't find 
those comments -- maybe I'm hearing voices?)  I think a simple singleton would 
work, with something like an internal HashSet and an add(String 
coprocessorClass) method that's called on coprocessor load.  The same would 
work on HMaster, though it's not strictly necessary since only a single 
MasterCoprocessorHost instance is created at the HMaster level.  But 
consistency in implementation here is probably good.  The number of 
CoprocessorHost types we have may increase if new Observer types are added.

I think what's here is good, with only some minor nits on naming. But I think 
the above enhancement is pretty critical to add.


src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorHost.java
<https://reviews.apache.org/r/969/#comment2892>

    I saw in previous comments that this is needed because SortedCopyOnWriteSet 
doesn't implement toString().  So why not make SortedCopyOnWriteSet implement 
toString()?  Seems cleaner to me and more generic/reusable.
    
    Wherever this is implemented, use a StringBuilder to create the string to 
return, not repeated string concatenation.



src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorHost.java
<https://reviews.apache.org/r/969/#comment2894>

    Awfully long name, maybe just abortServer()?



src/main/java/org/apache/hadoop/hbase/master/MasterCoprocessorHost.java
<https://reviews.apache.org/r/969/#comment2895>

    This and the following method names are awfully long.  It's just personal 
preference, but I like to keep things shorter.



src/test/java/org/apache/hadoop/hbase/coprocessor/BuggyRegionObserver.java
<https://reviews.apache.org/r/969/#comment2893>

    Minor nit, but why have this as a separate top-level class?  Seems like it 
could just be an inner class in TestRegionServerCoprocessorException same way 
that the BuggyMasterObserver is an inner class in 
TestMasterCoprocessorException.


- Gary


On 2011-08-01 22:08:27, Eugene Koontz wrote:
bq.  
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/969/
bq.  -----------------------------------------------------------
bq.  
bq.  (Updated 2011-08-01 22:08:27)
bq.  
bq.  
bq.  Review request for hbase, Gary Helmling and Mingjie Lai.
bq.  
bq.  
bq.  Summary
bq.  -------
bq.  
bq.  https://issues.apache.org/jira/browse/HBASE-4014 Coprocessors: Flag the 
presence of coprocessors in logged exceptions
bq.  
bq.  The general gist here is to wrap each of 
{Master,RegionServer}CoprocessorHost's coprocessor call inside a 
bq.  
bq.  "try { ... } catch (Throwable e) { handleCoprocessorThrowable(e) }"
bq.  
bq.  block. 
bq.  
bq.  handleCoprocessorThrowable() is responsible for either passing 'e' along 
to the client (if 'e' is an IOException) or, otherwise, aborting the service 
(Regionserver or Master).
bq.  
bq.  The abort message contains a list of the loaded coprocessors for crash 
analysis.
bq.  
bq.  
bq.  This addresses bug HBASE-4014.
bq.      https://issues.apache.org/jira/browse/HBASE-4014
bq.  
bq.  
bq.  Diffs
bq.  -----
bq.  
bq.    src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorHost.java 
18ba6e7 
bq.    src/main/java/org/apache/hadoop/hbase/master/MasterCoprocessorHost.java 
aa930f5 
bq.    
src/main/java/org/apache/hadoop/hbase/regionserver/RegionCoprocessorHost.java 
c2b3558 
bq.    
src/test/java/org/apache/hadoop/hbase/coprocessor/BuggyRegionObserver.java 
PRE-CREATION 
bq.    
src/test/java/org/apache/hadoop/hbase/coprocessor/TestMasterCoprocessorException.java
 PRE-CREATION 
bq.    
src/test/java/org/apache/hadoop/hbase/coprocessor/TestRegionServerCoprocessorException.java
 PRE-CREATION 
bq.  
bq.  Diff: https://reviews.apache.org/r/969/diff
bq.  
bq.  
bq.  Testing
bq.  -------
bq.  
bq.  patch includes two tests:
bq.  
bq.  TestMasterCoprocessorException.java
bq.  TestRegionServerCoprocessorException.java
bq.  
bq.  both tests pass in my build environment.
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Eugene
bq.  
bq.



> Coprocessors: Flag the presence of coprocessors in logged exceptions
> --------------------------------------------------------------------
>
>                 Key: HBASE-4014
>                 URL: https://issues.apache.org/jira/browse/HBASE-4014
>             Project: HBase
>          Issue Type: Improvement
>          Components: coprocessors
>            Reporter: Andrew Purtell
>            Assignee: Eugene Koontz
>             Fix For: 0.92.0
>
>         Attachments: HBASE-4014.patch, HBASE-4014.patch, HBASE-4014.patch, 
> HBASE-4014.patch
>
>
> For some initial triage of bug reports for core versus for deployments with 
> loaded coprocessors, we need something like the Linux kernel's taint flag, 
> and list of linked in modules that show up in the output of every OOPS, to 
> appear above or below exceptions that appear in the logs.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to