[jira] Commented: (HBASE-2001) Coprocessors: Colocate user code with regions

2010-11-19 Thread HBase Review Board (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-2001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12934033#action_12934033
 ] 

HBase Review Board commented on HBASE-2001:
---

Message from: Mingjie Lai mjla...@gmail.com

---
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/876/
---

(Updated 2010-11-19 14:39:18.378849)


Review request for hbase, stack, Andrew Purtell, and Jonathan Gray.


Changes
---

Final patch, ready to be checked in:
- Truncated white space at line end
- Rebuilt the patch after HBase-2002 checked in. 


Summary
---

The diff actually contains 2 seperate patches: HBase-2001 and the one for 
(HBASE-2002+HBASE-2321). The reason is that HBase-2001's CommandTarget relies 
on HBASE-2002 + HBASE-2321 which patches are still under review. I have to 
include Gary's HBASE-2002, HBASE-2321 with this diff, since reviewboard is so 
powerful :) and it disallow my diff to be based on some unchecked in patch. 

Eventually the patch here should be committed after 2001 and 2321. I will make 
another patch after they got checked in. 

Both HBase-2001 and the dynamic RPC stuff are quite big patches. Total number 
of lines are more than 7k. I turned back and forth, but still don't have a good 
idea to create the patch in order to reduce the review pain. However right now 
I'm putting the whole patch for all the 3 issues. Here the list of file which 
are only related to coprocessor:

src/main/java/org/apache/hadoop/hbase/coprocessor/BaseEndpointCoprocessor.java
src/main/java/org/apache/hadoop/hbase/coprocessor/BaseRegionObserverCoprocessor.java
src/main/java/org/apache/hadoop/hbase/coprocessor/Coprocessor.java
src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorEnvironment.java
src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorException.java
src/main/java/org/apache/hadoop/hbase/coprocessor/RegionObserver.java
src/main/java/org/apache/hadoop/hbase/coprocessor/package-info.java
src/main/java/org/apache/hadoop/hbase/regionserver/CoprocessorHost.java
src/test/java/org/apache/hadoop/hbase/coprocessor/ColumnAggregationEndpoint.java
src/test/java/org/apache/hadoop/hbase/coprocessor/ColumnAggregationProtocol.java
src/test/java/org/apache/hadoop/hbase/coprocessor/SimpleRegionObserver.java
src/test/java/org/apache/hadoop/hbase/coprocessor/TestCoprocessorEndpoint.java
src/test/java/org/apache/hadoop/hbase/coprocessor/TestCoprocessorInterface.java
src/test/java/org/apache/hadoop/hbase/coprocessor/TestRegionObserverInterface.java
src/test/java/org/apache/hadoop/hbase/coprocessor/TestRegionObserverStacking.java


==

(Here is a brief description. Please find much more details at the 
package-info.java in the diff. I also post the package-info.html to 
https://issues.apache.org/jira/browse/HBASE-2001 as an attachment.)


Coprocessors are code that runs in-process on each region server. Regions 
contain references to the coprocessor implementation classes associated with 
them. Coprocessor classes will be loaded either from local jars on the region 
server's classpath or via the HDFS classloader.

Multiple types of coprocessors are provided to provide sufficient flexibility 
for potential use cases. Right now there are:

* Coprocessor: provides region lifecycle management hooks, e.g., region 
open/close/split/flush/compact operations.
* RegionObserver: provides hook for monitor table operations from client side, 
such as table get/put/scan/delete, etc.
* Endpoint: provides on demand triggers for any arbitrary function executed at 
a region. One use case is column aggregation at region server.

Coprocessor:
A coprocessor is required to implement Coprocessor interface so that 
coprocessor framework can manage it internally.

Another design goal of this interface is to provide simple features for making 
coprocessors useful, while exposing no more internal state or control actions 
of the region server than necessary and not exposing them directly. 

RegionObserver
If the coprocessor implements the RegionObserver interface it can observe and 
mediate client actions on the region. 

Endpoint:
Coprocessor and RegionObserver provide certain hooks for injecting user code 
running at each region. These code will be triggerd with existing HTable and 
HBaseAdmin operations at the certain hook points.

Through Endpoint and dynamic RPC protocol, you can define your own interface 
communicated between client and region server, i.e., you can create a new 
method, specify passed parameters and return types for the method. And the new 
Endpoint methods can be triggered by calling client side dynamic RPC functions 
-- HTable.exec(...). 

Coprocess loading
A customized coprocessor can be loaded by two different ways, by configuration, 
or by 

[jira] Commented: (HBASE-2001) Coprocessors: Colocate user code with regions

2010-11-19 Thread HBase Review Board (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-2001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12934039#action_12934039
 ] 

HBase Review Board commented on HBASE-2001:
---

Message from: Andrew Purtell apurt...@apache.org

---
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/876/#review1961
---

Ship it!


Will commit after running unit tests and verifying all pass.

- Andrew





 Coprocessors: Colocate user code with regions
 -

 Key: HBASE-2001
 URL: https://issues.apache.org/jira/browse/HBASE-2001
 Project: HBase
  Issue Type: Sub-task
Reporter: Andrew Purtell
Assignee: Mingjie Lai
 Fix For: 0.92.0

 Attachments: asm-transformations.pdf, HBase-2001-final.patch, 
 HBASE-2001-RegionObserver-2.patch, HBASE-2001-RegionObserver.patch, 
 HBASE-2001.patch.gz, packge-info.html, packge-info.html, packge-info.html


 Support user code that runs run next to each region in table. As regions 
 split and move, coprocessor code should automatically  move also.
 Use classloader which looks on HDFS.
 Associate a list of classes to load with each table. Put this in HRI so it 
 inherits from table but can be changed on a per region basis (so then those 
 region specific changes can inherited by daughters). 
 Not completely arbitrary code, should require implementation of an interface 
 with callbacks for:
 * Open
 * Close
 * Split
 * Compact
 * (Multi)get and scanner next()
 * (Multi)put
 * (Multi)delete
 Add method to HTableInterface for invoking coprocessor methods and retrieving 
 results.  
 Add methods in o.a.h.h.regionserver or subpackage which implement convenience 
 functions for coprocessor methods and consistent/controlled access to 
 internals: store access, threading, persistent and ephemeral state, scratch 
 storage, etc. 
 GitHub: https://github.com/trendmicro/hbase/tree/coprocessor
 Please see the latest attached package-info.html for updated description.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-2001) Coprocessors: Colocate user code with regions

2010-11-17 Thread HBase Review Board (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-2001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12933068#action_12933068
 ] 

HBase Review Board commented on HBASE-2001:
---

Message from: Andrew Purtell apurt...@apache.org


bq.  On 2010-11-15 16:51:18, stack wrote:
bq.   +1 on commit to TRUNK.  I think all below can be cleaned up on commit 
(Andrew, you going to commit?)

Stack, Yes I plan to commit the patches for HBASE-2001/HBASE-2002/HBASE-2321 
onto trunk this week. The dynamic RPC and coprocessor framework changes are 
largely independent and will go in separately to make the change history in the 
commit log more informative. We will address your comments before doing so.


- Andrew


---
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/876/#review1930
---





 Coprocessors: Colocate user code with regions
 -

 Key: HBASE-2001
 URL: https://issues.apache.org/jira/browse/HBASE-2001
 Project: HBase
  Issue Type: Sub-task
Reporter: Andrew Purtell
Assignee: Mingjie Lai
 Fix For: 0.92.0

 Attachments: asm-transformations.pdf, 
 HBASE-2001-RegionObserver-2.patch, HBASE-2001-RegionObserver.patch, 
 HBASE-2001.patch.gz, packge-info.html, packge-info.html, packge-info.html


 Support user code that runs run next to each region in table. As regions 
 split and move, coprocessor code should automatically  move also.
 Use classloader which looks on HDFS.
 Associate a list of classes to load with each table. Put this in HRI so it 
 inherits from table but can be changed on a per region basis (so then those 
 region specific changes can inherited by daughters). 
 Not completely arbitrary code, should require implementation of an interface 
 with callbacks for:
 * Open
 * Close
 * Split
 * Compact
 * (Multi)get and scanner next()
 * (Multi)put
 * (Multi)delete
 Add method to HTableInterface for invoking coprocessor methods and retrieving 
 results.  
 Add methods in o.a.h.h.regionserver or subpackage which implement convenience 
 functions for coprocessor methods and consistent/controlled access to 
 internals: store access, threading, persistent and ephemeral state, scratch 
 storage, etc. 
 GitHub: https://github.com/trendmicro/hbase/tree/coprocessor
 Please see the latest attached package-info.html for updated description.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-2001) Coprocessors: Colocate user code with regions

2010-11-15 Thread HBase Review Board (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-2001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12932285#action_12932285
 ] 

HBase Review Board commented on HBASE-2001:
---

Message from: st...@duboce.net


bq.  On 2010-10-05 23:10:58, stack wrote:
bq.   src/main/java/org/apache/hadoop/hbase/client/Action.java, line 30
bq.   http://review.cloudera.org/r/876/diff/7/?file=14158#file14158line30
bq.  
bq.   I took a look at the package-info.html.  Very nice doc.  One thought 
though was that the batch methods do not seem to be instrumented.  Are they?  
The bulk of inserts are done by multiput now.
bq.   
bq.   Maybe link to the wiki page when you say this in package-info.html 
'implement role-based access control for HBase'
bq.   
bq.   Fix this 'These code will be triggerd with existing...'
bq.   
bq.   BaseRegionObserver as the name of the class that implements BOTH 
Coprocessor and RegionObserver with sensible defaults seems off... it'd make 
sense as the name of an implemenation of RegionObserver but not of both.  Is 
there a better name to give it -- even BaseRegionObserverCoprocessor?  Unless 
BaseObserver already implements Coprocessor?
bq.   
bq.   Should this also say that methods can be new also?  '...i.e., you 
can specify new passed parameters and return types for a method. '
bq.   
bq.   CommandTarget is a strange name for an host of arbitrary 
user-designed methods.  Can we come up w/ something more telling?   Notions 
that come to mind are Substrate, Platform -- i.e. stuff you build up on.
bq.   
bq.   Minor.. fix '...the actually implemention class running...'
bq.   
bq.   Fix this '...How is the client side example of calling...'
bq.   
bq.   The example is missing a bit of code that would help along its 
illustration a few comments would help too but this is a minor 
criticism.  Not important.  I get the gist (Folks interested in CP need to 
start with this page -- it makes grokking the code the easier).
bq.   
bq.   This page would seem to indicate CPs can be chained.  Am I reading 
that wrong?  (See 'Load from configuration')  Over in Gary review, he was 
saying on CP per region only.
bq.   
bq.   
bq.   Usually attribute names are upper-cased.  Here we have 
'Coprocessor$1' (that $1is intentional right?)
bq.   
bq.   This functionality, if its working, is amazing.
bq.   
bq.   
bq.  
bq.  
bq.  Mingjie Lai wrote:
bq.  @stack:
bq.  I didn't realize you posted a comment until last week, since your 
comments here didn't get pushed to jira, neither emails sent to d...@hbase. 
bq.  
bq.  Thanks for your comments. I will address them very soon. But before 
that I'd like to finalize the name of ``CommandTarget'':
bq.  
bq.  You said, ``CommandTarget is a strange name for an host of arbitrary 
user-designed methods.  Can we come up w/ something more telling?   Notions 
that come to mind are Substrate, Platform -- i.e. stuff you build up on.''
bq.  
bq.  Some of us suggested to use ``Endpoint'' instead of CommandTarget. Do 
you like it better? (I'm not really good at naming stuff)
bq.  
bq.  After finalizing the name, I will make the changes to both source code 
and package-info. And post a patch here.
bq.  
bq.  Thanks,
bq.  Mingjie
bq. 

I'm not good at naming either Endpoint seems more 'generic', less loaded 
than 'CommandTarget'.  If you fellas working with this stuff think that a 
better name then thats good by me.


- stack


---
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/876/#review1438
---





 Coprocessors: Colocate user code with regions
 -

 Key: HBASE-2001
 URL: https://issues.apache.org/jira/browse/HBASE-2001
 Project: HBase
  Issue Type: Sub-task
Reporter: Andrew Purtell
Assignee: Mingjie Lai
 Fix For: 0.92.0

 Attachments: asm-transformations.pdf, 
 HBASE-2001-RegionObserver-2.patch, HBASE-2001-RegionObserver.patch, 
 HBASE-2001.patch.gz, packge-info.html, packge-info.html, packge-info.html


 Support user code that runs run next to each region in table. As regions 
 split and move, coprocessor code should automatically  move also.
 Use classloader which looks on HDFS.
 Associate a list of classes to load with each table. Put this in HRI so it 
 inherits from table but can be changed on a per region basis (so then those 
 region specific changes can inherited by daughters). 
 Not completely arbitrary code, should require implementation of an interface 
 with callbacks for:
 * Open
 * Close
 * Split
 * Compact
 * (Multi)get and 

[jira] Commented: (HBASE-2001) Coprocessors: Colocate user code with regions

2010-11-15 Thread HBase Review Board (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-2001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12932292#action_12932292
 ] 

HBase Review Board commented on HBASE-2001:
---

Message from: st...@duboce.net

---
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/876/#review1930
---

Ship it!


+1 on commit to TRUNK.  I think all below can be cleaned up on commit (Andrew, 
you going to commit?)


src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java
http://review.cloudera.org/r/876/#comment6139

Check in here.  Looks like tabs?  review board reporting it as whitespace.



src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java
http://review.cloudera.org/r/876/#comment6140

Usually in hbase code base there are spaces around operations; e.g. around 
'+'.



src/main/java/org/apache/hadoop/hbase/client/coprocessor/ExecResult.java
http://review.cloudera.org/r/876/#comment6142

Be careful.  In hbase lines are 80 characters long normally.  Fix on commit?



src/main/java/org/apache/hadoop/hbase/client/coprocessor/ExecResult.java
http://review.cloudera.org/r/876/#comment6143

I think its ok if these lines  80 characters



src/main/java/org/apache/hadoop/hbase/client/coprocessor/package-info.java
http://review.cloudera.org/r/876/#comment6144

Excellent



src/main/java/org/apache/hadoop/hbase/coprocessor/package-info.java
http://review.cloudera.org/r/876/#comment6146

Lots of white space in here.


- stack





 Coprocessors: Colocate user code with regions
 -

 Key: HBASE-2001
 URL: https://issues.apache.org/jira/browse/HBASE-2001
 Project: HBase
  Issue Type: Sub-task
Reporter: Andrew Purtell
Assignee: Mingjie Lai
 Fix For: 0.92.0

 Attachments: asm-transformations.pdf, 
 HBASE-2001-RegionObserver-2.patch, HBASE-2001-RegionObserver.patch, 
 HBASE-2001.patch.gz, packge-info.html, packge-info.html, packge-info.html


 Support user code that runs run next to each region in table. As regions 
 split and move, coprocessor code should automatically  move also.
 Use classloader which looks on HDFS.
 Associate a list of classes to load with each table. Put this in HRI so it 
 inherits from table but can be changed on a per region basis (so then those 
 region specific changes can inherited by daughters). 
 Not completely arbitrary code, should require implementation of an interface 
 with callbacks for:
 * Open
 * Close
 * Split
 * Compact
 * (Multi)get and scanner next()
 * (Multi)put
 * (Multi)delete
 Add method to HTableInterface for invoking coprocessor methods and retrieving 
 results.  
 Add methods in o.a.h.h.regionserver or subpackage which implement convenience 
 functions for coprocessor methods and consistent/controlled access to 
 internals: store access, threading, persistent and ephemeral state, scratch 
 storage, etc. 
 GitHub: https://github.com/trendmicro/hbase/tree/coprocessor
 Please see the latest attached package-info.html for updated description.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-2001) Coprocessors: Colocate user code with regions

2010-11-08 Thread HBase Review Board (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-2001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12929784#action_12929784
 ] 

HBase Review Board commented on HBASE-2001:
---

Message from: Mingjie Lai mjla...@gmail.com

---
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/876/
---

(Updated 2010-11-08 14:41:41.575886)


Review request for hbase, stack, Andrew Purtell, and Jonathan Gray.


Changes
---

Changes:
- Addressed Stack's comments.
  - Renamed CommandTarget to Endpoint (still up to decide whether it's better 
or not)
  - Improved package-info. 

- Refined some of RegionObserver method signatures, i.e., preGet(), etc. 
- Added hooks for HTable.increment(Increment).


Summary (updated)
---

The diff actually contains 2 seperate patches: HBase-2001 and the one for 
(HBASE-2002+HBASE-2321). The reason is that HBase-2001's CommandTarget relies 
on HBASE-2002 + HBASE-2321 which patches are still under review. I have to 
include Gary's HBASE-2002, HBASE-2321 with this diff, since reviewboard is so 
powerful :) and it disallow my diff to be based on some unchecked in patch. 

Eventually the patch here should be committed after 2001 and 2321. I will make 
another patch after they got checked in. 

Both HBase-2001 and the dynamic RPC stuff are quite big patches. Total number 
of lines are more than 7k. I turned back and forth, but still don't have a good 
idea to create the patch in order to reduce the review pain. However right now 
I'm putting the whole patch for all the 3 issues. Here the list of file which 
are only related to coprocessor:

src/main/java/org/apache/hadoop/hbase/coprocessor/BaseEndpointCoprocessor.java
src/main/java/org/apache/hadoop/hbase/coprocessor/BaseRegionObserverCoprocessor.java
src/main/java/org/apache/hadoop/hbase/coprocessor/Coprocessor.java
src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorEnvironment.java
src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorException.java
src/main/java/org/apache/hadoop/hbase/coprocessor/RegionObserver.java
src/main/java/org/apache/hadoop/hbase/coprocessor/package-info.java
src/main/java/org/apache/hadoop/hbase/regionserver/CoprocessorHost.java
src/test/java/org/apache/hadoop/hbase/coprocessor/ColumnAggregationEndpoint.java
src/test/java/org/apache/hadoop/hbase/coprocessor/ColumnAggregationProtocol.java
src/test/java/org/apache/hadoop/hbase/coprocessor/SimpleRegionObserver.java
src/test/java/org/apache/hadoop/hbase/coprocessor/TestCoprocessorEndpoint.java
src/test/java/org/apache/hadoop/hbase/coprocessor/TestCoprocessorInterface.java
src/test/java/org/apache/hadoop/hbase/coprocessor/TestRegionObserverInterface.java
src/test/java/org/apache/hadoop/hbase/coprocessor/TestRegionObserverStacking.java


==

(Here is a brief description. Please find much more details at the 
package-info.java in the diff. I also post the package-info.html to 
https://issues.apache.org/jira/browse/HBASE-2001 as an attachment.)


Coprocessors are code that runs in-process on each region server. Regions 
contain references to the coprocessor implementation classes associated with 
them. Coprocessor classes will be loaded either from local jars on the region 
server's classpath or via the HDFS classloader.

Multiple types of coprocessors are provided to provide sufficient flexibility 
for potential use cases. Right now there are:

* Coprocessor: provides region lifecycle management hooks, e.g., region 
open/close/split/flush/compact operations.
* RegionObserver: provides hook for monitor table operations from client side, 
such as table get/put/scan/delete, etc.
* Endpoint: provides on demand triggers for any arbitrary function executed at 
a region. One use case is column aggregation at region server.

Coprocessor:
A coprocessor is required to implement Coprocessor interface so that 
coprocessor framework can manage it internally.

Another design goal of this interface is to provide simple features for making 
coprocessors useful, while exposing no more internal state or control actions 
of the region server than necessary and not exposing them directly. 

RegionObserver
If the coprocessor implements the RegionObserver interface it can observe and 
mediate client actions on the region. 

Endpoint:
Coprocessor and RegionObserver provide certain hooks for injecting user code 
running at each region. These code will be triggerd with existing HTable and 
HBaseAdmin operations at the certain hook points.

Through Endpoint and dynamic RPC protocol, you can define your own interface 
communicated between client and region server, i.e., you can create a new 
method, specify passed parameters and return types for the method. And the new 
Endpoint methods can be triggered by calling client side 

[jira] Commented: (HBASE-2001) Coprocessors: Colocate user code with regions

2010-10-25 Thread HBase Review Board (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-2001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12924653#action_12924653
 ] 

HBase Review Board commented on HBASE-2001:
---

Message from: Andrew Purtell apurt...@apache.org


bq.  On 2010-10-25 06:49:15, Himanshu Vashishtha wrote:
bq.   src/main/java/org/apache/hadoop/hbase/regionserver/CoprocessorHost.java, 
line 343
bq.   http://review.cloudera.org/r/876/diff/7/?file=14190#file14190line343
bq.  
bq.   What is its purpose here? I couldn't see it being used as of now. Is 
it for some future functionality.

The access controller coprocessor (HBASE-3025) needs a CatalogTracker. Other 
future functionality is also considered.


- Andrew


---
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/876/#review1646
---





 Coprocessors: Colocate user code with regions
 -

 Key: HBASE-2001
 URL: https://issues.apache.org/jira/browse/HBASE-2001
 Project: HBase
  Issue Type: Sub-task
Reporter: Andrew Purtell
Assignee: Mingjie Lai
 Fix For: 0.92.0

 Attachments: asm-transformations.pdf, 
 HBASE-2001-RegionObserver-2.patch, HBASE-2001-RegionObserver.patch, 
 HBASE-2001.patch.gz, packge-info.html, packge-info.html


 Support user code that runs run next to each region in table. As regions 
 split and move, coprocessor code should automatically  move also.
 Use classloader which looks on HDFS.
 Associate a list of classes to load with each table. Put this in HRI so it 
 inherits from table but can be changed on a per region basis (so then those 
 region specific changes can inherited by daughters). 
 Not completely arbitrary code, should require implementation of an interface 
 with callbacks for:
 * Open
 * Close
 * Split
 * Compact
 * (Multi)get and scanner next()
 * (Multi)put
 * (Multi)delete
 Add method to HRegionInterface for invoking coprocessor methods and 
 retrieving results.  
 Add methods in o.a.h.h.regionserver or subpackage which implement convenience 
 functions for coprocessor methods and consistent/controlled access to 
 internals: store access, threading, persistent and ephemeral state, scratch 
 storage, etc. 
 GitHub: http://github.com/mlai/hbase/tree/0.90_coprocessor

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-2001) Coprocessors: Colocate user code with regions

2010-10-24 Thread HBase Review Board (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-2001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12924421#action_12924421
 ] 

HBase Review Board commented on HBASE-2001:
---

Message from: Himanshu Vashishtha vashishth...@gmail.com

---
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/876/#review1639
---



src/main/java/org/apache/hadoop/hbase/regionserver/CoprocessorHost.java
http://review.cloudera.org/r/876/#comment5505

Is this list is for scalability. I am not able to visualise a scenario 
where a Cp has reference to more than one table. Given a region has only one 
CpHost instance, and though it can a multi CpImpls in it, but each of them will 
have its own Environment instance (thus HTable instance). (I might be missing 
sth though :) )


- Himanshu





 Coprocessors: Colocate user code with regions
 -

 Key: HBASE-2001
 URL: https://issues.apache.org/jira/browse/HBASE-2001
 Project: HBase
  Issue Type: Sub-task
Reporter: Andrew Purtell
Assignee: Mingjie Lai
 Fix For: 0.92.0

 Attachments: asm-transformations.pdf, 
 HBASE-2001-RegionObserver-2.patch, HBASE-2001-RegionObserver.patch, 
 HBASE-2001.patch.gz, packge-info.html, packge-info.html


 Support user code that runs run next to each region in table. As regions 
 split and move, coprocessor code should automatically  move also.
 Use classloader which looks on HDFS.
 Associate a list of classes to load with each table. Put this in HRI so it 
 inherits from table but can be changed on a per region basis (so then those 
 region specific changes can inherited by daughters). 
 Not completely arbitrary code, should require implementation of an interface 
 with callbacks for:
 * Open
 * Close
 * Split
 * Compact
 * (Multi)get and scanner next()
 * (Multi)put
 * (Multi)delete
 Add method to HRegionInterface for invoking coprocessor methods and 
 retrieving results.  
 Add methods in o.a.h.h.regionserver or subpackage which implement convenience 
 functions for coprocessor methods and consistent/controlled access to 
 internals: store access, threading, persistent and ephemeral state, scratch 
 storage, etc. 
 GitHub: http://github.com/mlai/hbase/tree/0.90_coprocessor

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-2001) Coprocessors: Colocate user code with regions

2010-10-03 Thread HBase Review Board (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-2001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12917301#action_12917301
 ] 

HBase Review Board commented on HBASE-2001:
---

Message from: Lars Francke lars.fran...@gmail.com

---
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/876/#review1380
---


Sorry for all the whitespace comments :)
There are a bunch more in the test classes.


src/main/java/org/apache/hadoop/hbase/HServerInfo.java
http://review.cloudera.org/r/876/#comment4581

The ternary operator does not need braces.



src/main/java/org/apache/hadoop/hbase/client/Action.java
http://review.cloudera.org/r/876/#comment4582

The ternary operator does not need braces.



src/main/java/org/apache/hadoop/hbase/client/Batch.java
http://review.cloudera.org/r/876/#comment4583

Remove extra character(s)



src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java
http://review.cloudera.org/r/876/#comment4586

Should be of Type ListR not ArrayListR



src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java
http://review.cloudera.org/r/876/#comment4585

Why is this necessary? You already set the size by using the correct 
constructor.



src/main/java/org/apache/hadoop/hbase/client/HTableInterface.java
http://review.cloudera.org/r/876/#comment4588

Remove the public, interfaces don't need that.



src/main/java/org/apache/hadoop/hbase/client/HTableInterface.java
http://review.cloudera.org/r/876/#comment4589

Remove the public, interfaces don't need that.

Also byte[] key in Map so every implementor has to make sure to use a Map 
that does this correctly.



src/main/java/org/apache/hadoop/hbase/client/HTableInterface.java
http://review.cloudera.org/r/876/#comment4590

Remove the public, interfaces don't need that.



src/main/java/org/apache/hadoop/hbase/coprocessor/BaseCommandTarget.java
http://review.cloudera.org/r/876/#comment4591

Whitespace stuff



src/main/java/org/apache/hadoop/hbase/coprocessor/BaseCommandTarget.java
http://review.cloudera.org/r/876/#comment4592

Whitespace stuff



src/main/java/org/apache/hadoop/hbase/coprocessor/BaseCommandTarget.java
http://review.cloudera.org/r/876/#comment4593

Whitespace stuff



src/main/java/org/apache/hadoop/hbase/coprocessor/BaseRegionObserver.java
http://review.cloudera.org/r/876/#comment4594

Whitespace stuff



src/main/java/org/apache/hadoop/hbase/coprocessor/BaseRegionObserver.java
http://review.cloudera.org/r/876/#comment4595

Whitespace stuff



src/main/java/org/apache/hadoop/hbase/coprocessor/BaseRegionObserver.java
http://review.cloudera.org/r/876/#comment4596

Inconsistent formatting



src/main/java/org/apache/hadoop/hbase/coprocessor/BaseRegionObserver.java
http://review.cloudera.org/r/876/#comment4597

Inconsistent formatting



src/main/java/org/apache/hadoop/hbase/coprocessor/BaseRegionObserver.java
http://review.cloudera.org/r/876/#comment4598

Whitespace stuff



src/main/java/org/apache/hadoop/hbase/coprocessor/BaseRegionObserver.java
http://review.cloudera.org/r/876/#comment4599

Whitespace stuff



src/main/java/org/apache/hadoop/hbase/coprocessor/BaseRegionObserver.java
http://review.cloudera.org/r/876/#comment4600

Whitespace stuff



src/main/java/org/apache/hadoop/hbase/coprocessor/BaseRegionObserver.java
http://review.cloudera.org/r/876/#comment4601

Whitespace stuff



src/main/java/org/apache/hadoop/hbase/coprocessor/BaseRegionObserver.java
http://review.cloudera.org/r/876/#comment4602

Whitespace stuff



src/main/java/org/apache/hadoop/hbase/coprocessor/BaseRegionObserver.java
http://review.cloudera.org/r/876/#comment4603

Whitespace stuff



src/main/java/org/apache/hadoop/hbase/coprocessor/BaseRegionObserver.java
http://review.cloudera.org/r/876/#comment4604

Whitespace stuff



src/main/java/org/apache/hadoop/hbase/coprocessor/BaseRegionObserver.java
http://review.cloudera.org/r/876/#comment4605

Whitespace stuff



src/main/java/org/apache/hadoop/hbase/coprocessor/Coprocessor.java
http://review.cloudera.org/r/876/#comment4612

Remove public static final



src/main/java/org/apache/hadoop/hbase/coprocessor/Coprocessor.java
http://review.cloudera.org/r/876/#comment4606

Whitespace stuff



src/main/java/org/apache/hadoop/hbase/coprocessor/Coprocessor.java
http://review.cloudera.org/r/876/#comment4613

Remove public



src/main/java/org/apache/hadoop/hbase/coprocessor/Coprocessor.java
http://review.cloudera.org/r/876/#comment4614

Remove public



src/main/java/org/apache/hadoop/hbase/coprocessor/Coprocessor.java
http://review.cloudera.org/r/876/#comment4615

Remove public



src/main/java/org/apache/hadoop/hbase/coprocessor/Coprocessor.java

[jira] Commented: (HBASE-2001) Coprocessors: Colocate user code with regions

2010-09-30 Thread HBase Review Board (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-2001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12916596#action_12916596
 ] 

HBase Review Board commented on HBASE-2001:
---

Message from: Jonathan Gray jg...@apache.org

---
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/876/#review1366
---



src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
http://review.cloudera.org/r/876/#comment4522

I would actually be -1 on having this call the get coprocessor hooks.  
Expected behavior (for me) would be that coprocessors hook into the client-side 
calls, not internal calls.  For checkAndPut, are those Gets also wrapped by 
coprocessors?

I suppose you could make an argument either way, but I'd err on the side of 
coprocessors attaching on client operations not internal ones.


- Jonathan





 Coprocessors: Colocate user code with regions
 -

 Key: HBASE-2001
 URL: https://issues.apache.org/jira/browse/HBASE-2001
 Project: HBase
  Issue Type: Sub-task
Reporter: Andrew Purtell
Assignee: Andrew Purtell
 Attachments: asm-transformations.pdf, 
 HBASE-2001-RegionObserver-2.patch, HBASE-2001-RegionObserver.patch, 
 HBASE-2001.patch.gz, packge-info.html


 Support user code that runs run next to each region in table. As regions 
 split and move, coprocessor code should automatically  move also.
 Use classloader which looks on HDFS.
 Associate a list of classes to load with each table. Put this in HRI so it 
 inherits from table but can be changed on a per region basis (so then those 
 region specific changes can inherited by daughters). 
 Not completely arbitrary code, should require implementation of an interface 
 with callbacks for:
 * Open
 * Close
 * Split
 * Compact
 * (Multi)get and scanner next()
 * (Multi)put
 * (Multi)delete
 Add method to HRegionInterface for invoking coprocessor methods and 
 retrieving results.  
 Add methods in o.a.h.h.regionserver or subpackage which implement convenience 
 functions for coprocessor methods and consistent/controlled access to 
 internals: store access, threading, persistent and ephemeral state, scratch 
 storage, etc. 
 GitHub: http://github.com/apurtell/hbase-coprocessor

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-2001) Coprocessors: Colocate user code with regions

2010-09-30 Thread HBase Review Board (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-2001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12916760#action_12916760
 ] 

HBase Review Board commented on HBASE-2001:
---

Message from: Mingjie Lai mjla...@gmail.com

---
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/876/
---

(Updated 2010-09-30 19:18:18.297399)


Review request for hbase.


Changes
---

Fixes by review board comments


Summary
---


The diff actually contains 2 seperate patches: HBase-2001 and the one for 
(HBASE-2002+HBASE-2321). The reason is that HBase-2001's CommandTarget relies 
on HBASE-2002 + HBASE-2321 which patches are still under review. I have to 
include Gary's HBASE-2002, HBASE-2321 with this diff, since reviewboard is so 
powerful :) and it disallow my diff to be based on some unchecked in patch. 

Both HBase-2001 and the dynamic RPC stuff are quite big patches. Total number 
of lines are more than 7k. I turned back and forth, but still don't have a good 
idea to create the patch in order to reduce the review pain. However right now 
I'm putting the whole patch for all the 3 issues. Here the list of file which 
are only related to coprocessor:

src/main/java/org/apache/hadoop/hbase/coprocessor/BaseCommandTarget.java
src/main/java/org/apache/hadoop/hbase/coprocessor/BaseRegionObserver.java
src/main/java/org/apache/hadoop/hbase/coprocessor/Coprocessor.java
src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorEnvironment.java
src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorException.java
src/main/java/org/apache/hadoop/hbase/coprocessor/RegionObserver.java
src/main/java/org/apache/hadoop/hbase/coprocessor/package-info.java
src/main/java/org/apache/hadoop/hbase/regionserver/CoprocessorHost.java
src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
src/main/java/org/apache/hadoop/hbase/regionserver/SplitTransaction.java
src/main/resources/hbase-default.xml
src/test/java/org/apache/hadoop/hbase/coprocessor/TestClassloading.java
src/test/java/org/apache/hadoop/hbase/coprocessor/TestCommandTarget.java
src/test/java/org/apache/hadoop/hbase/coprocessor/TestCoprocessorInterface.java
src/test/java/org/apache/hadoop/hbase/coprocessor/TestRegionObserverInterface.java
src/test/java/org/apache/hadoop/hbase/coprocessor/TestRegionObserverStacking.java


==

(Here is a brief description. Please find much more details at the 
package-info.java in the diff. I also post the package-info.html to 
https://issues.apache.org/jira/browse/HBASE-2001 as an attachment.)


Coprocessors are code that runs in-process on each region server. Regions 
contain references to the coprocessor implementation classes associated with 
them. Coprocessor classes will be loaded either from local jars on the region 
server's classpath or via the HDFS classloader.

Multiple types of coprocessors are provided to provide sufficient flexibility 
for potential use cases. Right now there are:

* Coprocessor: provides region lifecycle management hooks, e.g., region 
open/close/split/flush/compact operations.
* RegionObserver: provides hook for monitor table operations from client side, 
such as table get/put/scan/delete, etc.
* CommandTarget: provides on demand triggers for any arbitrary function 
executed at a region. One use case is column aggregation at region server.

Coprocessor:
A coprocessor is required to implement Coprocessor interface so that 
coprocessor framework can manage it internally.

Another design goal of this interface is to provide simple features for making 
coprocessors useful, while exposing no more internal state or control actions 
of the region server than necessary and not exposing them directly. 

RegionObserver
If the coprocessor implements the RegionObserver interface it can observe and 
mediate client actions on the region. 

CommandTarget:
Coprocessor and RegionObserver provide certain hooks for injecting user code 
running at each region. These code will be triggerd with existing HTable and 
HBaseAdmin operations at the certain hook points.

Through CommandTarget and dynamic RPC protocol, you can define your own 
interface communicated between client and region server, i.e., you can specify 
new passed parameters and return types for a method. And the new CommandTarget 
methods can be triggered by calling client side dynamic RPC functions -- 
HTable.exec(...). 

Coprocess loading
A customized coprocessor can be loaded by two different ways, by configuration, 
or by HTableDescriptor for a newly created table.

(Currently we don't really have an on demand coprocessor loading machanism for 
opened regions. However it should be easy to create a dedicated CommandTarget 
for coprocessor loading) 


This addresses bug HBase-2001.

[jira] Commented: (HBASE-2001) Coprocessors: Colocate user code with regions

2010-09-01 Thread Jonathan Gray (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-2001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12905385#action_12905385
 ] 

Jonathan Gray commented on HBASE-2001:
--

+1 on having pre and post for everything.  More clear naming too.

 Coprocessors: Colocate user code with regions
 -

 Key: HBASE-2001
 URL: https://issues.apache.org/jira/browse/HBASE-2001
 Project: HBase
  Issue Type: Sub-task
Reporter: Andrew Purtell
Assignee: Andrew Purtell
 Attachments: asm-3.2-bin.zip, asm-transformations.pdf, 
 HBASE-2001-RegionObserver-2.patch, HBASE-2001-RegionObserver.patch, 
 HBASE-2001.patch.gz


 Support user code that runs run next to each region in table. As regions 
 split and move, coprocessor code should automatically  move also.
 Use classloader which looks on HDFS.
 Associate a list of classes to load with each table. Put this in HRI so it 
 inherits from table but can be changed on a per region basis (so then those 
 region specific changes can inherited by daughters). 
 Not completely arbitrary code, should require implementation of an interface 
 with callbacks for:
 * Open
 * Close
 * Split
 * Compact
 * (Multi)get and scanner next()
 * (Multi)put
 * (Multi)delete
 Add method to HRegionInterface for invoking coprocessor methods and 
 retrieving results.  
 Add methods in o.a.h.h.regionserver or subpackage which implement convenience 
 functions for coprocessor methods and consistent/controlled access to 
 internals: store access, threading, persistent and ephemeral state, scratch 
 storage, etc. 
 GitHub: http://github.com/apurtell/hbase-coprocessor

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-2001) Coprocessors: Colocate user code with regions

2010-06-03 Thread HBase Review Board (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-2001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12875228#action_12875228
 ] 

HBase Review Board commented on HBASE-2001:
---

Message from: Jonathan Gray jg...@facebook.com

---
This is an automatically generated e-mail. To reply, visit:
http://review.hbase.org/r/96/#review124
---



src/main/java/org/apache/hadoop/hbase/coprocessor/Coprocessor.java
http://review.hbase.org/r/96/#comment698

What is purpose of this?



src/main/java/org/apache/hadoop/hbase/coprocessor/Coprocessor.java
http://review.hbase.org/r/96/#comment694

Split decisions will not be made post-compaction as they are now after 
HBASE-2375 goes in.  That decision will actually be made at flush time, most 
likely post-flush though we'll know at the start whether it will end up needing 
to split.



src/main/java/org/apache/hadoop/hbase/coprocessor/RegionObserver.java
http://review.hbase.org/r/96/#comment695

So a coprocessor implementation would potentially implement Coprocessor and 
RegionObserver?  Notifications of higher level events happen through 
Coprocessor, this is for lower level hooks?  Maybe a bit more detail in class 
comment to describe difference between the two interfaces.



src/main/java/org/apache/hadoop/hbase/coprocessor/RegionObserver.java
http://review.hbase.org/r/96/#comment701

This makes sense now reading the rest of the code.  But it seems that the 
Coprocessor is in fact the observer that just gets notified of actions while 
this observer is actually the processor that can manipulate stuff?



src/main/java/org/apache/hadoop/hbase/coprocessor/RegionObserver.java
http://review.hbase.org/r/96/#comment696

And descending timestamp



src/main/java/org/apache/hadoop/hbase/coprocessor/RegionObserver.java
http://review.hbase.org/r/96/#comment697

This is great javadoc



src/main/java/org/apache/hadoop/hbase/coprocessor/RegionObserver.java
http://review.hbase.org/r/96/#comment699

Gets are called after the Get is performed, Puts and Deletes are called 
before, correct?

Would there be a use case for pre-Get hook?  Just wondering.



src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
http://review.hbase.org/r/96/#comment700

Javadoc says it's called after the split happens but before report to 
master.  Seems that this happens once we create the new HRegions but before we 
actually do the swap.  What exactly would/could a coprocessor be doing in this 
window?

One thing to be aware of is the master changes coming are going to make a 
split run entirely on the RS including the edits to META, closing of the 
parent, and opening of the children.  Where in that process would this hook 
make sense?


- Jonathan





 Coprocessors: Colocate user code with regions
 -

 Key: HBASE-2001
 URL: https://issues.apache.org/jira/browse/HBASE-2001
 Project: HBase
  Issue Type: Sub-task
Reporter: Andrew Purtell
Assignee: Andrew Purtell
 Attachments: asm-3.2-bin.zip, asm-transformations.pdf, 
 HBASE-2001-RegionObserver.patch, HBASE-2001.patch.gz


 Support user code that runs run next to each region in table. As regions 
 split and move, coprocessor code should automatically  move also.
 Use classloader which looks on HDFS.
 Associate a list of classes to load with each table. Put this in HRI so it 
 inherits from table but can be changed on a per region basis (so then those 
 region specific changes can inherited by daughters). 
 Not completely arbitrary code, should require implementation of an interface 
 with callbacks for:
 * Open
 * Close
 * Split
 * Compact
 * (Multi)get and scanner next()
 * (Multi)put
 * (Multi)delete
 Add method to HRegionInterface for invoking coprocessor methods and 
 retrieving results.  
 Add methods in o.a.h.h.regionserver or subpackage which implement convenience 
 functions for coprocessor methods and consistent/controlled access to 
 internals: store access, threading, persistent and ephemeral state, scratch 
 storage, etc. 
 GitHub: http://github.com/apurtell/hbase-coprocessor

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-2001) Coprocessors: Colocate user code with regions

2010-06-03 Thread HBase Review Board (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-2001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12875474#action_12875474
 ] 

HBase Review Board commented on HBASE-2001:
---

Message from: Andrew Purtell apurt...@apache.org


bq.  On 2010-06-02 17:24:34, Todd Lipcon wrote:
bq.   this seems like a reasonable framework, but I'd rather see this stay 
around as a branch until there is at least one or two actual things using it 
for a real purpose. Otherwise I think we'll end up shipping an API that we 
later realize doesn't work for real apps. What do you think?
bq.  
bq.  stack wrote:
bq.  This seems like a good idea when dev'ing an Interface.  After the 3rd 
implemenation you'll have some confidence in your Interface.

Ok, I'll take all the comments below and incorporate the feedback in a new 
version and commit it on a feature branch that will track trunk. I can use git 
to manage the merging and just push snapshots into SVN. Will set up a project 
on our Hudson to crunch tests for it.


- Andrew


---
This is an automatically generated e-mail. To reply, visit:
http://review.hbase.org/r/96/#review121
---





 Coprocessors: Colocate user code with regions
 -

 Key: HBASE-2001
 URL: https://issues.apache.org/jira/browse/HBASE-2001
 Project: HBase
  Issue Type: Sub-task
Reporter: Andrew Purtell
Assignee: Andrew Purtell
 Attachments: asm-3.2-bin.zip, asm-transformations.pdf, 
 HBASE-2001-RegionObserver.patch, HBASE-2001.patch.gz


 Support user code that runs run next to each region in table. As regions 
 split and move, coprocessor code should automatically  move also.
 Use classloader which looks on HDFS.
 Associate a list of classes to load with each table. Put this in HRI so it 
 inherits from table but can be changed on a per region basis (so then those 
 region specific changes can inherited by daughters). 
 Not completely arbitrary code, should require implementation of an interface 
 with callbacks for:
 * Open
 * Close
 * Split
 * Compact
 * (Multi)get and scanner next()
 * (Multi)put
 * (Multi)delete
 Add method to HRegionInterface for invoking coprocessor methods and 
 retrieving results.  
 Add methods in o.a.h.h.regionserver or subpackage which implement convenience 
 functions for coprocessor methods and consistent/controlled access to 
 internals: store access, threading, persistent and ephemeral state, scratch 
 storage, etc. 
 GitHub: http://github.com/apurtell/hbase-coprocessor

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-2001) Coprocessors: Colocate user code with regions

2010-06-03 Thread HBase Review Board (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-2001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12875475#action_12875475
 ] 

HBase Review Board commented on HBASE-2001:
---

Message from: Andrew Purtell apurt...@apache.org


bq.  On 2010-06-02 17:24:34, Todd Lipcon wrote:
bq.   src/test/java/org/apache/hadoop/hbase/coprocessor/TestClassloading.java, 
line 40
bq.   http://review.hbase.org/r/96/diff/4/?file=898#file898line40
bq.  
bq.   woah, can we add something to the build to build this jar as a test 
resource, or something?

Yes.


bq.  On 2010-06-02 17:24:34, Todd Lipcon wrote:
bq.   
src/test/java/org/apache/hadoop/hbase/coprocessor/TestCoprocessorInterface.java,
 line 43
bq.   http://review.hbase.org/r/96/diff/4/?file=899#file899line43
bq.  
bq.   you could use mockito verification to do this, probably would be 
simpler

Thanks this would be a good opportunity to learn mockito.


bq.  On 2010-06-02 17:24:34, Todd Lipcon wrote:
bq.   src/main/java/org/apache/hadoop/hbase/coprocessor/Coprocessor.java, line 
149
bq.   http://review.hbase.org/r/96/diff/4/?file=892#file892line149
bq.  
bq.   this map-like interface is somewhat confusing - what's the purpose 
of it?

This is an environment space, like unix process env vars, shared among all 
threads of the coprocessor (which get a reference to the environment). Useful 
for the mapreduce stuff not included in this patch. For example, rather than 
sum or average using intermediates, update AtomicLongs instead.


- Andrew


---
This is an automatically generated e-mail. To reply, visit:
http://review.hbase.org/r/96/#review121
---





 Coprocessors: Colocate user code with regions
 -

 Key: HBASE-2001
 URL: https://issues.apache.org/jira/browse/HBASE-2001
 Project: HBase
  Issue Type: Sub-task
Reporter: Andrew Purtell
Assignee: Andrew Purtell
 Attachments: asm-3.2-bin.zip, asm-transformations.pdf, 
 HBASE-2001-RegionObserver.patch, HBASE-2001.patch.gz


 Support user code that runs run next to each region in table. As regions 
 split and move, coprocessor code should automatically  move also.
 Use classloader which looks on HDFS.
 Associate a list of classes to load with each table. Put this in HRI so it 
 inherits from table but can be changed on a per region basis (so then those 
 region specific changes can inherited by daughters). 
 Not completely arbitrary code, should require implementation of an interface 
 with callbacks for:
 * Open
 * Close
 * Split
 * Compact
 * (Multi)get and scanner next()
 * (Multi)put
 * (Multi)delete
 Add method to HRegionInterface for invoking coprocessor methods and 
 retrieving results.  
 Add methods in o.a.h.h.regionserver or subpackage which implement convenience 
 functions for coprocessor methods and consistent/controlled access to 
 internals: store access, threading, persistent and ephemeral state, scratch 
 storage, etc. 
 GitHub: http://github.com/apurtell/hbase-coprocessor

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-2001) Coprocessors: Colocate user code with regions

2010-06-03 Thread HBase Review Board (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-2001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12875476#action_12875476
 ] 

HBase Review Board commented on HBASE-2001:
---

Message from: Andrew Purtell apurt...@apache.org


bq.  On 2010-06-02 23:28:50, stack wrote:
bq.   src/main/java/org/apache/hadoop/hbase/coprocessor/Coprocessor.java, line 
140
bq.   http://review.hbase.org/r/96/diff/4/?file=892#file892line140
bq.  
bq.   Which table is this?

Any table. The idea is the coprocessor can create any private tables it needs 
to implement its functionality.


bq.  On 2010-06-02 23:28:50, stack wrote:
bq.   src/main/java/org/apache/hadoop/hbase/coprocessor/Coprocessor.java, line 
27
bq.   http://review.hbase.org/r/96/diff/4/?file=892#file892line27
bq.  
bq.   What about unloading?  You remember that conversation up on irc of 
how loading is one thing but unloading w/o breakage is hard prob.

I'm not trying to tackle unloading yet.

However, classes are strongly bound to their classloader. We do instantiate a 
classloader each time to load a coprocessor and we don't hold a reference to 
the classloader. It is my understanding that when there are no more references 
to the classes (no live objects), they and the classloader will be garbage 
collected, though the JVM spec does not guarantee this the Sun JVM will do 
this. Creating a new classloader and asking for the class again, presumably 
from an updated jar, should load the new class -- a unit test can verify. 

To help insure old classes don't leak via live objects hanging around, we could 
consider a cooperative lifecycle management scheme like that used by OSGi: 
http://en.wikipedia.org/wiki/OSGi. 


bq.  On 2010-06-02 23:28:50, stack wrote:
bq.   src/main/java/org/apache/hadoop/hbase/coprocessor/Coprocessor.java, line 
150
bq.   http://review.hbase.org/r/96/diff/4/?file=892#file892line150
bq.  
bq.   Needs to be Writable?

No.


bq.  On 2010-06-02 23:28:50, stack wrote:
bq.   src/main/java/org/apache/hadoop/hbase/coprocessor/RegionObserver.java, 
line 29
bq.   http://review.hbase.org/r/96/diff/4/?file=893#file893line29
bq.  
bq.   This is a mixin you'd use if you want to be notified about 
compactions, etc.?

This is for translating values found in a flush file into new values in the new 
storefile being built by the compaction, or for dropping values. 


bq.  On 2010-06-02 23:28:50, stack wrote:
bq.   src/main/java/org/apache/hadoop/hbase/regionserver/CoprocessorHost.java, 
line 85
bq.   http://review.hbase.org/r/96/diff/4/?file=894#file894line85
bq.  
bq.   What would this be used for?  For CP to call out elsewhere on a 
table?

The idea is the coprocessor can create any private tables it needs to implement 
its functionality. But we want to mediate that, add access control, clean up 
references when/if the cp is terminated (and perhaps unloaded).


bq.  On 2010-06-02 23:28:50, stack wrote:
bq.   src/main/java/org/apache/hadoop/hbase/regionserver/CoprocessorHost.java, 
line 237
bq.   http://review.hbase.org/r/96/diff/4/?file=894#file894line237
bq.  
bq.   Is this needed?

Why not.


bq.  On 2010-06-02 23:28:50, stack wrote:
bq.   src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java, line 245
bq.   http://review.hbase.org/r/96/diff/4/?file=895#file895line245
bq.  
bq.   Its always on then?

Yes, otherwise I have to wrap all calls to the cp host in HRegion with if 
(coprocessorHost != null) then, including the inner loops of the major and 
minor compactors. 


bq.  On 2010-06-02 23:28:50, stack wrote:
bq.   src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java, line 
2885
bq.   http://review.hbase.org/r/96/diff/4/?file=895#file895line2885
bq.  
bq.   Who would want to get at this?

Tests. So probably this does not have to be public.


- Andrew


---
This is an automatically generated e-mail. To reply, visit:
http://review.hbase.org/r/96/#review123
---





 Coprocessors: Colocate user code with regions
 -

 Key: HBASE-2001
 URL: https://issues.apache.org/jira/browse/HBASE-2001
 Project: HBase
  Issue Type: Sub-task
Reporter: Andrew Purtell
Assignee: Andrew Purtell
 Attachments: asm-3.2-bin.zip, asm-transformations.pdf, 
 HBASE-2001-RegionObserver.patch, HBASE-2001.patch.gz


 Support user code that runs run next to each region in table. As regions 
 split and move, coprocessor code should automatically  move also.
 Use classloader which looks on HDFS.
 Associate a list of classes to load with each table. Put this in HRI so it 
 inherits from table but can be changed on a per region basis (so then those 
 region specific changes can inherited by daughters). 
 Not completely 

[jira] Commented: (HBASE-2001) Coprocessors: Colocate user code with regions

2010-05-31 Thread HBase Review Board (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-2001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12873885#action_12873885
 ] 

HBase Review Board commented on HBASE-2001:
---

Message from: Andrew Purtell apurt...@apache.org

---
This is an automatically generated e-mail. To reply, visit:
http://review.hbase.org/r/96/
---

(Updated 2010-05-31 22:30:41.618893)


Review request for hbase.


Summary
---

This patch is the parts of the HBASE-2001 patch which implements support for 
the RegionObserver interface. This enables extension of the regionserver 
through stacking dynamically loaded classes i.e. from jars on HDFS onto upcalls 
from HRegion. I made some improvements over the other patch and added a test 
case. There are other parts of 2001 which need some thought and some work and 
would not be useful without client side support. This is the part which could 
be immediately useful. 

Submitted for feedback. 

Incorporates a user suggestion and Stack +1 about hooking compaction.


This addresses bug HBASE-2001.
http://issues.apache.org/jira/browse/HBASE-2001


Diffs
-

  src/main/java/org/apache/hadoop/hbase/coprocessor/Coprocessor.java 
PRE-CREATION 
  src/main/java/org/apache/hadoop/hbase/coprocessor/RegionObserver.java 
PRE-CREATION 
  src/main/java/org/apache/hadoop/hbase/regionserver/CoprocessorHost.java 
PRE-CREATION 
  src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 2413e98 
  
src/main/java/org/apache/hadoop/hbase/regionserver/MinorCompactingStoreScanner.java
 71f738e 
  src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 515b42f 
  src/test/java/org/apache/hadoop/hbase/coprocessor/TestClassloading.java 
PRE-CREATION 
  
src/test/java/org/apache/hadoop/hbase/coprocessor/TestCoprocessorInterface.java 
PRE-CREATION 
  
src/test/java/org/apache/hadoop/hbase/coprocessor/TestRegionObserverInterface.java
 PRE-CREATION 
  
src/test/java/org/apache/hadoop/hbase/coprocessor/TestRegionObserverStacking.java
 PRE-CREATION 

Diff: http://review.hbase.org/r/96/diff


Testing
---

All the new unit tests plus TestHRegion pass locally.


Thanks,

Andrew




 Coprocessors: Colocate user code with regions
 -

 Key: HBASE-2001
 URL: https://issues.apache.org/jira/browse/HBASE-2001
 Project: HBase
  Issue Type: Sub-task
Reporter: Andrew Purtell
Assignee: Andrew Purtell
 Attachments: asm-3.2-bin.zip, asm-transformations.pdf, 
 HBASE-2001-RegionObserver.patch, HBASE-2001.patch.gz


 Support user code that runs run next to each region in table. As regions 
 split and move, coprocessor code should automatically  move also.
 Use classloader which looks on HDFS.
 Associate a list of classes to load with each table. Put this in HRI so it 
 inherits from table but can be changed on a per region basis (so then those 
 region specific changes can inherited by daughters). 
 Not completely arbitrary code, should require implementation of an interface 
 with callbacks for:
 * Open
 * Close
 * Split
 * Compact
 * (Multi)get and scanner next()
 * (Multi)put
 * (Multi)delete
 Add method to HRegionInterface for invoking coprocessor methods and 
 retrieving results.  
 Add methods in o.a.h.h.regionserver or subpackage which implement convenience 
 functions for coprocessor methods and consistent/controlled access to 
 internals: store access, threading, persistent and ephemeral state, scratch 
 storage, etc. 
 GitHub: http://github.com/apurtell/hbase-coprocessor

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-2001) Coprocessors: Colocate user code with regions

2010-05-31 Thread HBase Review Board (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-2001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12873887#action_12873887
 ] 

HBase Review Board commented on HBASE-2001:
---

Message from: Andrew Purtell apurt...@apache.org

---
This is an automatically generated e-mail. To reply, visit:
http://review.hbase.org/r/96/
---

(Updated 2010-05-31 22:47:25.650165)


Review request for hbase.


Summary
---

This patch is the parts of the HBASE-2001 patch which implements support for 
the RegionObserver interface. This enables extension of the regionserver 
through stacking dynamically loaded classes i.e. from jars on HDFS onto upcalls 
from HRegion. I made some improvements over the other patch and added a test 
case. There are other parts of 2001 which need some thought and some work and 
would not be useful without client side support. This is the part which could 
be immediately useful. 

Submitted for feedback. 

Incorporates a user suggestion and Stack +1 about hooking compaction.


This addresses bug HBASE-2001.
http://issues.apache.org/jira/browse/HBASE-2001


Diffs (updated)
-

  src/main/java/org/apache/hadoop/hbase/coprocessor/Coprocessor.java 
PRE-CREATION 
  src/main/java/org/apache/hadoop/hbase/coprocessor/RegionObserver.java 
PRE-CREATION 
  src/main/java/org/apache/hadoop/hbase/regionserver/CoprocessorHost.java 
PRE-CREATION 
  src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 2413e98 
  
src/main/java/org/apache/hadoop/hbase/regionserver/MinorCompactingStoreScanner.java
 71f738e 
  src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 515b42f 
  src/test/java/org/apache/hadoop/hbase/coprocessor/TestClassloading.java 
PRE-CREATION 
  
src/test/java/org/apache/hadoop/hbase/coprocessor/TestCoprocessorInterface.java 
PRE-CREATION 
  
src/test/java/org/apache/hadoop/hbase/coprocessor/TestRegionObserverInterface.java
 PRE-CREATION 
  
src/test/java/org/apache/hadoop/hbase/coprocessor/TestRegionObserverStacking.java
 PRE-CREATION 

Diff: http://review.hbase.org/r/96/diff


Testing
---

All the new unit tests plus TestHRegion pass locally.


Thanks,

Andrew




 Coprocessors: Colocate user code with regions
 -

 Key: HBASE-2001
 URL: https://issues.apache.org/jira/browse/HBASE-2001
 Project: HBase
  Issue Type: Sub-task
Reporter: Andrew Purtell
Assignee: Andrew Purtell
 Attachments: asm-3.2-bin.zip, asm-transformations.pdf, 
 HBASE-2001-RegionObserver.patch, HBASE-2001.patch.gz


 Support user code that runs run next to each region in table. As regions 
 split and move, coprocessor code should automatically  move also.
 Use classloader which looks on HDFS.
 Associate a list of classes to load with each table. Put this in HRI so it 
 inherits from table but can be changed on a per region basis (so then those 
 region specific changes can inherited by daughters). 
 Not completely arbitrary code, should require implementation of an interface 
 with callbacks for:
 * Open
 * Close
 * Split
 * Compact
 * (Multi)get and scanner next()
 * (Multi)put
 * (Multi)delete
 Add method to HRegionInterface for invoking coprocessor methods and 
 retrieving results.  
 Add methods in o.a.h.h.regionserver or subpackage which implement convenience 
 functions for coprocessor methods and consistent/controlled access to 
 internals: store access, threading, persistent and ephemeral state, scratch 
 storage, etc. 
 GitHub: http://github.com/apurtell/hbase-coprocessor

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.