> On 2010-09-25 16:18:25, Andrew Purtell wrote:
> > src/main/java/org/apache/hadoop/hbase/client/HTable.java, line 1400
> > <http://review.cloudera.org/r/816/diff/5/?file=12533#file12533line1400>
> >
> >     Maybe for sake of clarity call this getStartKeysInRange?
> 
> Gary Helmling wrote:
>     Makes sense as well, will rename.
> 
> Himanshu Vashishtha wrote:
>     Gary: I meant this use case (might be irrelevant, considering scope of 
> cp):
>     If I want to know count of number of rows in a range (Row A, Row B): with 
> the help of this method, I can get the starting row of all regions that are 
> in this range but for the first (where you take the starting row of the 
> range: Row A in this case).
>     So, when the processing is done in the last region, one shd be aware of 
> the last row in the range, Row B. This processing will be done in the cp 
> impl, but that impl will be same on all region servers, so that check will be 
> there for all regions (no?). 
>     It is entirely possible  that this use case is not the one to be 
> supported by cp; or as I haven't really looked at mlai's code thoroughly yet, 
> might be missing something obvious.

A key point to understand in the HTable.exec() calls is that List<Row> and 
RowRange arguments are _only_ used to locate the regions against which we'll 
invoke the CoprocessorProtocol method.  Since coprocessors run in place 
per-region, we need to somehow indicate the region/coprocessor instances where 
the method should be invoked.

Note that the List<Row> or RowRange that were passed to HTable.exec() are _not_ 
made directly available to the CoprocessorProtocol method invoked, and couldn't 
easily be passed without a different approach to the framework.

Of course, if the CoprocessorProtocol method needs a certain row restriction to 
operate, you could just make it a parameter to the method -- sum(RowRange) for 
example.  But that is up to the CoprocessorProtocol implementor.

But I think this raises a good question: should the HTable interface use rows 
to identify the regions (the current methods)
   exec(Class protocol, List<Row> rows, Call method)
   exec(Class protocol, RowRange range, Call method)

or would it be better to identify the regions directly using region name, or 
HRegionInfo, etc
   exec(Class protocol, List<byte[]> regionNames, Call method)


Anyone have strong opinions here?  My thought was that using rows was a bit 
more consistent with other client calls, but maybe it raises the wrong 
expectations.  For the moment I'll re-examine the javadoc to see if I can make 
this clearer, but I'd appreciate other thoughts.


- Gary


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/816/#review1336
-----------------------------------------------------------


On 2010-09-30 17:13:36, Gary Helmling wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> http://review.cloudera.org/r/816/
> -----------------------------------------------------------
> 
> (Updated 2010-09-30 17:13:36)
> 
> 
> Review request for hbase, Andrew Purtell and Jonathan Gray.
> 
> 
> Summary
> -------
> 
> This is really two separate patches in one, though with some overlapping 
> changes.  If necessary I can split them apart for separate review.  Please 
> let me know if that would make review easier.
> 
> Part 1:
> ==============
> Port over of HADOOP-6422 to the HBase RPC code.  The goal of this change is 
> to allow alternate RPC client/server implementations to be enabled through a 
> simple configuration change.  Ultimately I would like to use this to allow 
> secure RPC to be enabled through configuration, while not blocking normal 
> (current) RPC operation on non-secure Hadoop versions.
> 
> This portion of the patch abstracts out two interfaces from the RPC code:
> 
> RpcEngine: HBaseRPC uses this to obtain proxy instances for client calls and 
> server instances for HMaster and HRegionServer
> RpcServer: this allows differing RPC server implementations, breaking the 
> dependency on HBaseServer
> 
> The bulk of the current code from HBaseRPC is moved into WritableRpcEngine 
> and is unchanged other than the interface requirements.  So the current call 
> path remains the same, other than the HBaseRPC.getProtocolEngine() 
> abstraction.
> 
> 
> Part 2:
> ===============
> The remaining changes provide server-side hooks for registering new RPC 
> protocols/handlers (per-region to support coprocessors), and client side 
> hooks to support dynamic execution of the registered protocols.
> 
> The new RPC protocol actions are constrained to 
> org.apache.hadoop.hbase.ipc.CoprocessorProtocol implementations (which 
> extends VersionedProtocol) to prevent arbitrary execution of methods against 
> HMasterInterface, HRegionInterface, etc.
> 
> For protocol handler registration, HRegionServer provides a new method:
> 
>   public <T extends CoprocessorProtocol> boolean registerProtocol(
>       byte[] region, Class<T> protocol, T handler)
> 
> which builds a Map of region name to protocol instances for dispatching 
> client calls.
> 
> 
> Client invocations are performed through HTable, which adds the following 
> methods:
> 
> 
>   public <T extends CoprocessorProtocol> T proxy(Class<T> protocol, Row row)
> 
> This directly returns a proxy instance to the CoprocessorProtocol 
> implementation registered for the region serving row "row".  Any method calls 
> will be proxied to the region's server and invoked using the map of 
> registered region name -> handler instances.
> 
> Calls directed against multiple rows are a bit more complicated.  They are 
> supported with the methods:
> 
>   public <T extends CoprocessorProtocol, R> void exec(
>       Class<T> protocol, List<? extends Row> rows,
>       BatchCall<T,R> callable, BatchCallback<R> callback)
> 
>   public <T extends CoprocessorProtocol, R> void exec(
>       Class<T> protocol, RowRange range,
>       BatchCall<T,R> callable, BatchCallback<R> callback)
> 
> where BatchCall and BatchCallback are simple interfaces defining the methods 
> to be called and a callback instance to be invoked for each result.
> 
> For the sample CoprocessorProtocol interface:
> 
>   interface PingProtocol extends CoprocessorProtocol {
>     public String ping();
>     public String hello(String name);
>   }
> 
> a client invocation might look like:
> 
>     final Map<byte[],R> results = new TreeMap<byte[],R>(...)
>     List<Row> rows = ...
>     table.exec(PingProtocol.class, rows,
>         new HTable.BatchCall<PingProtocol,String>() {
>           public String call(PingProtocol instance) {
>             return instance.ping();
>           }
>         },
>         new BatchCallback<R>(){
>           public void update(byte[] region, byte[] row, R value) {
>             results.put(region, value);
>           }
>         });
> 
> The BatchCall.call() method will be invoked for each row in the passed in 
> list, and the BatchCallback.update() method will be invoked for each return 
> value.  However, currently the PingProtocol.ping() invocation will result in 
> a separate RPC call per row, which is less that ideal.
> 
> Support is in place to make use of the HRegionServer.multi() invocations for 
> batched RPC (see the org.apache.hadoop.hbase.client.Exec class), but this 
> does not mesh well with the current client-side interface.
> 
> In addition to standard code review, I'd appreciate any thoughts on the 
> client interactions in particular, and whether they would meet some of the 
> anticipated uses of coprocessors.
> 
> 
> This addresses bugs HBASE-2002 and HBASE-2321.
>     http://issues.apache.org/jira/browse/HBASE-2002
>     http://issues.apache.org/jira/browse/HBASE-2321
> 
> 
> Diffs
> -----
> 
>   src/main/java/org/apache/hadoop/hbase/client/Action.java 556ea81 
>   src/main/java/org/apache/hadoop/hbase/client/Batch.java PRE-CREATION 
>   src/main/java/org/apache/hadoop/hbase/client/Exec.java PRE-CREATION 
>   src/main/java/org/apache/hadoop/hbase/client/ExecResult.java PRE-CREATION 
>   src/main/java/org/apache/hadoop/hbase/client/HConnection.java 65f7618 
>   src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java 
> fbdec0b 
>   src/main/java/org/apache/hadoop/hbase/client/HTable.java 0dbf263 
>   src/main/java/org/apache/hadoop/hbase/client/MultiAction.java c6ea838 
>   src/main/java/org/apache/hadoop/hbase/client/MultiResponse.java 91bd04b 
>   src/main/java/org/apache/hadoop/hbase/client/RowRange.java PRE-CREATION 
>   src/main/java/org/apache/hadoop/hbase/client/Scan.java 29b3cb0 
>   src/main/java/org/apache/hadoop/hbase/io/HbaseObjectWritable.java 83f623d 
>   src/main/java/org/apache/hadoop/hbase/ipc/ConnectionHeader.java 
> PRE-CREATION 
>   src/main/java/org/apache/hadoop/hbase/ipc/CoprocessorProtocol.java 
> PRE-CREATION 
>   src/main/java/org/apache/hadoop/hbase/ipc/ExecRPCInvoker.java PRE-CREATION 
>   src/main/java/org/apache/hadoop/hbase/ipc/HBaseClient.java 2b5eeb6 
>   src/main/java/org/apache/hadoop/hbase/ipc/HBaseRPC.java e23a629 
>   src/main/java/org/apache/hadoop/hbase/ipc/HBaseServer.java e4c356d 
>   src/main/java/org/apache/hadoop/hbase/ipc/HRegionInterface.java 27f9cc0 
>   src/main/java/org/apache/hadoop/hbase/ipc/Invocation.java PRE-CREATION 
>   src/main/java/org/apache/hadoop/hbase/ipc/RpcEngine.java PRE-CREATION 
>   src/main/java/org/apache/hadoop/hbase/ipc/RpcServer.java PRE-CREATION 
>   src/main/java/org/apache/hadoop/hbase/ipc/WritableRpcEngine.java 
> PRE-CREATION 
>   src/main/java/org/apache/hadoop/hbase/master/HMaster.java fb1e834 
>   src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 0a4fbce 
>   src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java 
> 595cf2e 
>   src/main/resources/hbase-default.xml 5fafe65 
>   
> src/test/java/org/apache/hadoop/hbase/regionserver/TestServerCustomProtocol.java
>  PRE-CREATION 
> 
> Diff: http://review.cloudera.org/r/816/diff
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Gary
> 
>

Reply via email to