> On 2010-10-04 23:09:49, stack wrote: > > src/main/java/org/apache/hadoop/hbase/client/Action.java, line 35 > > <http://review.cloudera.org/r/816/diff/9/?file=13679#file13679line35> > > > > Action is a new class so we are not breaking any pre-exisiting API here > > (Even so, erasure would reduce this API change to the old I believe > > anyways?). > > > > Gary Helmling wrote: > Action and MultiAction seem to just be internal implementation classes, > so I thought this would be a safe refactoring. > > You mean for something like a rolling restart? I believe the type > erasure + ignoring return types on method lookup (getResult return type was > parameterized so would be Object) should make this continue to work, but I > may have introduced breakage elsewhere...
Don't worry about it. Rolling restart is only guaranteed to work between patch releases. We're talking major release here. All rpc versions have been upped and 0.20.xs or even 0.89s I'd say won't work w/ 0.90.x > On 2010-10-04 23:09:49, stack wrote: > > src/main/java/org/apache/hadoop/hbase/client/Batch.java, line 50 > > <http://review.cloudera.org/r/816/diff/9/?file=13680#file13680line50> > > > > cool > > > > But, this method's name is 'returning'? > > > > So, it executes the 'method' of 'protocol' and returns the hosting > > 'Call' whose invocation has already run? > > > > Should it be 'execute' or 'executeCall' or 'invokeCall', etc. > > Gary Helmling wrote: > This just returns a Batch.Call instance, whose call() method will invoke > the specified CoprocessorProtocol method. So it returns a Batch.Call that > returns the method result. At this point the remote invocation has not yet > happened. That won't occur until down in > HConnectionManager.HConnectionImplementation.processExecs(). Reached through > passing the Batch.Call instance to HTable.exec(...). > > Yeah, "returning()" is a little generic too. I could rename to > forMethod() or callingMethod()? forExec? I don't remember if the javadoc ties this to HTable.exec... it probably does. If it don't, it should. Maybe even an 'example'? Good naming and where it is located in the hierarchy will help folks grok how to use it all. > On 2010-10-04 23:09:49, stack wrote: > > src/main/java/org/apache/hadoop/hbase/client/Batch.java, line 80 > > <http://review.cloudera.org/r/816/diff/9/?file=13680#file13680line80> > > > > Do you need the 'Batch.' prefix here? > > Gary Helmling wrote: > No, it's extraneous and inconsistent with most HBase code. I'll drop it. > I guess I was having a "static method invocations should be referenced by > class name" moment... Not important. > On 2010-10-04 23:09:49, stack wrote: > > src/main/java/org/apache/hadoop/hbase/client/Exec.java, line 35 > > <http://review.cloudera.org/r/816/diff/9/?file=13681#file13681line35> > > > > ... against a Coprocessor? > > > > Maybe add above? > > > > Gary Helmling wrote: > Yes, could rename CoprocessorExec for clarity. It's fairly generic but > there's no other usage. > > I guess in the naming here and elsewhere, I was envisioning Coprocessors > as the sort of stored procedures of HBase. A basic functionality -- execute > this user code -- with coprocessors as the implementation. So I took a > general approach to naming the client interface. It seemed to fit in with > the basic operations: Get, Scan, Put, Delete, Exec. > > But if this is overly general and confusing, I have no problem renaming > this and the other client bits with a "Coprocessor" prefix. > > Will definitely improve javadoc here as well. Ok. That explains a bunch (regards naming). Can you think of a place where Exec would be used anywhere else? It doesn't seem tied to coprocessors if I recall. You could say, in javadoc, "used for example by coprocessors" Batch though I think doesn't fit your Get, Scan, Put, Delete, Exec set because it strikes me as a Batch of Gets/Scans/Puts... which it is not. I'm bad w/ names or would suggest something. Would be a shame to give it all Coprocessor prefix if later it could be used for other things. > On 2010-10-04 23:09:49, stack wrote: > > src/main/java/org/apache/hadoop/hbase/client/Exec.java, line 65 > > <http://review.cloudera.org/r/816/diff/9/?file=13681#file13681line65> > > > > This is fair I suppose if only one coprocessor per region (Is that > > true)? > > Gary Helmling wrote: > Correct, only a single handler can be registered per CoprocessorProtocol > subclass per region. Is that how we want it to always be? Chaining processors would just be a nightmare? > On 2010-10-04 23:09:49, stack wrote: > > src/main/java/org/apache/hadoop/hbase/client/HConnection.java, line 241 > > <http://review.cloudera.org/r/816/diff/9/?file=13683#file13683line241> > > > > Want to call this out as a CP method? > > > > Gary Helmling wrote: > This is actually a refactoring of the previous processBatch() method to > accommodate Exec instances as well. > > The processExecs() method below doesn't make use of it yet, but I'd like > to incorporate that as an immediate next step for better RPC batching. Fair enough. > On 2010-10-04 23:09:49, stack wrote: > > src/main/java/org/apache/hadoop/hbase/client/Batch.java, line 37 > > <http://review.cloudera.org/r/816/diff/9/?file=13680#file13680line37> > > > > I think the fact that this class if of coprocessors needs to be > > highlighted better. Batch is a super generic name yet this Batch is only > > for CPs. A subpackage for these CP classes would be a pain would it? Any > > other way of grouping these CP classes? A prefix? Just throwing it out > > there (I'm sure you thought about it and had a reason for not going these > > routes). > > Gary Helmling wrote: > The name is I guess a remnant of these changes starting out as > overly-generic, then scaled back to coprocessors specific implementation. I > don't see any general applicability beyond that at the moment, so clearing up > the naming would probably help. Don't want to get too wordy though... > CoprocessorBatch? CoprocessorBatch would work with this prefix affixed to all related packages. Should these classes go into a client subpackage of coprocessor altogether? (Thinking on it, a subpackage of client is kinda ugly -- but maybe that'd be the way to go?) > On 2010-10-04 23:09:49, stack wrote: > > src/main/java/org/apache/hadoop/hbase/client/HConnection.java, line 252 > > <http://review.cloudera.org/r/816/diff/9/?file=13683#file13683line252> > > > > Cool > > > > So this would be for a single cooprocessor implementation. > > > > You say above that we do an rpc per row but are the rpcs run serially > > or in parallell? If parallel, thats sweet. > > Gary Helmling wrote: > They're parallelized using the existing HTable ExecutorService. > > But as an immediate next step, I would like to get them also batching > into a single RPC call per CoprocessorProtocol method invocation, per region > server. The scaffolding is already there in the (Multi)Action > parameterization and HConnection.processBatchCallback(), I just need to > coordinate the per RS batching through ExecRPCInvoker. Yeah, but when I was thinking on it, this is not nearly as critical as batching raw rows....given as how the row is just a 'region-chooser" in your CP case.... I suppose it could help in case where hundreds of regions on a single server and all have same CP hooked in. > On 2010-10-04 23:09:49, stack wrote: > > src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java, line > > 1087 > > <http://review.cloudera.org/r/816/diff/9/?file=13684#file13684line1087> > > > > Should this be public? Its not in HConnection, is it? Or, rather, why > > is it not in HConnection? > > Gary Helmling wrote: > Should be in HConnection as well. ok > On 2010-10-04 23:09:49, stack wrote: > > src/main/java/org/apache/hadoop/hbase/client/RowRange.java, line 4 > > <http://review.cloudera.org/r/816/diff/9/?file=13689#file13689line4> > > > > You have to say something about 'inclusive'? > > Gary Helmling wrote: > I'll remove RowRange, since it was only used in a previous version of the > HTable.exec() signatures. It's nicely parallel to Row, but with only a > single implementation (Scan), seems useless. yeah, remove it then.... good. - stack ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/816/#review1414 ----------------------------------------------------------- On 2010-10-04 16:28:39, Gary Helmling wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > http://review.cloudera.org/r/816/ > ----------------------------------------------------------- > > (Updated 2010-10-04 16:28:39) > > > Review request for hbase, Andrew Purtell and Jonathan Gray. > > > Summary > ------- > > This is really two separate patches in one, though with some overlapping > changes. If necessary I can split them apart for separate review. Please > let me know if that would make review easier. > > Part 1: > ============== > Port over of HADOOP-6422 to the HBase RPC code. The goal of this change is > to allow alternate RPC client/server implementations to be enabled through a > simple configuration change. Ultimately I would like to use this to allow > secure RPC to be enabled through configuration, while not blocking normal > (current) RPC operation on non-secure Hadoop versions. > > This portion of the patch abstracts out two interfaces from the RPC code: > > RpcEngine: HBaseRPC uses this to obtain proxy instances for client calls and > server instances for HMaster and HRegionServer > RpcServer: this allows differing RPC server implementations, breaking the > dependency on HBaseServer > > The bulk of the current code from HBaseRPC is moved into WritableRpcEngine > and is unchanged other than the interface requirements. So the current call > path remains the same, other than the HBaseRPC.getProtocolEngine() > abstraction. > > > Part 2: > =============== > The remaining changes provide server-side hooks for registering new RPC > protocols/handlers (per-region to support coprocessors), and client side > hooks to support dynamic execution of the registered protocols. > > The new RPC protocol actions are constrained to > org.apache.hadoop.hbase.ipc.CoprocessorProtocol implementations (which > extends VersionedProtocol) to prevent arbitrary execution of methods against > HMasterInterface, HRegionInterface, etc. > > For protocol handler registration, HRegionServer provides a new method: > > public <T extends CoprocessorProtocol> boolean registerProtocol( > byte[] region, Class<T> protocol, T handler) > > which builds a Map of region name to protocol instances for dispatching > client calls. > > > Client invocations are performed through HTable, which adds the following > methods: > > > public <T extends CoprocessorProtocol> T proxy(Class<T> protocol, Row row) > > This directly returns a proxy instance to the CoprocessorProtocol > implementation registered for the region serving row "row". Any method calls > will be proxied to the region's server and invoked using the map of > registered region name -> handler instances. > > Calls directed against multiple rows are a bit more complicated. They are > supported with the methods: > > public <T extends CoprocessorProtocol, R> void exec( > Class<T> protocol, List<? extends Row> rows, > BatchCall<T,R> callable, BatchCallback<R> callback) > > public <T extends CoprocessorProtocol, R> void exec( > Class<T> protocol, RowRange range, > BatchCall<T,R> callable, BatchCallback<R> callback) > > where BatchCall and BatchCallback are simple interfaces defining the methods > to be called and a callback instance to be invoked for each result. > > For the sample CoprocessorProtocol interface: > > interface PingProtocol extends CoprocessorProtocol { > public String ping(); > public String hello(String name); > } > > a client invocation might look like: > > final Map<byte[],R> results = new TreeMap<byte[],R>(...) > List<Row> rows = ... > table.exec(PingProtocol.class, rows, > new HTable.BatchCall<PingProtocol,String>() { > public String call(PingProtocol instance) { > return instance.ping(); > } > }, > new BatchCallback<R>(){ > public void update(byte[] region, byte[] row, R value) { > results.put(region, value); > } > }); > > The BatchCall.call() method will be invoked for each row in the passed in > list, and the BatchCallback.update() method will be invoked for each return > value. However, currently the PingProtocol.ping() invocation will result in > a separate RPC call per row, which is less that ideal. > > Support is in place to make use of the HRegionServer.multi() invocations for > batched RPC (see the org.apache.hadoop.hbase.client.Exec class), but this > does not mesh well with the current client-side interface. > > In addition to standard code review, I'd appreciate any thoughts on the > client interactions in particular, and whether they would meet some of the > anticipated uses of coprocessors. > > > This addresses bug HBASE-2002. > http://issues.apache.org/jira/browse/HBASE-2002 > > > Diffs > ----- > > src/main/java/org/apache/hadoop/hbase/client/Action.java 556ea81 > src/main/java/org/apache/hadoop/hbase/client/Batch.java PRE-CREATION > src/main/java/org/apache/hadoop/hbase/client/Exec.java PRE-CREATION > src/main/java/org/apache/hadoop/hbase/client/ExecResult.java PRE-CREATION > src/main/java/org/apache/hadoop/hbase/client/HConnection.java 65f7618 > src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java > 866ba92 > src/main/java/org/apache/hadoop/hbase/client/HTable.java 0dbf263 > src/main/java/org/apache/hadoop/hbase/client/HTableInterface.java 74593bf > src/main/java/org/apache/hadoop/hbase/client/MultiAction.java c6ea838 > src/main/java/org/apache/hadoop/hbase/client/MultiResponse.java 91bd04b > src/main/java/org/apache/hadoop/hbase/client/RowRange.java PRE-CREATION > src/main/java/org/apache/hadoop/hbase/client/Scan.java 29b3cb0 > src/main/java/org/apache/hadoop/hbase/io/HbaseObjectWritable.java 83f623d > src/main/java/org/apache/hadoop/hbase/ipc/ConnectionHeader.java > PRE-CREATION > src/main/java/org/apache/hadoop/hbase/ipc/CoprocessorProtocol.java > PRE-CREATION > src/main/java/org/apache/hadoop/hbase/ipc/ExecRPCInvoker.java PRE-CREATION > src/main/java/org/apache/hadoop/hbase/ipc/HBaseClient.java 2b5eeb6 > src/main/java/org/apache/hadoop/hbase/ipc/HBaseRPC.java e23a629 > src/main/java/org/apache/hadoop/hbase/ipc/HBaseServer.java e4c356d > src/main/java/org/apache/hadoop/hbase/ipc/HRegionInterface.java ee5dd8f > src/main/java/org/apache/hadoop/hbase/ipc/Invocation.java PRE-CREATION > src/main/java/org/apache/hadoop/hbase/ipc/RpcEngine.java PRE-CREATION > src/main/java/org/apache/hadoop/hbase/ipc/RpcServer.java PRE-CREATION > src/main/java/org/apache/hadoop/hbase/ipc/WritableRpcEngine.java > PRE-CREATION > src/main/java/org/apache/hadoop/hbase/master/HMaster.java fb1e834 > src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 0a4fbce > src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java > 89f499a > src/main/java/org/apache/hadoop/hbase/rest/client/RemoteHTable.java d4166cf > src/main/resources/hbase-default.xml 5fafe65 > > src/test/java/org/apache/hadoop/hbase/regionserver/TestServerCustomProtocol.java > PRE-CREATION > > Diff: http://review.cloudera.org/r/816/diff > > > Testing > ------- > > > Thanks, > > Gary > >
