On Thu, May 17, 2012 at 6:28 AM, Andrew Purtell <[email protected]> wrote:
> Are HBase coprocessors explicitly wrong for this use case if the app > logic > needs to access multiple regions in a single call? Not coprocessors in general. The client side support for Endpoints > (Exec, etc.) gives the developer the fiction of addressing the cluster > as a range of rows, and will parallelize per-region Endpoint > invocations, and collect the responses, and can return them all to the > caller as "a single call". But on the deadlock problem the Endpoint behaves the same way as Observer. Endpoints are also executed via RPC handlers of RegionServer. > However for RegionObservers, if you want to > do something cross-region, so therefore issue one or more RPCs which > must complete *before you can complete the RPC* you are currently > processing, then this is inherently problematic and deadlock prone. If > on the other hand you schedule the cross-region work with an Executor > or similar and return on the current RPC, that would be ok. This means that once RPC handlers are blocked then the cluster can be considered to be dead, because the coprocessors are written by users and any kind of code may appear on the server side. If the Executor is also feasible for Endpoint, then how to return the results the client is waiting for? Maybe extra loop is needed in the client issues RPCs to retrieve the results constantly. It also means that Endpoint has to keep the results on server. Then the Endpoint has to be stateful. This is another question that I doubt about. Should any of coprocessors be stateful or stateless? What if the client just dies before it can retrieve the results? Should another lease be created for that results, just like RegionServer does for scanners? It looks messy, but any way that is possible. -- Best Regards! Fei Ding [email protected]
