Hi,

I think I understand the answer.

My question was based on incorrect premises. (You can tell I am new to this.)

The CoprocesserService() method will send requests to all region servers 
serving regions within a given key range. So each coprocessor instance is 
handling just one region. I suppose one could write badly behaved code in a 
coprocessor instance that does cross servers, but the natural architecture of 
an EndPoint coprocessor is to work on one region locally.

The client code that calls CoprocessorService is responsible for processing the 
set of responses from each region server that was called.

So in my example, some client side code has to loop through these, adding 
together the results from each response.

Thanks,

Dave

From: Dave Birdsall
Sent: Monday, July 24, 2017 9:30 AM
To: [email protected]
Subject: EndpointCoprocessors with multi-region access

Hi,

I have a basic question about Endpoint coprocessors.

Suppose I want to write a coprocessor that returns the total number of memstore 
bytes used by a table.

I can write code that loops through all the regions, asking their region 
servers to tell me the memstore bytes for each given region, and then add them 
all up.

Such code, of course, will have a RegionServer talking to other RegionServers 
in the cluster.

Is there any problem with this? For example, when a RegionServer does an RPC to 
another RegionServer, does that tie up a thread in the calling RegionServer? 
And if so, and if my coprocessor is popular, might I get deadlocks or thread 
exhaustion errors if multiple RegionServers run my coprocessor?

The more general architectural question is, should an EndPoint coprocessor 
limit itself to the regions that are on its own RegionServer? Or does HBase 
possess appropriate layers to robustly manage prolific cross-server traffic?

Thanks,

Dave

Reply via email to