OK. Here is the summary of what I know:

 

A region server, after some amount of scanning, can begin to get
ClosedChannelException when it tries to respond to the client.
Unfortunately, this only effects the response to the client. The region
server apparently continues to tell zookeeper and say "I'm alive and
OK". Consequently, the regionserver is never shutdown. This causes the
client to still attempt to access regions on the effectively-dead
server. But each request will eventually time out on the client side,
since all the client sees is "I sent a request, and never receive any
response on the socket.".  However, the client has no capability to
inform the master of the problem. 

 

If I manually shutdown the region server where the problem exists, the
regions get redistributed other region servers automatically, and then
the client will receive new information about the new location of the
regions, on a different region server, and the client can begin
functioning again. However, the problem will soon reappear on a
different region server.

 

 

-geoff

 

 

Reply via email to