On 7/13/17 1:48 PM, Tanujit Ghosh wrote:
Hi All,
We are facing a problem in our cluster as stated below.
We have a long running java process which does various select on
underlying Phoenix/HBASE table structure and return data. This process
gets requests from other upstream apps and responds with results from
Phoenix/HBASE.
We are facing an issue here is that if any one of the HBASE region
servers goes down, then we start getting a RegionNotServingException
when we run a query on a table whose regions were on the region server
that went down. Although now the cluster has reassigned those regions to
other region servers, somehow it does not reflect in the Phoenix query
layer.
This expected to a degree. Even after Regions move on the cluster, the
client will not re-query a Region's location until it is not where the
client thinks it was (invalidates the cache of Region->RS, and
re-queries it from hbase:meta).
If you see transiently this for a region after it moves, that is
expected. You have to do nothing -- the client automatically recovers
and is just informing you. However, if your client becomes "stuck"
(looping with NotServingRegionExceptions), that's a completely different
problem and would likely be an HBase bug.
I'm not sure which case you're describing.
As per Phoenix documentation, we are creating a new PhoenixConnection
object for each query and running the select statements.
Has anyone faced a similar issue?
Any suggestions/help in this regards will be appreciated.
Thanks and Regards,
Tanujit