so it is very wired that in parts of my servers, I didnot get the error and
so the cache was not cleaned.

2012/8/10 N Keywal <[email protected]>

> If it's a single row, I would expect the server to return the error
> immediately. Then you will have the sleep I was mentioning previously,
> but the cache should be cleaned before the sleep...
>
> On Fri, Aug 10, 2012 at 1:32 PM, deanforwever2010
> <[email protected]> wrote:
> > hi, Keywal
> > my hbase version is 0.94,
> > my query is just to get limited columns of a row,
> > I make a callable task of 1.5 seconds, so  maybe it didnot fail but
> > canceled by my process,so the region cache didnot clear after many
> requests
> > happened.
> > my question is why should it take so long time for failure? and it behave
> > different between my servers, and there is no problem with network.
> >
> > 2012/8/10 N Keywal <[email protected]>
> >
> >> Hi,
> >>
> >> What are your queries exactly? What's the HBase version?
> >>
> >> The mechanism is:
> >> - There is a location cache, per HConnection, on the client
> >> - The client first tries the region server in its cache
> >> - if it fails, the client removes this entry from the cache and enters
> >> the retry loop
> >> - there is a limited amount of retries and a sleep between the retries
> >> - most of the times, the client will connect to meta to get the new
> >> location
> >>
> >> When there are multiple queries, before HBASE-5924, the errors will be
> >> analyzed after the other regions servers has returned as well. It
> >> could be an explanation. HBASE-5877 exists as well, but only for
> >> moves, not for splits...
> >>
> >> Cheers,
> >>
> >> N.
> >>
> >>
> >> On Fri, Aug 10, 2012 at 11:26 AM, deanforwever2010
> >> <[email protected]> wrote:
> >> > on the region server's log :2012-08-10 11:49:50,796 DEBUG
> >> > org.apache.hadoop.hbase.regionserver.HRegionServer:
> >> > NotServingRegionException; Region is not online:
> >> > test_list,zWPpyme,1342510667492.91486e7fa0ac39048276848a2618479b.
> >> >
> >> > after region split, client didnt get result after timeout setting(1.5
> >> > second),then the task is canceled by my program, so the
> >> HConnectionManager
> >> > didnt delete the cachedLocation;
> >> > the client  still query the old region id which is no more exists
> >> >
> >> > And more, part of my processes updated the region location info, part
> >> > not.I'm sure the network is fine;
> >> >
> >> > how to fix the problem?why does it need so long time to detect the new
> >> > regions?
> >>
>

Reply via email to