If it's a single row, I would expect the server to return the error
immediately. Then you will have the sleep I was mentioning previously,
but the cache should be cleaned before the sleep...

On Fri, Aug 10, 2012 at 1:32 PM, deanforwever2010
<deanforwever2...@gmail.com> wrote:
> hi, Keywal
> my hbase version is 0.94,
> my query is just to get limited columns of a row,
> I make a callable task of 1.5 seconds, so  maybe it didnot fail but
> canceled by my process,so the region cache didnot clear after many requests
> happened.
> my question is why should it take so long time for failure? and it behave
> different between my servers, and there is no problem with network.
>
> 2012/8/10 N Keywal <nkey...@gmail.com>
>
>> Hi,
>>
>> What are your queries exactly? What's the HBase version?
>>
>> The mechanism is:
>> - There is a location cache, per HConnection, on the client
>> - The client first tries the region server in its cache
>> - if it fails, the client removes this entry from the cache and enters
>> the retry loop
>> - there is a limited amount of retries and a sleep between the retries
>> - most of the times, the client will connect to meta to get the new
>> location
>>
>> When there are multiple queries, before HBASE-5924, the errors will be
>> analyzed after the other regions servers has returned as well. It
>> could be an explanation. HBASE-5877 exists as well, but only for
>> moves, not for splits...
>>
>> Cheers,
>>
>> N.
>>
>>
>> On Fri, Aug 10, 2012 at 11:26 AM, deanforwever2010
>> <deanforwever2...@gmail.com> wrote:
>> > on the region server's log :2012-08-10 11:49:50,796 DEBUG
>> > org.apache.hadoop.hbase.regionserver.HRegionServer:
>> > NotServingRegionException; Region is not online:
>> > test_list,zWPpyme,1342510667492.91486e7fa0ac39048276848a2618479b.
>> >
>> > after region split, client didnt get result after timeout setting(1.5
>> > second),then the task is canceled by my program, so the
>> HConnectionManager
>> > didnt delete the cachedLocation;
>> > the client  still query the old region id which is no more exists
>> >
>> > And more, part of my processes updated the region location info, part
>> > not.I'm sure the network is fine;
>> >
>> > how to fix the problem?why does it need so long time to detect the new
>> > regions?
>>

Reply via email to