On Tue, Sep 15, 2009 at 3:13 PM, Barney Frank <[email protected]>wrote:
....
> **** This is despite the fact that I set hbase.pause to be 25 ms and the
> retries.number = 2. ****
>
>
Yeah, this is down in guts of the hadoop rpc we use. Around connection
setup it has its own config. that is not well aligned with ours (ours being
the retries and pause settings)
The maxretriies down in ipc is
this.maxRetries = conf.getInt("ipc.client.connect.max.retries", 10);
Thats for an IOE other than timeout. For timeout, it does this:
} catch (SocketTimeoutException toe) {
/* The max number of retries is 45,
* which amounts to 20s*45 = 15 minutes retries.
*/
handleConnectionFailure(timeoutFailures++, 45, toe);
Let me file an issue to address the above. The retries should be our
retries... and in here it has a hardcoded 1000ms that instead should be our
pause.... Not hard to fix.
> I restart the Master and RegionServer and then send more client requests
> through HTablePool. It has the same "Retrying to connect to server:"
> messages. I noticed that the port number it is using is the old port for
> the region server and not the new one assigned after the restart. The
> HbaseClient does not seem to recover unless I restart the client app. When
> I do not use HTablePool and only Htable it works fine.
>
We've not done work to make the pool ride over a restart.
> Two issues:
> 1) Setting and using hbase.client.pause and hbase.client.retries.number
> parameters. I have rarely gotten them to work. It seems to default to 2
> sec and 10 retries no matter if I overwrite the defaults on the client and
> the server. Yes, I made sure my client doesn't have anything in the
> classpath it might pick-up.
> <property>
> <name>hbase.client.pause</name>
> <value>20</value>
> </property>
> <property>
> <name>hbase.client.retries.number</name>
> <value>2</value>
> </property>
>
Please make an issue for this and I'll investigate. I"ve already added note
to an existing HBaseClient ipc issue and will fix above items as part of it.
> 2) Running HTablePool under Pseudo mode, the client doesn't seem to refresh
> with the new regionserver port after the master/regions are back up. It
> gets "stuck" with the info from the settings prior to the master goin down.
>
> I would appreciate any thoughts or help.
>
You need to use the pool? Your app is highly threaded and all are
connecting to hbase (hundreds)?
St.Ack