Yes, it makes sense. Even a 1 minute timeout is not ideal in this case: we know that the work to do server side is trivial, and we know it's idempotent so we can retry. So I would to tend to use a specific setting to use for such operations.
Could you please create a jira for this? Thanks, Nicolas On Tue, Aug 6, 2013 at 9:46 AM, Julian Zhou <[email protected]> wrote: > Hi Community, > Could you help if this case makes sense for 0.94 or trunk? > Default of "hbase.rpc.timeout" is 60000 ms (1 min). User sometimes > increase them to a bigger value such as 600000 ms (10 mins) for many > concurrent loading application from client. Some user share the same > hbase-site.xml for both client and server. HRegionServer > #tryRegionServerReport via rpc channel to report to live master, but > there was a window for master failover scenario. That region server > attemping to connect to master, which was just killed, backup master > took the ative role immediately and put to /hbase/master, but region > server was still waiting for the rpc timeout from connecting to the dead > master. If "hbase.rpc.timeout" is too long, this master failover process > will be long due to long rpc timeout from dead master. > > If so, could we seperate with 2 options, "hbase.rpc.timeout" is still > for hbase client, while "hbase.rpc.internal.timeout" was for this > regionserver/master rpc channel, which could be set shorted value > without affect real client rpc timeout value? > > -- > Best Regards, Julian > >
