Typically it is better to use caching and batch size to limit the number of rows returned and thus the amount of processing required between calls to next() during a scan, but it would be nice if HBase provided a way to manually refresh a lease similar to Hadoop's context.progress(). In a cluster that is used for many different applications, upping the global lease timeout is a heavy handed solution. Even being able to override the timeout on a per-scan basis would be nice.
Thoughts on that, Ted? On Wed, Mar 20, 2013 at 1:00 PM, Ted Yu <[email protected]> wrote: > In 0.94, there is only one setting. > See release notes of HBASE-6170 which is in 0.95 > > Looks like this should help (in 0.95): > > https://issues.apache.org/jira/browse/HBASE-2214 > Do HBASE-1996 -- setting size to return in scan rather than count of rows > -- properly > > From your description, you should be able to raise the timeout since the > writes are relatively fast. > > Cheers > > On Wed, Mar 20, 2013 at 9:32 AM, Dan Crosta <[email protected]> wrote: > > > I'm confused -- I only see one setting in CDH manager, what is the name > of > > the other setting? > > > > Our load is moderately frequent small writes (in batches of 1000 cells at > > a time, typically split over a few hundred rows -- these complete very > > fast, we haven't seen any timeouts there), and infrequent batches of > large > > reads (scans), which is where we do see timeouts. My guess is that the > > timeout is more due to our application taking some time -- apparently > more > > than 60s -- to process the results of each scan's output, rather than due > > to slowness in HBase itself, which tends to be only moderately loaded > > (judging by CPU, network, and disk) while we do the reads. > > > > Thanks, > > - Dan > > > > On Mar 17, 2013, at 2:20 PM, Ted Yu wrote: > > > > > The lease timeout is used by row locking too. > > > That's the reason behind splitting the setting into two config > > parameters. > > > > > > How is your load composition ? Do you mostly serve reads from HBase ? > > > > > > Cheers > > > > > > On Sun, Mar 17, 2013 at 1:56 PM, Dan Crosta <[email protected]> wrote: > > > > > >> Ah, thanks Ted -- I was wondering what that setting was for. > > >> > > >> We are using CDH 4.2.0, which is HBase 0.94.2 (give or take a few > > >> backports from 0.94.3). > > >> > > >> Is there any harm in setting the lease timeout to something larger, > > like 5 > > >> or 10 minutes? > > >> > > >> Thanks, > > >> - Dan > > >> > > >> On Mar 17, 2013, at 1:46 PM, Ted Yu wrote: > > >> > > >>> Which HBase version are you using ? > > >>> > > >>> In 0.94 and prior, the config param is > hbase.regionserver.lease.period > > >>> > > >>> In 0.95, it is different. See release notes of HBASE-6170 > > >>> > > >>> On Sun, Mar 17, 2013 at 11:46 AM, Dan Crosta <[email protected]> > wrote: > > >>> > > >>>> We occasionally get scanner timeout errors such as "66698ms passed > > since > > >>>> the last invocation, timeout is currently set to 60000" when > > iterating a > > >>>> scanner through the Thrift API. Is there any reason not to raise the > > >>>> timeout to something larger than the default 60s? Put another way, > > what > > >>>> resources (and how much of them) does a scanner take up on a thrift > > >> server > > >>>> or region server? > > >>>> > > >>>> Also, to confirm -- I believe "hbase.rpc.timeout" is the setting in > > >>>> question here, but someone please correct me if I'm wrong. > > >>>> > > >>>> Thanks, > > >>>> - Dan > > >>>> > > >>>> > > >>>> > > >> > > >> > > > > >
