Would suggest you look at the full context of that sentence.

*Higher caching values will enable faster scanners but will eat up more memory and some calls of next may take longer and longer times when the cache is empty*

When the caching value is large, you will have to block to fill the cache which is done on a call to next() when the cache is empty. The rest of the calls to next() would be very quick.

Conversly, a smaller value for this property would result in more calls to next() actually requiring a re-load of the cache, but these calls would take less time because that cache is smaller.

Rajeshkumar J wrote:
Hi,

hbase.client.scanner.caching

Description

Number of rows that we try to fetch when calling next on a scanner if it is
not served from (local, client) memory. This configuration works together
with hbase.client.scanner.max.result.size to try and use the network
efficiently. The default value is Integer.MAX_VALUE by default so that the
network will fill the chunk size defined by
hbase.client.scanner.max.result.size
rather than be limited by a particular number of rows since the size of
rows varies table to table. If you know ahead of time that you will not
require more than a certain number of rows from a scan, this configuration
should be set to that row limit via Scan#setCaching. Higher caching values
will enable faster scanners but will eat up more memory and *some calls of
next may take longer and longer times when the cache is empty*. Do not set
this value such that the time between invocations is greater than the
scanner timeout; i.e. hbase.client.scanner.timeout.period

Default

2147483647


Can any one explain below lines


*some calls of next may take longer and longer times when the cache is
empty*


Thanks

Reply via email to