Hi Geoff,

we are having these issues when the scanner uses scan.addColumn() and the column is sparse, in the sense there are many rows with some other column in the same column family. I suppose your problems will vanish if you use scan.addFamily() call instead. The same behavior may appear if you are reading from region after massive delete (then the timeouts settle down after major compation), or when using server side Filters.

Changing scan.addColumn() to scan.addFamily() brings some overhead, which I think could be removed by RegionServer renewing the lease of scanner while reading data, not only after first entry to HRegionServer.next().

Would this be worth opening a JIRA?

Jan

On 6.9.2011 04:11, Geoff Hendrey wrote:
Hi -



I found some odd behavior with ResultScanner.next(). Usually the times
for next() are couple hundred ms. But occasionally the call to next
spikes VERY long. In fact, I have the timeout set to 60 seconds (60000),
but once in a while the call to next() itself is interrupted by the
ScannerTimeoutException after more than 60 seconds. It seems odd that
the call to next itself can be interrupted because "61107ms passed since
the last invocation, timeout is currently set to 60000"

The only thing I can think of is that a GC kicks in after the call to
next begins, but before the call returns, and the server is still
ticking the timeout. But this seems to happen periodically with
regularity. The odds of the GC kicking in at that exact instant, so
often, seem kind of low.

-geoff




Reply via email to