Scanner caching (in 1.1 and 2.0) is now a _limit_. I.e. normally you leave it
disabled (the default of Long.MAX_VALUE) unless you know ahead of time that
you'll only look at the first N rows returned. In that case you'd set it to N.
I thought we had renamed it from "caching" to "limit" but looking at the code,
that is not the case.
In 0.98 and 1.0.x we need to keep it around defaulting to 100 for backwards
compatibility.
-- Lars
From: Dave Latham <[email protected]>
To: [email protected]
Sent: Wednesday, April 8, 2015 9:09 PM
Subject: remove scanner caching?
After debugging a scans missing data issue while migrating to 0.98 (thanks
Andrew, Jonathon, Josh, and Lars for the help), I'm left wondering why we
have both caching and maxResultSize for scans. It seems to be more client
api complexity than it's worth. Why would someone need to set caching when
maxResultSize is available? Indeed, the first patch proposed by some
fellow in HBASE-1996 simply replaced caching with maxResultSize. Can we
deprecate and eventually remove caching? Is there a good case for keeping
it in the client API surface?
Dave