On Mon, Jan 24, 2011 at 7:26 PM, Dani Rayan <[email protected]> wrote: > ResultScanner refscanner = table.getScanner(Bytes.toBytes("ColA")); // > Looks expensive.
> The getscanner operation looks expensive. Am I m(i,e)ssing something ? This shouldn't be expensive. What happens under the hood is that the client makes an "openScanner" RPC call to the RegionServer, to which the RS responds with a scanner ID. The state of the scanner is stored in the RS. The current RPC protocol forces you to make this call. You can't seek back with a scanner. When you move forward, the only way to go back is to close the scanner and open a new one again. Opening a scanner shouldn't take long, we're talking about milliseconds (I'm seeing ~2ms in one of our production clusters at StumbleUpon). Are your RegionServers very busy? Have you seen anything that might look like excessive GCing or lock contention? -- Benoit "tsuna" Sigoure Software Engineer @ www.StumbleUpon.com
