On Tue, Jan 25, 2011 at 2:14 PM, Dani Rayan <[email protected]> wrote: > But opening and closing the scanner inside this nested loop is taking > mulitple seconds to complete on just 3000 rows :(
Something is wrong with your cluster or the way you use it. The overhead of opening / closing the scanner is normally absolutely negligible compared to the overhead to scan the full table, even with a table as small as just 3000 rows. Does your table fit entirely in one region? How big are the rows? Are you writing a lot to your table? Are you typically inserting cells or overwriting stuff in existing ones? Is your pseudo-distributed HBase running on a single machine? If yes, why not use a non-distributed HBase setup (without HDFS)? -- Benoit "tsuna" Sigoure Software Engineer @ www.StumbleUpon.com
