On Tue, Jan 25, 2011 at 2:14 PM, Dani Rayan <[email protected]> wrote:
> But opening and closing the scanner inside this nested loop is taking
> mulitple seconds to complete on just 3000 rows :(

Something is wrong with your cluster or the way you use it.  The
overhead of opening / closing the scanner is normally absolutely
negligible compared to the overhead to scan the full table, even with
a table as small as just 3000 rows.

Does your table fit entirely in one region?  How big are the rows?
Are you writing a lot to your table?  Are you typically inserting
cells or overwriting stuff in existing ones?

Is your pseudo-distributed HBase running on a single machine?  If yes,
why not use a non-distributed HBase setup (without HDFS)?

-- 
Benoit "tsuna" Sigoure
Software Engineer @ www.StumbleUpon.com

Reply via email to