This is quite odd because I do the same thing on a multi-million row table
and get multiple regions ...
You do have multiple regions, right? What happens if you only specify the
-loadKey parameter and none of the others?

On Thu, Jan 20, 2011 at 8:24 AM, Mr. Lukas <[email protected]> wrote:

> Hi pig users,
> I'm also using pig 0.8 together with HBase 0.20.6 and think, my problem is
> related to Ian's. When processing a table with millions of rows (stored in
> multiple), HBaseStorage won't scan the full table but only a few hundred
> records.
>
> The following minimal example reproduces my problem (for this table):
>
> REGISTER '/path/to/guava-r07.jar'
> SET DEFAULT_PARALLEL 30;
> items = LOAD 'hbase://some-table' USING
> org.apache.pig.backend.hadoop.hbase.HBaseStorage('family:column', '-caster
> HBaseBinaryConverter -caching 500 -loadKey') AS (key:bytearray,
> a_column:long);
> items = GROUP items ALL;
> item_count = FOREACH items GENERATE COUNT_STAR($1);
> DUMP item_count
>
> Pig issues just one mapper and I guess, that it scans just one region of
> the
> table. Or did i miss some fundamental configuration options?
>
> Best regards,
> Lukas
>

Reply via email to