c = FOREACH b GENERATE group as key, COUNT(a);

will give you the number of rows in a per key.

a_all = group a ALL;
a_count = FOREACH a_all GENERATE COUNT(a);

will give you the total number of rows in a.

Does that answer your question?


On Tue, Feb 23, 2010 at 3:54 PM, jiang licht <[email protected]> wrote:

> Excuse me I could have missed important part of PIG document and asked this
> trivial question here :) What is the best way to find out the total number
> of tuples (rows) in the bag of data loaded? For example, after "a = LOAD
> 'sth' AS (key, value); b = GROUP a BY key; c = FOREACH b GENERATE key;" I
> want to know how many tuples are loaded to 'a' and total number left in 'c'.
> One way might be to use a udf function. But is there a support of counting
> this in PIG?
>
> Thanks,
>
> Michael
>
>
>

Reply via email to