c = FOREACH b GENERATE group as key, COUNT(a); will give you the number of rows in a per key.
a_all = group a ALL; a_count = FOREACH a_all GENERATE COUNT(a); will give you the total number of rows in a. Does that answer your question? On Tue, Feb 23, 2010 at 3:54 PM, jiang licht <[email protected]> wrote: > Excuse me I could have missed important part of PIG document and asked this > trivial question here :) What is the best way to find out the total number > of tuples (rows) in the bag of data loaded? For example, after "a = LOAD > 'sth' AS (key, value); b = GROUP a BY key; c = FOREACH b GENERATE key;" I > want to know how many tuples are loaded to 'a' and total number left in 'c'. > One way might be to use a udf function. But is there a support of counting > this in PIG? > > Thanks, > > Michael > > >
