On Wed, Aug 29, 2012 at 4:51 PM, Jonathan Coveney <[email protected]>wrote:
> COUNT is a UDF that takes in a Bag and outputs a Double. > > Relations are not Bags, so that's one way of thinking about it. But of > course, we could have coerced the syntax to make it work. > > I like to think of it as such: > > A foreach is a transformation on the rows of a relation. Thus, applying > COUNT directly to a relation doesn't make any sense, since you're doing an > aggregate transformation. This is why grouping is necessary. you're putting > all of the rows of the relation into one row (with the catch-all key > "all"), so that you can run a function on them. > Thanks! I think I get it. > Don't know if that helps. > > 2012/8/29 Mohit Anchlia <[email protected]> > > > Thanks! Why is grouping necessary? Is it to send it to the reducer? > > > > On Wed, Aug 29, 2012 at 4:03 PM, Alan Gates <[email protected]> > wrote: > > > > > A = load 'foo'; > > > B = group A all; > > > C = foreach B generate COUNT(A); > > > > > > Alan. > > > On Aug 29, 2012, at 3:51 PM, Mohit Anchlia wrote: > > > > > > > How do I get count of all the rows? All the examples of COUNT use > group > > > by. > > > > > > > > >
