There have been several requests for this. I'm not a fan of it, because it makes it too easy to forget that you're forcing a single reducer MR job to accomplish this. But I'm open to persuasion if everyone else disagrees.

Alan.

On Jun 11, 2010, at 7:27 PM, Russell Jurney wrote:

This would be great. Save us from GROUP ALL/FOREACH, which is awkward.

On Fri, Jun 11, 2010 at 7:14 PM, Dmitriy Ryaboy <dvrya...@gmail.com> wrote:

It would be cool to just treat relations as bags in the general case. They
kind of are, and kind of are not. Causes lots of user confusion.
There are obvious users-doing-dumb-stuff scenarios that arise though.
I guess the Pig philosophy is that the user is the optimizer, though.. so
maybe it's ok.

-D

On Fri, Jun 11, 2010 at 6:42 PM, Russell Jurney <russell.jur...@gmail.com
wrote:

Would it be possible, and not a ton of work to make the builtin SIZE()
work
on a relation?  Reason being, I frequently do this:

B = GROUP A ALL;
C = FOREACH B GENERATE SIZE(A) AS total;
DUMP C;

And I would rather do this:

DUMP SIZE(A);

Russ



Reply via email to