There have been several requests for this. I'm not a fan of it,
because it makes it too easy to forget that you're forcing a single
reducer MR job to accomplish this. But I'm open to persuasion if
everyone else disagrees.
Alan.
On Jun 11, 2010, at 7:27 PM, Russell Jurney wrote:
This would be great. Save us from GROUP ALL/FOREACH, which is
awkward.
On Fri, Jun 11, 2010 at 7:14 PM, Dmitriy Ryaboy <dvrya...@gmail.com>
wrote:
It would be cool to just treat relations as bags in the general
case. They
kind of are, and kind of are not. Causes lots of user confusion.
There are obvious users-doing-dumb-stuff scenarios that arise though.
I guess the Pig philosophy is that the user is the optimizer,
though.. so
maybe it's ok.
-D
On Fri, Jun 11, 2010 at 6:42 PM, Russell Jurney <russell.jur...@gmail.com
wrote:
Would it be possible, and not a ton of work to make the builtin
SIZE()
work
on a relation? Reason being, I frequently do this:
B = GROUP A ALL;
C = FOREACH B GENERATE SIZE(A) AS total;
DUMP C;
And I would rather do this:
DUMP SIZE(A);
Russ