I'm trying to select some group of tuples using LIMIT and FLATTEN,
but the result is different from what I expected.

I wonder whether it is an intended behavior or a bug.

--------------------
Example (Selecting 2 groups based on value of 'k') :

data = LOAD 'data' AS (k, v);
DUMP data;

(1, A)
(1, B)
(2, C)
(3, D)
(3, E)
(3, F)

grouped = GROUP data BY k;
selected = LIMIT grouped 2;
flattened = FOREACH selected GENERATE FLATTEN (data);

DUMP flattened;

(1, A)
(1, B)

What I expected was 2 groups - e.g :
(1, A)
(1, B)
(2, C)


EXPLAIN showed that the LIMIT 2 was also being applied to 'flattened', not
only to 'grouped'.

Is this an intended behavior ?  If so, what is the correct way to do to get
the desired result ?

@ I tried on PIG 0.8.0 & 0.8.1, with & without -t All or -t LimitOptimizer.
The results were all the same.

Reply via email to