Hi,

Hi I am new to the list. I've been working on the Pig code base,
adding my own blocking map side POs (e.g., map side join, map side
grouping) for when assertions can be made with regard to fragmentation
of input relations. Partly inspired by the new block placement policy
possibilities in hadoop-2.

Anyway, my question to the list is the following. Whilst looking at
the code for POCollectedGroup I noticed that this PO expects split
content to be sorted. On the other hand the Collectable loader
interface only seems to indicate that keys are unique across splits.
Why is this discrepancy? Is there a good reason not to have a
indicator interface that captures all input requirements, e.g., smt
like OrderedCollectableLoadFunc.


regards,
Vasco

Reply via email to