Hi, Hi I am new to the list. I've been working on the Pig code base, adding my own blocking map side POs (e.g., map side join, map side grouping) for when assertions can be made with regard to fragmentation of input relations. Partly inspired by the new block placement policy possibilities in hadoop-2.
Anyway, my question to the list is the following. Whilst looking at the code for POCollectedGroup I noticed that this PO expects split content to be sorted. On the other hand the Collectable loader interface only seems to indicate that keys are unique across splits. Why is this discrepancy? Is there a good reason not to have a indicator interface that captures all input requirements, e.g., smt like OrderedCollectableLoadFunc. regards, Vasco