Hey James,

Have you looked at linkedIn's collection of UDFs, datafu (
http://engineering.linkedin.com/open-source/introducing-datafu-open-source-collection-useful-apache-pig-udfs
)?

In particular, they have a UDF called BagSplit (
https://github.com/linkedin/datafu/blob/master/src/java/datafu/pig/bags/BagSplit.java).
It might not do exactly what you want since it splits a bag into bags of
size n, not into 10 equal-sized bags, but it shouldn't be too hard to write
your own UDF using BagSplit.java as a reference.

Dan F.



On Wed, Apr 11, 2012 at 8:53 AM, James Newhaven <[email protected]>wrote:

> Hi,
>
> I need to divide a large bag into 10 smaller bags of equal size. Does
> anyone know of a function that can do this easily? I've had a look at the
> standard functions and the PiggyBank and can't find anything appropriate.
>
> Thanks,
> James
>

Reply via email to