Ah, I suppose I was just proving it oculd be done.
To make a new one, you'd do:
public class MyUdf extends EvalFunc<DataBag> {
private static final BagFactory mBagFactory = BagFactory.getInstance();
public DataBag exec(Tuple input) throws IOException {
DataBag output = mBagFactory.newDefaultBag();
for (Tuple t : (DataBag)input.get(0)) {
output.add(t);
}
return output;
}
}
2013/3/18 Kris Coward <[email protected]>
>
> But he asked for a function that returns *another* bag ;)
>
> Snark aside, when returning bags or tuples, it's also worthwhile to at
> least consider also defining the output schema, which for your example
> code would probably mean
>
> public Schema outputSchema(Schema input){
> Schema output = new Schema();
> output.add(input.getField(0));
> return output;
> }
>
> (possibly with some omitted exception handling)
>
> -Kris
>
> On Mon, Mar 18, 2013 at 11:19:17AM +0100, Jonathan Coveney wrote:
> > Absolutely.
> >
> > public class MyUdf extends EvalFunc<DataBag> {
> > public DataBag exec(Tuple input) throws IOException {
> > return (DataBag)input.get(0);
> > }
> > }
> >
> >
> > A dummy example, but there you go. DataBag is a valid pig type like any
> > other, so you just returnit like you would normally.
> >
> >
> > 2013/3/18 pranjal rajput <[email protected]>
> >
> > > Hi,
> > > Can we define a UDF in pig that takes a bag as an input and returns
> another
> > > bag as output?
> > > How can this be done?
> > > Thanks,
> > > --
> > > regards
> > > Pranjal
> > >
>
> --
> Kris Coward http://unripe.melon.org/
> GPG Fingerprint: 2BF3 957D 310A FEEC 4733 830E 21A4 05C7 1FEB 12B3
>