Hi
I would like to create a bag of tuples using an eval UDF.
I wrote a simple eval method but when I use it pig cannot figure out the
schema of the UDF's output.
When I call "describe" on the output I get {(null)}
I tries to set the schema using the "as" statement (i.e. "as
{(w1:chararray, w2:chararray)} but pig cannot parse this.
To test this I wrote the following eval method and called using the
following commands
A = Load 'test' as (x:int,y:int);
B = ForEach test Generate PigTest(x,y);
Describe B;
public class CreateBag extends EvalFunc<DataBag>{
TupleFactory mTupleFactory = TupleFactory.getInstance();
BagFactory mBagFactory = BagFactory.getInstance();
@Override
public DataBag exec(Tuple input) throws IOException {
int a = (int)input.get(0);
int b = (int)input.get(1);
DataBag result = mBagFactory.newDefaultBag();
Tuple t1 = mTupleFactory.newTuple(2);
t1.set(0, a+1);
t1.set(1, b+1);
Tuple t2 = mTupleFactory.newTuple(2);
t2.set(0, a+1);
t2.set(1, b+1);
result.add(t1);
result.add(t2);
return result;
}
}
How can I call it and get the correct schema visible in Pig?
{(n1:int,n2:int)}
Thanks
Manu Cohen-Yashar
Senior Architect, Cloud Computing and Application Security
Sela Group
Phone: 972-4-9881203
Mobile: 972-52-5574551