Hi

 

I would like to create a bag of tuples using an eval UDF.

I wrote a simple eval method but when I use it pig cannot figure out the
schema of the UDF's output. 

When I call "describe" on the output I get {(null)}

I tries to set the schema using the "as" statement (i.e. "as
{(w1:chararray, w2:chararray)} but pig cannot parse this.

 

To test this I wrote the following eval method and called using the
following commands

 

A = Load 'test' as (x:int,y:int);

B = ForEach test Generate PigTest(x,y);

Describe B;

 

public class CreateBag extends EvalFunc<DataBag>{

 

       TupleFactory mTupleFactory = TupleFactory.getInstance();

       BagFactory mBagFactory = BagFactory.getInstance();

       

       @Override

       public DataBag exec(Tuple input) throws IOException {

                     int a = (int)input.get(0);

                     int b = (int)input.get(1);

                     

                     DataBag result = mBagFactory.newDefaultBag();

                     

                     Tuple t1 = mTupleFactory.newTuple(2);

                     t1.set(0, a+1);

                     t1.set(1, b+1);

                     

                     Tuple t2 = mTupleFactory.newTuple(2);

                     t2.set(0, a+1);

                     t2.set(1, b+1);

                     

                     result.add(t1);

                     result.add(t2);

                     

                     return result;

       }

 

}

 

How can I call it and get the correct schema visible in Pig?
{(n1:int,n2:int)}

 

Thanks

 

Manu Cohen-Yashar

Senior Architect, Cloud Computing and Application Security

Sela Group

 

Phone: 972-4-9881203

Mobile: 972-52-5574551

 

Reply via email to