You can flatten, or you can override outputSchema so that you can specify the output of the UDF.
You can find an example here: http://pig.apache.org/docs/r0.9.1/udf.html#eval-functions Or, if you're using trunk, you can use Dmitriy's @OutputSchema annotation, so you'd do @OutputSchema("b:bag{t:tuple(x:chararray,y:chararray)}") and it would do the magic for you :) If you're not using trunk, it is definitely still easier to use Utils.getSchemaFromString() instead of building it up. 2011/11/29 Prashant Kommireddi <[email protected]> > You could possibly FLATTEN out the results from your UDF > u = foreach g generate FLATTEN(UrlCoOccurence($1)) as (v1, v2); > > On Tue, Nov 29, 2011 at 3:49 PM, Ayon Sinha <[email protected]> wrote: > > > Hi, > > I have a UDF that is: > > public DataBag exec(Tuple input) throws IOException > > > > This bag has tuples with 2 String fields each. > > How do I tell in Pig to expect a bag{tuple(chararray, chararray)} from > the > > UDF call > > > > u = foreach g generate UrlCoOccurence($1) as pairs; > > > > > > > > I tried this > > u = foreach g generate (bag{tuple(chararray, > > chararray)})UrlCoOccurence($1) as pairs; > > > > this gives me: > > 2011-11-29 15:47:07,048 [main] ERROR org.apache.pig.tools.grunt.Grunt - > > ERROR 1052: Cannot cast bag with schema bag to bag with schema > > bag({(chararray,chararray)}) > > > > > > Basically my UDF returns a bag of tuples which have 2 values. I need to > > flatten it and v1 & v2. > > > > -Ayon > > See My Photos on Flickr > > Also check out my Blog for answers to commonly asked questions. > > >
