You can flatten, or you can override outputSchema so that you can specify
the output of the UDF.

You can find an example here:
http://pig.apache.org/docs/r0.9.1/udf.html#eval-functions

Or, if you're using trunk, you can use Dmitriy's @OutputSchema annotation,
so you'd do

@OutputSchema("b:bag{t:tuple(x:chararray,y:chararray)}")

and it would do the magic for you :)

If you're not using trunk, it is definitely still easier to use
Utils.getSchemaFromString() instead of building it up.

2011/11/29 Prashant Kommireddi <[email protected]>

> You could possibly FLATTEN out the results from your UDF
> u = foreach g generate FLATTEN(UrlCoOccurence($1)) as (v1, v2);
>
> On Tue, Nov 29, 2011 at 3:49 PM, Ayon Sinha <[email protected]> wrote:
>
> > Hi,
> > I have a UDF that is:
> > public DataBag exec(Tuple input) throws IOException
> >
> > This bag has tuples with 2 String fields each.
> > How do I tell in Pig to expect a bag{tuple(chararray, chararray)} from
> the
> > UDF call
> >
> > u = foreach g generate UrlCoOccurence($1) as pairs;
> >
> >
> >
> > I tried this
> > u = foreach g generate (bag{tuple(chararray,
> > chararray)})UrlCoOccurence($1) as pairs;
> >
> > this gives me:
> > 2011-11-29 15:47:07,048 [main] ERROR org.apache.pig.tools.grunt.Grunt -
> > ERROR 1052: Cannot cast bag with schema bag to bag with schema
> > bag({(chararray,chararray)})
> >
> >
> > Basically my UDF returns a bag of tuples which have 2 values. I need to
> > flatten it and v1 & v2.
> >
> > -Ayon
> > See My Photos on Flickr
> > Also check out my Blog for answers to commonly asked questions.
> >
>

Reply via email to