Thanks, that's ultimately what I went with. (Saw how it was done in the
AvroStorage class). Thought there might be a cleaner/simpler/better way
I was missing.

--jacob
@thedatachef

On Tue, 2011-02-01 at 21:22 +0530, Harsh J wrote:
> I remember facing this problem when trying to implement a Load/Store
> quite a while ago.
> 
> The issue (not really an issue I guess) is that checkSchema is a
> front-end method. One that is used, perhaps multiple times, in the
> Pig's front-end code. It isn't called by the back-end code of Pig that
> runs on a given platform (Local or Hadoop).
> 
> To persist your schema, ensure you put it onto the 'JobConf' (in loose
> terms). Pig lets you do this by using the UDFContext class for UDFs.
> Get a UDFContext for your UDF, then set a property in it with a key
> signifying your schema/other data and the value. Similarly, retrieve
> it in the other methods using a similar way, wherever you need it
> (getOutputFormat, putNext, etc.).
> 
> On Tue, Feb 1, 2011 at 10:16 AM, Jacob Perkins
> <[email protected]> wrote:
> > Trying to write a simple storefunc that makes use of the input data's
> > field names. Is there a way to gain access to this inside of the call to
> > putNext? Ostensibly you could set a variable with the schema during the
> > call to checkSchema, eg. in HBaseStorage, but as far as I can tell this
> > is null by the time putNext is called. Is there some other way or am I
> > missing something obvious?
> >
> > --jacob
> > @thedatachef
> >
> >
> 
> 
> 


Reply via email to