Thanks, that's ultimately what I went with. (Saw how it was done in the AvroStorage class). Thought there might be a cleaner/simpler/better way I was missing.
--jacob @thedatachef On Tue, 2011-02-01 at 21:22 +0530, Harsh J wrote: > I remember facing this problem when trying to implement a Load/Store > quite a while ago. > > The issue (not really an issue I guess) is that checkSchema is a > front-end method. One that is used, perhaps multiple times, in the > Pig's front-end code. It isn't called by the back-end code of Pig that > runs on a given platform (Local or Hadoop). > > To persist your schema, ensure you put it onto the 'JobConf' (in loose > terms). Pig lets you do this by using the UDFContext class for UDFs. > Get a UDFContext for your UDF, then set a property in it with a key > signifying your schema/other data and the value. Similarly, retrieve > it in the other methods using a similar way, wherever you need it > (getOutputFormat, putNext, etc.). > > On Tue, Feb 1, 2011 at 10:16 AM, Jacob Perkins > <[email protected]> wrote: > > Trying to write a simple storefunc that makes use of the input data's > > field names. Is there a way to gain access to this inside of the call to > > putNext? Ostensibly you could set a variable with the schema during the > > call to checkSchema, eg. in HBaseStorage, but as far as I can tell this > > is null by the time putNext is called. Is there some other way or am I > > missing something obvious? > > > > --jacob > > @thedatachef > > > > > > >
