It's pretty straightforward, that's why the LoadMetadata interface exists.
You just have to implement it and translate however you store the schema to
a Pig Schema object.

PigStorageSchema will read a json file that describes the schema, you can
look at how that's done there (actually, PigStorage itself will do that in
trunk).

You can also check out what the Elephant-Bird library does for loading
protocol buffers and thrift objects, where schema is derived from the
object itself.

-Dmitriy

On Fri, Feb 3, 2012 at 4:35 AM, praveenesh kumar <[email protected]>wrote:

> Hey guys,
>
> I am new to Pig.
> I was wondering is it possible to pass schema in pig load statement while
> loading it first time.
>
> Suppose if I have a huge dataset.. containing around 100 cols.. Is there a
> way through which I can pass the schema defined in some other file (some
> kind of meta file) into pig load statement or do I have to define it every
> time inside LOAD statement ?
>
> Thanks,
> Praveenesh
>

Reply via email to