Yeah I tried that -
Here's what I get  for a small sample data :

{
"fields":
[
                {"name":"name","type":55,"description":"autogenerated from
Pig Field Schema","schema":null},
                {"name":"age","type":10,"description":"autogenerated from
Pig Field Schema","schema":null},
                {"name":"gpa","type":20,"description":"autogenerated from
Pig Field Schema","schema":null}
],

"version":0,
"sortKeys":[],
"sortKeyOrders":[]
}


I am looking to see if I can decode this formats and try to define my own
schema in this way and use it in PigLoader function

Thanks,
Praveenesh

On Mon, Feb 6, 2012 at 2:41 PM, Dmitriy Ryaboy <[email protected]> wrote:

> it reads the schema file *it creates* . So, you process some data, store
> it, then read it back later, and the schema is back.
> Like I said, the json is not very human-readable -- the types are integers
> rather than words like "chararray", etc.
> Try saving something and check out the .pig_schema file to see an example.
>
> D
>
> On Sun, Feb 5, 2012 at 10:59 PM, praveenesh kumar <[email protected]
> >wrote:
>
> > Okie.. so how can I make use of -schema option with PigStorage.
> >
> > Suppose my Jscon schema is -
> >
> > {
> >        "name":"Student_Data",
> >        "properties":
> >        {
> >                "id":
> >                {
> >                        "type":"INTEGER",
> >                        "description":"Student id"
> >                },
> >                "name":
> >                {
> >                        "type":"CHARARRAY",
> >                        "description":"Name of the student"
> >
> >                },
> >                "marks":
> >                {
> >                        "type":"INTEGER",
> >                        "description":"Marks of the student"
> >                },
> >
> >        }
> > }
> >
> > I tried to create the above schema in Pig Datatypes. Can I use it or Is
> > there a different way to use  "-schema" option ?
> > <code>-schema</code> Reads/Stores the schema of the relation using a
> hidden
> > JSON file.
> >
> > Or is there some other way to directly pass the schema defined in some
> > other file as plain text file and read it using PigStorage ?
> >
> > Thanks,
> > Praveenesh
> >
> >
> > On Mon, Feb 6, 2012 at 12:18 PM, Dmitriy Ryaboy <[email protected]>
> > wrote:
> >
> > > It's a json serialization of the Pig schema object, and isn't really
> > meant
> > > to be created by hand.
> > > Patches to make it more human-friendly would be quite welcome.
> > >
> > > D
> > >
> > > On Sun, Feb 5, 2012 at 10:35 PM, praveenesh kumar <
> [email protected]
> > > >wrote:
> > >
> > > > Thanks,
> > > > I was also looking for -schema option in PigStorage.
> > > > But Can anyone explain how can we define that json schema file.
> > > > Some tutorial/small example would be very helpful.
> > > >
> > > > Praveenesh
> > > >
> > > > On Mon, Feb 6, 2012 at 11:55 AM, Dmitriy Ryaboy <[email protected]>
> > > > wrote:
> > > >
> > > > > It's pretty straightforward, that's why the LoadMetadata interface
> > > > exists.
> > > > > You just have to implement it and translate however you store the
> > > schema
> > > > to
> > > > > a Pig Schema object.
> > > > >
> > > > > PigStorageSchema will read a json file that describes the schema,
> you
> > > can
> > > > > look at how that's done there (actually, PigStorage itself will do
> > that
> > > > in
> > > > > trunk).
> > > > >
> > > > > You can also check out what the Elephant-Bird library does for
> > loading
> > > > > protocol buffers and thrift objects, where schema is derived from
> the
> > > > > object itself.
> > > > >
> > > > > -Dmitriy
> > > > >
> > > > > On Fri, Feb 3, 2012 at 4:35 AM, praveenesh kumar <
> > [email protected]
> > > > > >wrote:
> > > > >
> > > > > > Hey guys,
> > > > > >
> > > > > > I am new to Pig.
> > > > > > I was wondering is it possible to pass schema in pig load
> statement
> > > > while
> > > > > > loading it first time.
> > > > > >
> > > > > > Suppose if I have a huge dataset.. containing around 100 cols..
> Is
> > > > there
> > > > > a
> > > > > > way through which I can pass the schema defined in some other
> file
> > > > (some
> > > > > > kind of meta file) into pig load statement or do I have to define
> > it
> > > > > every
> > > > > > time inside LOAD statement ?
> > > > > >
> > > > > > Thanks,
> > > > > > Praveenesh
> > > > > >
> > > > >
> > > >
> > >
> >
>

Reply via email to