David Ciemiewicz commented on PIG-760:


> Still flat-schemas only, haven't gotten around to wrestling the Jackson 
> Parser on this one. David - do you need nested schemas?

No, at this time, and for the foreseeable future, I do not need nested schemes 
(hierarchical schemes) in relationship to PigStorage.

In general, PigStorage users are doing single atomic value per column. So your 
current implementation sounds like it will suffice for now, with appropriate 
caveats in any docs.

BTW, is there a pointer to documentation of the .pig_schema file likes like.  
Guess there should be one for .pig_header too. :-)

> Serialize schemas for PigStorage() and other storage types.
> -----------------------------------------------------------
>                 Key: PIG-760
>                 URL: https://issues.apache.org/jira/browse/PIG-760
>             Project: Pig
>          Issue Type: New Feature
>            Reporter: David Ciemiewicz
>            Assignee: Dmitriy V. Ryaboy
>             Fix For: 0.6.0
>         Attachments: pigstorageschema-2.patch, pigstorageschema.patch
> I'm finding PigStorage() really convenient for storage and data interchange 
> because it compresses well and imports into Excel and other analysis 
> environments well.
> However, it is a pain when it comes to maintenance because the columns are in 
> fixed locations and I'd like to add columns in some cases.
> It would be great if load PigStorage() could read a default schema from a 
> .schema file stored with the data and if store PigStorage() could store a 
> .schema file with the data.
> I have tested this out and both Hadoop HDFS and Pig in -exectype local mode 
> will ignore a file called .schema in a directory of part files.
> So, for example, if I have a chain of Pig scripts I execute such as:
> A = load 'data-1' using PigStorage() as ( a: int , b: int );
> store A into 'data-2' using PigStorage();
> B = load 'data-2' using PigStorage();
> describe B;
> describe B should output something like { a: int, b: int }

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

Reply via email to