pig-user  

Re: BinStorage and versioning

Kevin Weil
Wed, 08 Oct 2008 13:42:15 -0700

Just tested -- it does not work in the release, but in the types branch it
does:
Thanks!

> cat f3.txt
a b c

> cat f4.txt
A B C D

a = load 'f3.txt' using PigStorage(' ') as (f1, f2, f3);
b = load 'f4.txt' using PigStorage(' ') as (F1, F2, F3, F4);
both = union a, b;
store both into 'fields.bz2' using BinStorage();

c = load 'fields.bz2' using BinStorage() as (p1, p2, p3, p4);
dump c;
(a, b, c)
(A, B, C, D)

k = foreach c generate p4;
dump k;
(D)
(NULL)



On Wed, Oct 8, 2008 at 12:59 PM, Olga Natkovich <[EMAIL PROTECTED]> wrote:

> It should work just fine; the missing data will be replaced with null at
> least in the types branch.
>
> Olga
>
> > -----Original Message-----
> > From: Kevin Weil [EMAIL PROTECTED]
> > Sent: Wednesday, October 08, 2008 12:00 PM
> > To: pig-user@incubator.apache.org
> > Subject: BinStorage and versioning
> >
> > Hi,
> >
> > Say that I write out a tuple with three fields using
> > BinStorage.  And then in a couple months, I add a parameter,
> > so now I write out a tuple with a new fourth field.  If I'm
> > loading a directory containing the files that have both of
> > these tuples, and I say
> >
> > a = LOAD 'my_directory' using BinStorage() as (f1, f2, f3, f4)
> >
> > what will happen when BinStorage tries to load the first
> > tuple with only three fields?  Will f4 just be NULL?
> >
> > Thanks,
> > Kevin
> >
>