Mohit,
A = LOAD '/user/apuser/test/data1' AS b:bag{
you are naming your data bag as b.
if you want to refer values inside the data bag try b.a or b.b.
The sample data I gave you is something random if you are trying to skip over
nulls
you can do so by using Filter.
Take a look at http://pig.apache.org/docs/r0.11.0/
-Harsha
--
Harsha
On Thursday, March 7, 2013 at 11:58 AM, Mohit Anchlia wrote:
> So I did this. I took your example and put it in a file and ran some pig
> commands through grunt but I am getting same results from a bag and
> generating from tuple. I might be doing something wrong here.
>
> grunt> A = LOAD '/user/apuser/test/data1' AS b:bag{t:tuple(a:chararray,
> b:chararray)};
> grunt> dump A;
> 2013-03-07 14:55:25,125 [main] INFO
> org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input
> paths to process : 1
> ({(1,)})
> ({(3,)})
> ({(5,10)})
> ({(7,)})
>
> grunt> b = foreach A generate b;
> grunt> dump b;
> 2013-03-07 14:57:59,509 [main] INFO
> org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input
> paths to process : 1
> ({(1,)})
> ({(3,)})
> ({(5,10)})
> ({(7,)})
> grunt>
>
> I get the same output again.
>
>
> On Thu, Mar 7, 2013 at 11:40 AM, Mohit Anchlia <[email protected]
> (mailto:[email protected])>wrote:
>
> > good suggestion. Let me try that
> >
> >
> > On Thu, Mar 7, 2013 at 11:27 AM, Harsha <[email protected]
> > (mailto:[email protected])> wrote:
> >
> > > It will be easier if you have some sample data and run it through grunt
> > > shell.
> > > Lets say you have a dataset like this
> > > ({(1,)})
> > > ({(3,)})
> > > ({(5,10)})
> > > ({(7,)})
> > >
> > > some of them are nulls in your "b" and some rows has values for "b"
> > > and if you do a "generate" for above it will run through each row
> > > and try to fetch values for b if there is none it will do ()
> > > something like this
> > >
> > > ({()})
> > > ({()})
> > > ({(10)})
> > > ({()})
> > >
> > >
> > >
> > >
> > > --
> > > Harsha
> > >
> > >
> > > On Thursday, March 7, 2013 at 11:15 AM, Mohit Anchlia wrote:
> > >
> > > > sorry, yes my question was about accessing b not $1. What's the effect
> > > of
> > > > writing empty() to a file. Say if I did store b into temp then should I
> > > > expect a line or nothing gets writen at all in the file.
> > > >
> > > > On Thu, Mar 7, 2013 at 10:53 AM, Harsha <[email protected]
> > > > (mailto:[email protected]) (mailto:
> > > [email protected] (mailto:[email protected]))> wrote:
> > > >
> > > > > from your schema b:bag{t:tuple(a:chararray, b:chararray)}
> > > > > your tuple is inside a bag so on the next line if you are trying to
> > > > >
> > > >
> > > >
> > >
> > > access
> > > > > through $1 pig will
> > > > > throw up an error saying non-existent column.
> > > > > but if your question is about accessing b than it will print empty ()
> > > > >
> > > >
> > >
> > > if
> > > > > the there is no value present (as you are setting it as null).
> > > > >
> > > > > --
> > > > > Harsha
> > > > >
> > > > >
> > > > > On Thursday, March 7, 2013 at 10:35 AM, Mohit Anchlia wrote:
> > > > >
> > > > > > Thanks! Does "generate" skip over that? if I did b = for B generate
> > > $1
> > > > > what
> > > > > > should be expected outcome of alias "b"
> > > > >
> > > > >
> > > >
> > >
> > > > > > On Thu, Mar 7, 2013 at 10:31 AM, Harsha <[email protected]
> > > > > > (mailto:[email protected]) (mailto:
> > > [email protected] (mailto:[email protected])) (mailto:
> > > > > [email protected] (mailto:[email protected]))> wrote:
> > > > > >
> > > > > > > Hi Mohit,
> > > > > > > it won't convert into string literal 'NULL' since its a tuple
> > > > > > > you'll see results like
> > > > > > > ('Hello',)
> > > > > > >
> > > > > > > --
> > > > > > > Harsha
> > > > > > >
> > > > > > >
> > > > > > > On Thursday, March 7, 2013 at 10:10 AM, Mohit Anchlia wrote:
> > > > > > >
> > > > > > > > Any help would be appreciated. I'll also write something
> > > shortly and
> > > > > see
> > > > > > > > what happens.
> > > > > > > >
> > > > > > > > On Wed, Mar 6, 2013 at 4:58 PM, Mohit Anchlia <
> > > > > [email protected] (mailto:[email protected])(mailto:
> > > > > > > [email protected] (mailto:[email protected]))>wrote:
> > > > > >
> > > > >
> > > > >
> > > > > > > >
> > > > > > > > > If I define and set tuple like this:
> > > > > > > > >
> > > > > > > > > Tuple t1 = mTupleFactory.newTuple(2);
> > > > > > > > > t1.set(0, "Hello");
> > > > > > > > > t1.set(1, NULL);
> > > > > > > > >
> > > > > > > > > and have schema like:
> > > > > > > > >
> > > > > > > > > b:bag{t:tuple(a:chararray, b:chararray)
> > > > > > > > >
> > > > > > > > > and then in the pig script if I do:
> > > > > > > > >
> > > > > > > > > page = foreach B generate b;
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > What should be expected outcome? Would "generate" convert
> > > NULL into
> > > > > > > > > literal 'NULL' as a string? Or does it skip over that NULL.
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> > >
> >
> >
>
>
>