Ah but you can have the output schema  be an echo of the input schema, and bass 
your bag in as an (ignored) argument. 

On Dec 5, 2011, at 5:52 PM, Jonathan Coveney <[email protected]> wrote:

> 1) There is an error in the above. In pig8, the *following*  worked (the
> two snippers above are the same):
> 
> bag_of_stuff = load 'thing' as (x:int);
> a = group bag_of_stuff all;
> b = foreach a generate FLATTEN((IsEmpty(bag_of_stuff) ? null :
> bag_of_stuff)) as stuff; --no :int
> dump b;
> 
> 2) Dmitriy, I thought about doing something like that, but I don't know
> that it would work? if the UDF just outputs a single null, then it's schema
> is going to be "null," and I imagine you'd see the same error (though I can
> of course test that). To avoid the error, it'd have to be a bag with a null
> element, but then it'd have the same issue the code is trying to avoid: if
> you flatten a bag with a null, the row disappears
> 
> 2011/12/5 Dmitriy Ryaboy <[email protected]>
> 
>> s/null/UdfThatContainsASingleNull/ ?
>> 
>> On Mon, Dec 5, 2011 at 5:04 PM, Jonathan Coveney <[email protected]>
>> wrote:
>>> In pig8, the following worked:
>>> 
>>> bag_of_stuff = load 'thing' as (x:int);
>>> a = group bag_of_stuff all;
>>> b = foreach a generate FLATTEN((IsEmpty(bag_of_stuff) ? null :
>>> bag_of_stuff)) as stuff:int;
>>> dump b;
>>> 
>>> in pig9, however, in some cases, this could lead to an error, because you
>>> need to explicitly set the type of "stuff," which leads to:
>>> 
>>> bag_of_stuff = load 'thing' as (x:int);
>>> a = group bag_of_stuff all;
>>> b = foreach a generate FLATTEN((IsEmpty(bag_of_stuff) ? null :
>>> bag_of_stuff)) as stuff:int;
>>> dump b;
>>> 
>>> However, this doesn't work in pig8.
>>> 
>>> 2011-12-06 00:50:54,949 [main] ERROR org.apache.pig.tools.grunt.Grunt -
>>> ERROR 1022: Type mismatch merging schema prefix. Field Schema: bytearray.
>>> Other Field Schema: stuff: int
>>> 
>>> I'm not sure what the best way around this is. You can't explicitly cast
>>> (int)null, because then you get:
>>> 
>>> 2011-12-06 01:02:11,962 [main] ERROR org.apache.pig.tools.grunt.Grunt -
>>> ERROR 1050: Unsupported input type for BinCond: left hand side: int;
>> right
>>> hand side: bag
>>> 
>>> Any suggestions would be welcome. Maybe it'd be worth making a flatten
>>> that, in the case of an empty bag, returns a null row instead of getting
>>> washed out? I know it's sort of annoying given I know how to make it work
>>> in pig9, but I'd like for the script that uses this to work in both pig8
>>> and pig9, ideally...
>> 

Reply via email to