Have tried both ways foo is null OR foo == '\n', doesn't work in pig.
Why would null values be saved as \N in a file ? Is there a reason, is this
hive or hadoop way which pig cant understand ?


On 5/17/10 11:53 AM, "Dmitriy Ryaboy" <[email protected]> wrote:

> Arguably, that's a Hive bug. What does hive do if you *want* to have a \n as
> a value?
> 
> For your case, I think it's as simple as foreach rel generate ( foo is null
> OR foo == '\n' ? 'U' : foo);
> 
> -D
> 
> On Mon, May 17, 2010 at 11:42 AM, Syed Wasti <[email protected]> wrote:
> 
>> Well Dmitriy, my bad, I was looking at the data through a hive query and it
>> shows as NULL, but when I looked into the flat file all the NULL values are
>> are seen as \N.
>> Hive is able to understand \N as NULL but pig is not... How can I resolve
>> this ?
>> 
>> On 5/16/10 4:33 PM, "Dmitriy Ryaboy" <[email protected]> wrote:
>> 
>>> In that case, maybe it's the data, and what you think is null is actually
>>> '\n' ?
>>> 
>>> -D
>>> 
>>> On Sun, May 16, 2010 at 4:07 PM, Syed Wasti <[email protected]> wrote:
>>> 
>>>> Doing absolutely the same thing and I am using pig 6 too.
>>>> Tried with the fake data on both local and mapreduce modes, works fine.
>>>> But on my script against actual data in mapreduce mode, it fails to do
>> the
>>>> same thing, places \N instead of U.
>>>> 
>>>> grunt> rel1 = LOAD '/user/swasti/data' USING PigStorage('\t') as (num);
>>>> grunt> dump rel1;
>>>> (1)
>>>> (2)
>>>> (3)
>>>> ()
>>>> (5)
>>>> grunt> find_null = FOREACH rel1 GENERATE (num is null?'U':num);
>>>> grunt> dump find_null;
>>>> (1)
>>>> (2)
>>>> (3)
>>>> (U)
>>>> (5)
>>>> 
>>>> 
>>>> On 5/16/10 2:23 PM, "Dmitriy Ryaboy" <[email protected]> wrote:
>>>> 
>>>>> So what I am saying is, check that you are not inserting some weird
>>>>> non-ascii quotes in your actual script.
>>>>> I just ran this on Pig 6, it worked:
>>>>> 
>>>>> grunt> data = load 'tmp/nulltest' using PigStorage() as (num);
>>>>> grunt> dump data;
>>>>> (1)
>>>>> (2)
>>>>> ()
>>>>> (3)
>>>>> grunt> find_nulls = foreach data generate ( num is null ? 'U' : num );
>>>>> grunt> dump find_nulls;
>>>>> (1)
>>>>> (2)
>>>>> (U)
>>>>> (3)
>>>>> 
>>>>> I double-checked just in case, and it works in both local and mapreduce
>>>>> modes.
>>>>> 
>>>>> -Dmitriy
>>>>> 
>>>>> On Sun, May 16, 2010 at 1:49 PM, Syed Wasti <[email protected]>
>> wrote:
>>>>> 
>>>>>> Hmm not sure why, I used quotes in this mail, let me rewrite,
>>>>>> SQL(U is within single quotes): NVL(city,U) city
>>>>>> Pig(U is within single quotes): (city is null?U:city) AS city
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> On 5/16/10 1:36 PM, "Dmitriy Ryaboy" <[email protected]> wrote:
>>>>>> 
>>>>>>> Syed,
>>>>>>> The samples you pasted include all kinds of extraneous characters.
>> Are
>>>>>> you
>>>>>>> sure your script is properly encoded?
>>>>>>> 
>>>>>>> 
>>>>>>> On Sun, May 16, 2010 at 1:16 PM, Syed Wasti <[email protected]>
>>>> wrote:
>>>>>>> 
>>>>>>>> I am trying the SQL ³NVL(city, ŒU¹) city² in pig I am using the
>>>> bincond
>>>>>>>> operator, ³(city is null?'U': city) AS city², which is of chararray
>>>>>> type,
>>>>>>>> the result file shows Œ\N¹ instead of U.  Any ideas ?
>>>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>> 
>>>> 
>>>> 
>> 
>> 
>> 


Reply via email to