Did you try to escape the backslash?


Dano

On Thu, May 17, 2012 at 11:57 AM, Nerius Landys <[email protected]> wrote:

> I'm having problems using Pig's STRSPLIT (on Amazon's cloud computing
> environment).
> I also noticed that STRSPLIT isn't documented in the Pig Latin
> Reference Manual, so I found out about it through other sources of
> information.
>
> My problem is that in certain cases STRSPLIT returns null.  I have no
> idea why.  Here is an acual session I ran to demonstrate the problem:
>
>
>
> grunt> CAT s3://otg-nlandys/pig-tut/bin-proto-4;
> Meta    1234567890      foo     34
> Movement        1234567890      Rambetter       1/1     2/3
> Movement        1234567890      Freddyman       10/1    10/2
>
> grunt> A = LOAD 's3://otg-nlandys/pig-tut/bin-proto-4';
> grunt> DUMP A;
> (Meta,1234567890,foo,34)
> (Movement,1234567890,Rambetter,1/1,2/3)
> (Movement,1234567890,Freddyman,10/1,10/2)
>
> grunt> MOVEMENT = FILTER A BY (chararray) $0 == 'Movement';
> grunt> DUMP MOVEMENT;
> (Movement,1234567890,Rambetter,1/1,2/3)
> (Movement,1234567890,Freddyman,10/1,10/2)
>
> grunt> TEST = FOREACH MOVEMENT GENERATE $3 AS startpos:chararray;
> grunt> DUMP TEST;
> (1/1)
> (10/1)
>
> grunt> POSA = FOREACH TEST GENERATE STRSPLIT(startpos,'/');
> grunt> DUMP POSA;
> ()
> ()
>
> _________________________________________________________________
>
>
> grunt> CAT s3://otg-nlandys/pig-tut/bin-proto-5;
> 1/1
> 10/1
>
> grunt> B = LOAD 's3://otg-nlandys/pig-tut/bin-proto-5' AS
> startpos:chararray;
> grunt> DUMP B;
> (1/1)
> (10/1)
>
> grunt> POSB = FOREACH B GENERATE STRSPLIT(startpos,'/');
> grunt> DUMP POSB;
> ((1,1))
> ((10,1))
>
>
> _________________________________________________________________
>
>
> My question is why POSA is empty rows and POSB isn't empty rows, when
> it seems that they should be identical.
>
> I'm kind of new to Pig and realize that the problem might be a
> shortcoming of UDF's and how Pig works with data of varying column
> count, but would like an explanation.  Thanks.
>
> Also one other minor bug with STRSPLIT that I noticed.  If your first
> argument to STRSPLIT is bytearray instead of chararray, it will return
> null.  So you have to explicitly cast bytearray to chararray for it to
> work.  Seems that this could be automated in the language, no?
>
> - Nerius
>

Reply via email to