Dan, 
    If you have double quotes in your data. try following
B = FILTER A by itemid != '""';  put single quotes .
It works in pig 0.9.2.

Thanks,
Harsha


On Tuesday, February 26, 2013 at 5:11 PM, Dan Yi wrote:

> Hi, 
> 
> I have this simple file and tried to remove lines that have itemid column as 
> empty string '', but it
> Won't work, I tried to set the == to some valid itemid in the file see if I 
> can filter out those lines, 
> Still it wont' work,  any one knows how to use the '=='? 
> 
> Pig script:
> 
> A = LOAD '$DATA' AS 
> (timestamp:chararray,itemid:chararray,actiontype:chararray,actionid:chararray,anonid:chararray,deviceid:chararray,userid:chararray,mediouserid:chararray);
> B = DISTINCT A PARALLEL 2;
> 
> -- none of the following filter would work.
> C = FILTER B BY itemid == '';
> D = FILTER B BY itemid == '591837';
> E = FILTER B BY actiontype == 'AddToCart';
> 
> STORE A INTO 'OUTPUT1/A' USING PigStorage();
> STORE B INTO 'OUTPUT1/B' USING PigStorage();
> STORE C INTO 'OUTPUT1/C' USING PigStorage();
> STORE D INTO 'OUTPUT1/D' USING PigStorage();
> STORE E INTO 'OUTPUT1/E' USING PigStorage();
> 
> 
> 
> Data file is attached in the email.
> 
> Thanks.
> 

Reply via email to