Hi ,

I am trying to use pig to aggregate data from an applications log lines.

Most of the data in the input file have the following format:
        A       B       C       D       E       F

I am aggregating the data as follows:

A= load '$in_dir' using PigStorage('\t') as (A, B,C,D,E,F);
D = group A by (A, B,C,D,E,F);
E = FOREACH D GENERATE FLATTEN(group) as (A, B,C,D,E,F ),COUNT(A) as hit
STORE E INTO '$in_dir._1' using PigStorage('\t');

In some cases i see the input lines are only : A        B       C       D  (E,F 
columns are missing)
Would the pig script ignore such lines.

Thanks & Regards,
Arun

Reply via email to