Hi ,
I am trying to use pig to aggregate data from an applications log lines.
Most of the data in the input file have the following format:
A B C D E F
I am aggregating the data as follows:
A= load '$in_dir' using PigStorage('\t') as (A, B,C,D,E,F);
D = group A by (A, B,C,D,E,F);
E = FOREACH D GENERATE FLATTEN(group) as (A, B,C,D,E,F ),COUNT(A) as hit
STORE E INTO '$in_dir._1' using PigStorage('\t');
In some cases i see the input lines are only : A B C D (E,F
columns are missing)
Would the pig script ignore such lines.
Thanks & Regards,
Arun