Wondering about performance and count...
A =  load 'test.csv' as (a1,a2,a3); 
B = GROUP A by a1;
-- which preferred?
C = FOREACH B GENERATE COUNT(A);
-- or would this only send a single field through the COUNT and be more 
performant? 
C = FOREACH B GENERATE COUNT(A.a2); 


Reply via email to