The sample.txt file content: android,u1,taobao1 android,u1,taobao1 ,u2,taobao2
RR = LOAD '/user/www/udc/output/bugfind/sample.txt' USING PigStorage(',') as (platform, machineID, productID); RB = GROUP RR BY (productID); RES = FOREACH RB{ ITEMUV = DISTINCT RR.machineID; GENERATE flatten(group),COUNT(ITEMUV) AS UV,COUNT(RR) AS PV; }; DUMP RES; OUTPUT: (taobao1,1,2) (taobao2,1,0) Why taobao2 the pv is 0, but uv is 1? I view? the source code of the COUNT function If the first column is null, cnt will not increase while (it.hasNext()){ Tuple t = (Tuple)it.next(); if (t != null && t.size() > 0 && t.get(0) != null ) cnt++; } -- cente...@gmail.com|齐忠