Is there any chance you can test this using the version of pig in trunk?
That'd be very helpful. If it's still an issue, file a JIRA and I'll take a
look.

2012/3/27 Subir S <[email protected]>

> Hi,
>
> There is a trivial issue with PigStats (during HASHJOIN), it does not print
> correct record count. My job does a LEFT OUTER join operation and hence the
> row count with input B should match output C. After seeing the difference
> in count i cross checked, but seems it is only a printing issue...Hope this
> is a bug which might have been already fixed by now? Can somebody advise!!
>
>
>
> 2012-03-27 06:28:39,200 [main] INFO  org.apache.pig.tools.pigstats.PigStats
> - Script Statistics:
>
> HadoopVersion   PigVersion      UserId  StartedAt       FinishedAt
> Features
> 0.20.2-cdh3u0   0.8.0-cdh3u0    ssasik0 2012-03-27 06:26:30     2012-03-27
> 06:28:39     HASH_JOIN
>
> Success!
>
> Job Stats (time in seconds):
> JobId   Maps    Reduces MaxMapTime      MinMapTIme      AvgMapTime
> MaxReduceTime   MinReduceTime   AvgReduceTime   Alias   Feature Outputs
> job_201203261530_0597   30      4       28      7       17      100
> 97      98
> gold_price_link,items,items_price_link,items_price_link1,work_price_link
> HASH_JOIN
>
> /tmp/pricing_hub/work/ssasik0/output/iteration1/hierarchy/20120319/hierarchy_items_final,
>
> Input(s):
> Successfully read 9552894 records from: "/A/*"
> Successfully read *9552894* records from: *"/B/*"*
>
> Output(s):
> Successfully stored 12277671 records (2625930049 bytes) in: "/C/"
>
> Counters:
> Total records written : 12277671
> Total bytes written : 2625930049
> Spillable Memory Manager spill count : 0
> Total bags proactively spilled: 0
> Total records proactively spilled: 0
>
> Job DAG:
> job_201203261530_0597
>
>
> 2012-03-27 06:28:39,211 [main] INFO
>
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
> - Success!
> -bash-3.2$ hadoop fs -cat */B/** | wc -l
> *12277671*
>

Reply via email to