That doesn't look like the complete set of counters -- all the map data is missing, for example. Kind of sounds like you've got extreme skew and you're stuck on the shuffle phase. Two input groups and 209,774,926 output records is a bit worrying. D
On Tue, Jan 3, 2012 at 6:50 AM, Tal Chalozin <[email protected]> wrote: > Hey, > > I used to run PIG tasks on EMR for a long time now and all worked great. > last night for some reason all my tasks are hanging in the reduce stage. > I've tried smaller data chunks or stronger / bigger clusters, but no luck. > > I can't seem to find any data in the job tracker beside the %66.89 > %complete. > this is the data in the error counters in the job tracker: > > File Output Format Counters > Bytes Written 0 > FileSystemCounters > FILE_BYTES_READ 131,072 > FILE_BYTES_WRITTEN 13,216,935 > HDFS_BYTES_WRITTEN 26,851,148,226 > Map-Reduce Framework > Reduce input groups 2 > Combine output records 0 > Reduce shuffle bytes 0 > Physical memory (bytes) snapshot 396,390,400 > Reduce output records 209,774,926 > Spilled Records 0 > CPU time spent (ms) 1,004,170 > Total committed heap usage (bytes) 181,600,256 > Virtual memory (bytes) snapshot 3,681,701,888 > Combine input records 0 > Reduce input records 1,894 > > even when running in local mode with -d DEBUG I get nothing. > > where should I look for more debugging data to understand what's going on > there? > > (needless to say i'm running pig 0.9.1 but this was working on EMR even > after amazon's HADOOP update) > > thank you so much. > > > >
