Hi folks,

 

I am a little puzzled by (what looks to me) is like records that I am
emitting from my combiner - but that are not showing up under 'combine
output records' (and seem to be disappearing). Here's some evidence:

 

Mapred says:

 

Combine input records 230,803,567 

Combine output records 112,533,683

 

i am maintaining three counters and bump one of them when emitting
records from the combiner (ie. The combiner emits three types of key-val
pairs):

 

COMBINERJOIN 28,264,088

COMBINERPASS 199,193,336

COMBINERKEYS 3,346,143

 

as can be seen - the total number of combiner outputs (sum of above
three counters) is the same as the combine input records - and that is
exactly what I expect from my program. However, something is going wrong
somewhere and all the emitted records don't show up in the combiner
output. There are no exceptions in the logs. And the output.collect()
interface does not return an error code.

 

Any ideas what's going on? Is this a pathogenic case (combiner emitting
same number of output records as input records)

 

Thanks,

 

Joydeep

Reply via email to