I am guessing it as a bug in Hadoop-17. Because I am able to reproduce the problem. But, I am not able to figure where exactly this can happen. Can some one please help me on this?
Thanks novice user wrote: > > To my surprise, only one output value of mapper is not reaching combiner. > and It is consistent when I repeated the experimentation. Same point > directly reaches reducer without going thru the combiner. I am surprised > how can this happen? > > > > novice user wrote: >> >> Regarding the conclusion, >> I am parsing the inputs in combiner and reducer differently. For >> example the output value of mapper is "s:d" where as the output value of >> combiner is "s,d". So, in reducer, I am assuming the input as "s,d" and >> trying to parse it. There I got the exception because it got input as >> "s:d". >> >> I am using hadoop-17. >> >> Icouldn't get exactly what you meant by no guarantee on the number of >> times a combiner is run. Can you please elaborate a bit on this? >> >> Thanks >> >> >> >> >> >> >> Arun C Murthy-2 wrote: >>> >>> >>> On Jul 1, 2008, at 4:04 AM, novice user wrote: >>> >>>> >>>> Hi all, >>>> I have a query regarding the functionality of combiner. >>>> Is it possible to ignore combiner code for some of the outputs of >>>> mapper and >>>> directly being sent to reducer though combiner is specified in job >>>> configuration? >>>> Because, I figured out that, when I am running on large amounts of >>>> data, >>>> some of the mapper output is directly reached reducer. I am >>>> wondering how >>>> can this be possible when I have specified combiner in the job >>>> configuration. Can any one please let me know if this thing happens? >>>> >>> >>> Can you elaborate on how you reached the conclusion that the output >>> of some maps isn't going through the combiner? >>> >>> Also, what version of hadoop are you using? hadoop-0.18 onwards there >>> aren't guarantees on the number of times a combiner is run... >>> >>> Arun >>> >>>> >>>> >>>> -- >>>> View this message in context: http://www.nabble.com/Combiner-is- >>>> optional-though-it-is-specified--tp18213887p18213887.html >>>> Sent from the Hadoop core-user mailing list archive at Nabble.com. >>>> >>> >>> >>> >> >> > > -- View this message in context: http://www.nabble.com/Combiner-is-optional-though-it-is-specified--tp18213887p18310279.html Sent from the Hadoop core-user mailing list archive at Nabble.com.
