I have seen this happen when there are inconsistent hostname to ip address lookups across the cluster and a node running a reducer is not connecting to the host that actually has the map output due to getting a different ip address for the node name.
On Mon, Jul 27, 2009 at 9:46 AM, Geoffry Roberts <[email protected]>wrote: > Thanks for the response. > > Now how do I fix this? Is the problem most likely in my MR code? or in my > hadoop configuration? or what? > > > On Mon, Jul 27, 2009 at 9:33 AM, Harish Mallipeddi < > [email protected]> wrote: > >> >> On Mon, Jul 27, 2009 at 9:42 PM, Geoffry Roberts < >> [email protected]> wrote: >> >>> All, >>> >>> I am attempting to run my first map reduce job and I am getting the >>> following error. Does anyone know what it means? >>> >>> Shuffle Error: Exceeded MAX_FAILED_UNIQUE_FETCHES; bailing-out. >>> >>> >> After the maps are complete, reducers need to fetch the intermediate >> map-outputs so they can reduce() them (this is part of the "shuffle" phase). >> It seems like in your case, for some reason the reducers are unable to fetch >> the map-ouputs from the corresponding TaskTracker nodes even after >> MAX_FAILED_UNIQUE_FETCHES attempts. The TaskTrackers (actually a Jetty >> webserver running on them) are responsible for serving these map-outputs. >> >> -- >> Harish Mallipeddi >> http://blog.poundbang.in >> > > -- Pro Hadoop, a book to guide you from beginner to hadoop mastery, http://www.amazon.com/dp/1430219424?tag=jewlerymall www.prohadoopbook.com a community for Hadoop Professionals
