On 08/07/2011 16:25, Juan P. wrote:
Here's another thought. I realized that the reduce operation in my map/reduce jobs is a flash. But it goes reaaaaaaaaally slow until the mappers end. Is there a way to configure the cluster to make the reduce wait for the map operations to complete? Specially considering my hardware restraints
take a look to see if its usually the same machine that's taking too long; test your HDDs to see if there are any signs of problems in the SMART messages. Then turn on speculation. It could be the problem with a slow mapper is caused by disk problems or an overloaded server.
