Hi all, I'm currently conducting some experiments with version 2.2, and I do get a lot of IO_ERR on the terasort benchmark on my 20-nodes cluster. The problem is due to ShuffleHandler throw an exception when try to send map output. The real Exception in the nodemanager is like: *2014-02-07 13:49:29,033 ERROR org.apache.hadoop.mapred.ShuffleHandler: Shuffle error [id: 0x40d5d4a7, /10.73.24.14:15736 <http://10.73.24.14:15736/> => /10.73.24.20:13562 <http://10.73.24.20:13562/>] EXCEPTION: java.io.IOException: Resource temporarily unavailable* I believe it's transferToFully in the SocketOutputStream.java throw the exception, and I found the comment says: /* * Ideally we should wait after transferTo returns 0. But because of a bug * in JRE on Linux (http://bugs.sun.com/view_bug.do?bug_id=5103988), which * throws an exception instead of returning 0, we wait for the channel to * be writable before writing to it. If you ever see IOException with * message "Resource temporarily unavailable" thrown here, please let us * know. * * Once we move to JAVA SE 7, wait should be moved to correct place. */ My cluster environment is a specialized version RedHat 4 and java 1.6.0_07, and change to jdk7 is not an option for the environment now. I'm not sure which mail list I should go, but I would be very grateful if someone could help.
Thank you! -- *Dasheng Jiang* *Peking University, Beijing* *Email: [email protected] <[email protected]> * *Phone: 18810775811*
