It seems that might be another bug jdk1.6.0_07 cause the problem, I change to the 1.6.0_45 and the IO_ERROR disappears.
Thanks~ 2014-02-07 15:27 GMT+08:00 蒋达晟 <[email protected]>: > Hi all, > I'm currently conducting some experiments with version 2.2, and I do get > a lot of IO_ERR on the terasort benchmark on my 20-nodes cluster. The > problem is due to ShuffleHandler throw an exception when try to send map > output. > The real Exception in the nodemanager is like: > *2014-02-07 13:49:29,033 ERROR org.apache.hadoop.mapred.ShuffleHandler: > Shuffle error [id: 0x40d5d4a7, /10.73.24.14:15736 > <http://10.73.24.14:15736/> => /10.73.24.20:13562 > <http://10.73.24.20:13562/>] EXCEPTION: java.io.IOException: Resource > temporarily unavailable* > I believe it's transferToFully in the SocketOutputStream.java throw the > exception, and I found the comment says: > /* > * Ideally we should wait after transferTo returns 0. But because of > a bug > * in JRE on Linux (http://bugs.sun.com/view_bug.do?bug_id=5103988), > which > * throws an exception instead of returning 0, we wait for the > channel to > * be writable before writing to it. If you ever see IOException with > * message "Resource temporarily unavailable" thrown here, please > let us > * know. > * > * Once we move to JAVA SE 7, wait should be moved to correct place. > */ > My cluster environment is a specialized version RedHat 4 and java > 1.6.0_07, and change to jdk7 is not an option for the environment now. > I'm not sure which mail list I should go, but I would be very grateful > if someone could help. > > Thank you! > > -- > > *Dasheng Jiang* > > *Peking University, Beijing* > > *Email: [email protected] <[email protected]> * > > *Phone: 18810775811* > -- *Dasheng Jiang* *Peking University, Beijing* *Email: [email protected] <[email protected]> * *Phone: 18810775811*
