Thanks Igor for your input. I'm not using DFS and using local FS on a NetApp. Hadoop 0.11.2 on Suse Linux 64 bit.
Venkat --- Igor Bolotin <[EMAIL PROTECTED]> wrote: > Just observed similar behavior this morning. The > thread dump on one of > the Jetty server showed that there was one thread > trying to open file on > DFS, while all other threads waited for this one to > because of > synchronization: > > "pool-2-thread-110" prio=1 tid=0x08aa8bc8 nid=0x7570 > runnable > [0x0015f000..0x00160130] > at > java.net.SocketInputStream.socketRead0(Native > Method) > at > java.net.SocketInputStream.read(SocketInputStream.java:129) > at > java.io.BufferedInputStream.fill(BufferedInputStream.java:218) > at > java.io.BufferedInputStream.read1(BufferedInputStream.java:256) > at > java.io.BufferedInputStream.read(BufferedInputStream.java:313) > - locked <0xaaa02f30> (a > java.io.BufferedInputStream) > at > java.io.DataInputStream.readFully(DataInputStream.java:176) > at > java.io.DataInputStream.readLong(DataInputStream.java:380) > at > org.apache.hadoop.dfs.DFSClient$DFSInputStream.blockSeekTo(DFSClient.jav > a:615) > - locked <0xaaa02f98> (a > org.apache.hadoop.dfs.DFSClient$DFSInputStream) > at > org.apache.hadoop.dfs.DFSClient$DFSInputStream.read(DFSClient.java:702) > - locked <0xaaa02f98> (a > org.apache.hadoop.dfs.DFSClient$DFSInputStream) > at > org.apache.hadoop.fs.FSDataInputStream$PositionCache.read(FSDataInputStr > eam.java:189) > at > java.io.BufferedInputStream.fill(BufferedInputStream.java:218) > at > java.io.BufferedInputStream.read1(BufferedInputStream.java:256) > at > java.io.BufferedInputStream.read(BufferedInputStream.java:313) > - locked <0xaaa03040> (a > org.apache.hadoop.fs.FSDataInputStream$Buffer) > at > java.io.DataInputStream.readFully(DataInputStream.java:176) > at > java.io.DataInputStream.readFully(DataInputStream.java:152) > at > org.apache.hadoop.fs.FSDataInputStream$Checker.<init>(FSDataInputStream. > java:60) > at > org.apache.hadoop.fs.FSDataInputStream.<init>(FSDataInputStream.java:279 > ) > at > org.apache.hadoop.fs.FileSystem.open(FileSystem.java:262) > at > com.collarity.cdata.CDataSegmentReader.openDataStream(CDataSegmentReader > .java:132) > at > com.collarity.cdata.CDataSegmentReader.getDataStream(CDataSegmentReader. > java:118) > - locked <0x1bd29208> (a > com.collarity.cdata.CDataSegmentReader) > at > com.collarity.cdata.CDataSegmentReader.readDocument(CDataSegmentReader.j > ava:95) > ... > > Unfortunately - as it happened on production system > - I really didn't > have any time to research it further. > Once I stopped mapreduce (Job tracker, task > trackers) and killed all > outstanding tasks - the Jetty got back to normal. > > The only suspicious line in the NameNode logs around > that timeframe was: > > > 2007-02-24 09:25:58,792 INFO > [EMAIL PROTECTED] > [] > StateChange : BLOCK* > NameSystem.heartbeatCheck: lost heartbeat > from sf3-1:50010 > > Well, if/when it happens again - I'll try to > investigate it further. > > Igor > > P.S. We are using Hadoop version 0.10.1. I guess > first thing for us to > try would be to to upgrade to the latest version. > > > -----Original Message----- > From: Venkat Seeth [mailto:[EMAIL PROTECTED] > Sent: Saturday, February 24, 2007 9:32 AM > To: [email protected] > Subject: Reduce hangs at times > > Hi there, > > Howdy. I observe at times that few of the reduce > tasks hangs during copy > phase and does not result in failures also. Hence > these tasks never > complete nor rerun for timeouts. > > reduce > copy (1510 of 1540 at 1.57 MB/s) > > > At the same time, I see that Jetty is out of threads > in its thread pool. > Dont know if these 2 are related. > > I also see the following exception for many of the > MR operations. > > 24 Feb 2007 02:52:14,438 WARN - No thread for > Socket[addr=/10.163.63.137,port=56019,localport=50060] > - at > org.mortbay.util.ThreadPool.run(ThreadPool.java:373) > 24 Feb 2007 02:52:24,440 WARN - No thread for > Socket[addr=/10.163.63.137,port=56023,localport=50060] > - at > org.mortbay.util.ThreadPool.run(ThreadPool.java:373) > 24 Feb 2007 02:53:01,582 WARN - > getMapOutput(task_0005_m_000432_0,30) failed : > java.net.SocketException: Connection reset > at > java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:96) > at > java.net.SocketOutputStream.write(SocketOutputStream.java:136) > at > org.mortbay.http.ChunkingOutputStream.bypassWrite(ChunkingOutputStream.j > ava:151) > at > org.mortbay.http.BufferedOutputStream.write(BufferedOutputStream.java:13 > 9) > at > org.mortbay.http.HttpOutputStream.write(HttpOutputStream.java:423) > at > org.mortbay.jetty.servlet.ServletOut.write(ServletOut.java:54) > at > org.apache.hadoop.mapred.TaskTracker$MapOutputServlet.doGet(TaskTracker. > java:1574) > at > javax.servlet.http.HttpServlet.service(HttpServlet.java:689) > at > javax.servlet.http.HttpServlet.service(HttpServlet.java:802) > at > org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:427) > at > org.mortbay.jetty.servlet.WebApplicationHandler.dispatch(WebApplicationH > andler.java:475) > at > org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:567) > at > org.mortbay.http.HttpContext.handle(HttpContext.java:1565) > at > org.mortbay.jetty.servlet.WebApplicationContext.handle(WebApplicationCon > text.java:635) > at > org.mortbay.http.HttpContext.handle(HttpContext.java:1517) > at > org.mortbay.http.HttpServer.service(HttpServer.java:954) > at > org.mortbay.http.HttpConnection.service(HttpConnection.java:814) > at > org.mortbay.http.HttpConnection.handleNext(HttpConnection.java:981) > at > org.mortbay.http.HttpConnection.handle(HttpConnection.java:831) > at > org.mortbay.http.SocketListener.handleConnection(SocketListener.java:244 > ) > at > org.mortbay.util.ThreadedServer.handle(ThreadedServer.java:357) > at > org.mortbay.util.ThreadPool$PoolThread.run(ThreadPool.java:534) > - at > org.apache.hadoop.mapred.TaskTracker.doGet(TaskTracker.java:1600) > 24 Feb 2007 02:53:01,583 WARN - > getMapOutput(task_0005_m_001465_0,55) failed : > > Has anyone experienced this? Any thoughts are > greatly appreciated. > > Thanks, > Venkat > > > > > ________________________________________________________________________ > ____________ > === message truncated === ____________________________________________________________________________________ No need to miss a message. Get email on-the-go with Yahoo! Mail for Mobile. Get started. http://mobile.yahoo.com/mail
