[ https://issues.apache.org/jira/browse/HADOOP-2682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12561567#action_12561567 ]
stack commented on HADOOP-2682: ------------------------------- I see a second instance later in the logs: {code} Thread 28 (IPC Server handler 0 on 60020): State: RUNNABLE Blocked count: 19232 Waited count: 178149 Stack: java.net.SocketInputStream.socketRead0(Native Method) java.net.SocketInputStream.read(Unknown Source) java.io.BufferedInputStream.fill(Unknown Source) java.io.BufferedInputStream.read(Unknown Source) java.io.DataInputStream.readShort(Unknown Source) org.apache.hadoop.dfs.DFSClient$BlockReader.newBlockReader(DFSClient.java:773) org.apache.hadoop.dfs.DFSClient$DFSInputStream.blockSeekTo(DFSClient.java:1000) org.apache.hadoop.dfs.DFSClient$DFSInputStream.read(DFSClient.java:1096) java.io.DataInputStream.readFully(Unknown Source) java.io.DataInputStream.readFully(Unknown Source) org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1383) org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1360) org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1349) org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1344) org.apache.hadoop.io.MapFile$Reader.<init>(MapFile.java:254) org.apache.hadoop.io.MapFile$Reader.<init>(MapFile.java:242) org.apache.hadoop.hbase.HStoreFile$BloomFilterMapFile$Reader.<init>(HStoreFile.java:816) org.apache.hadoop.hbase.HStoreFile.getReader(HStoreFile.java:936) org.apache.hadoop.hbase.HStore$StoreFileScanner.<init>(HStore.java:2057) org.apache.hadoop.hbase.HStore$HStoreScanner.<init>(HStore.java:2203) {code} getReader is synchronized. It looks like the lock could be removed (At least some of the tests done in the method are against finalized data members). Doesn't address why we're hung reading from hdfs. We're trying to read the SequenceFile version block, it looks like. Your dfs.socket.timeout is reasonable? We set this on the socket as read timeout so you'd think the reads would eventually give up (and then the datanode gets added to list of deadnodes). > regionserver wedged: thread holds lock during blocking read (?) > --------------------------------------------------------------- > > Key: HADOOP-2682 > URL: https://issues.apache.org/jira/browse/HADOOP-2682 > Project: Hadoop > Issue Type: Bug > Components: contrib/hbase > Affects Versions: 0.15.0 > Reporter: Michael Bieniosek > > Using a patched version of 0.15.0, my regionserver freezes up, saying > repeatedly: > 2008-01-22 23:10:12,617 WARN org.apache.hadoop.ipc.Server: IPC Server handler > 1 on 60020, call getProtocolVersion(org. > apache.hadoop.hbase.HRegionInterface, 1) from xxx.xxx.xxx.178:55116: > discarded for being too old (46464) > Then I get: > 2008-01-22 23:11:12,644 WARN org.apache.hadoop.ipc.Server: IPC Server handler > 7 on 60020, call openScanner(-ROOT-,,0, > [Lorg.apache.hadoop.io.Text;@19958bf9, .META., 1201043326058, null) from > xxx.xxx.45.192:44993: output error > java.nio.channels.ClosedChannelException > at sun.nio.ch.SocketChannelImpl.ensureWriteOpen(Unknown Source) > at sun.nio.ch.SocketChannelImpl.write(Unknown Source) > at > org.apache.hadoop.ipc.SocketChannelOutputStream.flushBuffer(SocketChannelOutputStream.java:108) > at > org.apache.hadoop.ipc.SocketChannelOutputStream.write(SocketChannelOutputStream.java:89) > at java.io.BufferedOutputStream.flushBuffer(Unknown Source) > at java.io.BufferedOutputStream.flush(Unknown Source) > at java.io.DataOutputStream.flush(Unknown Source) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:615) > 2008-01-22 23:11:12,660 INFO org.apache.hadoop.ipc.Server: Process Thread > Dump: Discarding call > getProtocolVersion(org.apache.hadoop.hbase.HRegionInterface, 1) from > xxx.xxx.45.224:34506 > 29 active threads > Thread 4329 (IPC Client connection to xxx.xxx.44.135:10000): > State: WAITING > Blocked count: 1 > Waited count: 4 > Waiting on [EMAIL PROTECTED] > Stack: > java.lang.Object.wait(Native Method) > java.lang.Object.wait(Unknown Source) > org.apache.hadoop.ipc.Client$Connection.waitForWork(Client.java:216) > org.apache.hadoop.ipc.Client$Connection.run(Client.java:255) > Thread 37 (IPC Server handler 9 on 60020): > State: BLOCKED > Blocked count: 1 > Waited count: 202 > Blocked on [EMAIL PROTECTED] > Blocked by 29 (IPC Server handler 1 on 60020) > Stack: > org.apache.hadoop.hbase.HStoreFile.getReader(HStoreFile.java:936) > org.apache.hadoop.hbase.HStore$StoreFileScanner.<init>(HStore.java:2057) > org.apache.hadoop.hbase.HStore$HStoreScanner.<init>(HStore.java:2203) > org.apache.hadoop.hbase.HStore.getScanner(HStore.java:2025) > org.apache.hadoop.hbase.HRegion$HScanner.<init>(HRegion.java:1699) > org.apache.hadoop.hbase.HRegion.getScanner(HRegion.java:1271) > org.apache.hadoop.hbase.HRegionServer.openScanner(HRegionServer.java:1499) > sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source) > sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source) > java.lang.reflect.Method.invoke(Unknown Source) > org.apache.hadoop.ipc.RPC$Server.call(RPC.java:379) > org.apache.hadoop.ipc.Server$Handler.run(Server.java:596) > Thread 36 (IPC Server handler 8 on 60020): > State: BLOCKED > Blocked count: 1 > Waited count: 201 > Blocked on [EMAIL PROTECTED] > Blocked by 29 (IPC Server handler 1 on 60020) > Stack: > org.apache.hadoop.hbase.HStoreFile.getReader(HStoreFile.java:936) > org.apache.hadoop.hbase.HStore$StoreFileScanner.<init>(HStore.java:2057) > org.apache.hadoop.hbase.HStore$HStoreScanner.<init>(HStore.java:2203) > org.apache.hadoop.hbase.HStore.getScanner(HStore.java:2025) > org.apache.hadoop.hbase.HRegion$HScanner.<init>(HRegion.java:1699) > org.apache.hadoop.hbase.HRegion.getScanner(HRegion.java:1271) > org.apache.hadoop.hbase.HRegionServer.openScanner(HRegionServer.java:1499) > sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source) > sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source) > java.lang.reflect.Method.invoke(Unknown Source) > org.apache.hadoop.ipc.RPC$Server.call(RPC.java:379) > org.apache.hadoop.ipc.Server$Handler.run(Server.java:596) > Thread 35 (IPC Server handler 7 on 60020): > State: RUNNABLE > Blocked count: 4 > Waited count: 240 > Stack: > sun.management.ThreadImpl.getThreadInfo0(Native Method) > sun.management.ThreadImpl.getThreadInfo(Unknown Source) > sun.management.ThreadImpl.getThreadInfo(Unknown Source) > > org.apache.hadoop.util.ReflectionUtils.printThreadInfo(ReflectionUtils.java:114) > > org.apache.hadoop.util.ReflectionUtils.logThreadInfo(ReflectionUtils.java:162) > org.apache.hadoop.ipc.Server$Handler.run(Server.java:579) > Thread 34 (IPC Server handler 6 on 60020): > State: BLOCKED > Blocked count: 1 > Waited count: 203 > Blocked on [EMAIL PROTECTED] > Blocked by 29 (IPC Server handler 1 on 60020) > Stack: > org.apache.hadoop.hbase.HStoreFile.getReader(HStoreFile.java:936) > org.apache.hadoop.hbase.HStore$StoreFileScanner.<init>(HStore.java:2057) > org.apache.hadoop.hbase.HStore$HStoreScanner.<init>(HStore.java:2203) > org.apache.hadoop.hbase.HStore.getScanner(HStore.java:2025) > org.apache.hadoop.hbase.HRegion$HScanner.<init>(HRegion.java:1699) > org.apache.hadoop.hbase.HRegion.getScanner(HRegion.java:1271) > org.apache.hadoop.hbase.HRegionServer.openScanner(HRegionServer.java:1499) > sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source) > sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source) > java.lang.reflect.Method.invoke(Unknown Source) > org.apache.hadoop.ipc.RPC$Server.call(RPC.java:379) > org.apache.hadoop.ipc.Server$Handler.run(Server.java:596) > Thread 33 (IPC Server handler 5 on 60020): > State: BLOCKED > Blocked count: 1 > Waited count: 212 > Blocked on [EMAIL PROTECTED] > Blocked by 29 (IPC Server handler 1 on 60020) > Stack: > org.apache.hadoop.hbase.HStoreFile.getReader(HStoreFile.java:936) > org.apache.hadoop.hbase.HStore$StoreFileScanner.<init>(HStore.java:2057) > org.apache.hadoop.hbase.HStore$HStoreScanner.<init>(HStore.java:2203) > org.apache.hadoop.hbase.HStore.getScanner(HStore.java:2025) > org.apache.hadoop.hbase.HRegion$HScanner.<init>(HRegion.java:1699) > org.apache.hadoop.hbase.HRegion.getScanner(HRegion.java:1271) > org.apache.hadoop.hbase.HRegionServer.openScanner(HRegionServer.java:1499) > sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source) > sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source) > java.lang.reflect.Method.invoke(Unknown Source) > org.apache.hadoop.ipc.RPC$Server.call(RPC.java:379) > org.apache.hadoop.ipc.Server$Handler.run(Server.java:596) > Thread 32 (IPC Server handler 4 on 60020): > State: BLOCKED > Blocked count: 1 > Waited count: 202 > Blocked on [EMAIL PROTECTED] > Blocked by 29 (IPC Server handler 1 on 60020) > Stack: > org.apache.hadoop.hbase.HStoreFile.getReader(HStoreFile.java:936) > org.apache.hadoop.hbase.HStore$StoreFileScanner.<init>(HStore.java:2057) > org.apache.hadoop.hbase.HStore$HStoreScanner.<init>(HStore.java:2203) > org.apache.hadoop.hbase.HStore.getScanner(HStore.java:2025) > org.apache.hadoop.hbase.HRegion$HScanner.<init>(HRegion.java:1699) > org.apache.hadoop.hbase.HRegion.getScanner(HRegion.java:1271) > org.apache.hadoop.hbase.HRegionServer.openScanner(HRegionServer.java:1499) > sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source) > sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source) > java.lang.reflect.Method.invoke(Unknown Source) > org.apache.hadoop.ipc.RPC$Server.call(RPC.java:379) > org.apache.hadoop.ipc.Server$Handler.run(Server.java:596) > Thread 31 (IPC Server handler 3 on 60020): > State: BLOCKED > Blocked count: 1 > Waited count: 210 > Blocked on [EMAIL PROTECTED] > Blocked by 29 (IPC Server handler 1 on 60020) > Stack: > org.apache.hadoop.hbase.HStoreFile.getReader(HStoreFile.java:936) > org.apache.hadoop.hbase.HStore$StoreFileScanner.<init>(HStore.java:2057) > org.apache.hadoop.hbase.HStore$HStoreScanner.<init>(HStore.java:2203) > org.apache.hadoop.hbase.HStore.getScanner(HStore.java:2025) > org.apache.hadoop.hbase.HRegion$HScanner.<init>(HRegion.java:1699) > org.apache.hadoop.hbase.HRegion.getScanner(HRegion.java:1271) > org.apache.hadoop.hbase.HRegionServer.openScanner(HRegionServer.java:1499) > sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source) > sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source) > java.lang.reflect.Method.invoke(Unknown Source) > org.apache.hadoop.ipc.RPC$Server.call(RPC.java:379) > org.apache.hadoop.ipc.Server$Handler.run(Server.java:596) > Thread 30 (IPC Server handler 2 on 60020): > State: BLOCKED > Blocked count: 1 > Waited count: 216 > Blocked on [EMAIL PROTECTED] > Blocked by 29 (IPC Server handler 1 on 60020) > Stack: > org.apache.hadoop.hbase.HStoreFile.getReader(HStoreFile.java:936) > org.apache.hadoop.hbase.HStore$StoreFileScanner.<init>(HStore.java:2057) > org.apache.hadoop.hbase.HStore$HStoreScanner.<init>(HStore.java:2203) > org.apache.hadoop.hbase.HStore.getScanner(HStore.java:2025) > org.apache.hadoop.hbase.HRegion$HScanner.<init>(HRegion.java:1699) > org.apache.hadoop.hbase.HRegion.getScanner(HRegion.java:1271) > org.apache.hadoop.hbase.HRegionServer.openScanner(HRegionServer.java:1499) > sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source) > sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source) > java.lang.reflect.Method.invoke(Unknown Source) > org.apache.hadoop.ipc.RPC$Server.call(RPC.java:379) > org.apache.hadoop.ipc.Server$Handler.run(Server.java:596) > Thread 29 (IPC Server handler 1 on 60020): > State: RUNNABLE > Blocked count: 3 > Waited count: 229 > Stack: > java.net.SocketInputStream.socketRead0(Native Method) > java.net.SocketInputStream.read(Unknown Source) > java.io.BufferedInputStream.fill(Unknown Source) > java.io.BufferedInputStream.read(Unknown Source) > java.io.DataInputStream.readShort(Unknown Source) > > org.apache.hadoop.dfs.DFSClient$BlockReader.newBlockReader(DFSClient.java:773) > > org.apache.hadoop.dfs.DFSClient$DFSInputStream.blockSeekTo(DFSClient.java:1000) > org.apache.hadoop.dfs.DFSClient$DFSInputStream.read(DFSClient.java:1096) > java.io.DataInputStream.readFully(Unknown Source) > java.io.DataInputStream.readFully(Unknown Source) > org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1383) > org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1360) > org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1349) > org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1344) > org.apache.hadoop.io.MapFile$Reader.<init>(MapFile.java:263) > org.apache.hadoop.io.MapFile$Reader.<init>(MapFile.java:242) > > org.apache.hadoop.hbase.HStoreFile$BloomFilterMapFile$Reader.<init>(HStoreFile.java:816) > org.apache.hadoop.hbase.HStoreFile.getReader(HStoreFile.java:936) > org.apache.hadoop.hbase.HStore$StoreFileScanner.<init>(HStore.java:2057) > org.apache.hadoop.hbase.HStore$HStoreScanner.<init>(HStore.java:2203) > Thread 28 (IPC Server handler 0 on 60020): > State: BLOCKED > Blocked count: 1 > Waited count: 200 > Blocked on [EMAIL PROTECTED] > Blocked by 29 (IPC Server handler 1 on 60020) > Stack: > org.apache.hadoop.hbase.HStoreFile.getReader(HStoreFile.java:936) > org.apache.hadoop.hbase.HStore$StoreFileScanner.<init>(HStore.java:2057) > org.apache.hadoop.hbase.HStore$HStoreScanner.<init>(HStore.java:2203) > org.apache.hadoop.hbase.HStore.getScanner(HStore.java:2025) > org.apache.hadoop.hbase.HRegion$HScanner.<init>(HRegion.java:1699) > org.apache.hadoop.hbase.HRegion.getScanner(HRegion.java:1271) > org.apache.hadoop.hbase.HRegionServer.openScanner(HRegionServer.java:1499) > sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source) > sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source) > java.lang.reflect.Method.invoke(Unknown Source) > org.apache.hadoop.ipc.RPC$Server.call(RPC.java:379) > org.apache.hadoop.ipc.Server$Handler.run(Server.java:596) > Thread 14 (IPC Server listener on 60020): > State: RUNNABLE > Blocked count: 0 > Waited count: 0 > Stack: > sun.nio.ch.PollArrayWrapper.poll0(Native Method) > sun.nio.ch.PollArrayWrapper.poll(Unknown Source) > sun.nio.ch.PollSelectorImpl.doSelect(Unknown Source) > sun.nio.ch.SelectorImpl.lockAndDoSelect(Unknown Source) > sun.nio.ch.SelectorImpl.select(Unknown Source) > sun.nio.ch.SelectorImpl.select(Unknown Source) > org.apache.hadoop.ipc.Server$Listener.run(Server.java:285) > ... etc. > Note that most of the threads are blocked on thread 29, which is in turn > blocked on a socket read. This node was hosting the ROOT region, so when it > got wedged it made hbase unusable (since the master won't reassign ROOT until > the server actually goes down). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.