[ 
https://issues.apache.org/jira/browse/HBASE-930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-930:
------------------------

    Attachment: 930.patch

Patch that shutsdown regionserver when we fail close of write-ahead-log.

> RegionServer stuck: HLog: Could not append. Requesting close of log 
> java.io.IOException: Could not get block locations. Aborting...
> -----------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-930
>                 URL: https://issues.apache.org/jira/browse/HBASE-930
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: stack
>         Attachments: 930.patch
>
>
> HDFS went wonky.  Manifest itself in regionserver with below: i.e. first 
> can't replicate and then the exceptions every time we append and then every 
> time we try to rotate the log file.   Exception is always: "IOException: 
> Could not get block locations. Aborting..."  Meantime the HDFS is fixed but 
> for whatever reason the HRS just keeps on with the below failings; as though 
> the error is stuck in the DFSClient.
> {code}
> ...
> 2008-10-15 02:54:59,867 DEBUG org.apache.hadoop.hbase.regionserver.HStore: 
> Completed compaction of 1399814750/alternate_url store size is 2.9m
> 2008-10-15 02:54:59,868 INFO org.apache.hadoop.hbase.regionserver.HRegion: 
> compaction completed on region 
> enwiki_old,7qE8TUI5v-NIZu_9oZhaJF==,1221856888271 in 38sec
> 2008-10-15 02:59:08,507 INFO org.apache.hadoop.dfs.DFSClient: 
> org.apache.hadoop.ipc.RemoteException: java.io.IOException: File 
> /hbase/aa0-000-8.u.powerset.com/log_208.76.45.180_1223616861290_60020/hlog.dat.1224038569764
>  could only be re
> plicated to 0 nodes, instead of 1
>         at 
> org.apache.hadoop.dfs.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1117)
>         at org.apache.hadoop.dfs.NameNode.addBlock(NameNode.java:330)
>         at sun.reflect.GeneratedMethodAccessor15.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
>         at java.lang.reflect.Method.invoke(Unknown Source)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:452)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:888)
>         at org.apache.hadoop.ipc.Client.call(Client.java:715)
>         at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216)
>         at org.apache.hadoop.dfs.$Proxy1.addBlock(Unknown Source)
>         at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
>         at java.lang.reflect.Method.invoke(Unknown Source)
>         at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
>         at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
>         at org.apache.hadoop.dfs.$Proxy1.addBlock(Unknown Source)
>         at 
> org.apache.hadoop.dfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.java:2440)
>         at 
> org.apache.hadoop.dfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2323)
>         at 
> org.apache.hadoop.dfs.DFSClient$DFSOutputStream.access$1800(DFSClient.java:1735)
>         at 
> org.apache.hadoop.dfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:1912)
> ...
> 2008-10-15 02:59:11,357 WARN org.apache.hadoop.dfs.DFSClient: 
> NotReplicatedYetException sleeping 
> /hbase/aa0-000-8.u.powerset.com/log_208.76.45.180_1223616861290_60020/hlog.dat.1224038569764
>  retries left 1
> 2008-10-15 02:59:14,566 WARN org.apache.hadoop.dfs.DFSClient: DataStreamer 
> Exception: org.apache.hadoop.ipc.RemoteException: java.io.IOException: File 
> /hbase/aa0-000-8.u.powerset.com/log_208.76.45.180_1223616861290_60020/hlog.dat.122403
> 8569764 could only be replicated to 0 nodes, instead of 1
>         at 
> org.apache.hadoop.dfs.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1117)
>         at org.apache.hadoop.dfs.NameNode.addBlock(NameNode.java:330)
>         at sun.reflect.GeneratedMethodAccessor15.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
>         at java.lang.reflect.Method.invoke(Unknown Source)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:452)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:888)
>         at org.apache.hadoop.ipc.Client.call(Client.java:715)
>         at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216)
>         at org.apache.hadoop.dfs.$Proxy1.addBlock(Unknown Source)
>         at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
>         at java.lang.reflect.Method.invoke(Unknown Source)
>         at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
>         at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
>         at org.apache.hadoop.dfs.$Proxy1.addBlock(Unknown Source)
>         at 
> org.apache.hadoop.dfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.java:2440)
>         at 
> org.apache.hadoop.dfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2323)
>         at 
> org.apache.hadoop.dfs.DFSClient$DFSOutputStream.access$1800(DFSClient.java:1735)
>         at 
> org.apache.hadoop.dfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:1912)
> 2008-10-15 02:59:14,566 WARN org.apache.hadoop.dfs.DFSClient: Error Recovery 
> for block null bad datanode[0]
> 2008-10-15 02:59:14,566 FATAL org.apache.hadoop.hbase.regionserver.HLog: 
> Could not append. Requesting close of log
> java.io.IOException: Could not get block locations. Aborting...
>         at 
> org.apache.hadoop.dfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:2143)
>         at 
> org.apache.hadoop.dfs.DFSClient$DFSOutputStream.access$1400(DFSClient.java:1735)
>         at 
> org.apache.hadoop.dfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:1889)
> 2008-10-15 02:59:14,567 INFO org.apache.hadoop.hbase.regionserver.LogRoller: 
> Rolling hlog. Number of entries: 81
> 2008-10-15 02:59:14,567 INFO org.apache.hadoop.ipc.Server: IPC Server handler 
> 8 on 60020, call batchUpdate([EMAIL PROTECTED], row => 
> Dnsw4RukHJ5-0FQddY35mF==, {column => misc:upload_time, value => '...', column 
> => page:mime, value => '...', c
> olumn => page:content, value => '...', column => page:url, value => '...'}, 
> -1) from 208.76.44.183:59031: error: java.io.IOException: Could not get block 
> locations. Aborting...
> java.io.IOException: Could not get block locations. Aborting...
>         at 
> org.apache.hadoop.dfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:2143)
>         at 
> org.apache.hadoop.dfs.DFSClient$DFSOutputStream.access$1400(DFSClient.java:1735)
>         at 
> org.apache.hadoop.dfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:1889)
> 2008-10-15 02:59:14,588 ERROR org.apache.hadoop.hbase.regionserver.LogRoller: 
> Log rolling failed
> java.io.IOException: Could not get block locations. Aborting...
>         at 
> org.apache.hadoop.dfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:2143)
>         at 
> org.apache.hadoop.dfs.DFSClient$DFSOutputStream.access$1400(DFSClient.java:1735)
>         at 
> org.apache.hadoop.dfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:1889)
> ...
> {code}
> For now adding abort of regionserver if we can't close the log roller file.  
> If we can't close, then we're losing edits.  If this behavior causes us close 
> to much, then need to dig in more.
> Happened on pset cluster running 0.18.1 hadoop and 0.18.0 hbase.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to