If you run hdfs fsck it shows missing blocks? What happens if you try to copy /hbase/data/default/MEDIA/ ecd1e565ab8a8bfba77cab46ed023539* /F/5eacfeb8a2eb419cb6fe348df0540145 to local filesystem from HDFS (hdfs dfs -copyLocal... or whatever it is called)?
Try moving aside the problematic files? Your AWS giving you grief? St.Ack On Tue, Apr 19, 2016 at 8:29 PM, Michal Medvecky <[email protected]> wrote: > Hello, > > after several network outages in AWS (never ever run HBase there!), my > HBase was seriously damaged. After doing some steps like restarting > namenodes, hdfs fsck, restarting all regionservers and hbase master, i'm > still having 8 offline regions I am unable to start. > > When running hbck with any combination of repair parameters, it's always > stuck on messages like: > > 2016-04-20 03:26:16,812 INFO [hbasefsck-pool1-t45] > util.HBaseFsckRepair: *Region > still in transition, waiting for it to become assigned*: {ENCODED => > 8fe9d66a1f4c4739dd1929e3c38bf951, NAME => > > 'MEDIA,\x01rvkUDKIuye0\x00YT,1460997677820.8fe9d66a1f4c4739dd1929e3c38bf951.', > STARTKEY => '\x01rvkUDKIuye0\x00YT', ENDKEY => > '\x01stefanonoferini/club-edition-17'} > > when looking into regionserver logs, I see messages like: > > 2016-04-19 23:27:54,969 ERROR > [RS_OPEN_REGION-prod-aws-hbase-data-0010:16020-80] > handler.OpenRegionHandler: Failed open of region=MEDIA,\x05JEklcNpOKos\ > x00YT,1461001150488.20d48fd40c94c7c81049cbc506de4ad4., starting to roll > back the global memstore size. > java.io.IOException: java.io.IOException: java.io.FileNotFoundException: > *File > does not exist: /hbase/data/default/MEDIA/ecd1e565ab8a8bfba77cab46ed023539* > /F/5eacfeb8a2eb419cb6fe348df0540145 > at > org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:71) > at > org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:61) > at > > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:1828) > at > > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1799) > at > > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1712) > at > > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:587) > at > > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB > .java:365) > at > > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.j > ava) > at > > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > 2016-04-19 23:27:54,957 INFO > [StoreOpener-20d48fd40c94c7c81049cbc506de4ad4-1] hfile.CacheConfig: > blockCache=LruBlockCache{blockCount=2, currentSize=328 > 5448, freeSize=3198122040, maxSize=3201407488, heapSize=3285448, > minSize=3041337088, minFactor=0.95, multiSize=1520668544, multiFactor=0.5, > singleSize=7 > 60334272, singleFactor=0.25}, cacheDataOnRead=true, cacheDataOnWrite=false, > cacheIndexesOnWrite=false, cacheBloomsOnWrite=false, > cacheEvictOnClose=false > , cacheDataCompressed=false, prefetchOnOpen=false > 2016-04-19 23:27:54,957 INFO > [StoreOpener-20d48fd40c94c7c81049cbc506de4ad4-1] > compactions.CompactionConfiguration: size [134217728, 9223372036854775807 > ); files [3, 10); ratio 1.200000; off-peak ratio 5.000000; throttle point > 2684354560; major period 604800000, major jitter 0.500000, min locality to > com > pact 0.700000 > 2016-04-19 23:27:54,962 INFO [StoreFileOpenerThread-F-1] > regionserver.StoreFile$Reader: Loaded Delete Family Bloom > (CompoundBloomFilter) metadata for 5 > eacfeb8a2eb419cb6fe348df0540145 > 2016-04-19 23:27:54,969 ERROR > [RS_OPEN_REGION-prod-aws-hbase-data-0010:16020-80] regionserver.HRegion: > Could not initialize all stores for the region=ME > DIA,\x05JEklcNpOKos\x00YT,1461001150488.20d48fd40c94c7c81049cbc506de4ad4. > 2016-04-19 23:27:54,969 WARN > [StoreOpener-20d48fd40c94c7c81049cbc506de4ad4-1] ipc.Client: interrupted > waiting to send rpc request to server > java.lang.InterruptedException > at java.util.concurrent.FutureTask.awaitDone(FutureTask.java:404) > at java.util.concurrent.FutureTask.get(FutureTask.java:191) > at > org.apache.hadoop.ipc.Client$Connection.sendRpcRequest(Client.java:1054) > at org.apache.hadoop.ipc.Client.call(Client.java:1449) > at org.apache.hadoop.ipc.Client.call(Client.java:1407) > at > > org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229) > at com.sun.proxy.$Proxy16.getFileInfo(Unknown Source) > at > > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:771) > at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source) > at > > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > > org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187) > at > > org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102) > at com.sun.proxy.$Proxy17.getFileInfo(Unknown Source) > at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source) > at > > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.hadoop.hbase.fs.HFileSystem$1.invoke(HFileSystem.java:279) > at com.sun.proxy.$Proxy18.getFileInfo(Unknown Source) > at > org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:2116) > at > > org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1305) > at > > org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1301) > at > > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > > org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1301) > at > > org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:421) > at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1424) > at > > org.apache.hadoop.hbase.regionserver.HRegionFileSystem.createStoreDir(HRegionFileSystem.java:171) > at > org.apache.hadoop.hbase.regionserver.HStore.<init>(HStore.java:220) > at > > org.apache.hadoop.hbase.regionserver.HRegion.instantiateHStore(HRegion.java:4973) > at > org.apache.hadoop.hbase.regionserver.HRegion$1.call(HRegion.java:925) > at > org.apache.hadoop.hbase.regionserver.HRegion$1.call(HRegion.java:922) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > > I did all kinds of recovery magic, like restarting all components or > cleaning ZK. > > I found this thread: > http://comments.gmane.org/gmane.comp.java.hadoop.hbase.user/31308 that > supposes to create empty hfiles, but I'm a bit afraid to do this. > > I'm using hbase 1.1.3 with hadoop 2.7.1, (both binary-downloaded from their > websites) on ubuntu 14.04. > > Thank you for any help > > Michal >
