What was the value for hbase.master.wait.for.log.splitting config parameter ? Default value is false.
Cheers On May 9, 2013, at 12:41 AM, lars hofhansl <la...@apache.org> wrote: > Potential jiras that went into 0.94.7 that could be responsible: > HBASE-7824 > HBASE-8246 > HBASE-8276 > HBASE-8288 > HBASE-8212 > HBASE-8081 > HBASE-8211 > HBASE-8211 > > > -- Lars > > ----- Original Message ----- > From: lars hofhansl <la...@apache.org> > To: "dev@hbase.apache.org" <dev@hbase.apache.org>; lars hofhansl > <la...@apache.org> > Cc: > Sent: Thursday, May 9, 2013 12:23 AM > Subject: Re: All region server died due to "Parent directory doesn't exist" > > All the directories in .logs have the -splitting suffix, so this seems by > design. > The problem is that even though all logs are split, each time I startup a > region server now, its log dir is renamed to ...-splitting and the region > server shuts itself down. > > -- Lars > > > > ----- Original Message ----- > From: lars hofhansl <la...@apache.org> > To: hbase-dev <dev@hbase.apache.org> > Cc: > Sent: Wednesday, May 8, 2013 11:39 PM > Subject: All region server died due to "Parent directory doesn't exist" > > We just had all RegionServers die in a test cluster. All with the following > exception. > (This is CDH4.2.1 with HBase 0.94.7 build against it) > > Strangely HDFS is up and running (I can ls all directories, create files in > it, etc. HDFS's fsck reports that all is well), yet we had the RSs die with > this. > This almost looks like a race where the directories under .logs were yanked > away while they were still in use. > > I plan to investigate this further. In any event, has anybody seen this issue > (or anything similar to this) before? > When this happened there was no load on the cluster (other than some write > from OTSDB). > > Thanks. > > -- Lars > > 2013-05-08 16:02:41,178 FATAL > org.apache.hadoop.hbase.regionserver.HRegionServer: ABORTING region server > <host>,60020,1367614452787: IOE in log roller > java.io.IOException: Exception in createWriter > at > org.apache.hadoop.hbase.regionserver.wal.HLogFileSystem.createWriter(HLogFileSystem.java:66) > at > org.apache.hadoop.hbase.regionserver.wal.HLog.createWriterInstance(HLog.java:715) > at > org.apache.hadoop.hbase.regionserver.wal.HLog.rollWriter(HLog.java:648) > at > org.apache.hadoop.hbase.regionserver.LogRoller.run(LogRoller.java:95) > at java.lang.Thread.run(Thread.java:662) > Caused by: java.io.IOException: cannot get log writer > at > org.apache.hadoop.hbase.regionserver.wal.HLog.createWriter(HLog.java:771) > at > org.apache.hadoop.hbase.regionserver.wal.HLogFileSystem.createWriter(HLogFileSystem.java:60) > ... 4 more > Caused by: java.io.IOException: java.io.FileNotFoundException: Parent > directory doesn't exist: /hbase/.logs/<host>,60020,1367614452787 > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.verifyParentDir(FSNamesystem.java:1726) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1848) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInt(FSNamesystem.java:1770) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:1747) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.create(NameNodeRpcServer.java:418) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.create(ClientNamenodeProtocolServerSideTranslatorPB.java:205) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:44068) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:453) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1002) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1695) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1691) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:396) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1689) > > at > org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter.init(SequenceFileLogWriter.java:173) > at > org.apache.hadoop.hbase.regionserver.wal.HLog.createWriter(HLog.java:768) > ... 5 more >