Thanks for that. I had two blocks I had to delete with hadoop fsck / -delete because they where corrupted but I am unsure if I lost data from base looks like I still have data just not sure what the corrupted blocks where if I did lose some info it was not much.
I would thank there would be a way to put the datanode/regional server in a safe mode and start closing files with out corrupting any files but that may not be an option but would like to see if in the future on a production release. Billy "stack" <[EMAIL PROTECTED]> wrote in message news:[EMAIL PROTECTED] > See if last item in FAQ fixes your issue Billy: > http://wiki.apache.org/lucene-hadoop/Hbase/FAQ > St.Ack > > Billy wrote: >> I have tried to load hbase several times and always keep filing >> >> 2007-12-18 14:21:45,062 FATAL org.apache.hadoop.hbase.HRegionServer: >> Replay of hlog required. Forcing server restart >> org.apache.hadoop.hbase.DroppedSnapshotException: java.io.IOException: >> Too many open files >> >> Thats the error I keep getting before all the problems start then when I >> shutdown and restart everything the region server complane about not >> being abile to find a block >> >> 2007-12-18 14:30:10,610 INFO org.apache.hadoop.fs.DFSClient: Could not >> obtain block blk_-8122648511302257739 from any node: >> java.io.IOException: No live nodes contain current block >> >> 2007-12-18 14:30:22,221 ERROR org.apache.hadoop.hbase.HRegionServer: >> Unable to close log in abort >> java.io.IOException: java.io.IOException: Too many open files >> at java.lang.UNIXProcess.<init>(UNIXProcess.java:148) >> at java.lang.ProcessImpl.start(ProcessImpl.java:65) >> at java.lang.ProcessBuilder.start(ProcessBuilder.java:451) >> at java.lang.Runtime.exec(Runtime.java:591) >> at java.lang.Runtime.exec(Runtime.java:464) >> at >> org.apache.hadoop.fs.ShellCommand.runCommand(ShellCommand.java:63) >> at org.apache.hadoop.fs.ShellCommand.run(ShellCommand.java:57) >> at org.apache.hadoop.fs.DF.getAvailable(DF.java:72) >> at >> org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:264) >> at >> org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.createTmpFileForWrite(LocalDirAllocator.java:294) >> at >> org.apache.hadoop.fs.LocalDirAllocator.createTmpFileForWrite(LocalDirAllocator.java:155) >> at >> org.apache.hadoop.dfs.DFSClient$DFSOutputStream.newBackupFile(DFSClient.java:1462) >> at >> org.apache.hadoop.dfs.DFSClient$DFSOutputStream.openBackupStream(DFSClient.java:1429) >> at >> org.apache.hadoop.dfs.DFSClient$DFSOutputStream.writeChunk(DFSClient.java:1571) >> at >> org.apache.hadoop.fs.FSOutputSummer.writeChecksumChunk(FSOutputSummer.java:141) >> at >> org.apache.hadoop.fs.FSOutputSummer.flushBuffer(FSOutputSummer.java:124) >> at >> org.apache.hadoop.dfs.DFSClient$DFSOutputStream.close(DFSClient.java:1724) >> at >> org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:49) >> at >> org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:64) >> at >> org.apache.hadoop.io.SequenceFile$Writer.close(SequenceFile.java:918) >> at org.apache.hadoop.hbase.HLog.close(HLog.java:390) >> at >> org.apache.hadoop.hbase.HRegionServer.run(HRegionServer.java:860) >> at java.lang.Thread.run(Thread.java:595) >> >> >> I have my setup set to split the table at 16MB spits so I can do some >> testing the problem starts to show up with the table get about 6-7 >> splits. I have this on a 3 node setup 1 master/namenoe and 2 >> datanodes/region server I attached a tail of the last 5000 line of the >> regional servers I could not get much from the master log as it keep >> trying to connect and there is tons of data I scanned for WARN and FATAL >> but got nothing that looks like an error message. >> >> Billy >> >> > >