Thank you Wayne. I'll dig in.... Weds (I'm not by a computer till then). St.Ack
On Sun, Jul 3, 2011 at 9:56 AM, Wayne <[email protected]> wrote: > I have uploaded the logs. I do not have a snapshot of the .META. table in > the messed up state. The root partition ran out of space at 2:30 am. Below > are links to the various logs. It appears everything but the data nodes went > south. The data nodes kept repeating the same errors shown in the log below. > > Master log > > http://pastebin.com/WmBAC0Xm > > Namenode log > > http://pastebin.com/tjRqfCaChttp://pastebin.com/tjRqfCaC > > Node 2 region server log > > http://pastebin.com/M3EH02bP > > Node 2 data node log > > http://pastebin.com/XKgUAMTK > > Thanks. > > > On Sun, Jul 3, 2011 at 12:40 AM, Stack <[email protected]> wrote: > >> You have a snapshot of the state of .META. at time you noticed it >> messed up? And the master log from around the time of the startup >> post-fs-fillup? >> St.Ack >> >> On Sat, Jul 2, 2011 at 7:27 PM, Wayne <[email protected]> wrote: >> > Like most problems we brought it on ourselves. To me the bigger issue is >> how >> > to get out. Since region definitions are the core of what hbase does, it >> > would be great to have a bullet proof recovery process that we can invoke >> to >> > get us out. Bugs and human error will bring on problems and nothing will >> > ever change that, but not having tools to help recover out of the hole is >> > where I think it is lacking. HDFS is very stable. The hbase .META. table >> > (and -ROOT-?) are the core how HBase manages things. If this gets out of >> > whack all is lost. I think it would be great to have automatic backup of >> the >> > meta table and the ability to recover everything based on the HDFS data >> out >> > there and the backup. Something like a recovery mode that goes through >> and >> > sees what is out there and rebuilds the meta based on it. With corrupted >> > data and lost regions etc. etc. like any relational database there should >> be >> > one or more recovery modes that goes through everything and rebuilds it >> > consistently. Data may be lost but at least the cluster will be left in a >> > 100% consistent/clean state. Manual editing of .META. is not something >> > anyone should do (especially me). It is prone to human error...it should >> be >> > easy to have well tested recover tools that can do the hard work for us. >> > >> > Below is an attempt at the play by play in case it helps. It all started >> > with the root partition of the namenode/hmaster filling up due to a table >> > export. >> > >> > When I restarted hadoop this error was in the namenode log; >> > "java.io.IOException: Incorrect data format. logVersion is -18 but >> > writables.length is 0" >> > >> > So i found this< >> https://groups.google.com/a/cloudera.org/group/cdh-user/browse_thread/thread/e35ee876da1a3bbc >> >, >> > which mentioned editing the namenode log files after verifying our >> namenode >> > log files seem to have the same symptom. So I copied each namenode "name" >> > file to root's home directory and followed their advice. >> > That allowed the namenode to start, but then HDFS wouldn't come up. It >> kept >> > hanging in safe-mode with the repeated error; >> > "The ratio of reported blocks 0.9925 has not reached the threshold >> 0.9990. >> > Safe mode will be turned off automatically." >> > So i turned safe-mode off with; "hadoop dfsadmin -safemode leave" and I >> > tried to run "hadoop fsck" a few times and it still showed HDFS as >> > "corrupt", so i did "hadoop fsck -move" and this is the last part of the >> > output; >> > >> ....................................................................................Status: >> > CORRUPT >> > Total size: 1423140871890 B (Total open files size: 668770828 B) >> > Total dirs: 3172 >> > Total files: 2584 (Files currently being written: 11) >> > Total blocks (validated): 23095 (avg. block size 61621167 B) (Total open >> > file blocks (not validated): 10) >> > ******************************** >> > CORRUPT FILES: 65 >> > MISSING BLOCKS: 173 >> > MISSING SIZE: 8560948988 B >> > CORRUPT BLOCKS: 173 >> > ******************************** >> > Minimally replicated blocks: 22922 (99.25092 %) >> > Over-replicated blocks: 0 (0.0 %) >> > Under-replicated blocks: 0 (0.0 %) >> > Mis-replicated blocks: 0 (0.0 %) >> > Default replication factor: 3 >> > Average block replication: 2.9775276 >> > Corrupt blocks: 173 >> > Missing replicas: 0 (0.0 %) >> > Number of data-nodes: 10 >> > Number of racks: 1 >> > >> > I ran it again and got this; >> > .Status: HEALTHY >> > Total size: 1414579922902 B (Total open files size: 668770828 B) >> > Total dirs: 3272 >> > Total files: 2519 (Files currently being written: 11) >> > Total blocks (validated): 22922 (avg. block size 61712761 B) (Total open >> > file blocks (not validated): 10) >> > Minimally replicated blocks: 22922 (100.0 %) >> > Over-replicated blocks: 0 (0.0 %) >> > Under-replicated blocks: 0 (0.0 %) >> > Mis-replicated blocks: 0 (0.0 %) >> > Default replication factor: 3 >> > Average block replication: 3.0 >> > Corrupt blocks: 0 >> > Missing replicas: 0 (0.0 %) >> > Number of data-nodes: 10 >> > Number of racks: 1 >> > >> > >> > The filesystem under path '/' is HEALTHY >> > >> > So i started everything and it seemed to be superficially functional. >> > >> > I then shutdown hadoop and restarted. hadoop came up in a matter of a few >> > minutes, then hbase took about ten minutes of seeming to copy files >> around, >> > based on the hbase master logs. >> > >> > After this we saw region not found client errors on some tables. I ran >> hbase >> > hbck to look for problems and saw the errors I reported in the original >> > post. Add in the ganglia problems and a botched attempt to edit the >> .META. >> > table which brought us even further into the rabbit hole. I then decided >> to >> > drop the affected tables but lo and behold one can not disable a table >> that >> > has messed up regions...so I manually deleted the data but some of the >> > .META. table entries were still there. Finally this afternoon we >> reformatted >> > the entire cluster. >> > >> > Thanks. >> > >> > >> > >> > On Sat, Jul 2, 2011 at 5:25 PM, Stack <[email protected]> wrote: >> > >> >> On Sat, Jul 2, 2011 at 9:55 AM, Wayne <[email protected]> wrote: >> >> > It just returns a ton of errors (import: command not found). Our >> cluster >> >> is >> >> > hosed anyway. I am waiting to get it completely re-installed from >> >> scratch. >> >> > Hope has long since flown out the window. I just changed my opinion of >> >> what >> >> > it takes to manage hbase. A Java engineer is required on staff. I also >> >> > realized now a backup strategy is more important than for a RDBMS. >> Having >> >> > RF=3 in HDFS offers no insurance against hbase lossing its shirt and >> >> having >> >> > .META. getting corrupted. I think I just found the achilles heel. >> >> > >> >> > >> >> >> >> Yeah, stability is primary but I do not know how you got into the >> >> circumstance you find yourself in. All I can offer is to try and do >> >> diagnostics since avoiding hitting this situation again is of utmost >> >> importance. >> >> >> >> St.Ack >> >> >> > >> >
