Chris, I really appreciate your detailed fix description! I've run into similar problems (due to old hardware and bad sectors) and could never figure out how to fix a broken table. Hbck always seemed to just make things worse until I would give up and recreate the table.
Can you publish your utility that you used to create valid/empty HFiles? --Tom On Sun, Dec 9, 2012 at 6:08 PM, Kevin O'dell <[email protected]> wrote: > Chris, > > Thank you for the very descriptive update. > > On Sun, Dec 9, 2012 at 6:29 PM, Chris Waterson <[email protected]> wrote: > >> Well, I upgraded to 0.92.2, since the version I was running on (0.92.1) >> didn't have those options for "hbck". >> >> That helped. >> >> It took me a while to realize that I had to make the root filesystem >> writable so that "hbck >> -repair" could create itself a directory. So, once that was done, it at >> least ran through to completion. >> >> But the problem persisted in that there were blocks in META that didn't >> exist on the filesystem. One poor region server was assigned the sad task >> of attempting to open the non-existent directory, which it slavishly >> reattempted again and again, filling its log with FileNotFoundException >> stack traces. >> >> For example, >> >> 2012-12-09 00:14:33,315 ERROR >> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Failed open >> of >> region=referrers,com.free-hdwallpapers.www/wallpapers/animals/mici/595718.jpg|com.free-hdwallpapers.www/wallpaper/animals/husky/270579,1354964606745.0c54fe59c58ddd6b34042ec98171bff7. >> java.io.FileNotFoundException: File does not exist: >> /hbase/referrers/2cb553c74d52ddcbf31940f6c7128c63/main/33f1fd9efb944c4e982ba719cd7dde84 >> etc., etc. >> >> In particular, the directory above "/hbase/referrers/2cb553...c63" simply >> did not exist at all in HDFS. >> >> So I took matters into my own hands and created the missing >> "/hbase/referrers/2cb553...c63" directory, its subdirectory "main", and >> attempted to create a zero-length file "331fd9...e84". This changed the >> firehose of exceptions from FileNotFoundException to CorruptHFileException. >> >> So, I wrote a small program to emit a valid, empty HFile, and proceeded to >> place these files at whatever places in HDFS that a FileNotFoundException >> was being thrown. After creating three or four of them, the exceptions >> stopped. >> >> I then ran "hbck -repair" again, and upon completion it declared victory. >> >> Again, I suspect that I got myself into this problem because I ran a >> machine out of disk space. It's likely that most folks are more clever >> than me, and so this problem hasn't arisen before. :) >> >> >> >> >> On Dec 9, 2012, at 3:00 PM, "Kevin O'dell" <[email protected]> >> wrote: >> >> > can you run hbase hbck -fixMeta -fixAssignments >> > >> > This should assign those region servers and fix the hole. >> > >> > On Sat, Dec 8, 2012 at 11:30 PM, Chris Waterson <[email protected]> >> wrote: >> > >> >> Hello! I've gotten myself into trouble where I'm missing files on HDFS >> >> that HBase thinks ought to be there. In particular, running "hbase >> hbck" >> >> yields the below message: two regions are "not deployed on any region >> >> server" (because there is no file in HDFS for the region), and "there >> is a >> >> hole in the region chain". >> >> >> >> (FWIW, I suspect that this problem is due to a recent incident where we >> >> ran the cluster out of disk space.) >> >> >> >> I'm running 0.92.1, and have been staggering around trying to figure out >> >> what procedure I ought to use to correct the problem, but my Google-fu >> is >> >> too poor to have yielded results. Any pointers would be appreciated! >> >> >> >> thanks, >> >> chris >> >> >> >> >> >> >> >> >> >> ERROR: Region >> >> >> referrers,com.free-hdwallpapers.www/wallpapers/animals/mici/595718.jpg|com.free-hdwallpapers.www/wallpaper/animals/husky/270579,1354964606745.0c54fe59c58ddd6b34042ec98171bff7. >> >> not deployed on any region server. >> >> ERROR: Region >> >> >> referrers,com.free-hdwallpapers.www/wallpapers/anime/mici/78285.jpg|com.free-hdwallpapers.www/wallpaper/anime/wolf-furry/90641,1354964606745.d2451e8db0f2b9546cc42c6d260a2ab8. >> >> not deployed on any region server. >> >> ERROR: There is a hole in the region chain between >> >> >> com.free-hdwallpapers.www/wallpapers/animals/mici/595718.jpg|com.free-hdwallpapers.www/wallpaper/animals/husky/270579 >> >> and >> >> >> com.free-hdwallpapers.www/wallpapers/entertainment/mici/11840.jpg|com.free-hdwallpapers.www/wallpaper/entertainment/new-moon-bella-and-edward/12951. >> >> You need to create a new regioninfo and region dir in hdfs to plug the >> >> hole. >> >> >> >> >> > >> > >> > -- >> > Kevin O'Dell >> > Customer Operations Engineer, Cloudera >> >> > > > -- > Kevin O'Dell > Customer Operations Engineer, Cloudera
