I guess one way might be to write your own dfs reader that ignores The exceptions and reads whatever it can
Sent from my iPad On Nov 23, 2012, at 6:12 PM, Hs <[email protected]> wrote: > Hi, > > I am running hadoop 1.0.3 and hbase-0.94.0on a 12-node cluster. For unknown > operational faults, 6 datanodes have suffered a complete data loss(hdfs data > directory gone). When I restart hadoop, it reports "The ratio of reported > blocks 0.8252". > > I have a folder in hdfs containing many important files in hadoop > SequenceFile format. The hadoop fsck tool shows that (in this folder) > > Total size: 134867556461 B > Total dirs: 16 > Total files: 251 > Total blocks (validated): 2136 (avg. block size 63140241 B) > ******************************** > CORRUPT FILES: 167 > MISSING BLOCKS: 405 > MISSING SIZE: 25819446263 B > CORRUPT BLOCKS: 405 > ******************************** > > I wonder if I can read these corrupted SequenceFiles with missing blocks > skipped ? Or, what else can I do now to recover these SequenceFiles as much > as possible ? > > Please save me. > > Thanks ! > > (Sorry for duplicating this post on user and hdfs-dev list, I do not know > where exactly i should put it.)
