Adam Antal created HDFS-13818:
---------------------------------
Summary: Extend OIV to detect offline FSImage corruption
Key: HDFS-13818
URL: https://issues.apache.org/jira/browse/HDFS-13818
Project: Hadoop HDFS
Issue Type: Improvement
Components: hdfs
Reporter: Adam Antal
Assignee: Adam Antal
A follow-up Jira for HDFS-13031: an improvement of the OIV is suggested for
detecting corruptions like HDFS-13101 in an offline way.
The reasoning is the following. Apart from a NN startup throwing the error,
there is nothing in the customer's hand that could reassure him/her that the
FSImages is good or corrupted.
Although real full checking of the FSImage is only possible by the NN, for
stack traces associated with the observed corruption cases the solution of
putting up a tertiary NN is a little bit of overkill.
The OIV would be a handy choice, already having functionality like loading the
fsimage and constructing the folder structure, we just have to add the option
of detecting the null INodes.
For e.g. the Delimited OIV processor can already use in disk MetadataMap, which
reduces memory consumption. Also there may be a window for parallelizing:
iterating through INodes for e.g. could be done distributed, increasing
efficiency, and we wouldn't need a high mem-high CPU setup for just checking
the FSImage.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]