[ 
https://issues.apache.org/jira/browse/HDFS-13818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam Antal updated HDFS-13818:
------------------------------
    Description: 
A follow-up Jira for HDFS-13031: an improvement of the OIV is suggested for 
detecting corruptions like HDFS-13101 in an offline way.

The reasoning is the following. Apart from a NN startup throwing the error, 
there is nothing in the customer's hand that could reassure him/her that the 
FSImages is good or corrupted.

Although real full checking of the FSImage is only possible by the NN, for 
stack traces associated with the observed corruption cases the solution of 
putting up a tertiary NN is a little bit of overkill. The OIV would be a handy 
choice, already having functionality like loading the fsimage and constructing 
the folder structure, we just have to add the option of detecting the null 
INodes. For e.g. the Delimited OIV processor can already use in disk 
MetadataMap, which reduces memory consumption. Also there may be a window for 
parallelizing: iterating through INodes for e.g. could be done distributed, 
increasing efficiency, and we wouldn't need a high mem-high CPU setup for just 
checking the FSImage.

  was:
A follow-up Jira for HDFS-13031: an improvement of the OIV is suggested for 
detecting corruptions like HDFS-13101 in an offline way.

The reasoning is the following. Apart from a NN startup throwing the error, 
there is nothing in the customer's hand that could reassure him/her that the 
FSImages is good or corrupted.

Although real full checking of the FSImage is only possible by the NN, for 
stack traces associated with the observed corruption cases the solution of 
putting up a tertiary NN is a little bit of overkill. 

The OIV would be a handy choice, already having functionality like loading the 
fsimage and constructing the folder structure, we just have to add the option 
of detecting the null INodes.

For e.g. the Delimited OIV processor can already use in disk MetadataMap, which 
reduces memory consumption. Also there may be a window for parallelizing: 
iterating through INodes for e.g. could be done distributed, increasing 
efficiency, and we wouldn't need a high mem-high CPU setup for just checking 
the FSImage.


> Extend OIV to detect FSImage corruption
> ---------------------------------------
>
>                 Key: HDFS-13818
>                 URL: https://issues.apache.org/jira/browse/HDFS-13818
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: hdfs
>            Reporter: Adam Antal
>            Assignee: Adam Antal
>            Priority: Major
>
> A follow-up Jira for HDFS-13031: an improvement of the OIV is suggested for 
> detecting corruptions like HDFS-13101 in an offline way.
> The reasoning is the following. Apart from a NN startup throwing the error, 
> there is nothing in the customer's hand that could reassure him/her that the 
> FSImages is good or corrupted.
> Although real full checking of the FSImage is only possible by the NN, for 
> stack traces associated with the observed corruption cases the solution of 
> putting up a tertiary NN is a little bit of overkill. The OIV would be a 
> handy choice, already having functionality like loading the fsimage and 
> constructing the folder structure, we just have to add the option of 
> detecting the null INodes. For e.g. the Delimited OIV processor can already 
> use in disk MetadataMap, which reduces memory consumption. Also there may be 
> a window for parallelizing: iterating through INodes for e.g. could be done 
> distributed, increasing efficiency, and we wouldn't need a high mem-high CPU 
> setup for just checking the FSImage.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to