[ https://issues.apache.org/jira/browse/HADOOP-4663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12665493#action_12665493 ]
Raghu Angadi commented on HADOOP-4663: -------------------------------------- > The data corruption you have seen occured because the > generation-stamp-update-procotol is not triggered > during a block transfer request. This patch correctly handles > block-trasfer-requests and should prevent the data > corruption issue from occuring. Many different types of data corruption occurred recently with 0.18.. mainly because of combination of bugs. The corruptions caused by the issue in this jira has little to do with generation stamp for transfers. Primary cause is this : # DN promotes all files created in 0.17 from /tmp directory that it should never have done. # When it moved the files it did not generate a gen stamp for metadata files. # DN reports those blocks as valid to NN. # Later DN marks these files as corrupt since there is no metadata. The above is one of the biggest source of corruptions. There were various other bugs that contributed, many of these were fixed in 0.18.3. > Datanode should delete files under tmp when upgraded from 0.17 > -------------------------------------------------------------- > > Key: HADOOP-4663 > URL: https://issues.apache.org/jira/browse/HADOOP-4663 > Project: Hadoop Core > Issue Type: Bug > Components: dfs > Affects Versions: 0.18.0 > Reporter: Raghu Angadi > Assignee: dhruba borthakur > Priority: Blocker > Fix For: 0.19.1 > > Attachments: deleteTmp.patch, deleteTmp2.patch, deleteTmp_0.18.patch, > handleTmp1.patch > > > Before 0.18, when Datanode restarts, it deletes files under data-dir/tmp > directory since these files are not valid anymore. But in 0.18 it moves these > files to normal directory incorrectly making them valid blocks. One of the > following would work : > - remove the tmp files during upgrade, or > - if the files under /tmp are in pre-18 format (i.e. no generation), delete > them. > Currently effect of this bug is that, these files end up failing block > verification and eventually get deleted. But cause incorrect over-replication > at the namenode before that. > Also it looks like our policy regd treating files under tmp needs to be > defined better. Right now there are probably one or two more bugs with it. > Dhruba, please file them if you rememeber. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.