[
https://issues.apache.org/jira/browse/HADOOP-12780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15139632#comment-15139632
]
madhumita chakraborty commented on HADOOP-12780:
------------------------------------------------
[~cnauroth] Could you please take a look at the patch?
> During atomic rename handle crash when one directory has been renamed but not
> file under it.
> --------------------------------------------------------------------------------------------
>
> Key: HADOOP-12780
> URL: https://issues.apache.org/jira/browse/HADOOP-12780
> Project: Hadoop Common
> Issue Type: Bug
> Components: fs/azure
> Affects Versions: 2.8.0
> Reporter: madhumita chakraborty
> Assignee: madhumita chakraborty
> Priority: Critical
> Attachments: HADOOP-12780.001.patch
>
>
> During atomic folder rename process preperaion we record the proposed change
> to a metadata file (-renamePending.json).
> Say we are renaming parent/folderToRename to parent/renamedFolder.
> folderToRename has an inner folder innerFolder and innerFolder has a file
> innerFile
> Content of the –renamePending.json file will be
> { OldFolderName: parent/ folderToRename",
> NewFolderName: "parent/renamedFolder",
> FileList: [ "innerFolder", "innerFolder/innerFile" ]
> }
> Atfirst we rename all files within the source directory and then rename the
> source directory at the last step
> The steps are
> 1. Atfirst we will rename innerFolder,
> 2. Then rename innerFolder/innerFile
> 3. Then rename source directory folderToRename
> Say the process crashes after step 1.
> So innerFolder has been renamed.
> Note that Azure storage does not natively support folder. So if a directory
> created by mkdir command, we create an empty placeholder blob with metadata
> for the directory.
> So after step 1, the empty blob corresponding to the directory innerFolder
> has been renamed.
> When the process comes up, in redo path it will go through the
> –renamePending.json file try to redo the renames.
> For each file in file list of renamePending file it checks if the source file
> exists, if source file exists then it renames the file. When it gets
> innerFolder, it calls filesystem.exists(innerFolder). Now
> filesystem.exists(innerFolder) will return true, because file under that
> folder exists even though the empty blob corresponding th that folder does
> not exist. So it will try to rename this folder, and as the empty blob has
> already been deleted so this fails with exception that “source blob does not
> exist”.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)