[ 
https://issues.apache.org/jira/browse/HADOOP-14512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Xu updated HADOOP-14512:
----------------------------
    Status: In Progress  (was: Patch Available)

> WASB atomic rename should not throw exception if the file is neither in src 
> nor in dst when doing the rename
> ------------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-14512
>                 URL: https://issues.apache.org/jira/browse/HADOOP-14512
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs/azure
>    Affects Versions: 2.8.0
>            Reporter: Duo Xu
>            Assignee: Duo Xu
>         Attachments: HADOOP-14512.001.patch, HADOOP-14512.002.patch
>
>
> During atomic rename operation, WASB creates a rename pending json file to 
> document which files need to be renamed and the destination. Then WASB will 
> read this file and rename all the files one by one.
> There is a recent customer incident in HBase showing a potential bug in the 
> atomic rename implementation,
> For example, below is a rename pending json file,
> {code}
> {
>   FormatVersion: "1.0",
>   OperationUTCTime: "2017-04-29 06:08:57.465",
>   OldFolderName: "hbase\/data\/default\/abc",
>   NewFolderName: "hbase\/.tmp\/data\/default\/abc",
>   FileList: [
>     ".tabledesc",
>     ".tabledesc\/.tableinfo.0000000001",
>     ".tmp",
>     "08e698e0b7d4132c0456b16dcf3772af",
>     "08e698e0b7d4132c0456b16dcf3772af\/.regioninfo",
>     "08e698e0b7d4132c0456b16dcf3772af\/0\/617294e0737e4d37920e1609cf539a83",
>     "08e698e0b7d4132c0456b16dcf3772af\/recovered.edits\/185.seqid",
>     "08e698e0b7d4132c0456b16dcf3772af\/.regioninfo",
>     "08e698e0b7d4132c0456b16dcf3772af\/0",
>  "08e698e0b7d4132c0456b16dcf3772af\/0\/617294e0737e4d37920e1609cf539a83",
>     "08e698e0b7d4132c0456b16dcf3772af\/recovered.edits",
>     "08e698e0b7d4132c0456b16dcf3772af\/recovered.edits\/185.seqid"
>   ]
> }
> {code}  
> When HBase regionserver process (underlying is using WASB driver) was 
> renaming  "08e698e0b7d4132c0456b16dcf3772af\/.regioninfo", the regionserver 
> process crashed or the VM got rebooted due to system maintenence. When the 
> regionserver process started running again, it found the rename pending json 
> file and tried to redo the rename operation. 
> However, when it read the first file ".tabledesc" in the file list, it could 
> not find this file in src folder and it also could not find the file in 
> destination folder. It could not find it in src folder because the file had 
> already been renamed/moved to the destination folder. It could not find it in 
> destination folder because when HBase starts, it will clean up all the files 
> under /hbase/.tmp.
> The current implementation will throw exceptions saying
> {code}
> else {
>         throw new IOException(
>             "Attempting to complete rename of file " + srcKey + "/" + fileName
>             + " during folder rename redo, and file was not found in source "
>             + "or destination.");
>       }
> {code}
> This will cause HBase HMaster initialization failure and restart HMaster will 
> not work because the same exception will throw again.
> My proposal is that if during the redo, WASB finds a file not in src and not 
> in dst, WASB should just skip this file and process the next file rather than 
> throw the error and let user manually fix it. Reasons are
> 1. Since the rename pending json file contains file A, if the file A is not 
> in src, it must have been renamed.
> 2. if the file A is not in src and not in dst, the upper layer service must 
> have  removed it. One thing to note is that during the atomic rename, the 
> folder is locked. So the only situation the file gets deleted is when VM 
> reboots or service process crashes. When service process restarts, there 
> might be some operations happening before the atomic rename redo, like the 
> HBase example above.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to