[
https://issues.apache.org/jira/browse/HADOOP-12678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15086136#comment-15086136
]
Chris Nauroth commented on HADOOP-12678:
----------------------------------------
Thank you, [~madhuch-ms]. The error handling in {{deleteRenamePendingFile}}
still needs some work. Here is the code from your v005 patch.
{code}
} catch (IOException e) {
// If the rename metadata was not found then somebody probably
// raced with us and finished the delete first
Throwable t = e.getCause();
if (t != null && t instanceof StorageException) {
StorageException se = (StorageException) t;
if (se.getErrorCode().equals(("BlobNotFound"))) {
LOG.warn("rename pending file " + redoFile + " is already deleted");
} else {
throw e;
}
}
}
{code}
If there is a general {{IOException}} not caused by an Azure
{{StorageException}}, then this logic would stifle the exception without either
throwing it or logging it. An example of this could be loss of network
connectivity to the Azure Storage backend, which Java would report as an
{{IOException}} with no cause and a message describing the network error. We'd
want to make sure errors like this propagate to the caller, so please stick
with the code I gave in my last comment:
{code}
} catch (IOException e) {
Throwable cause = e.getCause();
if (cause != null && cause instanceof StorageException &&
"BlobNotFound".equals(((StorageException)cause).getErrorCode())) {
LOG.warn("rename pending file " + redoFile + " is already deleted");
} else {
throw e;
}
}
{code}
This ensures that only the BlobNotFound error would get swallowed, and any
other {{IOException}}, whether or not its root cause is in Azure Storage, would
propagate to the caller. It also clarifies that there are really only two
cases for this code: swallow BlobNotFound, else rethrow.
The JavaDoc warnings from the last pre-commit run don't require any action.
These are pre-existing warnings unrelated to this patch. The patch is shifting
the line numbers and therefore making it appear that new warnings were
introduced.
> Handle empty rename pending metadata file during atomic rename in redo path
> ---------------------------------------------------------------------------
>
> Key: HADOOP-12678
> URL: https://issues.apache.org/jira/browse/HADOOP-12678
> Project: Hadoop Common
> Issue Type: Bug
> Components: fs/azure
> Reporter: madhumita chakraborty
> Assignee: madhumita chakraborty
> Priority: Critical
> Attachments: HADOOP-12678.001.patch, HADOOP-12678.002.patch,
> HADOOP-12678.003.patch, HADOOP-12678.004.patch, HADOOP-12678.005.patch
>
>
> Handle empty rename pending metadata file during atomic rename in redo path
> During atomic rename we create metadata file for rename(-renamePending.json).
> We create that in 2 steps
> 1. We create an empty blob corresponding to the .json file in its real
> location
> 2. We create a scratch file to which we write the contents of the rename
> pending which is then copied over into the blob described in 1
> If process crash occurs after step 1 and before step 2 is complete - we will
> be left with a zero size blob corresponding to the pending rename metadata
> file.
> This kind of scenario can happen in the /hbase/.tmp folder because it is
> considered a candidate folder for atomic rename. Now when HMaster starts up
> it executes listStatus on the .tmp folder to clean up pending data. At this
> stage due to the lazy pending rename complete process we look for these json
> files. On seeing an empty file the process simply throws a fatal exception
> assuming something went wrong.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)