[
https://issues.apache.org/jira/browse/HDFS-3049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13262316#comment-13262316
]
Todd Lipcon commented on HDFS-3049:
-----------------------------------
- can you please add some comments to the code explaining the behavior of
getInputStream? eg perhaps a comment above the definition of "best" saying "the
set of streams which all contain the same number of transactions, starting from
this point".
- I think some of the new INFO logs are better off being debug
- Am I correct in understanding that MergedEditLogInputStream only provides
failover functionality between multiple logs which contain the same exact set
of transactions? In that case, I think a better name is something like
FailoverEditLogInputStream. Also, I think you should iterate through the
streams in the constructor and verify that they all have matching metadata
(version, first tx, last tx, length), throwing an IllegalArgumentException if
not.
- You can use {{MultipleIOException}} to group all the exception causes together
- The new class needs an interface audience annotation (Private) - or can it be
made package-private?
- If you have an error on the very first transaction read, then {{needSkip}}
gets set true but {{prevTxId}} is still {{INVALID_TXID}} right? Maybe instead
initialize {{prevTxId}} to {{firstTxId-1}}? Or make the variable be
{{nextTxId}}?
- Can you add some comments to nextOp()? I am having trouble following the
control flow -- why don't you just do the "skipping" in the error handling path
instead of using this "needSkip" flag?
- In the exceptions you throw, you should include the name of the underlying
stream
- the debug message "now trying to read from" should probably be combined with
the "got error" message above.
- Is it really worth adding a new parameter to MiniDFSCluster for this, instead
of just manually setting the name/edit dir configs for the few test cases where
we need to ensure only a single dir?
- Refactor out the code to corrupt a log file, since it now appears multiple
times, and will make the test easier to digest
- No need to catch the exception and fail() in the test case - if you just let
the exception fall through, it will fail on its own
{code}
+ public boolean accept(File dir, String name) {
+ if
(name.startsWith(NNStorage.getFinalizedEditsFileName(startErrorTxId,
+ endErrorTxId))) {
+ return true;
+ }
+ return false;
+ }
{code}
why does this have to be so complicated? How would we ever have more than one
file starting with edits_N-M in the directory? i.e. why not just do: {{{File
editLog = new File(new File(f1, "current"), NNStorage.getFinalized...)))}}
> During the normal loading NN startup process, fall back on a different
> EditLog if we see one that is corrupt
> ------------------------------------------------------------------------------------------------------------
>
> Key: HDFS-3049
> URL: https://issues.apache.org/jira/browse/HDFS-3049
> Project: Hadoop HDFS
> Issue Type: New Feature
> Components: name-node
> Affects Versions: 0.23.0
> Reporter: Colin Patrick McCabe
> Assignee: Colin Patrick McCabe
> Priority: Minor
> Attachments: HDFS-3049.001.patch, HDFS-3049.002.patch,
> HDFS-3049.003.patch
>
>
> During the NameNode startup process, we load an image, and then apply edit
> logs to it until we believe that we have all the latest changes.
> Unfortunately, if there is an I/O error while reading any of these files, in
> most cases, we simply abort the startup process. We should try harder to
> locate a readable edit log and/or image file.
> *There are three main use cases for this feature:*
> 1. If the operating system does not honor fsync (usually due to a
> misconfiguration), a file may end up in an inconsistent state.
> 2. In certain older releases where we did not use fallocate() or similar to
> pre-reserve blocks, a disk full condition may cause a truncated log in one
> edit directory.
> 3. There may be a bug in HDFS which results in some of the data directories
> receiving corrupt data, but not all. This is the least likely use case.
> *Proposed changes to normal NN startup*
> * We should try a different FSImage if we can't load the first one we try.
> * We should examine other FSEditLogs if we can't load the first one(s) we try.
> * We should fail if we can't find EditLogs that would bring us up to what we
> believe is the latest transaction ID.
> Proposed changes to recovery mode NN startup:
> we should list out all the available storage directories and allow the
> operator to select which one he wants to use.
> Something like this:
> {code}
> Multiple storage directories found.
> 1. /foo/bar
> edits__curent__XYZ size:213421345 md5:2345345
> image size:213421345 md5:2345345
> 2. /foo/baz
> edits__curent__XYZ size:213421345 md5:2345345345
> image size:213421345 md5:2345345
> Which one would you like to use? (1/2)
> {code}
> As usual in recovery mode, we want to be flexible about error handling. In
> this case, this means that we should NOT fail if we can't find EditLogs that
> would bring us up to what we believe is the latest transaction ID.
> *Not addressed by this feature*
> This feature will not address the case where an attempt to access the
> NameNode name directory or directories hangs because of an I/O error. This
> may happen, for example, when trying to load an image from a hard-mounted NFS
> directory, when the NFS server has gone away. Just as now, the operator will
> have to notice this problem and take steps to correct it.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira