[jira] [Commented] (HDFS-3049) During the normal loading NN startup process, fall back on a different EditLog if we see one that is corrupt

Todd Lipcon (JIRA) Wed, 25 Apr 2012 19:01:43 -0700

    [ 
https://issues.apache.org/jira/browse/HDFS-3049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13262316#comment-13262316
 ]


Todd Lipcon commented on HDFS-3049:
-----------------------------------

- can you please add some comments to the code explaining the behavior of 
getInputStream? eg perhaps a comment above the definition of "best" saying "the 
set of streams which all contain the same number of transactions, starting from 
this point".
- I think some of the new INFO logs are better off being debug
- Am I correct in understanding that MergedEditLogInputStream only provides 
failover functionality between multiple logs which contain the same exact set 
of transactions? In that case, I think a better name is something like 
FailoverEditLogInputStream. Also, I think you should iterate through the 
streams in the constructor and verify that they all have matching metadata 
(version, first tx, last tx, length), throwing an IllegalArgumentException if 
not.
- You can use {{MultipleIOException}} to group all the exception causes together
- The new class needs an interface audience annotation (Private) - or can it be 
made package-private?
- If you have an error on the very first transaction read, then {{needSkip}} 
gets set true but {{prevTxId}} is still {{INVALID_TXID}} right? Maybe instead 
initialize {{prevTxId}} to {{firstTxId-1}}? Or make the variable be 
{{nextTxId}}?
- Can you add some comments to nextOp()? I am having trouble following the 
control flow -- why don't you just do the "skipping" in the error handling path 
instead of using this "needSkip" flag?

- In the exceptions you throw, you should include the name of the underlying 
stream
- the debug message "now trying to read from" should probably be combined with 
the "got error" message above.
- Is it really worth adding a new parameter to MiniDFSCluster for this, instead 
of just manually setting the name/edit dir configs for the few test cases where 
we need to ensure only a single dir?

- Refactor out the code to corrupt a log file, since it now appears multiple 
times, and will make the test easier to digest
- No need to catch the exception and fail() in the test case - if you just let 
the exception fall through, it will fail on its own

{code}
+        public boolean accept(File dir, String name) {
+          if 
(name.startsWith(NNStorage.getFinalizedEditsFileName(startErrorTxId, 
+                                  endErrorTxId))) {
+            return true;
+          }
+          return false;
+        }
{code}

why does this have to be so complicated? How would we ever have more than one 
file starting with edits_N-M in the directory? i.e. why not just do: {{{File 
editLog = new File(new File(f1, "current"), NNStorage.getFinalized...)))}}

                
> During the normal loading NN startup process, fall back on a different 
> EditLog if we see one that is corrupt
> ------------------------------------------------------------------------------------------------------------
>
>                 Key: HDFS-3049
>                 URL: https://issues.apache.org/jira/browse/HDFS-3049
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>          Components: name-node
>    Affects Versions: 0.23.0
>            Reporter: Colin Patrick McCabe
>            Assignee: Colin Patrick McCabe
>            Priority: Minor
>         Attachments: HDFS-3049.001.patch, HDFS-3049.002.patch, 
> HDFS-3049.003.patch
>
>
> During the NameNode startup process, we load an image, and then apply edit 
> logs to it until we believe that we have all the latest changes.  
> Unfortunately, if there is an I/O error while reading any of these files, in 
> most cases, we simply abort the startup process.  We should try harder to 
> locate a readable edit log and/or image file.
> *There are three main use cases for this feature:*
> 1. If the operating system does not honor fsync (usually due to a 
> misconfiguration), a file may end up in an inconsistent state.
> 2. In certain older releases where we did not use fallocate() or similar to 
> pre-reserve blocks, a disk full condition may cause a truncated log in one 
> edit directory.
> 3. There may be a bug in HDFS which results in some of the data directories 
> receiving corrupt data, but not all.  This is the least likely use case.
> *Proposed changes to normal NN startup*
> * We should try a different FSImage if we can't load the first one we try.
> * We should examine other FSEditLogs if we can't load the first one(s) we try.
> * We should fail if we can't find EditLogs that would bring us up to what we 
> believe is the latest transaction ID.
> Proposed changes to recovery mode NN startup:
> we should list out all the available storage directories and allow the 
> operator to select which one he wants to use.
> Something like this:
> {code}
> Multiple storage directories found.
> 1. /foo/bar
>     edits__curent__XYZ          size:213421345       md5:2345345
>     image                                  size:213421345       md5:2345345
> 2. /foo/baz
>     edits__curent__XYZ          size:213421345       md5:2345345345
>     image                                  size:213421345       md5:2345345
> Which one would you like to use? (1/2)
> {code}
> As usual in recovery mode, we want to be flexible about error handling.  In 
> this case, this means that we should NOT fail if we can't find EditLogs that 
> would bring us up to what we believe is the latest transaction ID.
> *Not addressed by this feature*
> This feature will not address the case where an attempt to access the 
> NameNode name directory or directories hangs because of an I/O error.  This 
> may happen, for example, when trying to load an image from a hard-mounted NFS 
> directory, when the NFS server has gone away.  Just as now, the operator will 
> have to notice this problem and take steps to correct it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3049) During the normal loading NN startup process, fall back on a different EditLog if we see one that is corrupt

Reply via email to