[ 
https://issues.apache.org/jira/browse/HDFS-3004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13231837#comment-13231837
 ] 

Colin Patrick McCabe commented on HDFS-3004:
--------------------------------------------

bq. Why is the following necessary (instead of mark(1))? Doesn't decode itself 
still only read 1 byte?  in.mark(in.available());

decodeOp reads however many bytes are necessary to decode the operation.  There 
is no maximum opcode length.

bq. Per above now that we never throw EditLogInputException we can nuke the 
class and its uses

Yeah.

bq. Can we avoid caching opcodes?

Not really.  The issue is that in order to seek to at least transaction ID X, 
you have to read everything up to and possibly including X.

If you read transaction id X, but have no way of putting it back in the input 
stream, you're screwed.  The earlier code finessed this issue by assuming that 
there were never any gaps between transactions-- in other words, that 
transaction ID N+1 always followed N.  So if you want N, just seek to N-1 and 
then you're set.  For obvious reasons, recovery can't assume this.

I guess there is one alternative-- we could force all of the input streams to 
implement seek(), but that seems like massive overkill.

For what it's worth, Todd and I discussed the rewind() issue today a little bit 
and he seems to think the cache should work (although I don't think he's 
reviewed this patch yet).

[naming change suggestions]

I like these suggestions-- will implement.
                
> Implement Recovery Mode
> -----------------------
>
>                 Key: HDFS-3004
>                 URL: https://issues.apache.org/jira/browse/HDFS-3004
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>          Components: tools
>            Reporter: Colin Patrick McCabe
>            Assignee: Colin Patrick McCabe
>         Attachments: HDFS-3004.010.patch, HDFS-3004.011.patch, 
> HDFS-3004.012.patch, HDFS-3004.013.patch, HDFS-3004.015.patch, 
> HDFS-3004.016.patch, HDFS-3004.017.patch, HDFS-3004.018.patch, 
> HDFS-3004__namenode_recovery_tool.txt
>
>
> When the NameNode metadata is corrupt for some reason, we want to be able to 
> fix it.  Obviously, we would prefer never to get in this case.  In a perfect 
> world, we never would.  However, bad data on disk can happen from time to 
> time, because of hardware errors or misconfigurations.  In the past we have 
> had to correct it manually, which is time-consuming and which can result in 
> downtime.
> Recovery mode is initialized by the system administrator.  When the NameNode 
> starts up in Recovery Mode, it will try to load the FSImage file, apply all 
> the edits from the edits log, and then write out a new image.  Then it will 
> shut down.
> Unlike in the normal startup process, the recovery mode startup process will 
> be interactive.  When the NameNode finds something that is inconsistent, it 
> will prompt the operator as to what it should do.   The operator can also 
> choose to take the first option for all prompts by starting up with the '-f' 
> flag, or typing 'a' at one of the prompts.
> I have reused as much code as possible from the NameNode in this tool.  
> Hopefully, the effort that was spent developing this will also make the 
> NameNode editLog and image processing even more robust than it already is.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to