[ 
https://issues.apache.org/jira/browse/HDFS-2003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13044133#comment-13044133
 ] 

Todd Lipcon commented on HDFS-2003:
-----------------------------------

BTW, I thought of another reason why EOFException shouldn't be treated the same 
if it comes in the middle of a transaction:

A lot of the transaction serialization formats have length-prefixed strings. In 
the case that there is corruption in the file, I often find we get into a 
situation where it's trying to read a length-prefixed string but instead gets 
some other random bytes (eg part of a filename). This causes it to issue a 
read() for a very large number of bytes, which, depending on how much heap is 
available, usually results in an OOME or an early EOFException. In the case of 
the EOFException, we don't want to treat it as a successful log read, which is 
what the code does now.

> Separate FSEditLog reading logic from editLog memory state building logic
> -------------------------------------------------------------------------
>
>                 Key: HDFS-2003
>                 URL: https://issues.apache.org/jira/browse/HDFS-2003
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>    Affects Versions: Edit log branch (HDFS-1073)
>            Reporter: Ivan Kelly
>            Assignee: Ivan Kelly
>             Fix For: Edit log branch (HDFS-1073)
>
>         Attachments: HDFS-2003.diff, HDFS-2003.diff, HDFS-2003.diff
>
>
> Currently FSEditLogLoader has code for reading from an InputStream 
> interleaved with code which updates the FSNameSystem and FSDirectory. This 
> makes it difficult to read an edit log without having a whole load of other 
> object initialised, which is problematic if you want to do things like count 
> how many transactions are in a file etc. 
> This patch separates the reading of the stream and the building of the memory 
> state. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to