[ 
https://issues.apache.org/jira/browse/HDFS-3510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-3510:
---------------------------------------

    Description: 
In the FSEditLog, we want to avoid running out of space in the middle of 
writing an edit log operation to the disk.  We do this by a process called 
"preallocation"-- reserving space on the disk for the upcoming edit log entries 
before beginning to write them.

branch-1 has some major problems with the way it does preallocation.  These 
problems can lead to corrupt edit logs when the disk runs out of space during 
an edit log sync operation.

The problems are:

* We use FileChannel#write without checking for short writes, but 
WritableByteChannel explicitly documents that they are possible, and the 
FileChannel subclass is silent on the issue.
* We only try to do preallocation when the current position is less than 4096 
bytes from the end of the file.  However, bufReady starts out at 512kb, and 
only gets bigger from there.  There is no way that 4kb is enough space to 
reserve.
* The current code seems to be based on a misunderstanding of how space is 
allocated in files in Linux.  In FileChannel#write(ByteBuffer, long), the 
second argument is the offset to start writing at.  Since we set this to 
fc.position() + 1024*1024, this means that we *start* writing a megabyte after 
the end of the file.  This is guaranteed to create a sparse file on Linux, 
defeating the point of pre-allocation.

  was:
In the FSEditLog, we want to avoid running out of space in the middle of 
writing an edit log operation to the disk.  We do this by a process called 
"preallocation"-- reserving space on the disk for the upcoming edit log entries 
before beginning to write them.

branch-1 has some major problems with the way it does preallocation.  These 
problems can lead to corrupt edit logs when the disk runs out of space during 
an edit log sync operation.

        Summary: FSEditLog pre-allocation does not work in branch-1  (was: Some 
FSEditLog pre-allocation problems in branch-1)
    
> FSEditLog pre-allocation does not work in branch-1
> --------------------------------------------------
>
>                 Key: HDFS-3510
>                 URL: https://issues.apache.org/jira/browse/HDFS-3510
>             Project: Hadoop HDFS
>          Issue Type: Bug
>    Affects Versions: 1.0.0
>            Reporter: Colin Patrick McCabe
>            Assignee: Colin Patrick McCabe
>             Fix For: 1.0.0
>
>         Attachments: HDFS-3510-b1.001.patch
>
>
> In the FSEditLog, we want to avoid running out of space in the middle of 
> writing an edit log operation to the disk.  We do this by a process called 
> "preallocation"-- reserving space on the disk for the upcoming edit log 
> entries before beginning to write them.
> branch-1 has some major problems with the way it does preallocation.  These 
> problems can lead to corrupt edit logs when the disk runs out of space during 
> an edit log sync operation.
> The problems are:
> * We use FileChannel#write without checking for short writes, but 
> WritableByteChannel explicitly documents that they are possible, and the 
> FileChannel subclass is silent on the issue.
> * We only try to do preallocation when the current position is less than 4096 
> bytes from the end of the file.  However, bufReady starts out at 512kb, and 
> only gets bigger from there.  There is no way that 4kb is enough space to 
> reserve.
> * The current code seems to be based on a misunderstanding of how space is 
> allocated in files in Linux.  In FileChannel#write(ByteBuffer, long), the 
> second argument is the offset to start writing at.  Since we set this to 
> fc.position() + 1024*1024, this means that we *start* writing a megabyte 
> after the end of the file.  This is guaranteed to create a sparse file on 
> Linux, defeating the point of pre-allocation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to