[jira] Commented: (HADOOP-4298) File corruption when reading with fuse-dfs

Brian Bockelman (JIRA) Mon, 29 Sep 2008 20:14:07 -0700

    [ 
https://issues.apache.org/jira/browse/HADOOP-4298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12635637#action_12635637
 ]


Brian Bockelman commented on HADOOP-4298:
-----------------------------------------

Hey Owen, all,

In fuse_dfs.c, I replaced this line in function dfs_read:

  if (fh->sizeBuffer == 0  || offset < fh->startOffset || offset > 
(fh->startOffset + fh->sizeBuffer)  )

with the following:

  if (fh->sizeBuffer == 0  || offset < fh->startOffset || offset >= 
(fh->startOffset + fh->sizeBuffer) || (offset+size) >= (fh->startOffset + 
fh->sizeBuffer) )

This covers the bug I mentioned below.  I can now md5sum files successfully.  
However, my application still complains of data corruption on reads (although 
it does make it further through the file!); the application, unlike md5sum, has 
a very random read pattern.

One possibility is that it is doing a huge read which would break through the 
buffer, or another undiscovered bug.  When I figure things out, I'll turn the 
above fix into a proper patch (although you are welcome to do it for me if you 
have time).

However, I would prefer if the expert or original author took a peak at this 
code.

> File corruption when reading with fuse-dfs
> ------------------------------------------
>
>                 Key: HADOOP-4298
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4298
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: contrib/fuse-dfs
>    Affects Versions: 0.18.1
>         Environment: CentOs 4.6 final; kernel 2.6.9-67.ELsmp; FUSE 2.7.4; 
> hadoop 0.18.1; 64-bit
> I hand-altered the fuse-dfs makefile to use 64-bit instead of the hardcoded 
> -m32.
>            Reporter: Brian Bockelman
>            Priority: Critical
>             Fix For: 0.18.2
>
>
> I pulled a 5GB data file into Hadoop using the following command:
> hadoop fs -put /scratch/886B9B3D-6A85-DD11-A9AB-000423D6CA6E.root 
> /user/brian/testfile
> I have HDFS mounted in /mnt/hadoop using fuse-dfs.
> However, when I try to md5sum the file in place (md5sum /mnt/hadoop) or copy 
> the file back to local disk using "cp" then md5sum it, the checksum is 
> incorrect.
> When I pull the file using normal hadoop means (hadoop fs -get 
> /user/brian/testfile /scratch), the md5sum is correct.
> When I repeat the test with a smaller file (512MB, on the theory that there 
> is a problem with some 2GB limit somewhere), the problem remains.
> When I repeat the test, the md5sum is consistently wrong - i.e., some part of 
> the corruption is deterministic, and not the apparent fault of a bad disk.
> CentOs 4.6 is, unfortunately, not the apparent culprit.  When checking on 
> CentOs 5.x, I could recreate the corruption issue.  The second node was also 
> a 64-bit compile and CentOs 5.2 (`uname -r` returns 2.6.18-92.1.10.el5).
> Thanks for looking into this,
> Brian

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-4298) File corruption when reading with fuse-dfs

Reply via email to