[ 
https://issues.apache.org/jira/browse/HADOOP-4616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12646593#action_12646593
 ] 

Marc-Olivier Fleury commented on HADOOP-4616:
---------------------------------------------

I tested the patch, it seems to work fine, at least on missing data (I could 
not find a file that was corrupt, those are a bit harder to get)

I tested it by killing the datanodes, while keeping the namenode alive. When 
trying to copy a file to the local file system, I get the following error :

cp : reading <file name> : : Unknown error 255

Which is what we were supposed to get, I guess.

A possible improvement would be to get a better error code, but I don't know 
what kind of possibility you have regarding the error codes...

Thanks for the quick fix!

Marc-O

> assertion makes fuse-dfs exit when reading incomplete data
> ----------------------------------------------------------
>
>                 Key: HADOOP-4616
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4616
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: contrib/fuse-dfs
>    Affects Versions: 0.18.2
>            Reporter: Marc-Olivier Fleury
>            Assignee: Pete Wyckoff
>            Priority: Minor
>             Fix For: 0.18.3, 0.19.1, 0.20.0
>
>         Attachments: HADOOP-4616.txt, HADOOP-4616.txt, HADOOP-4616.txt
>
>
> When trying to read a file that is corrupt on HDFS (registered by the 
> namenode, but part of the data is missing on the datanodes), some of the 
> assertions in dfs_read fail, causing the program to abort. This makes it  
> impossible to access the mounted partition until it is mounted again.
> A simple way to reproduce this bug is to remove enough datanodes to have part 
> of the data missing, and to read each file listed in HDFS.
> this is the assertion that fails (fuse_dfs.c:903) : assert(bufferReadIndex >= 
> 0 && bufferReadIndex < fh->bufferSize);
> The expected behaviour would be to return either no file or a corrupt file, 
> but continue working afterward.
> removing the assertion seems to work for now, but a special behaviour is 
> probably needed to handle this particular problem correctly.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to