[
https://issues.apache.org/jira/browse/HADOOP-4616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12645935#action_12645935
]
Pete Wyckoff commented on HADOOP-4616:
--------------------------------------
is this causes because in the following, the "if(num_read < 0)" should be
"if(num_read <= 0)"?
This way, the error is caught and it should return an IO error and continue on.
I think hdfs though should return -1 in this case, but regardless, the fuse
code leaves this condition out and should catch it.
{code}
while (dfs->rdbuffer_size - total_read > 0 &&
(num_read = hdfsPread(fh->fs, fh->hdfsFH, offset + total_read,
fh->buf + total_read, dfs->rdbuffer_size - total_read)) > 0) {
total_read += num_read;
}
if (num_read < 0) {
// invalidate the buffer
fh->bufferSize = 0;
syslog(LOG_ERR, "Read error - pread failed for %s with return code %d
%s:%d", path, (int)num_read, __FILE__, __LINE__);
ret = -EIO;
} else {
fh->bufferSize = total_read;
fh->buffersStartOffset = offset;
if (dfs->rdbuffer_size - total_read > 0) {
isEOF = 1;
}
}
{code}
> assertion makes fuse-dfs exit when reading incomplete data
> ----------------------------------------------------------
>
> Key: HADOOP-4616
> URL: https://issues.apache.org/jira/browse/HADOOP-4616
> Project: Hadoop Core
> Issue Type: Bug
> Components: contrib/fuse-dfs
> Affects Versions: 0.20.0
> Reporter: Marc-Olivier Fleury
> Priority: Minor
>
> When trying to read a file that is corrupt on HDFS (registered by the
> namenode, but part of the data is missing on the datanodes), some of the
> assertions in dfs_read fail, causing the program to abort. This makes it
> impossible to access the mounted partition until it is mounted again.
> A simple way to reproduce this bug is to remove enough datanodes to have part
> of the data missing, and to read each file listed in HDFS.
> this is the assertion that fails (fuse_dfs.c:903) : assert(bufferReadIndex >=
> 0 && bufferReadIndex < fh->bufferSize);
> The expected behaviour would be to return either no file or a corrupt file,
> but continue working afterward.
> removing the assertion seems to work for now, but a special behaviour is
> probably needed to handle this particular problem correctly.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
