Ah-ha, I found it!

Anyone see the problem? If you want to cheat, you can look at the ticket...

    size_t num_read;
    size_t total_read = 0;
while (size - total_read > 0 && (num_read = hdfsPread(fh->fs, fh- >hdfsFH, offset + total_read, buf + total_read, size - total_read)) > 0) {
      total_read += num_read;
    }
    if (num_read < 0) {
      // invalidate the buffer
syslog(LOG_ERR, "Read error - pread failed for %s with return code %d %s:%d", path, (int)num_read, __FILE__, __LINE__);
      return -EIO;
    }
    return total_read;


Brian

On Dec 10, 2008, at 6:27 PM, Brian Bockelman wrote:

Hey,

In Hadoop-0.19.0, we've been getting crashing, deadlocking, and other badness from libhdfs (I think: I'm using it through fuse-dfs).

https://issues.apache.org/jira/browse/HADOOP-4775

However, I've been at a complete loss to make any progress in debugging. The problem happens consistently in our workflows, even though it's been problematic to find a simple test case (I suspect the issues are triggered by threading, while debug mode eliminates any threading!). It doesn't appear there are any nice ways to debug or follow along with actions in fuse_dfs or libhdfs: I don't even know how to make DFSClient spill its guts to a log.

Help!

Brian

Reply via email to