Xiaowei Zhu created HDFS-10543:
----------------------------------
Summary: hdfsRead read stops at block boundary
Key: HDFS-10543
URL: https://issues.apache.org/jira/browse/HDFS-10543
Project: Hadoop HDFS
Issue Type: Sub-task
Reporter: Xiaowei Zhu
Reproducer:
char *buf2 = new char[file_info->mSize];
memset(buf2, 0, (size_t)file_info->mSize);
int ret = hdfsRead(fs, file, buf2, file_info->mSize);
delete [] buf2;
if(ret != file_info->mSize) {
std::stringstream ss;
ss << "tried to read " << file_info->mSize << " bytes. but read " <<
ret << " bytes";
ReportError(ss.str());
hdfsCloseFile(fs, file);
continue;
}
When it runs with a file ~1.4GB large, it will return an error like "tried to
read 1468888890 bytes. but read 134217728 bytes". The HDFS cluster it runs
against has a block size of 1468888890 bytes. So it seems hdfsRead will stop at
a block boundary. Looks like a regression. We should add retry to continue
reading cross blocks in case of files w/ multiple blocks.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]