[
https://issues.apache.org/jira/browse/HDFS-10543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Xiaowei Zhu updated HDFS-10543:
-------------------------------
Attachment: HDFS-10543.HDFS-8707.000.patch
The patch fixed the issue that hdfsRead return only the size of the last read
block. And it also fixed the bug that when offset is at the last byte of a
block, it will return as block not found.
> hdfsRead read stops at block boundary
> -------------------------------------
>
> Key: HDFS-10543
> URL: https://issues.apache.org/jira/browse/HDFS-10543
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Components: hdfs-client
> Reporter: Xiaowei Zhu
> Attachments: HDFS-10543.HDFS-8707.000.patch
>
>
> Reproducer:
> char *buf2 = new char[file_info->mSize];
> memset(buf2, 0, (size_t)file_info->mSize);
> int ret = hdfsRead(fs, file, buf2, file_info->mSize);
> delete [] buf2;
> if(ret != file_info->mSize) {
> std::stringstream ss;
> ss << "tried to read " << file_info->mSize << " bytes. but read " <<
> ret << " bytes";
> ReportError(ss.str());
> hdfsCloseFile(fs, file);
> continue;
> }
> When it runs with a file ~1.4GB large, it will return an error like "tried to
> read 1468888890 bytes. but read 134217728 bytes". The HDFS cluster it runs
> against has a block size of 134217728 bytes. So it seems hdfsRead will stop
> at a block boundary. Looks like a regression. We should add retry to continue
> reading cross blocks in case of files w/ multiple blocks.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]