[
https://issues.apache.org/jira/browse/HDFS-10543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15362997#comment-15362997
]
Colin Patrick McCabe commented on HDFS-10543:
---------------------------------------------
Just to be clear, the existing HDFS Java client can return "short reads" that
are less than what was requested, even when there is more remaining in the
file. This is traditional in POSIX and nearly all filesystems I'm aware of
have these semantics. The justification is that applications may not want to
wait a long time to fetch more bytes, if there are some bytes available already
that they can process. Applications that do want the full buffer can just call
read() again. APIs like {{readFully}} exist to provide these semantics.
> hdfsRead read stops at block boundary
> -------------------------------------
>
> Key: HDFS-10543
> URL: https://issues.apache.org/jira/browse/HDFS-10543
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Components: hdfs-client
> Reporter: Xiaowei Zhu
> Fix For: HDFS-8707
>
> Attachments: HDFS-10543.HDFS-8707.000.patch,
> HDFS-10543.HDFS-8707.001.patch, HDFS-10543.HDFS-8707.002.patch,
> HDFS-10543.HDFS-8707.003.patch, HDFS-10543.HDFS-8707.004.patch
>
>
> Reproducer:
> char *buf2 = new char[file_info->mSize];
> memset(buf2, 0, (size_t)file_info->mSize);
> int ret = hdfsRead(fs, file, buf2, file_info->mSize);
> delete [] buf2;
> if(ret != file_info->mSize) {
> std::stringstream ss;
> ss << "tried to read " << file_info->mSize << " bytes. but read " <<
> ret << " bytes";
> ReportError(ss.str());
> hdfsCloseFile(fs, file);
> continue;
> }
> When it runs with a file ~1.4GB large, it will return an error like "tried to
> read 1468888890 bytes. but read 134217728 bytes". The HDFS cluster it runs
> against has a block size of 134217728 bytes. So it seems hdfsRead will stop
> at a block boundary. Looks like a regression. We should add retry to continue
> reading cross blocks in case of files w/ multiple blocks.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]