[
https://issues.apache.org/jira/browse/HDFS-5343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13794970#comment-13794970
]
Jing Zhao commented on HDFS-5343:
---------------------------------
Thanks [~sathish.gurram] for the fix!
So I agree this looks more like an issue in DFSInputStream#readWithStrategy.
Before we only tested DFSInputStream#read(long position, byte[] buffer, int
offset, int length) and it works fine. However in read(byte[], int, int) and
read(ByteBuffer) the file length has not been taken into account.
For the current patch, instead of changing the value of blockEnd, can we modify
the readWithStrategy method? I.e., in
{code}
// currentNode can be left as null if previous read had a checksum
// error on the same block. See HDFS-3067
if (pos > blockEnd || currentNode == null) {
currentNode = blockSeekTo(pos);
}
int realLen = (int) Math.min(len, (blockEnd - pos + 1L));
int result = readBuffer(strategy, off, realLen, corruptedBlockMap);
{code}
we can add an extra check:
{code}
int realLen = (int) Math.min(len, (blockEnd - pos + 1L));
if (locatedBlocks.isLastBlockComplete()) {
realLen = (int) Math.min(realLen, locatedBlocks.getFileLength());
}
{code}
> When cat command is issued on snapshot files getting unexpected result
> ----------------------------------------------------------------------
>
> Key: HDFS-5343
> URL: https://issues.apache.org/jira/browse/HDFS-5343
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: hdfs-client
> Reporter: sathish
> Assignee: sathish
> Attachments: HDFS-5343-001.patch
>
>
> first if we create one file with some file length and take the snapshot of
> that file,and again append some data through append method to that file,then
> if we do cat command operation on snapshot of that file,in general it should
> dispaly the data what we added with create operation,but it is displaying the
> total data i.e. create +_ appended data.
> but if we do the same operation and if we read the contents of snapshot file
> through input stream it is just displaying the data created in snapshoted
> files.
> in this the behaviour of cat command and reading through inputstream is
> getting different
--
This message was sent by Atlassian JIRA
(v6.1#6144)