[
https://issues.apache.org/jira/browse/HDFS-5343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13794881#comment-13794881
]
sathish commented on HDFS-5343:
-------------------------------
// update current position
if (updatePosition) {
pos = offset;
blockEnd = blk.getStartOffset() + blk.getBlockSize() - 1;
currentLocatedBlock = blk;
}
Yes,the above check is the cause for reading unexpexcted bytes,when reading
from snapshot file,and one more thing is the cat command uses the readstrategy
method to read the bytes from a file,in this it taking block end as length to
read the content in the file,so even if we snapshot file also,it is not taking
lenght as indication to read content in snapshot files,
private int readWithStrategy(ReaderStrategy strategy, int off, int len) throws
IOException {
dfsClient.checkOpen();
if (closed) {
throw new IOException("Stream closed");
}
Map<ExtendedBlock,Set<DatanodeInfo>> corruptedBlockMap
= new HashMap<ExtendedBlock, Set<DatanodeInfo>>();
failures = 0;
if (pos < getFileLength()) {
int retries = 2;
while (retries > 0) {
try {
// currentNode can be left as null if previous read had a checksum
// error on the same block. See HDFS-3067
if (pos > blockEnd || currentNode == null) {
currentNode = blockSeekTo(pos);
}
int realLen = (int) Math.min(len, (blockEnd - pos + 1L));
int result = readBuffer(strategy, off, realLen, corruptedBlockMap);
here it is taking blockend as indication to read the content in that file,so we
can change that blockend for that snapshot files to filelength as like this
// update current position
if (updatePosition) {
pos = offset;
blockEnd = Math.min((locatedBlocks.getFileLength() - 1),
(blk.getStartOffset()
+ blk.getBlockSize() - 1));
currentLocatedBlock = blk;
}
return blk;
}
> When cat command is issued on snapshot files getting unexpected result
> ----------------------------------------------------------------------
>
> Key: HDFS-5343
> URL: https://issues.apache.org/jira/browse/HDFS-5343
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: hdfs-client
> Reporter: sathish
> Assignee: sathish
>
> first if we create one file with some file length and take the snapshot of
> that file,and again append some data through append method to that file,then
> if we do cat command operation on snapshot of that file,in general it should
> dispaly the data what we added with create operation,but it is displaying the
> total data i.e. create +_ appended data.
> but if we do the same operation and if we read the contents of snapshot file
> through input stream it is just displaying the data created in snapshoted
> files.
> in this the behaviour of cat command and reading through inputstream is
> getting different
--
This message was sent by Atlassian JIRA
(v6.1#6144)