Bharath Vissapragada has posted comments on this change.

Change subject: IMPALA-3680: Reset the file read offset for failed hdfs cache 
reads
......................................................................


Patch Set 1:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/3313/1//COMMIT_MSG
Commit Message:

PS1, Line 12: re-issuing the whole set of scan ranges
> what "set" of scan ranges? and where does this happen?
When the ReadFromCache() fails, it increments the internal file read offset but 
doesn't read any data. Then we  fallback to ReadRange(), which again fails 
because we try to read past the file_size (due to incorrect offset). Then we 
enter [1], where impalad thinks the data is still left to be read and tries to 
read past scan_range and hence uses DEFAULT_READ_PAST_SIZE(=1KB). This 
repeatedly starts a large no. of 1KB size scanranges till the whole range is 
read. 

[1] 
https://github.com/cloudera/Impala/blob/cdh5-trunk/be/src/exec/scanner-context.cc#L145


http://gerrit.cloudera.org:8080/#/c/3313/1/be/src/runtime/disk-io-mgr-scan-range.cc
File be/src/runtime/disk-io-mgr-scan-range.cc:

Line 432:       hdfsSeek(fs_, hdfs_file_->file(), 0);
> how did this work before, and how does this work?  Shouldn't we seek to fil
- Yea sorry, this should be offset_. hdfs seems to be incrementing its internal 
file offset even though the read fails and when we call hdfsReadFile() after 
fallback, its resuming from that offset again, because we are reusing 
hdfs_file_->file() object. So per my understanding, this is still required even 
if we are not calling ScanRange::Open()

- Regarding how this worked before, I replied to your comment on the commit 
message.


-- 
To view, visit http://gerrit.cloudera.org:8080/3313
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I0a9ea19dd8571b01d2cd5b87da1c259219f6297a
Gerrit-PatchSet: 1
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Bharath Vissapragada <[email protected]>
Gerrit-Reviewer: Bharath Vissapragada <[email protected]>
Gerrit-Reviewer: Dan Hecht <[email protected]>
Gerrit-HasComments: Yes

Reply via email to