Bharath Vissapragada has submitted this change and it was merged.

Change subject: IMPALA-3680: Cleanup the scan range state after failed hdfs 
cache reads
......................................................................


IMPALA-3680: Cleanup the scan range state after failed hdfs cache reads

Currently we don't reset the file read offset if ZCR fails. Due to
this, when we switch to the normal read path, we hit the eosr of
the scan-range even before reading the expected data length. If both
the ReadFromCache() and ReadRange() calls fail without reading any
data, we end up creating a whole list of scan-ranges, each with size
1KB (DEFAULT_READ_PAST_SIZE) assuming we are reading past the scan
range. This gives a huge performance hit. This patch just calls
ScanRange::Close() after the failed cache reads to clean up the
file system state so that the re-reads start from beginning of
the scan range.

This was hit as a part of debugging IMPALA-3679, where the queries
on 1gb cached data were running ~20x slower compared to non-cached
runs.

Change-Id: I0a9ea19dd8571b01d2cd5b87da1c259219f6297a
Reviewed-on: http://gerrit.cloudera.org:8080/3313
Reviewed-by: Michael Brown <[email protected]>
Tested-by: Bharath Vissapragada <[email protected]>
---
M be/src/runtime/disk-io-mgr-scan-range.cc
M testdata/cluster/node_templates/common/etc/hadoop/conf/hdfs-site.xml.tmpl
M tests/query_test/test_hdfs_caching.py
3 files changed, 83 insertions(+), 5 deletions(-)

Approvals:
  Michael Brown: Looks good to me, approved
  Bharath Vissapragada: Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/3313
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I0a9ea19dd8571b01d2cd5b87da1c259219f6297a
Gerrit-PatchSet: 10
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Bharath Vissapragada <[email protected]>
Gerrit-Reviewer: Bharath Vissapragada <[email protected]>
Gerrit-Reviewer: Dan Hecht <[email protected]>
Gerrit-Reviewer: David Knupp <[email protected]>
Gerrit-Reviewer: Michael Brown <[email protected]>

Reply via email to