Gergely Fürnstáhl has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/18191


Change subject: IMPALA-9433: Improved caching of HdfsileHandles
......................................................................

IMPALA-9433: Improved caching of HdfsileHandles

Seperated LRU caching functionality to a templated LruMultiCache class.

Replaced std::multimap with std::unordered_map with std::list for O(1)
lookups and less memory overhead, as it stores each key one time. Added
boost::intrusive::list to handle LRU relations with less overhead.
Added O(1) release method, instead of O(n) with minimal memory overhead.
Implemented RAII Accessor to remove the responsibility of releasing
the objects from the user.

Wrapped cache accessor and related DiskIOManager metrics to a
FileHandleCache::Accessor. Removed Release*() call trees from
FileHandleCache and DiskIOManager, removed scoped exit from
HdfsFileReader as they are handled automatically.

Testing:

Implemented extensive unit testing of the class, including forced
rehashes, collisions, capacity overshoot, explicit/automatic release
and destroy.

Ran tests/custom_cluster/test_hdfs_fd_caching.py to verify
FileHandleCache::Accessor behaviour through metrics.

Ran bin/single_node_perf_run.py with TPCH and TPC-DS on parquet tables,
no visible change in performance:
TPCH   scale=10 iterations=100: Delta(Avg)=-0.67% Delta(GeoMean)=-0.49%
TPC-DS scale=10 iterations= 50: Delta(Avg)=-0.02% Delta(GeoMean)= 0.00%

Change-Id: I6b5c5e9e2b5db2847ab88c41f667c9ca1b03d51a
---
M be/src/runtime/io/disk-io-mgr.cc
M be/src/runtime/io/disk-io-mgr.h
M be/src/runtime/io/handle-cache.h
M be/src/runtime/io/handle-cache.inline.h
M be/src/runtime/io/hdfs-file-reader.cc
M be/src/util/CMakeLists.txt
A be/src/util/lru-multi-cache-test.cc
A be/src/util/lru-multi-cache.h
A be/src/util/lru-multi-cache.inline.h
9 files changed, 1,188 insertions(+), 274 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/91/18191/18
--
To view, visit http://gerrit.cloudera.org:8080/18191
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I6b5c5e9e2b5db2847ab88c41f667c9ca1b03d51a
Gerrit-Change-Number: 18191
Gerrit-PatchSet: 18
Gerrit-Owner: Gergely Fürnstáhl <[email protected]>
Gerrit-Reviewer: Gergely Fürnstáhl <[email protected]>
Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]>

Reply via email to