Joe McDonnell has posted comments on this change. Change subject: IMPALA-5352: Age out unused file handles from the cache ......................................................................
Patch Set 1: (14 comments) http://gerrit.cloudera.org:8080/#/c/7640/1/be/src/runtime/disk-io-mgr-handle-cache.h File be/src/runtime/disk-io-mgr-handle-cache.h: PS1, Line 98: Status > comment on when this returns an error Changed to void. This was anticipating IMPALA-5750, but I will just change this in that patch. PS1, Line 126: struct LruListEntry { : LruListEntry(typename MapType::iterator map_entry_in); : typename MapType::iterator map_entry; : uint64_t timestamp_seconds; : }; > It'd be nice to have a validate() method to check internal consistency of t Changed the semantics of the lru_entry on FileHandleEntry so that it can DCHECK that it is being used correctly. Looking into whether there are other checks that make sense. Line 170: /// Periodic check to evict unused file handles > maybe mention "executed by eviction_thread_". Done Line 178: size_t total_capacity_; > total_capacity_ could do with a short comment. Alternatively I'm not sure t Removed PS1, Line 180: Maximum time that a file handle can be unused before > Maximum time before an unused file handle is aged out of the cache. Done. That is much better. Line 187: bool shut_down_ = false; > It would be better to use an atomic so that it's obviously correct and ther Done http://gerrit.cloudera.org:8080/#/c/7640/1/be/src/runtime/disk-io-mgr-handle-cache.inline.h File be/src/runtime/disk-io-mgr-handle-cache.inline.h: PS1, Line 73: Status > Might as well be a void fn Done Line 76: if (unused_handle_timeout_secs_ != 0 && total_capacity_ != 0) { > It's worth considering if we should just always run the thread for simplici Changed to always run the thread. That eliminates the need for total_capacity_. Line 78: &FileHandleCache<NUM_PARTITIONS>::EvictHandlesLoop, this)); > nit: 4spaces Done PS1, Line 187: 1000 > Might be helpful to make this a constant in the class. Done PS1, Line 197: true > Can we combine this line and the one below? Done http://gerrit.cloudera.org:8080/#/c/7640/1/be/src/runtime/disk-io-mgr.cc File be/src/runtime/disk-io-mgr.cc: PS1, Line 101: insures > nit: ensures Done http://gerrit.cloudera.org:8080/#/c/7640/1/tests/custom_cluster/test_hdfs_fd_caching.py File tests/custom_cluster/test_hdfs_fd_caching.py: PS1, Line 137: unused_file_handle_timeout_mins > might be better to take the timeout in seconds, then you can make the timeo Changed the timeout to seconds and lowered the value. PS1, Line 147: 60 > Maybe extract into a local variable and mention that it's derived from unus Changed to take the timeout in seconds. It looks like we don't have any parameters that are floating point, so I thought it would be easier in seconds. I introduced a variable to clarify the test. -- To view, visit http://gerrit.cloudera.org:8080/7640 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Iefe04b3e2e22123ecb8b3e494934c93dfb29682e Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Joe McDonnell <[email protected]> Gerrit-Reviewer: Joe McDonnell <[email protected]> Gerrit-Reviewer: Matthew Jacobs <[email protected]> Gerrit-Reviewer: Tim Armstrong <[email protected]> Gerrit-HasComments: Yes
