Matthew Jacobs has posted comments on this change.

Change subject: IMPALA-5352: Age out unused file handles from the cache
......................................................................


Patch Set 2:

(9 comments)

some more thoughts

http://gerrit.cloudera.org:8080/#/c/7640/2//COMMIT_MSG
Commit Message:

Line 25: This adds a test to custom_cluster/test_hdfs_fd_caching.py
worth thinking about testing this on a cluster where there is a huge burst of 
FHs that need to get aged out at the same time. the extra time spent cleaning 
up the structures shouldn't be significant, but worth checking it doesn't 
disrupt anything else.


http://gerrit.cloudera.org:8080/#/c/7640/2/be/src/runtime/disk-io-mgr-handle-cache.inline.h
File be/src/runtime/disk-io-mgr-handle-cache.inline.h:

PS2, Line 64: template <size_t NUM_PARTITIONS>
            : FileHandleCache<NUM_PARTITIONS>::~FileHandleCache() {
            :   shut_down_.Store(SHUTDOWN_TRUE);
            :   if (eviction_thread_.get() != nullptr) {
            :     eviction_thread_->Join();
            :   }
            : }
this never actually runs because this has process-lifetime, right?

If we're going to bother with this code, it'd be good to at least have a BE 
test which exercises it. (Maybe there already is?)


PS2, Line 96:         // remove from lru and reset iterator
            :         p.lru_list.erase(elem->lru_entry);
            :         elem->lru_entry = p.lru_list.end();
it'd be nice to pull this out into a fn on FileHandleCachePartition and also 
for the 'returning fh' case, then the lru manipulation stuff can be kept in one 
place and hidden from the FileHandleCache.


PS2, Line 172:     DCHECK(release_elem->lru_entry == p.lru_list.end());
             :     release_elem->lru_entry = 
p.lru_list.emplace(p.lru_list.end(), release_it);
This could be hidden in FileHandleCachePartition as well


PS2, Line 196: FileHandleCache<NUM_PARTITIONS>::FileHandleCachePartition& p
how about moving this function into FileHandleCachePartition, then calling 
p.EvictHandles(oldest_valid_ts, now())

It'd be nice to be encapsulating more of the stuff that's touching the lru into 
FileHandleCachePartition.


PS2, Line 197: uint64_t now = MonotonicSeconds();
maybe this should be determined in EvictHandlesLoop() before you start taking 
the locks, then pass it to each call so that each 'run' of the loop applies the 
same eviction time consistently.


PS2, Line 198: valid
valid is a bit vague


PS2, Line 199: 0
hitting this branch seems extremely unlikely, only if the user sets the timeout 
to something absurd. Since this is a code path though, we should have a test 
case for it. e.g. setting the flag to max uint64 or something.


PS2, Line 202: lru_entry = p.lru_list.front();
             :     typename MapType::iterator evict_it = lru_entry.map_entry;
             :     uint64_t lru_oldest_timestamp
it'd be nice to use the same naming convention for these vars since they're all 
associated with the same oldest element

e.g. oldest_entry, oldest_entry_map_it, oldest_entry_lru_timestamp


-- 
To view, visit http://gerrit.cloudera.org:8080/7640
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Iefe04b3e2e22123ecb8b3e494934c93dfb29682e
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe McDonnell <joemcdonn...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <joemcdonn...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <m...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com>
Gerrit-HasComments: Yes

Reply via email to