Joe McDonnell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18191 )

Change subject: IMPALA-9433: Improved caching of HdfsFileHandles
......................................................................


Patch Set 21:

(8 comments)

Thank you for taking this on. There is a lot of history here. Originally, the 
file handle cache used a generic structure like this:
https://github.com/apache/impala/blob/branch-2.10.0/be/src/util/lru-cache.h

In my rewrite, I switched it to remove the generic structure. This heads back 
in the other direction.

I like that you have backend tests for the generic data structure, which is 
definitely one advantage of that approach. One question I have about moving 
back to a generic structure is whether we would be able to add new 
customization to the file handle cache case. I had been thinking about adding a 
file structure that could contain additional per-file data and/or stats. Is 
that possible with the new generic structure?

http://gerrit.cloudera.org:8080/#/c/18191/21/be/src/runtime/io/handle-cache.inline.h
File be/src/runtime/io/handle-cache.inline.h:

http://gerrit.cloudera.org:8080/#/c/18191/21/be/src/runtime/io/handle-cache.inline.h@167
PS21, Line 167:   // Opening a file handle requires talking to the NameNode so 
it can take some time.
              :   RETURN_IF_ERROR(accessor_tmp.Get()->Init(hdfs_monitor_));
Let me double-check my understanding of the threading here:
The EmplaceAndGet adds an entry, but that entry is in_use=true. If this 
function takes a while, other threads will ignore the entry because it is in 
use.


http://gerrit.cloudera.org:8080/#/c/18191/21/be/src/runtime/io/handle-cache.inline.h@170
PS21, Line 170: it/out
Nit: in/out


http://gerrit.cloudera.org:8080/#/c/18191/21/be/src/util/lru-multi-cache.h
File be/src/util/lru-multi-cache.h:

http://gerrit.cloudera.org:8080/#/c/18191/21/be/src/util/lru-multi-cache.h@125
PS21, Line 125: resf ot
Nit: rest of


http://gerrit.cloudera.org:8080/#/c/18191/21/be/src/util/lru-multi-cache.h@170
PS21, Line 170:     Container_internal& container;
              :     typename Container_internal::iterator it;
Did we consider using an intrusive list for this?


http://gerrit.cloudera.org:8080/#/c/18191/21/be/src/util/lru-multi-cache.inline.h
File be/src/util/lru-multi-cache.inline.h:

http://gerrit.cloudera.org:8080/#/c/18191/21/be/src/util/lru-multi-cache.inline.h@147
PS21, Line 147:       // Move the object to the back of the owning list as it 
is no longer available
              :       container.splice(container.end(), container, 
container_it);
Why move it to the back? It could stay where it is and Get() would skip over it 
because in_use=true (i.e. change the break above to a continue). I don't think 
we get a meaningful speedup from avoiding a couple link list hops and shuffling 
the list around.


http://gerrit.cloudera.org:8080/#/c/18191/21/be/src/util/lru-multi-cache.inline.h@197
PS21, Line 197:   p_value_internal->timestamp_seconds = MonotonicSeconds()
> I think that we didn't refresh timestamp_second in the old version, so soon
The old behavior does reset the timestamp each time we use it. The code is 
subtle, but what is happening is that each get/release cycle the entry is 
removed from the LRU list and then added back. The timestamp is set each time 
it is added to the LRU list. If something is in continuous use, we would keep 
it open.

One nice thing about the old behavior is that the time-based eviction is O(1), 
because the ordering matches the LRU list ordering. The eviction thread can 
walk from oldest to newest and then stop when it sees the first timestamp that 
isn't eligible for eviction.


http://gerrit.cloudera.org:8080/#/c/18191/21/be/src/util/lru-multi-cache.inline.h@201
PS21, Line 201:   // Move the object to the front, keep LRU relation in owning 
list too to
              :   // be able to age out unused objects
              :   container.splice(container.begin(), container, 
p_value_internal->it);
Do we need to be moving entries? I'm not sure this gets us anything.

Even if we leave the objects in the order they were created, the fact that we 
prefer the earlier entries will cause the later entries to be less frequently 
used and age out faster.


http://gerrit.cloudera.org:8080/#/c/18191/21/be/src/util/lru-multi-cache.inline.h@222
PS21, Line 222:   if (container.size() == 1) {
              :     // Last element, owning list can be removed to prevent aging
              :     hash_table_.erase(p_value_internal->key);
              :   } else {
              :     // Remove from owning list
              :     container.erase(p_value_internal->it);
              :   }
When/how does the element get removed from the LRU list? (I'm guessing it has 
something to do with auto unlink)



--
To view, visit http://gerrit.cloudera.org:8080/18191
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6b5c5e9e2b5db2847ab88c41f667c9ca1b03d51a
Gerrit-Change-Number: 18191
Gerrit-PatchSet: 21
Gerrit-Owner: Gergely Fürnstáhl <[email protected]>
Gerrit-Reviewer: Csaba Ringhofer <[email protected]>
Gerrit-Reviewer: Gergely Fürnstáhl <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Joe McDonnell <[email protected]>
Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]>
Gerrit-Comment-Date: Tue, 22 Feb 2022 07:21:35 +0000
Gerrit-HasComments: Yes

Reply via email to