Marcel Kornacker has posted comments on this change. Change subject: IMPALA-4623: Enable file handle cache ......................................................................
Patch Set 8: (16 comments) http://gerrit.cloudera.org:8080/#/c/6478/8/be/src/runtime/disk-io-mgr-handle-cache.h File be/src/runtime/disk-io-mgr-handle-cache.h: Line 88: /// with a call to ReleaseFileHandle to release exclusive control. this should also evict handles if a more recent mtime for a file with existing cache entries is being requested (rather than waiting for them to age out). Line 98: static const uint32_t HASH_SEED = 0x87654321; where from? http://gerrit.cloudera.org:8080/#/c/6478/8/be/src/runtime/disk-io-mgr-handle-cache.inline.h File be/src/runtime/disk-io-mgr-handle-cache.inline.h: Line 65: // Keep going until find unused entry with the same mtime fix grammar Line 113: // to its place in the map add it? Line 117: if (elem->fh.get() == fh) { single line Line 129: // File can be unbuffered where does the corresponding buffering take place? this looks asymmetrical. http://gerrit.cloudera.org:8080/#/c/6478/8/be/src/runtime/disk-io-mgr-scan-range.cc File be/src/runtime/disk-io-mgr-scan-range.cc: Line 417: int last_read = -1; what does this mean? does it count anything, and if so, what? Line 429: return Status(GetHdfsErrorMsg("Error seeking HDFS file: ", file_)); also print position Line 435: return Status(GetHdfsErrorMsg("Error reading from HDFS file: ", file_)); leave todo that you need to retry once to deal with stale cache entries Line 549: remote_bytes_ += stats->totalBytesRead - stats->totalLocalBytesRead; why can't remote_bytes_ go into reader_? http://gerrit.cloudera.org:8080/#/c/6478/8/be/src/runtime/disk-io-mgr.cc File be/src/runtime/disk-io-mgr.cc: Line 1300 this comment disappeared. why is this being called? Line 278: cached_read_options_(nullptr), move into .h file and remove here http://gerrit.cloudera.org:8080/#/c/6478/6/be/src/runtime/disk-io-mgr.h File be/src/runtime/disk-io-mgr.h: Line 494: /// times for a scan range, it is necessary to keep some state about whether this the descriptions makes it sounds as if you need a flag, not a count. http://gerrit.cloudera.org:8080/#/c/6478/8/be/src/runtime/disk-io-mgr.h File be/src/runtime/disk-io-mgr.h: Line 498: int64_t remote_bytes_; num_... http://gerrit.cloudera.org:8080/#/c/6478/8/tests/query_test/test_hdfs_fd_caching.py File tests/query_test/test_hdfs_fd_caching.py: Line 74: self.execute_query_expect_success(self.client, count_query) this should give some expected result Line 99: self.execute_query_expect_success(self.client, count_query) you should check for the expected result, depending on the modification type -- To view, visit http://gerrit.cloudera.org:8080/6478 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ibe5ff60971dd653c3b6a0e13928cfa9fc59d078d Gerrit-PatchSet: 8 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Joe McDonnell <[email protected]> Gerrit-Reviewer: Dan Hecht <[email protected]> Gerrit-Reviewer: Joe McDonnell <[email protected]> Gerrit-Reviewer: Marcel Kornacker <[email protected]> Gerrit-Reviewer: Tim Armstrong <[email protected]> Gerrit-HasComments: Yes
