Qifan Chen has posted comments on this change. ( http://gerrit.cloudera.org:8080/17979 )
Change subject: IMPALA-10791 Add batching reading for remote temporary files ...................................................................... Patch Set 5: (14 comments) Looks pretty good. I think my two concerns are resolved after this morning's discussion. http://gerrit.cloudera.org:8080/#/c/17979/5/be/src/runtime/io/disk-file.h File be/src/runtime/io/disk-file.h: http://gerrit.cloudera.org:8080/#/c/17979/5/be/src/runtime/io/disk-file.h@52 PS5, Line 52: class MemBlock Need a description for MemBlock class. http://gerrit.cloudera.org:8080/#/c/17979/5/be/src/runtime/io/disk-file.h@62 PS5, Line 62: if the memory is reserved or allocated before deletion to the caller nit. Return whether the memory is reserved or allocated before deletion to the caller http://gerrit.cloudera.org:8080/#/c/17979/5/be/src/runtime/io/disk-file.h@68 PS5, Line 68: int64_t size, const std::unique_lock<SpinLock>& lock) nit. may switch the order of the two arguments as the lock appears first in other similar calls. http://gerrit.cloudera.org:8080/#/c/17979/5/be/src/runtime/io/disk-file.h@168 PS5, Line 168: /// The data of the memory block. nit. May describe the content of a memory block a little bit. For example, a memory block contains multiple pages. http://gerrit.cloudera.org:8080/#/c/17979/5/be/src/runtime/io/disk-file.h@200 PS5, Line 200: DiskFile(const std::string& path, DiskIoMgr* io_mgr, int64_t file_size, nit missing the comment. http://gerrit.cloudera.org:8080/#/c/17979/5/be/src/runtime/io/disk-file.h@206 PS5, Line 206: DiskFileReadBuffCtrl Since this class is nested within DiskFile class, suggest to call it ReadBuffer which owns a number of memory blocks. http://gerrit.cloudera.org:8080/#/c/17979/5/be/src/runtime/io/disk-file.h@206 PS5, Line 206: : DiskFileReadBuffCtrl(int64_t A little bit confusing on how these two names relate to mem blocks. See my two comments below. http://gerrit.cloudera.org:8080/#/c/17979/5/be/src/runtime/io/disk-file.h@210 PS5, Line 210: read_buffer_size_ Suggest to use a name relate to memory block, say mem_block_size_? http://gerrit.cloudera.org:8080/#/c/17979/5/be/src/runtime/io/disk-file.h@213 PS5, Line 213: num_of_read_buffers_ suggest to rename to num_mem_blocks_. http://gerrit.cloudera.org:8080/#/c/17979/5/be/src/runtime/io/disk-file.h@326 PS5, Line 326: int read_buffer_idx = offset / read_buffer_size(); Can you pls double check? Assume the memory block size is 8MB, and the offset is 99MB. The idx = 99/8 = 12. If there are only 4 mem blocks, then this idx of 12 is invalid. Am I missing something? http://gerrit.cloudera.org:8080/#/c/17979/5/be/src/runtime/io/disk-file.cc File be/src/runtime/io/disk-file.cc: http://gerrit.cloudera.org:8080/#/c/17979/5/be/src/runtime/io/disk-file.cc@116 PS5, Line 116: DISABLED Since the memory is actually deleted in this method, maybe we should call the final state DELETED? http://gerrit.cloudera.org:8080/#/c/17979/5/be/src/runtime/tmp-file-mgr.h File be/src/runtime/tmp-file-mgr.h: http://gerrit.cloudera.org:8080/#/c/17979/5/be/src/runtime/tmp-file-mgr.h@147 PS5, Line 147: remote_read_buffer_size_ nit. This name sounds like the buffer is remote. Maybe renamed as mem_block_size_? http://gerrit.cloudera.org:8080/#/c/17979/5/be/src/runtime/tmp-file-mgr.h@151 PS5, Line 151: remote_num_read_buffers_per_file_ similar comment as above. num_mem_blocks_per_file instead? http://gerrit.cloudera.org:8080/#/c/17979/5/be/src/runtime/tmp-file-mgr.h@158 PS5, Line 158: remote_max_total_read_buffer_size_ max_mem_block_size_? -- To view, visit http://gerrit.cloudera.org:8080/17979 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1dcc5d0881ffaeff09c5c514306cd668373ad31b Gerrit-Change-Number: 17979 Gerrit-PatchSet: 5 Gerrit-Owner: Yida Wu <wydbaggio...@gmail.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com> Gerrit-Reviewer: Yida Wu <wydbaggio...@gmail.com> Gerrit-Comment-Date: Thu, 04 Nov 2021 20:09:37 +0000 Gerrit-HasComments: Yes