Sahil Takiar has posted comments on this change. ( http://gerrit.cloudera.org:8080/15454 )
Change subject: IMPALA-3766: optionally compress spilled data ...................................................................... Patch Set 9: (7 comments) http://gerrit.cloudera.org:8080/#/c/15454/9/be/src/runtime/bufferpool/buffer-pool-counters.h File be/src/runtime/bufferpool/buffer-pool-counters.h: http://gerrit.cloudera.org:8080/#/c/15454/9/be/src/runtime/bufferpool/buffer-pool-counters.h@59 PS9, Line 59: RuntimeProfile::Counter* uncompressed_bytes_written; > I think adding the timers makes a lot of sense. Do we need some tests for the new counters as well? http://gerrit.cloudera.org:8080/#/c/15454/11/be/src/runtime/tmp-file-mgr-internal.h File be/src/runtime/tmp-file-mgr-internal.h: http://gerrit.cloudera.org:8080/#/c/15454/11/be/src/runtime/tmp-file-mgr-internal.h@94 PS11, Line 94: This is also the : /// offset of the next chunk to allocate in the file. maybe move this to the beginning of the comment block? http://gerrit.cloudera.org:8080/#/c/15454/11/be/src/runtime/tmp-file-mgr.cc File be/src/runtime/tmp-file-mgr.cc: http://gerrit.cloudera.org:8080/#/c/15454/11/be/src/runtime/tmp-file-mgr.cc@586 PS11, Line 586: DCHECK(io_mgr_buffer->eosr()); : DCHECK_LE(io_mgr_buffer->len(), read_buffer.len()); : if (io_mgr_ at this point, the compressed_.buffer() should be a nullptr, right? should we add a DCHECK for that? http://gerrit.cloudera.org:8080/#/c/15454/11/be/src/runtime/tmp-file-mgr.cc@644 PS11, Line 644: return status; : } : not sure I understand what this method is for, but why don't we need to decrypt the data if the data is compressed? http://gerrit.cloudera.org:8080/#/c/15454/11/be/src/runtime/tmp-file-mgr.cc@779 PS11, Line 779: write_range_->SetData(buffer_to_write.data(), buffer_to_write.len()); if this fails, do we still need to call FreeCompressedBuffer()? http://gerrit.cloudera.org:8080/#/c/15454/11/be/src/runtime/tmp-file-mgr.cc@807 PS11, Line 807: might be cleaner to have return true/false - true if the compression succeeded, false otherwise. I think it currently relies on checking if 'compressed_len_' has been set, which seems more fragile. i think it simplifies the caller as well, you could re-write it to something like this: MemRange buffer_to_write = buffer; if (parent_->tmp_file_mgr_->compression_enabled()) { if (TryCompress(buffer)) { buffer_to_write = MemRange(compressed_.buffer(), compressed_len_); } } http://gerrit.cloudera.org:8080/#/c/15454/11/be/src/runtime/tmp-file-mgr.cc@835 PS11, Line 835: Status status = io_ctx->AddWriteRange(write_range_.get()); is it possible for ProcessBlock to fail, but still write some data to the compressed buffer? if it is, then will this DCHECK? at this point compressed_len_ should still be -1 right? which means 'DCHECK(is_compressed())' will fail because 'is_compressed()' should return false? -- To view, visit http://gerrit.cloudera.org:8080/15454 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9c08ff9504097f0fee8c32316c5c150136abe659 Gerrit-Change-Number: 15454 Gerrit-PatchSet: 9 Gerrit-Owner: Tim Armstrong <[email protected]> Gerrit-Reviewer: Bikramjeet Vig <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Sahil Takiar <[email protected]> Gerrit-Reviewer: Tim Armstrong <[email protected]> Gerrit-Comment-Date: Wed, 25 Mar 2020 17:34:39 +0000 Gerrit-HasComments: Yes
