Sahil Takiar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15454 )

Change subject: IMPALA-3766: optionally compress spilled data
......................................................................


Patch Set 9:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/15454/9/be/src/runtime/bufferpool/buffer-pool-counters.h
File be/src/runtime/bufferpool/buffer-pool-counters.h:

http://gerrit.cloudera.org:8080/#/c/15454/9/be/src/runtime/bufferpool/buffer-pool-counters.h@59
PS9, Line 59:   RuntimeProfile::Counter* uncompressed_bytes_written;
> I think adding the timers makes a lot of sense.
Do we need some tests for the new counters as well?


http://gerrit.cloudera.org:8080/#/c/15454/11/be/src/runtime/tmp-file-mgr-internal.h
File be/src/runtime/tmp-file-mgr-internal.h:

http://gerrit.cloudera.org:8080/#/c/15454/11/be/src/runtime/tmp-file-mgr-internal.h@94
PS11, Line 94: This is also the
             :   /// offset of the next chunk to allocate in the file.
maybe move this to the beginning of the comment block?


http://gerrit.cloudera.org:8080/#/c/15454/11/be/src/runtime/tmp-file-mgr.cc
File be/src/runtime/tmp-file-mgr.cc:

http://gerrit.cloudera.org:8080/#/c/15454/11/be/src/runtime/tmp-file-mgr.cc@586
PS11, Line 586:  DCHECK(io_mgr_buffer->eosr());
              :   DCHECK_LE(io_mgr_buffer->len(), read_buffer.len());
              :   if (io_mgr_
at this point, the compressed_.buffer() should be a nullptr, right? should we 
add a DCHECK for that?


http://gerrit.cloudera.org:8080/#/c/15454/11/be/src/runtime/tmp-file-mgr.cc@644
PS11, Line 644: return status;
              : }
              :
not sure I understand what this method is for, but why don't we need to decrypt 
the data if the data is compressed?


http://gerrit.cloudera.org:8080/#/c/15454/11/be/src/runtime/tmp-file-mgr.cc@779
PS11, Line 779:   write_range_->SetData(buffer_to_write.data(), 
buffer_to_write.len());
if this fails, do we still need to call FreeCompressedBuffer()?


http://gerrit.cloudera.org:8080/#/c/15454/11/be/src/runtime/tmp-file-mgr.cc@807
PS11, Line 807:
might be cleaner to have return true/false - true if the compression succeeded, 
false otherwise. I think it currently relies on checking if 'compressed_len_' 
has been set, which seems more fragile.
i think it simplifies the caller as well, you could re-write it to something 
like this:

 MemRange buffer_to_write = buffer;
 if (parent_->tmp_file_mgr_->compression_enabled()) {
   if (TryCompress(buffer)) {
     buffer_to_write = MemRange(compressed_.buffer(), compressed_len_);
    }
 }


http://gerrit.cloudera.org:8080/#/c/15454/11/be/src/runtime/tmp-file-mgr.cc@835
PS11, Line 835:   Status status = io_ctx->AddWriteRange(write_range_.get());
is it possible for ProcessBlock to fail, but still write some data to the 
compressed buffer? if it is, then will this DCHECK? at this point 
compressed_len_ should still be -1 right? which means 'DCHECK(is_compressed())' 
will fail because 'is_compressed()' should return false?



--
To view, visit http://gerrit.cloudera.org:8080/15454
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9c08ff9504097f0fee8c32316c5c150136abe659
Gerrit-Change-Number: 15454
Gerrit-PatchSet: 9
Gerrit-Owner: Tim Armstrong <[email protected]>
Gerrit-Reviewer: Bikramjeet Vig <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Sahil Takiar <[email protected]>
Gerrit-Reviewer: Tim Armstrong <[email protected]>
Gerrit-Comment-Date: Wed, 25 Mar 2020 17:34:39 +0000
Gerrit-HasComments: Yes

Reply via email to