Bikramjeet Vig has posted comments on this change. ( http://gerrit.cloudera.org:8080/15454 )
Change subject: IMPALA-3766: optionally compress spilled data ...................................................................... Patch Set 13: (8 comments) http://gerrit.cloudera.org:8080/#/c/15454/11/be/src/runtime/bufferpool/buffer-pool-counters.h File be/src/runtime/bufferpool/buffer-pool-counters.h: http://gerrit.cloudera.org:8080/#/c/15454/11/be/src/runtime/bufferpool/buffer-pool-counters.h@55 PS11, Line 55: g for writes to di nit: Total bytes written to disk. (May be compressed) http://gerrit.cloudera.org:8080/#/c/15454/11/be/src/runtime/bufferpool/buffer-pool.cc File be/src/runtime/bufferpool/buffer-pool.cc: http://gerrit.cloudera.org:8080/#/c/15454/11/be/src/runtime/bufferpool/buffer-pool.cc@736 PS11, Line 736: [this, page]( if the write op fails then write_io_ops will not be incremented whereas previously it was. Is that the expected behavior? Dont have a preference either way, just wanted to point out the diff in behavior. http://gerrit.cloudera.org:8080/#/c/15454/11/be/src/runtime/tmp-file-mgr.h File be/src/runtime/tmp-file-mgr.h: http://gerrit.cloudera.org:8080/#/c/15454/11/be/src/runtime/tmp-file-mgr.h@161 PS11, Line 161: using 4kb or smaller nit: HOLE_PUNCH_BLOCK_SIZE_BYTES http://gerrit.cloudera.org:8080/#/c/15454/11/be/src/runtime/tmp-file-mgr.h@180 PS11, Line 180: THdfsCompression::type compr nit: although its obvious but maybe mention that -1 means compression_level_ is not used. http://gerrit.cloudera.org:8080/#/c/15454/11/be/src/runtime/tmp-file-mgr.h@291 PS11, Line 291: // occurs. Returns an error only if no temporary files are usable or the scratch : /// limit is exceeded. Must be c update comment http://gerrit.cloudera.org:8080/#/c/15454/11/be/src/runtime/tmp-file-mgr.h@482 PS11, Line 482: is this before or after compression? http://gerrit.cloudera.org:8080/#/c/15454/11/be/src/runtime/tmp-file-mgr.cc File be/src/runtime/tmp-file-mgr.cc: http://gerrit.cloudera.org:8080/#/c/15454/11/be/src/runtime/tmp-file-mgr.cc@65 PS11, Line 65: "(Advanced) Limit on the total bytes of compression buffers that will be used for " nit:maybe mention that this limit is shared across all queries. http://gerrit.cloudera.org:8080/#/c/15454/11/be/src/runtime/tmp-file-mgr.cc@454 PS11, Line 454: int64_t num_bytes, TmpFile** tmp_file, int64_t* file_offset) { : lock_guard<SpinLock> lock(lock_); : int64_t scratch_range_bytes = : shouldn't this check be done after the free_ranges_ recycle underneath? if punch_holes() is false and we already have an allocated file that we can recycle then we dont need to add to current_bytes_allocated_ and we wont hit this limit. -- To view, visit http://gerrit.cloudera.org:8080/15454 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9c08ff9504097f0fee8c32316c5c150136abe659 Gerrit-Change-Number: 15454 Gerrit-PatchSet: 13 Gerrit-Owner: Tim Armstrong <tarmstr...@cloudera.com> Gerrit-Reviewer: Bikramjeet Vig <bikramjeet....@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Sahil Takiar <stak...@cloudera.com> Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com> Gerrit-Comment-Date: Thu, 26 Mar 2020 01:36:16 +0000 Gerrit-HasComments: Yes