Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/18798 )
Change subject: IMPALA-6684: Fix untracked memory in KRPC ...................................................................... Patch Set 16: (19 comments) Patch set 16 switch the tracking implementation from using MemPool to using custom MemTrackerAllocator. The MemTrackerAllocator is a little bit more efficient, as shown by relative measurement from row-batch-serialize-benchmark. I tested both method using single_node_perf_run.py against TPCH-30 data and both show similar performance without major regression. http://gerrit.cloudera.org:8080/#/c/18798/15//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/18798/15//COMMIT_MSG@18 PS15, Line 18: : This patch also remove the thrift based s > Should we remove those code? They are not used anymore. Done http://gerrit.cloudera.org:8080/#/c/18798/15//COMMIT_MSG@20 PS15, Line 20: thrift RPC has been removed in prior commit. > Also mention the code change in Planner. Will mention the planner change in the next patch set. http://gerrit.cloudera.org:8080/#/c/18798/15/be/src/runtime/krpc-data-stream-sender.h File be/src/runtime/krpc-data-stream-sender.h: http://gerrit.cloudera.org:8080/#/c/18798/15/be/src/runtime/krpc-data-stream-sender.h@262 PS15, Line 262: . > nit: bytes This counter is removed. http://gerrit.cloudera.org:8080/#/c/18798/15/be/src/runtime/krpc-data-stream-sender.h@265 PS15, Line 265: ackin > nit: bytes This counter is removed. http://gerrit.cloudera.org:8080/#/c/18798/12/be/src/runtime/krpc-data-stream-sender.cc File be/src/runtime/krpc-data-stream-sender.cc: http://gerrit.cloudera.org:8080/#/c/18798/12/be/src/runtime/krpc-data-stream-sender.cc@1101 PS12, Line 1101: outbound_rb_mem_tracker_->Close(); : } > my hypothesis is that since we set the tracker to nullptr here, when all tr Patch set 15 fixed this. http://gerrit.cloudera.org:8080/#/c/18798/13/be/src/runtime/krpc-data-stream-sender.cc File be/src/runtime/krpc-data-stream-sender.cc: http://gerrit.cloudera.org:8080/#/c/18798/13/be/src/runtime/krpc-data-stream-sender.cc@1095 PS13, Line 1095: for (int i = 0; i < channels_.size(); ++i) > ok Removed. http://gerrit.cloudera.org:8080/#/c/18798/13/be/src/runtime/krpc-data-stream-sender.cc@1103 PS13, Line 1103: > that will be safe Removed. http://gerrit.cloudera.org:8080/#/c/18798/15/be/src/runtime/krpc-data-stream-sender.cc File be/src/runtime/krpc-data-stream-sender.cc: http://gerrit.cloudera.org:8080/#/c/18798/15/be/src/runtime/krpc-data-stream-sender.cc@780 PS15, Line 780: eos_sent_counter_ = ADD_COUNTER(profile(), "EosSent", TUnit::UNIT); : uncompressed_bytes_counter_ = : ADD_COUNTER(profile(), "UncompressedRowBatchSize", TUnit::BYTES); : total_sent_rows_counter_ = ADD_COUNTER(profile(), "RowsSent", TUnit::UNIT); > I was suggesting this counter addition for research purpose. Wonder if we s This counter is removed. http://gerrit.cloudera.org:8080/#/c/18798/15/be/src/runtime/krpc-data-stream-sender.cc@786 PS15, Line 786: new MemTracker(-1, "RowBatchSerialization", mem_tracker_.get())); > We should nest MemTracker under the KrpcDataStreamSender's mem_tracker_.get Done http://gerrit.cloudera.org:8080/#/c/18798/15/be/src/runtime/krpc-data-stream-sender.cc@1116 PS15, Line 1116: COUNTER_ADD(uncompressed_bytes_counter_, unc > Can be set during Prepare and Init method rather than here. Removed. http://gerrit.cloudera.org:8080/#/c/18798/15/be/src/runtime/row-batch.h File be/src/runtime/row-batch.h: http://gerrit.cloudera.org:8080/#/c/18798/15/be/src/runtime/row-batch.h@70 PS15, Line 70: kudu::Slice TupleDataAsSl > not necessary to define the pointer as unique_ptr, regular pointer should b Removed. http://gerrit.cloudera.org:8080/#/c/18798/15/be/src/runtime/row-batch.h@101 PS15, Line 101: > nit: serialization Removed. http://gerrit.cloudera.org:8080/#/c/18798/15/be/src/runtime/row-batch.h@137 PS15, Line 137: c: > If tuple_data_length_ > tuple_data_capacity_, we need to reallocate data bu Removed. http://gerrit.cloudera.org:8080/#/c/18798/15/be/src/runtime/row-batch.h@142 PS15, Line 142: > serialization. Done http://gerrit.cloudera.org:8080/#/c/18798/15/be/src/runtime/row-batch.h@143 PS15, Line 143: _FLUSH_RESOURCES, > The compression_scratch_, its length and capacity will be swapped Removed. http://gerrit.cloudera.org:8080/#/c/18798/15/be/src/runtime/row-batch.cc File be/src/runtime/row-batch.cc: http://gerrit.cloudera.org:8080/#/c/18798/15/be/src/runtime/row-batch.cc@321 PS15, Line 321: } > In the original code, we throw TErrorCode::ROW_BATCH_TOO_LARGE if size is l Done http://gerrit.cloudera.org:8080/#/c/18798/15/be/src/runtime/row-batch.cc@365 PS15, Line 365: tuple->DeepCopy(**desc, &tuple_data, &offset, /* convert_ptrs */ true); > In the original code, we do not resize here if current length is longer tha Done http://gerrit.cloudera.org:8080/#/c/18798/15/fe/src/main/java/org/apache/impala/planner/DataStreamSink.java File fe/src/main/java/org/apache/impala/planner/DataStreamSink.java: http://gerrit.cloudera.org:8080/#/c/18798/15/fe/src/main/java/org/apache/impala/planner/DataStreamSink.java@64 PS15, Line 64: th > nit: are Done http://gerrit.cloudera.org:8080/#/c/18798/15/fe/src/main/java/org/apache/impala/planner/DataStreamSink.java@75 PS15, Line 75: queryOptions.batch_size : : PlanNode.DEFAULT_ROWBATCH_S > Could you add a comment why set these two variables as 2? Done -- To view, visit http://gerrit.cloudera.org:8080/18798 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2ba2b907ce4f275a7a1fb8cf75453c7003eb4b82 Gerrit-Change-Number: 18798 Gerrit-PatchSet: 16 Gerrit-Owner: Omid Shahidi <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Kurt Deschler <[email protected]> Gerrit-Reviewer: Omid Shahidi <[email protected]> Gerrit-Reviewer: Riza Suminto <[email protected]> Gerrit-Reviewer: Wenzhe Zhou <[email protected]> Gerrit-Comment-Date: Wed, 12 Oct 2022 04:08:40 +0000 Gerrit-HasComments: Yes
