Riza Suminto has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18798 )

Change subject: IMPALA-6684: Fix untracked memory in KRPC
......................................................................


Patch Set 16:

(19 comments)

Patch set 16 switch the tracking implementation from using MemPool to using 
custom MemTrackerAllocator.

The MemTrackerAllocator is a little bit more efficient, as shown by relative 
measurement from row-batch-serialize-benchmark.
I tested both method using single_node_perf_run.py against TPCH-30 data and 
both show similar performance without major regression.

http://gerrit.cloudera.org:8080/#/c/18798/15//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/18798/15//COMMIT_MSG@18
PS15, Line 18:
             : This patch also remove the thrift based s
> Should we remove those code? They are not used anymore.
Done


http://gerrit.cloudera.org:8080/#/c/18798/15//COMMIT_MSG@20
PS15, Line 20: thrift RPC has been removed in prior commit.
> Also mention the code change in Planner.
Will mention the planner change in the next patch set.


http://gerrit.cloudera.org:8080/#/c/18798/15/be/src/runtime/krpc-data-stream-sender.h
File be/src/runtime/krpc-data-stream-sender.h:

http://gerrit.cloudera.org:8080/#/c/18798/15/be/src/runtime/krpc-data-stream-sender.h@262
PS15, Line 262: .
> nit: bytes
This counter is removed.


http://gerrit.cloudera.org:8080/#/c/18798/15/be/src/runtime/krpc-data-stream-sender.h@265
PS15, Line 265: ackin
> nit: bytes
This counter is removed.


http://gerrit.cloudera.org:8080/#/c/18798/12/be/src/runtime/krpc-data-stream-sender.cc
File be/src/runtime/krpc-data-stream-sender.cc:

http://gerrit.cloudera.org:8080/#/c/18798/12/be/src/runtime/krpc-data-stream-sender.cc@1101
PS12, Line 1101:     outbound_rb_mem_tracker_->Close();
               :   }
> my hypothesis is that since we set the tracker to nullptr here, when all tr
Patch set 15 fixed this.


http://gerrit.cloudera.org:8080/#/c/18798/13/be/src/runtime/krpc-data-stream-sender.cc
File be/src/runtime/krpc-data-stream-sender.cc:

http://gerrit.cloudera.org:8080/#/c/18798/13/be/src/runtime/krpc-data-stream-sender.cc@1095
PS13, Line 1095: for (int i = 0; i < channels_.size(); ++i) 
> ok
Removed.


http://gerrit.cloudera.org:8080/#/c/18798/13/be/src/runtime/krpc-data-stream-sender.cc@1103
PS13, Line 1103:
> that will be safe
Removed.


http://gerrit.cloudera.org:8080/#/c/18798/15/be/src/runtime/krpc-data-stream-sender.cc
File be/src/runtime/krpc-data-stream-sender.cc:

http://gerrit.cloudera.org:8080/#/c/18798/15/be/src/runtime/krpc-data-stream-sender.cc@780
PS15, Line 780:   eos_sent_counter_ = ADD_COUNTER(profile(), "EosSent", 
TUnit::UNIT);
              :   uncompressed_bytes_counter_ =
              :       ADD_COUNTER(profile(), "UncompressedRowBatchSize", 
TUnit::BYTES);
              :   total_sent_rows_counter_ = ADD_COUNTER(profile(), "RowsSent", 
TUnit::UNIT);
> I was suggesting this counter addition for research purpose. Wonder if we s
This counter is removed.


http://gerrit.cloudera.org:8080/#/c/18798/15/be/src/runtime/krpc-data-stream-sender.cc@786
PS15, Line 786: new MemTracker(-1, "RowBatchSerialization", 
mem_tracker_.get()));
> We should nest MemTracker under the KrpcDataStreamSender's mem_tracker_.get
Done


http://gerrit.cloudera.org:8080/#/c/18798/15/be/src/runtime/krpc-data-stream-sender.cc@1116
PS15, Line 1116:   COUNTER_ADD(uncompressed_bytes_counter_, unc
> Can be set during Prepare and Init method rather than here.
Removed.


http://gerrit.cloudera.org:8080/#/c/18798/15/be/src/runtime/row-batch.h
File be/src/runtime/row-batch.h:

http://gerrit.cloudera.org:8080/#/c/18798/15/be/src/runtime/row-batch.h@70
PS15, Line 70: kudu::Slice TupleDataAsSl
> not necessary to define the pointer as unique_ptr, regular pointer should b
Removed.


http://gerrit.cloudera.org:8080/#/c/18798/15/be/src/runtime/row-batch.h@101
PS15, Line 101:
> nit: serialization
Removed.


http://gerrit.cloudera.org:8080/#/c/18798/15/be/src/runtime/row-batch.h@137
PS15, Line 137: c:
> If tuple_data_length_ > tuple_data_capacity_, we need to reallocate data bu
Removed.


http://gerrit.cloudera.org:8080/#/c/18798/15/be/src/runtime/row-batch.h@142
PS15, Line 142:
> serialization.
Done


http://gerrit.cloudera.org:8080/#/c/18798/15/be/src/runtime/row-batch.h@143
PS15, Line 143: _FLUSH_RESOURCES,
> The compression_scratch_, its length and capacity will be swapped
Removed.


http://gerrit.cloudera.org:8080/#/c/18798/15/be/src/runtime/row-batch.cc
File be/src/runtime/row-batch.cc:

http://gerrit.cloudera.org:8080/#/c/18798/15/be/src/runtime/row-batch.cc@321
PS15, Line 321: }
> In the original code, we throw TErrorCode::ROW_BATCH_TOO_LARGE if size is l
Done


http://gerrit.cloudera.org:8080/#/c/18798/15/be/src/runtime/row-batch.cc@365
PS15, Line 365:   tuple->DeepCopy(**desc, &tuple_data, &offset, /* convert_ptrs 
*/ true);
> In the original code, we do not resize here if current length is longer tha
Done


http://gerrit.cloudera.org:8080/#/c/18798/15/fe/src/main/java/org/apache/impala/planner/DataStreamSink.java
File fe/src/main/java/org/apache/impala/planner/DataStreamSink.java:

http://gerrit.cloudera.org:8080/#/c/18798/15/fe/src/main/java/org/apache/impala/planner/DataStreamSink.java@64
PS15, Line 64: th
> nit: are
Done


http://gerrit.cloudera.org:8080/#/c/18798/15/fe/src/main/java/org/apache/impala/planner/DataStreamSink.java@75
PS15, Line 75:     queryOptions.batch_size :
             :         PlanNode.DEFAULT_ROWBATCH_S
> Could you add a comment why set these two variables as 2?
Done



--
To view, visit http://gerrit.cloudera.org:8080/18798
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2ba2b907ce4f275a7a1fb8cf75453c7003eb4b82
Gerrit-Change-Number: 18798
Gerrit-PatchSet: 16
Gerrit-Owner: Omid Shahidi <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Kurt Deschler <[email protected]>
Gerrit-Reviewer: Omid Shahidi <[email protected]>
Gerrit-Reviewer: Riza Suminto <[email protected]>
Gerrit-Reviewer: Wenzhe Zhou <[email protected]>
Gerrit-Comment-Date: Wed, 12 Oct 2022 04:08:40 +0000
Gerrit-HasComments: Yes

Reply via email to