Ashwani Raina has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/22079 )

Change subject: [compaction] Add memory tracking for better OOM issues triaging
......................................................................


Patch Set 11:

(9 comments)

http://gerrit.cloudera.org:8080/#/c/22079/10/src/kudu/tablet/delta_compaction.h
File src/kudu/tablet/delta_compaction.h:

http://gerrit.cloudera.org:8080/#/c/22079/10/src/kudu/tablet/delta_compaction.h@102
PS10, Line 102:   void MemoryExceededWarnMsgs();
> nit: add a comment for the new method, so it's documented similar to the re
Done


http://gerrit.cloudera.org:8080/#/c/22079/10/src/kudu/tablet/delta_compaction.h@104
PS10, Line 104:   enum MemTrackerAction {
              :     CONSUME,
              :     RELEASE
              :   };
> Is this still needed?
Somehow missed to remove this. Removed now.


http://gerrit.cloudera.org:8080/#/c/22079/10/src/kudu/tablet/delta_compaction.h@109
PS10, Line 109:   // Update memory tracker for compaction operation for the 
tablet under process.
> nit: does it make sense to mention about logging warning messages as well?
Done


http://gerrit.cloudera.org:8080/#/c/22079/10/src/kudu/tablet/delta_compaction.h@111
PS10, Line 111:   FsManager* const fs_manager_;
> nit: add an empty line to separate the declaration of the UpdateMemTracker(
Done


http://gerrit.cloudera.org:8080/#/c/22079/9/src/kudu/tablet/delta_compaction.cc
File src/kudu/tablet/delta_compaction.cc:

http://gerrit.cloudera.org:8080/#/c/22079/9/src/kudu/tablet/delta_compaction.cc@193
PS9, Line 193:
> s/them//
Ack


http://gerrit.cloudera.org:8080/#/c/22079/10/src/kudu/tablet/delta_compaction.cc
File src/kudu/tablet/delta_compaction.cc:

http://gerrit.cloudera.org:8080/#/c/22079/10/src/kudu/tablet/delta_compaction.cc@130
PS10, Line 130: if (process_memory::UnderMemoryPressure(&capacity_pct)) {
> > The main reason behind comparing the current consumption against memory_l
My thought process is from the perspective that we want to emit warnings before 
crossing the hard limit because once process memory crosses the hard limit, it 
may not even reach significant margin limit before getting killed by OOM 
killer. If that happens, all the information about memory usage numbers from 
current compaction, all compaction ops, etc. would be missing from the logs for 
post-mortem analysis.

However, you have a valid point here from the perspective of limiting the 
warning messages as much as possible.
To address this, there are two options:
1. With current logic, limit the number of warning messages by increasing the 
logginginterval from 1 second to say 5 seconds.
2. Instead of significant margin (25%), it could be small margin like 5% along 
with logging interval of 2 seconds.

How does that sound?


http://gerrit.cloudera.org:8080/#/c/22079/11/src/kudu/tablet/delta_compaction.cc
File src/kudu/tablet/delta_compaction.cc:

http://gerrit.cloudera.org:8080/#/c/22079/11/src/kudu/tablet/delta_compaction.cc@133
PS11, Line 133:           "MajorDeltaCompaction op may not complete due to lack 
of sufficient memory "
              :           "(at %.2f%% of capacity), current compaction 
consumption: %ld, total consumption "
              :           "for all running major delta compaction ops: %ld, 
tablet: %s"
> Maybe, be more direct of the expected OOM condition in this message?  The p
How about this?
"Approaching towards an out-of-memory condition that may impact the running 
operations requiring more "
"memory in order to proceed, including current MajorDeltaCompaction op. "
"Memory consumption is at %.2f%% of capacity, current compaction consumption: 
%ld for tablet: %s, total "
"consumption for all running major delta compaction ops: %ld",


http://gerrit.cloudera.org:8080/#/c/22079/11/src/kudu/tablet/delta_compaction.cc@176
PS11, Line 176:     // 1) Get the next batch of base data for the columns we're 
compacting.
> Does it make sense to UpdateMemTracker() with the arena's memory usage when
Yes, I missed to add UpdateMemTracker call here when made changes to address 
this comment: 
https://gerrit.cloudera.org/c/22079/9/src/kudu/tablet/delta_compaction.cc#271 
by adding UpdateMemTracker call at line 263.

Since this arena memory is short-lived and is almost negligible most of the 
times, I had skipped calling UpdateMemTracker() here and on line 263.

I guess it is better to account for each update even if the allocation size is 
negligible.


http://gerrit.cloudera.org:8080/#/c/22079/10/src/kudu/tablet/delta_store.h
File src/kudu/tablet/delta_store.h:

http://gerrit.cloudera.org:8080/#/c/22079/10/src/kudu/tablet/delta_store.h@479
PS10, Line 479:     int64_t slice_data_size_total = 0;
              :     int64_t delta_obj_size_total = prepared_deltas_.size() * 
sizeof(PreparedDelta);
> nit: could we compute the result of this function/method using just one tem
Done



--
To view, visit http://gerrit.cloudera.org:8080/22079
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic2037582d433730884212e83359bb75bad0d5394
Gerrit-Change-Number: 22079
Gerrit-PatchSet: 11
Gerrit-Owner: Ashwani Raina <[email protected]>
Gerrit-Reviewer: Abhishek Chennaka <[email protected]>
Gerrit-Reviewer: Alexey Serbin <[email protected]>
Gerrit-Reviewer: Ashwani Raina <[email protected]>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mahesh Reddy <[email protected]>
Gerrit-Reviewer: Yifan Zhang <[email protected]>
Gerrit-Comment-Date: Fri, 24 Jan 2025 13:22:41 +0000
Gerrit-HasComments: Yes

Reply via email to