Re: [PATCH -next v6 0/2] Make memory reclamation measurable

2024-03-08 Thread Bixuan Cui




在 2024/3/7 17:26, Michal Hocko 写道:

The main reasons for adding static tracepoints are:
1. To subdivide the time spent in the shrinker->count_objects() and
shrinker->scan_objects() functions within the do_shrink_slab function. Using
BPF kprobe, we can only track the time spent in the do_shrink_slab function.
2. When tracing frequently called functions, static tracepoints (BPF
tp/tracepoint) have lower performance impact compared to dynamic tracepoints
(BPF kprobe).

You can track the time process has been preempted by other means, no? We
have context switching tracepoints in place. Have you considered that
option?

Let me think about it...

Thanks
Bixuan Cui



Re: [PATCH -next v6 0/2] Make memory reclamation measurable

2024-03-07 Thread Michal Hocko
On Thu 07-03-24 15:40:29, Bixuan Cui wrote:
[...]
> Currently, with the help of kernel trace events or tools like Perfetto, we
> can only see that kswapd is competing for CPU and the frequency of memory
> reclamation triggers, but we do not have detailed information or metrics
> about memory reclamation, such as the duration and amount of each
> reclamation, or who is releasing memory (super_cache, f2fs, ext4), etc. This
> makes it impossible to locate the above problems.

I am not sure I agree with you here. We do provide insight into LRU and
shrinkers reclaim. Why isn't that enough. In general I would advise you
to focus more on describing why the existing infrastructure is
insuficient (having examples would be really appreciated).

> Currently this patch helps us solve 2 actual performance problems (kswapd
> preempts the CPU causing game delay)
> 1. The increased memory allocation in the game (across different versions)
> has led to the degradation of kswapd.
> This is found by calculating the total amount of Reclaim(page) during
> the game startup phase.
> 
> 2. The adoption of a different file system in the new system version has
> resulted in a slower reclamation rate.
> This is discovered through the OBJ_NAME change. For example, OBJ_NAME
> changes from super_cache_scan to ext4_es_scan.
> 
> Subsequently, it is also possible to calculate the memory reclamation rate
> to evaluate the memory performance of different versions.

Why cannot you achive this with existing tracing or /proc/vmstat
infrastructure?

> The main reasons for adding static tracepoints are:
> 1. To subdivide the time spent in the shrinker->count_objects() and
> shrinker->scan_objects() functions within the do_shrink_slab function. Using
> BPF kprobe, we can only track the time spent in the do_shrink_slab function.
> 2. When tracing frequently called functions, static tracepoints (BPF
> tp/tracepoint) have lower performance impact compared to dynamic tracepoints
> (BPF kprobe).

You can track the time process has been preempted by other means, no? We
have context switching tracepoints in place. Have you considered that
option?
-- 
Michal Hocko
SUSE Labs



Re: [PATCH -next v6 0/2] Make memory reclamation measurable

2024-03-06 Thread Bixuan Cui




在 2024/2/21 15:44, Michal Hocko 写道:
It would be really helpful to have more details on why we need those 
trace points. It is my understanding that you would like to have a more 
fine grained numbers for the time duration of different parts of the 
reclaim process. I can imagine this could be useful in some cases but is 
it useful enough and for a wider variety of workloads? Is that worth a 
dedicated static tracepoints? Why an add-hoc dynamic tracepoints or BPF 
for a very special situation is not sufficient? In other words, tell us 
more about the usecases and why is this generally useful.
Thank you for your reply, I'm sorry that I forgot to describe the 
detailed reason.


Memory reclamation usually occurs when there is high memory pressure (or 
low memory) and is performed by Kswapd. In embedded systems, CPU 
resources are limited, and it is common for kswapd and critical 
processes (which typically require a large amount of memory and trigger 
memory reclamation) to compete for CPU resources. which in turn affects 
the execution of this key process, causing the execution time to 
increase and causing lags,such as dropped frames or slower startup times 
in mobile games.
Currently, with the help of kernel trace events or tools like Perfetto, 
we can only see that kswapd is competing for CPU and the frequency of 
memory reclamation triggers, but we do not have detailed information or 
metrics about memory reclamation, such as the duration and amount of 
each reclamation, or who is releasing memory (super_cache, f2fs, ext4), 
etc. This makes it impossible to locate the above problems.


Currently this patch helps us solve 2 actual performance problems 
(kswapd preempts the CPU causing game delay)
1. The increased memory allocation in the game (across different 
versions) has led to the degradation of kswapd.
This is found by calculating the total amount of Reclaim(page) 
during the game startup phase.


2. The adoption of a different file system in the new system version has 
resulted in a slower reclamation rate.
This is discovered through the OBJ_NAME change. For example, 
OBJ_NAME changes from super_cache_scan to ext4_es_scan.


Subsequently, it is also possible to calculate the memory reclamation 
rate to evaluate the memory performance of different versions.




The main reasons for adding static tracepoints are:
1. To subdivide the time spent in the shrinker->count_objects() and 
shrinker->scan_objects() functions within the do_shrink_slab function. 
Using BPF kprobe, we can only track the time spent in the do_shrink_slab 
function.
2. When tracing frequently called functions, static tracepoints (BPF 
tp/tracepoint) have lower performance impact compared to dynamic 
tracepoints (BPF kprobe).


Thanks
Bixuan Cui



Re: [PATCH -next v6 0/2] Make memory reclamation measurable

2024-02-20 Thread Michal Hocko
On Wed 21-02-24 11:00:53, Bixuan Cui wrote:
> 
> 
> 在 2024/2/21 10:22, Steven Rostedt 写道:
> > It's up to the memory management folks to decide on this. -- Steve
> Noted with thanks.

It would be really helpful to have more details on why we need those
trace points.

It is my understanding that you would like to have a more fine grained
numbers for the time duration of different parts of the reclaim process.
I can imagine this could be useful in some cases but is it useful enough
and for a wider variety of workloads? Is that worth a dedicated static
tracepoints? Why an add-hoc dynamic tracepoints or BPF for a very
special situation is not sufficient?

In other words, tell us more about the usecases and why is this
generally useful.

Thanks!
-- 
Michal Hocko
SUSE Labs



Re: [PATCH -next v6 0/2] Make memory reclamation measurable

2024-02-20 Thread Bixuan Cui




在 2024/2/21 10:22, Steven Rostedt 写道:

It's up to the memory management folks to decide on this. -- Steve

Noted with thanks.

Bixuan Cui



Re: [PATCH -next v6 0/2] Make memory reclamation measurable

2024-02-20 Thread Steven Rostedt
On Wed, 21 Feb 2024 09:44:32 +0800
Bixuan Cui  wrote:

> ping~
> 

It's up to the memory management folks to decide on this.

-- Steve


> 在 2024/1/5 9:36, Bixuan Cui 写道:
> > When the system memory is low, kswapd reclaims the memory. The key steps
> > of memory reclamation include
> > 1.shrink_lruvec
> >* shrink_active_list, moves folios from the active LRU to the inactive 
> > LRU
> >* shrink_inactive_list, shrink lru from inactive LRU list
> > 2.shrink_slab
> >* shrinker->count_objects(), calculates the freeable memory
> >* shrinker->scan_objects(), reclaims the slab memory
> > 
> > The existing tracers in the vmscan are as follows:
> > 
> > --do_try_to_free_pages
> > --shrink_zones
> > --trace_mm_vmscan_node_reclaim_begin (tracer)
> > --shrink_node
> > --shrink_node_memcgs
> >--trace_mm_vmscan_memcg_shrink_begin (tracer)
> >--shrink_lruvec
> >  --shrink_list
> >--shrink_active_list
> >   --trace_mm_vmscan_lru_shrink_active (tracer)
> >--shrink_inactive_list
> >   --trace_mm_vmscan_lru_shrink_inactive (tracer)
> >  --shrink_active_list
> >--shrink_slab
> >  --do_shrink_slab
> >  --shrinker->count_objects()
> >  --trace_mm_shrink_slab_start (tracer)
> >  --shrinker->scan_objects()
> >  --trace_mm_shrink_slab_end (tracer)
> >--trace_mm_vmscan_memcg_shrink_end (tracer)
> > --trace_mm_vmscan_node_reclaim_end (tracer)
> > 
> > If we get the duration and quantity of shrink lru and slab,
> > then we can measure the memory recycling, as follows
> > 
> > Measuring memory reclamation with bpf:
> >LRU FILE:
> > CPU COMMShrinkActive(us) ShrinkInactive(us)  Reclaim(page)
> > 7   kswapd0 26  51  32
> > 7   kswapd0 52  47  13
> >SLAB:
> > CPU COMMOBJ_NAMECount_Dur(us) 
> > Freeable(page) Scan_Dur(us) Reclaim(page)
> >  1  kswapd0 super_cache_scan.cfi_jt 2   341 
> >3225 128
> >  7  kswapd0 super_cache_scan.cfi_jt 0   
> > 2247   8524 1024
> >  7  kswapd0 super_cache_scan.cfi_jt 23670   
> >00
> > 
> > For this, add the new tracer to shrink_active_list/shrink_inactive_list
> > and shrinker->count_objects().
> > 
> > Changes:
> > v6: * Add Reviewed-by from Steven Rostedt.
> > v5: * Use 'DECLARE_EVENT_CLASS(mm_vmscan_lru_shrink_start_template' to
> > replace 'RACE_EVENT(mm_vmscan_lru_shrink_inactive/active_start'
> >  * Add the explanation for adding new shrink lru events into 'mm: 
> > vmscan: add new event to trace shrink lru'
> > v4: Add Reviewed-by and Changlog to every patch.
> > v3: Swap the positions of 'nid' and 'freeable' to prevent the hole in the 
> > trace event.
> > v2: Modify trace_mm_vmscan_lru_shrink_inactive() in evict_folios() at the 
> > same time to fix build error.
> > 
> > cuibixuan (2):
> >mm: shrinker: add new event to trace shrink count
> >mm: vmscan: add new event to trace shrink lru
> > 
> >   include/trace/events/vmscan.h | 80 ++-
> >   mm/shrinker.c |  4 ++
> >   mm/vmscan.c   | 11 +++--
> >   3 files changed, 90 insertions(+), 5 deletions(-)
> >   




Re: [PATCH -next v6 0/2] Make memory reclamation measurable

2024-02-20 Thread Bixuan Cui

ping~

在 2024/1/5 9:36, Bixuan Cui 写道:

When the system memory is low, kswapd reclaims the memory. The key steps
of memory reclamation include
1.shrink_lruvec
   * shrink_active_list, moves folios from the active LRU to the inactive LRU
   * shrink_inactive_list, shrink lru from inactive LRU list
2.shrink_slab
   * shrinker->count_objects(), calculates the freeable memory
   * shrinker->scan_objects(), reclaims the slab memory

The existing tracers in the vmscan are as follows:

--do_try_to_free_pages
--shrink_zones
--trace_mm_vmscan_node_reclaim_begin (tracer)
--shrink_node
--shrink_node_memcgs
   --trace_mm_vmscan_memcg_shrink_begin (tracer)
   --shrink_lruvec
 --shrink_list
   --shrink_active_list
  --trace_mm_vmscan_lru_shrink_active (tracer)
   --shrink_inactive_list
  --trace_mm_vmscan_lru_shrink_inactive (tracer)
 --shrink_active_list
   --shrink_slab
 --do_shrink_slab
 --shrinker->count_objects()
 --trace_mm_shrink_slab_start (tracer)
 --shrinker->scan_objects()
 --trace_mm_shrink_slab_end (tracer)
   --trace_mm_vmscan_memcg_shrink_end (tracer)
--trace_mm_vmscan_node_reclaim_end (tracer)

If we get the duration and quantity of shrink lru and slab,
then we can measure the memory recycling, as follows

Measuring memory reclamation with bpf:
   LRU FILE:
CPU COMMShrinkActive(us) ShrinkInactive(us)  Reclaim(page)
7   kswapd0 26  51  32
7   kswapd0 52  47  13
   SLAB:
CPU COMMOBJ_NAMECount_Dur(us) 
Freeable(page) Scan_Dur(us) Reclaim(page)
 1  kswapd0 super_cache_scan.cfi_jt 2   341 
   3225 128
 7  kswapd0 super_cache_scan.cfi_jt 0   
2247   8524 1024
 7  kswapd0 super_cache_scan.cfi_jt 23670   
   00

For this, add the new tracer to shrink_active_list/shrink_inactive_list
and shrinker->count_objects().

Changes:
v6: * Add Reviewed-by from Steven Rostedt.
v5: * Use 'DECLARE_EVENT_CLASS(mm_vmscan_lru_shrink_start_template' to
replace 'RACE_EVENT(mm_vmscan_lru_shrink_inactive/active_start'
 * Add the explanation for adding new shrink lru events into 'mm: vmscan: 
add new event to trace shrink lru'
v4: Add Reviewed-by and Changlog to every patch.
v3: Swap the positions of 'nid' and 'freeable' to prevent the hole in the trace 
event.
v2: Modify trace_mm_vmscan_lru_shrink_inactive() in evict_folios() at the same 
time to fix build error.

cuibixuan (2):
   mm: shrinker: add new event to trace shrink count
   mm: vmscan: add new event to trace shrink lru

  include/trace/events/vmscan.h | 80 ++-
  mm/shrinker.c |  4 ++
  mm/vmscan.c   | 11 +++--
  3 files changed, 90 insertions(+), 5 deletions(-)





Re: [PATCH -next v6 0/2] Make memory reclamation measurable

2024-01-23 Thread Bixuan Cui

ping~

在 2024/1/5 9:36, Bixuan Cui 写道:

When the system memory is low, kswapd reclaims the memory. The key steps
of memory reclamation include
1.shrink_lruvec
   * shrink_active_list, moves folios from the active LRU to the inactive LRU
   * shrink_inactive_list, shrink lru from inactive LRU list
2.shrink_slab
   * shrinker->count_objects(), calculates the freeable memory
   * shrinker->scan_objects(), reclaims the slab memory

The existing tracers in the vmscan are as follows:

--do_try_to_free_pages
--shrink_zones
--trace_mm_vmscan_node_reclaim_begin (tracer)
--shrink_node
--shrink_node_memcgs
   --trace_mm_vmscan_memcg_shrink_begin (tracer)
   --shrink_lruvec
 --shrink_list
   --shrink_active_list
  --trace_mm_vmscan_lru_shrink_active (tracer)
   --shrink_inactive_list
  --trace_mm_vmscan_lru_shrink_inactive (tracer)
 --shrink_active_list
   --shrink_slab
 --do_shrink_slab
 --shrinker->count_objects()
 --trace_mm_shrink_slab_start (tracer)
 --shrinker->scan_objects()
 --trace_mm_shrink_slab_end (tracer)
   --trace_mm_vmscan_memcg_shrink_end (tracer)
--trace_mm_vmscan_node_reclaim_end (tracer)

If we get the duration and quantity of shrink lru and slab,
then we can measure the memory recycling, as follows

Measuring memory reclamation with bpf:
   LRU FILE:
CPU COMMShrinkActive(us) ShrinkInactive(us)  Reclaim(page)
7   kswapd0 26  51  32
7   kswapd0 52  47  13
   SLAB:
CPU COMMOBJ_NAMECount_Dur(us) 
Freeable(page) Scan_Dur(us) Reclaim(page)
 1  kswapd0 super_cache_scan.cfi_jt 2   341 
   3225 128
 7  kswapd0 super_cache_scan.cfi_jt 0   
2247   8524 1024
 7  kswapd0 super_cache_scan.cfi_jt 23670   
   00

For this, add the new tracer to shrink_active_list/shrink_inactive_list
and shrinker->count_objects().

Changes:
v6: * Add Reviewed-by from Steven Rostedt.
v5: * Use 'DECLARE_EVENT_CLASS(mm_vmscan_lru_shrink_start_template' to
replace 'RACE_EVENT(mm_vmscan_lru_shrink_inactive/active_start'
 * Add the explanation for adding new shrink lru events into 'mm: vmscan: 
add new event to trace shrink lru'
v4: Add Reviewed-by and Changlog to every patch.
v3: Swap the positions of 'nid' and 'freeable' to prevent the hole in the trace 
event.
v2: Modify trace_mm_vmscan_lru_shrink_inactive() in evict_folios() at the same 
time to fix build error.

cuibixuan (2):
   mm: shrinker: add new event to trace shrink count
   mm: vmscan: add new event to trace shrink lru

  include/trace/events/vmscan.h | 80 ++-
  mm/shrinker.c |  4 ++
  mm/vmscan.c   | 11 +++--
  3 files changed, 90 insertions(+), 5 deletions(-)





Re: [PATCH -next v6 0/2] Make memory reclamation measurable

2024-01-14 Thread Bixuan Cui

ping~

在 2024/1/5 9:36, Bixuan Cui 写道:

When the system memory is low, kswapd reclaims the memory. The key steps
of memory reclamation include
1.shrink_lruvec
   * shrink_active_list, moves folios from the active LRU to the inactive LRU
   * shrink_inactive_list, shrink lru from inactive LRU list
2.shrink_slab
   * shrinker->count_objects(), calculates the freeable memory
   * shrinker->scan_objects(), reclaims the slab memory

The existing tracers in the vmscan are as follows:

--do_try_to_free_pages
--shrink_zones
--trace_mm_vmscan_node_reclaim_begin (tracer)
--shrink_node
--shrink_node_memcgs
   --trace_mm_vmscan_memcg_shrink_begin (tracer)
   --shrink_lruvec
 --shrink_list
   --shrink_active_list
  --trace_mm_vmscan_lru_shrink_active (tracer)
   --shrink_inactive_list
  --trace_mm_vmscan_lru_shrink_inactive (tracer)
 --shrink_active_list
   --shrink_slab
 --do_shrink_slab
 --shrinker->count_objects()
 --trace_mm_shrink_slab_start (tracer)
 --shrinker->scan_objects()
 --trace_mm_shrink_slab_end (tracer)
   --trace_mm_vmscan_memcg_shrink_end (tracer)
--trace_mm_vmscan_node_reclaim_end (tracer)

If we get the duration and quantity of shrink lru and slab,
then we can measure the memory recycling, as follows

Measuring memory reclamation with bpf:
   LRU FILE:
CPU COMMShrinkActive(us) ShrinkInactive(us)  Reclaim(page)
7   kswapd0 26  51  32
7   kswapd0 52  47  13
   SLAB:
CPU COMMOBJ_NAMECount_Dur(us) 
Freeable(page) Scan_Dur(us) Reclaim(page)
 1  kswapd0 super_cache_scan.cfi_jt 2   341 
   3225 128
 7  kswapd0 super_cache_scan.cfi_jt 0   
2247   8524 1024
 7  kswapd0 super_cache_scan.cfi_jt 23670   
   00

For this, add the new tracer to shrink_active_list/shrink_inactive_list
and shrinker->count_objects().

Changes:
v6: * Add Reviewed-by from Steven Rostedt.
v5: * Use 'DECLARE_EVENT_CLASS(mm_vmscan_lru_shrink_start_template' to
replace 'RACE_EVENT(mm_vmscan_lru_shrink_inactive/active_start'
 * Add the explanation for adding new shrink lru events into 'mm: vmscan: 
add new event to trace shrink lru'
v4: Add Reviewed-by and Changlog to every patch.
v3: Swap the positions of 'nid' and 'freeable' to prevent the hole in the trace 
event.
v2: Modify trace_mm_vmscan_lru_shrink_inactive() in evict_folios() at the same 
time to fix build error.

cuibixuan (2):
   mm: shrinker: add new event to trace shrink count
   mm: vmscan: add new event to trace shrink lru

  include/trace/events/vmscan.h | 80 ++-
  mm/shrinker.c |  4 ++
  mm/vmscan.c   | 11 +++--
  3 files changed, 90 insertions(+), 5 deletions(-)





[PATCH -next v6 0/2] Make memory reclamation measurable

2024-01-04 Thread Bixuan Cui
When the system memory is low, kswapd reclaims the memory. The key steps
of memory reclamation include
1.shrink_lruvec
  * shrink_active_list, moves folios from the active LRU to the inactive LRU
  * shrink_inactive_list, shrink lru from inactive LRU list
2.shrink_slab
  * shrinker->count_objects(), calculates the freeable memory
  * shrinker->scan_objects(), reclaims the slab memory

The existing tracers in the vmscan are as follows:

--do_try_to_free_pages
--shrink_zones
--trace_mm_vmscan_node_reclaim_begin (tracer)
--shrink_node
--shrink_node_memcgs
  --trace_mm_vmscan_memcg_shrink_begin (tracer)
  --shrink_lruvec
--shrink_list
  --shrink_active_list
  --trace_mm_vmscan_lru_shrink_active (tracer)
  --shrink_inactive_list
  --trace_mm_vmscan_lru_shrink_inactive (tracer)
--shrink_active_list
  --shrink_slab
--do_shrink_slab
--shrinker->count_objects()
--trace_mm_shrink_slab_start (tracer)
--shrinker->scan_objects()
--trace_mm_shrink_slab_end (tracer)
  --trace_mm_vmscan_memcg_shrink_end (tracer)
--trace_mm_vmscan_node_reclaim_end (tracer)

If we get the duration and quantity of shrink lru and slab,
then we can measure the memory recycling, as follows

Measuring memory reclamation with bpf:
  LRU FILE:
CPU COMMShrinkActive(us) ShrinkInactive(us)  Reclaim(page)
7   kswapd0 26  51  32
7   kswapd0 52  47  13
  SLAB:
CPU COMMOBJ_NAMECount_Dur(us) 
Freeable(page) Scan_Dur(us) Reclaim(page)
 1  kswapd0 super_cache_scan.cfi_jt 2   341 
   3225 128
 7  kswapd0 super_cache_scan.cfi_jt 0   
2247   8524 1024
 7  kswapd0 super_cache_scan.cfi_jt 23670   
   00

For this, add the new tracer to shrink_active_list/shrink_inactive_list
and shrinker->count_objects().

Changes:
v6: * Add Reviewed-by from Steven Rostedt.
v5: * Use 'DECLARE_EVENT_CLASS(mm_vmscan_lru_shrink_start_template' to
replace 'RACE_EVENT(mm_vmscan_lru_shrink_inactive/active_start'
* Add the explanation for adding new shrink lru events into 'mm: vmscan: 
add new event to trace shrink lru'
v4: Add Reviewed-by and Changlog to every patch.
v3: Swap the positions of 'nid' and 'freeable' to prevent the hole in the trace 
event.
v2: Modify trace_mm_vmscan_lru_shrink_inactive() in evict_folios() at the same 
time to fix build error.

cuibixuan (2):
  mm: shrinker: add new event to trace shrink count
  mm: vmscan: add new event to trace shrink lru

 include/trace/events/vmscan.h | 80 ++-
 mm/shrinker.c |  4 ++
 mm/vmscan.c   | 11 +++--
 3 files changed, 90 insertions(+), 5 deletions(-)

-- 
2.17.1