Re: [Crash-utility] [PATCH] Fix "ps/vm" commands to display the memory usage for exiting tasks

2023-08-24 Thread lijiang
On Fri, Aug 25, 2023 at 9:15 AM HAGIO KAZUHITO(萩尾 一仁) 
wrote:

> >>> - if (!task_mm(task, TRUE))
> >>> + if (!(addr = task_mm(task, TRUE)))
> >>> + return;
> >>> +
> >>> + if (!readmem(addr + OFFSET(mm_struct_mm_count), KVADDR,
> _count,
> >>> + sizeof(int), "mm_struct mm_count", RETURN_ON_ERROR))
> >>> + return;
>
> tt->mm_struct is filled in task_mm(), I think we can use this:
>
>mm_count = INT(tt->mm_struct + OFFSET(mm_struct_mm_count));
>
> The following code in get_task_mem_usage() also use this.
>

Good idea. I will post v2 with this change later.

Thanks.
Lianbo


>
> Thanks,
> Kazu
--
Crash-utility mailing list
Crash-utility@redhat.com
https://listman.redhat.com/mailman/listinfo/crash-utility
Contribution Guidelines: https://github.com/crash-utility/crash/wiki


Re: [Crash-utility] [PATCH] Fix "ps/vm" commands to display the memory usage for exiting tasks

2023-08-24 Thread 萩尾 一仁
On 2023/08/24 17:29, lijiang wrote:
> On Thu, Aug 24, 2023 at 10:01 AM HAGIO KAZUHITO(萩尾 一仁) 
> wrote:
> 
>> On 2023/08/23 14:44, Lianbo Jiang wrote:
>>> When a task is exiting, usually kernel marks its flags as 'PF_EXITING',
>>> but even so, sometimes the mm_struct has not been freed, it might still
>>> be valid. For such tasks, the "ps/vm" commands won't display the memory
>>> usage. For example:
>>>
>>> crash> ps 47070
>>>   PIDPPID  CPU   TASKST  %MEM  VSZ  RSS
>> COMM
>>> 47070   1   0  9ba7c491  UN   0.000
>> ra_ris.parse
>>> crash> vm 47070
>>> PID: 47070TASK: 9ba7c491  CPU: 0COMMAND:
>> "ra_ris.parse"
>>>MM   PGD  RSSTOTAL_VM
>>>0 00k   0k
>>>
>>> To be honest, this is a corner case, but it has already occurred in
>>> actual production environments. Given that, let's allow the "ps/vm"
>>> commands to try to display the memory usage for this case, but it does
>>> not guarantee that it can work well at any time, which still depends on
>>> how far the mm_struct deconstruction has proceeded.
>>
>> Agree to display it, and looks like the deconstruction is done after
>> task->mm is set to NULL, so it looks fine to me.
>>
>>
> Thank you for the comments, Kazu.
> 
> 
>> void __noreturn do_exit(long code)
>> {
>> ...
>>   exit_signals(tsk);  /* sets PF_EXITING */
>> ...
>>   exit_mm();
>>
>> static void exit_mm(void)
>> {
>>   struct mm_struct *mm = current->mm;
>> ...
>>   current->mm = NULL;  ## task->mm is set to NULL here
>> ...
>>   mmput(mm);   ## release the resources actually
>>
>>
>> On the other hand, the mm->mm_count is decremented in mmput(), is there
>> need to check it?
>>
>>
> Good question. I had the same thoughts, but finally I chose to double check
> with the mm_count. Not sure how to ensure the memory synchronization can be
> done, when the kernel is panicking.

Ok, it's fine to double check just in case, then ...

> 
> If the address of mm pointer is valid and the mm_struct members are always
> legitimate, we won't need to double check.
> 
> But anyway, this is just my thoughts, maybe it's not correct completely. If
> you do not want to have it, I can post v2 and simply remove
> the IS_EXITING(task) from get_task_mem_usage().
> 
> Thanks.
> Lianbo
> 
> 
>> void mmput(struct mm_struct *mm)
>> {
>>   might_sleep();
>>
>>   if (atomic_dec_and_test(>mm_users))
>>   __mmput(mm);
>> }
>>
>> static inline void __mmput(struct mm_struct *mm)
>> {
>> ...
>>   exit_mmap(mm);
>> ...
>>   mmdrop(mm);
>> }
>>
>> static inline void mmdrop(struct mm_struct *mm)
>> {
>> ...
>>   if (unlikely(atomic_dec_and_test(>mm_count)))
>>   __mmdrop(mm);
>> }
>>
>> Thanks,
>> Kazu
>>
>>>
>>> With the patch:
>>> crash> ps 47070
>>>   PIDPPID  CPU   TASKST  %MEM  VSZ  RSS
>> COMM
>>> 47070   1   0  9ba7c491  UN  90.8 38461228 31426444
>> ra_ris.parse
>>> crash> vm 47070
>>> PID: 47070TASK: 9ba7c491  CPU: 0COMMAND:
>> "ra_ris.parse"
>>>MM   PGD  RSSTOTAL_VM
>>> 9bad6e873840  9baee0544000  31426444k  38461228k
>>>   VMA   START   END FLAGS FILE
>>> 9bafdbe1d6c8 40 8c5000 8000875
>> /data1/rishome/ra_cu_cn_412/sbin/ra_ris.parse
>>> ...
>>>
>>> Reported-by: Buland Kumar Singh 
>>> Signed-off-by: Lianbo Jiang 
>>> ---
>>>memory.c | 13 +++--
>>>1 file changed, 11 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/memory.c b/memory.c
>>> index 5d76c5d7fe6f..7d59c0555a0e 100644
>>> --- a/memory.c
>>> +++ b/memory.c
>>> @@ -4792,10 +4792,12 @@ get_task_mem_usage(ulong task, struct
>> task_mem_usage *tm)
>>>{
>>>struct task_context *tc;
>>>long rss = 0, rss_cache = 0;
>>> + int mm_count = 0;
>>> + ulong addr;
>>>
>>>BZERO(tm, sizeof(struct task_mem_usage));
>>>
>>> - if (IS_ZOMBIE(task) || IS_EXITING(task))
>>> + if (IS_ZOMBIE(task))
>>>return;
>>>
>>>tc = task_to_context(task);
>>> @@ -4805,7 +4807,14 @@ get_task_mem_usage(ulong task, struct
>> task_mem_usage *tm)
>>>
>>>tm->mm_struct_addr = tc->mm_struct;
>>>
>>> - if (!task_mm(task, TRUE))
>>> + if (!(addr = task_mm(task, TRUE)))
>>> + return;
>>> +
>>> + if (!readmem(addr + OFFSET(mm_struct_mm_count), KVADDR, _count,
>>> + sizeof(int), "mm_struct mm_count", RETURN_ON_ERROR))
>>> + return;

tt->mm_struct is filled in task_mm(), I think we can use this:

   mm_count = INT(tt->mm_struct + OFFSET(mm_struct_mm_count));

The following code in get_task_mem_usage() also use this.

Thanks,
Kazu
--
Crash-utility mailing list
Crash-utility@redhat.com

Re: [Crash-utility] [PATCH] Fix "ps/vm" commands to display the memory usage for exiting tasks

2023-08-24 Thread lijiang
On Thu, Aug 24, 2023 at 10:01 AM HAGIO KAZUHITO(萩尾 一仁) 
wrote:

> On 2023/08/23 14:44, Lianbo Jiang wrote:
> > When a task is exiting, usually kernel marks its flags as 'PF_EXITING',
> > but even so, sometimes the mm_struct has not been freed, it might still
> > be valid. For such tasks, the "ps/vm" commands won't display the memory
> > usage. For example:
> >
> >crash> ps 47070
> >  PIDPPID  CPU   TASKST  %MEM  VSZ  RSS
> COMM
> >47070   1   0  9ba7c491  UN   0.000
> ra_ris.parse
> >crash> vm 47070
> >PID: 47070TASK: 9ba7c491  CPU: 0COMMAND:
> "ra_ris.parse"
> >   MM   PGD  RSSTOTAL_VM
> >   0 00k   0k
> >
> > To be honest, this is a corner case, but it has already occurred in
> > actual production environments. Given that, let's allow the "ps/vm"
> > commands to try to display the memory usage for this case, but it does
> > not guarantee that it can work well at any time, which still depends on
> > how far the mm_struct deconstruction has proceeded.
>
> Agree to display it, and looks like the deconstruction is done after
> task->mm is set to NULL, so it looks fine to me.
>
>
Thank you for the comments, Kazu.


> void __noreturn do_exit(long code)
> {
> ...
>  exit_signals(tsk);  /* sets PF_EXITING */
> ...
>  exit_mm();
>
> static void exit_mm(void)
> {
>  struct mm_struct *mm = current->mm;
> ...
>  current->mm = NULL;  ## task->mm is set to NULL here
> ...
>  mmput(mm);   ## release the resources actually
>
>
> On the other hand, the mm->mm_count is decremented in mmput(), is there
> need to check it?
>
>
Good question. I had the same thoughts, but finally I chose to double check
with the mm_count. Not sure how to ensure the memory synchronization can be
done, when the kernel is panicking.

If the address of mm pointer is valid and the mm_struct members are always
legitimate, we won't need to double check.

But anyway, this is just my thoughts, maybe it's not correct completely. If
you do not want to have it, I can post v2 and simply remove
the IS_EXITING(task) from get_task_mem_usage().

Thanks.
Lianbo


> void mmput(struct mm_struct *mm)
> {
>  might_sleep();
>
>  if (atomic_dec_and_test(>mm_users))
>  __mmput(mm);
> }
>
> static inline void __mmput(struct mm_struct *mm)
> {
> ...
>  exit_mmap(mm);
> ...
>  mmdrop(mm);
> }
>
> static inline void mmdrop(struct mm_struct *mm)
> {
> ...
>  if (unlikely(atomic_dec_and_test(>mm_count)))
>  __mmdrop(mm);
> }
>
> Thanks,
> Kazu
>
> >
> > With the patch:
> >crash> ps 47070
> >  PIDPPID  CPU   TASKST  %MEM  VSZ  RSS
> COMM
> >47070   1   0  9ba7c491  UN  90.8 38461228 31426444
> ra_ris.parse
> >crash> vm 47070
> >PID: 47070TASK: 9ba7c491  CPU: 0COMMAND:
> "ra_ris.parse"
> >   MM   PGD  RSSTOTAL_VM
> >9bad6e873840  9baee0544000  31426444k  38461228k
> >  VMA   START   END FLAGS FILE
> >9bafdbe1d6c8 40 8c5000 8000875
> /data1/rishome/ra_cu_cn_412/sbin/ra_ris.parse
> >...
> >
> > Reported-by: Buland Kumar Singh 
> > Signed-off-by: Lianbo Jiang 
> > ---
> >   memory.c | 13 +++--
> >   1 file changed, 11 insertions(+), 2 deletions(-)
> >
> > diff --git a/memory.c b/memory.c
> > index 5d76c5d7fe6f..7d59c0555a0e 100644
> > --- a/memory.c
> > +++ b/memory.c
> > @@ -4792,10 +4792,12 @@ get_task_mem_usage(ulong task, struct
> task_mem_usage *tm)
> >   {
> >   struct task_context *tc;
> >   long rss = 0, rss_cache = 0;
> > + int mm_count = 0;
> > + ulong addr;
> >
> >   BZERO(tm, sizeof(struct task_mem_usage));
> >
> > - if (IS_ZOMBIE(task) || IS_EXITING(task))
> > + if (IS_ZOMBIE(task))
> >   return;
> >
> >   tc = task_to_context(task);
> > @@ -4805,7 +4807,14 @@ get_task_mem_usage(ulong task, struct
> task_mem_usage *tm)
> >
> >   tm->mm_struct_addr = tc->mm_struct;
> >
> > - if (!task_mm(task, TRUE))
> > + if (!(addr = task_mm(task, TRUE)))
> > + return;
> > +
> > + if (!readmem(addr + OFFSET(mm_struct_mm_count), KVADDR, _count,
> > + sizeof(int), "mm_struct mm_count", RETURN_ON_ERROR))
> > + return;
> > +
> > + if (IS_EXITING(task) && mm_count <= 0)
> >   return;
> >
> >   if (VALID_MEMBER(mm_struct_rss))
--
Crash-utility mailing list
Crash-utility@redhat.com
https://listman.redhat.com/mailman/listinfo/crash-utility
Contribution Guidelines: https://github.com/crash-utility/crash/wiki


Re: [Crash-utility] [PATCH] Fix "ps/vm" commands to display the memory usage for exiting tasks

2023-08-23 Thread 萩尾 一仁
On 2023/08/23 14:44, Lianbo Jiang wrote:
> When a task is exiting, usually kernel marks its flags as 'PF_EXITING',
> but even so, sometimes the mm_struct has not been freed, it might still
> be valid. For such tasks, the "ps/vm" commands won't display the memory
> usage. For example:
> 
>crash> ps 47070
>  PIDPPID  CPU   TASKST  %MEM  VSZ  RSS  COMM
>47070   1   0  9ba7c491  UN   0.000  
> ra_ris.parse
>crash> vm 47070
>PID: 47070TASK: 9ba7c491  CPU: 0COMMAND: "ra_ris.parse"
>   MM   PGD  RSSTOTAL_VM
>   0 00k   0k
> 
> To be honest, this is a corner case, but it has already occurred in
> actual production environments. Given that, let's allow the "ps/vm"
> commands to try to display the memory usage for this case, but it does
> not guarantee that it can work well at any time, which still depends on
> how far the mm_struct deconstruction has proceeded.

Agree to display it, and looks like the deconstruction is done after 
task->mm is set to NULL, so it looks fine to me.

void __noreturn do_exit(long code)
{
...
 exit_signals(tsk);  /* sets PF_EXITING */
...
 exit_mm();

static void exit_mm(void)
{
 struct mm_struct *mm = current->mm;
...
 current->mm = NULL;  ## task->mm is set to NULL here
...
 mmput(mm);   ## release the resources actually


On the other hand, the mm->mm_count is decremented in mmput(), is there 
need to check it?

void mmput(struct mm_struct *mm)
{
 might_sleep();

 if (atomic_dec_and_test(>mm_users))
 __mmput(mm);
}

static inline void __mmput(struct mm_struct *mm)
{
...
 exit_mmap(mm);
...
 mmdrop(mm);
}

static inline void mmdrop(struct mm_struct *mm)
{
...
 if (unlikely(atomic_dec_and_test(>mm_count)))
 __mmdrop(mm);
}

Thanks,
Kazu

> 
> With the patch:
>crash> ps 47070
>  PIDPPID  CPU   TASKST  %MEM  VSZ  RSS  COMM
>47070   1   0  9ba7c491  UN  90.8 38461228 31426444  
> ra_ris.parse
>crash> vm 47070
>PID: 47070TASK: 9ba7c491  CPU: 0COMMAND: "ra_ris.parse"
>   MM   PGD  RSSTOTAL_VM
>9bad6e873840  9baee0544000  31426444k  38461228k
>  VMA   START   END FLAGS FILE
>9bafdbe1d6c8 40 8c5000 8000875 
> /data1/rishome/ra_cu_cn_412/sbin/ra_ris.parse
>...
> 
> Reported-by: Buland Kumar Singh 
> Signed-off-by: Lianbo Jiang 
> ---
>   memory.c | 13 +++--
>   1 file changed, 11 insertions(+), 2 deletions(-)
> 
> diff --git a/memory.c b/memory.c
> index 5d76c5d7fe6f..7d59c0555a0e 100644
> --- a/memory.c
> +++ b/memory.c
> @@ -4792,10 +4792,12 @@ get_task_mem_usage(ulong task, struct task_mem_usage 
> *tm)
>   {
>   struct task_context *tc;
>   long rss = 0, rss_cache = 0;
> + int mm_count = 0;
> + ulong addr;
>   
>   BZERO(tm, sizeof(struct task_mem_usage));
>   
> - if (IS_ZOMBIE(task) || IS_EXITING(task))
> + if (IS_ZOMBIE(task))
>   return;
>   
>   tc = task_to_context(task);
> @@ -4805,7 +4807,14 @@ get_task_mem_usage(ulong task, struct task_mem_usage 
> *tm)
>   
>   tm->mm_struct_addr = tc->mm_struct;
>   
> - if (!task_mm(task, TRUE))
> + if (!(addr = task_mm(task, TRUE)))
> + return;
> +
> + if (!readmem(addr + OFFSET(mm_struct_mm_count), KVADDR, _count,
> + sizeof(int), "mm_struct mm_count", RETURN_ON_ERROR))
> + return;
> +
> + if (IS_EXITING(task) && mm_count <= 0)
>   return;
>   
>   if (VALID_MEMBER(mm_struct_rss))
--
Crash-utility mailing list
Crash-utility@redhat.com
https://listman.redhat.com/mailman/listinfo/crash-utility
Contribution Guidelines: https://github.com/crash-utility/crash/wiki