Hi Lianbo, It is waiting on me, I am planning to follow Kazu's idea to add a new option, for example, runq -q.
On Mon, Jun 9, 2025 at 5:09 PM lijiang <liji...@redhat.com> wrote: > On Mon, May 26, 2025 at 5:04 PM <devel-requ...@lists.crash-utility.osci.io> > wrote: > >> Date: Fri, 23 May 2025 05:23:40 +0000 >> From: HAGIO KAZUHITO(萩尾 一仁) <k-hagio...@nec.com> >> Subject: [Crash-utility] Re: [PATCH] Fix the "ps -m" command shows >> wrong duration of RU task >> To: Tao Liu <l...@redhat.com>, Ke Yin <k...@redhat.com> >> Cc: "devel@lists.crash-utility.osci.io" >> <devel@lists.crash-utility.osci.io> >> Message-ID: <23c74e07-d439-422a-bbea-8b2bf49b3...@nec.com> >> Content-Type: text/plain; charset="utf-8" >> >> Hi, >> >> On 2025/05/15 8:43, Tao Liu wrote: >> > Hi Ke Yin, >> > >> > On Wed, May 14, 2025 at 8:58 PM Ke Yin <k...@redhat.com> wrote: >> >> >> >> Hi Tao Liu & Kazu, >> >> >> >> Thanks for replying and sharing your thoughts. >> >> >> >> After a quick review of crash tool code, I found: >> >> >> >> runq -m will call dump_on_rq_milliseconds() to print the amount >> >> of time that the active task on each cpu has been running, >> >> but only for the current running task. >> >> >> >> runq -d will call dump_on_rq_tasks() to print all tasks in the run >> queue >> >> and the task running on cpu without calling translate_nanoseconds(). >> >> >> >> My preliminary idea is to combine these two functions and add a new >> >> parameter, for example -q, to print the tasks on each cpu that has >> >> been waiting in the run queue only. And as well as update doc of runq. >> >> >> >> In short: >> >> runq -q will call new_function which is the modified function based on >> dump_on_rq_tasks() (skip current + translate_nanoseconds). >> >> >> >> What do you think? >> >> I didn't know the "runq -d" option because it's a kind of debugging >> option and has no description in the help page. Also it searches all >> tasks for ones that have on_rq = 1 and doesn't look very efficient >> (nr_tasks * nr_cpus). so ideally, maybe a new function should be based >> on dump_runq() than based on dump_on_rq_tasks(), if possible.. >> >> > Looks like getting a solution: adding a new option(E.g: runq -q) to > achieve this purpose? Or need more discussion? Any update? > > Thanks > Lianbo > > >> Thanks, >> Kazu >> >> >> > >> > I'm OK with your idea in general. Please check if I understood >> > correctly, your implementation is like: >> > cmd_runq() { >> > ... >> > if (-d option) { >> > dump_on_rq_tasks(old path); >> > } else if (-q option) { >> > dump_on_rq_tasks(new path); >> > } >> > } >> > >> > dump_on_rq_tasks(option) >> > { >> > ... >> > for (i = 0; i < RUNNING_TASKS(); i++, tc++) { >> > if (old path) // Old path stay unchanged >> > dump_task_runq_entry(tc, 0); >> > else // New path will output your time duration >> > your_new_function_with_translate_nanoseconds(); >> > } >> > } >> > >> > Thanks, >> > Tao Liu >> > >> >> >> >> Thanks >> >> Kenneth Yin >> >> >> >> >> >> >> >> >> >> On Mon, May 12, 2025 at 1:36 PM Tao Liu <l...@redhat.com> wrote: >> >>> >> >>> Hi Kazu & Kenneth, >> >>> >> >>> Sorry for the late reply, and thanks for your fix and comments! >> >>> >> >>> On Thu, May 8, 2025 at 12:20 PM HAGIO KAZUHITO(萩尾 一仁) >> >>> <k-hagio...@nec.com> wrote: >> >>>> >> >>>> On 2025/05/07 16:16, HAGIO KAZUHITO(萩尾 一仁) wrote: >> >>>>> Hi, >> >>>>> >> >>>>> On 2025/04/28 19:38, Kenneth Yin wrote: >> >>>>>> The RU/TASK_RUNNING stat means the task is runnable. >> >>>>>> It is either currently running or on a run queue waiting to run. >> >>>>>> >> >>>>>> Currently, the crash tool uses the "rq_clock - >> sched_info->last_arrival" formula to >> >>>>>> calculate the duration of task in RU state. This is for the >> scenario of a task running on a CPU. >> >>>>> >> >>>>> The "ps -l" and "ps -m" options display what their help text >> describes, >> >>>>> not the duration of task in RU state. Please see "help ps". >> >>>>> >> >>>>> Also, tasks are sorted by the value, using different values for it >> could >> >>>>> make another confusion. >> >>>>> >> >>>>> The options have been used for a long time with the current code, >> if we >> >>>>> change the semantics of the options, it would be better to be >> careful. >> >>>>> The change might lose a kind of information instead of getting >> another >> >>>>> kind of information. >> >>>>> >> >>>>> On the other hand, I think that the duration of waiting in queue >> might >> >>>>> also be useful information. I'm not sure how we should display >> them, >> >>>>> but for example, how about adding a new option or adding a column >> for >> >>>>> last_queued? >> >>>> >> >>>> I thought of that the "runq" command might be suitable to display the >> >>>> waiting duration, because only tasks in the run queues have it. For >> >>>> example, extending the "runq -m" option or adding a new option. >> just my >> >>>> thought. >> >>>> >> >>>> Thanks, >> >>>> Kazu >> >>>> >> >>>>> >> >>>>> What do you think, folks? >> >>>>> >> >>>>> Thanks, >> >>>>> Kazu >> >>>>> >> >>>>>> >> >>>>>> But for the scenario of a task waiting in the CPU run queue (due >> to some reason >> >>>>>> for example cfs/rt queue throttled), this formula could cause >> misunderstanding. >> >>>>>> >> >>>>>> For example: >> >>>>>> [ 220 10:36:38.026] [RU] PID: 12345 TASK: ffff8d674ab6b180 >> CPU: 1 COMMAND: "task" >> >>>>>> >> >>>>>> Looking closer: >> >>>>>> >> >>>>>> crash> rq.clock ffff8de438a5acc0 >> >>>>>> clock = 87029229985307234, >> >>>>>> >> >>>>>> crash> task -R sched_info,se.exec_start >> >>>>>> PID: 12345 TASK: ffff8d674ab6b180 CPU: 1 COMMAND: "task" >> >>>>>> sched_info = { >> >>>>>> pcount = 33, >> >>>>>> run_delay = 0, >> >>>>>> last_arrival = 67983031958439673, >> >>>>>> last_queued = 87029224561119369 >> >>>>>> }, >> >>>>>> se.exec_start = 67983031958476937, >> >>>>>> >> >>>>>> 67983031 67983031 87029224 >> 87029229 >> >>>>>> |<- running on CPU ->| <- IN ->|<- waiting in queue >> ->| >> >>>>>> >> >>>>>> For this scenario, the "task" was waiting in the run queue of the >> CPU only for 5 seconds, >> >>>>>> we should use the "rq_clock - sched_info->last_queued" formula. >> >>> >> >>> Please check if my understanding is correct: >> >>> >> >>> The result you saw is "rq_clock - sched_info->last_arrival == 87029229 >> >>> - 67983031 == 19046198" >> >>> The expected result you want is: "rq_clock - sched_info->last_queued >> >>> == 87029229 - 87029224 == 5" >> >>> >> >>> You think the 19046198 value is misleading and should be 5 which only >> >>> contains the waiting in queue duration, am I correct? >> >>> >> >>> I agree with Kazu's idea, that we shouldn't change the existing ps >> >>> cmd's behaviour, and runq is a better alternative for the >> >>> waiting-in-queue duration display. >> >>> >> >>> What do you think? Could you please improve your code as well as an >> >>> updated "help runq" doc for runq? >> >>> >> >>> Thanks, >> >>> Tao Liu >> >>> >> >>>>>> >> >>>>>> We can trust sched_info->last_queued as it is only set when the >> task enters the CPU run queue. >> >>>>>> Furthermore, when the task hits/runs on a CPU or dequeues the CPU >> run queue, it will be reset to 0. >> >>>>>> >> >>>>>> Therefore, my idea is simple: >> >>>>>> >> >>>>>> If a task in RU stat and sched_info->last_queued has value (!= 0), >> >>>>>> it means this task is waiting in the run queue, use "rq_clock - >> sched_info->last_queued". >> >>>>>> >> >>>>>> Otherwise, if a task in RU stat and sched_info->last_queued = 0 >> >>>>>> and sched_info->last_arrival has value (it must be), it means this >> task is running on the CPU, >> >>>>>> use "rq_clock - sched_info->last_arrival". >> >>>>>> >> >>>>>> Signed-off-by: Kenneth Yin <k...@redhat.com> >> >>>>>> --- >> >>>>>> defs.h | 1 + >> >>>>>> symbols.c | 2 ++ >> >>>>>> task.c | 21 +++++++++++++++------ >> >>>>>> 3 files changed, 18 insertions(+), 6 deletions(-) >> >>>>>> >> >>>>>> diff --git a/defs.h b/defs.h >> >>>>>> index 4cf169c..66f5ce4 100644 >> >>>>>> --- a/defs.h >> >>>>>> +++ b/defs.h >> >>>>>> @@ -1787,6 +1787,7 @@ struct offset_table { /* >> stash of commonly-used offsets */ >> >>>>>> long vcpu_struct_rq; >> >>>>>> long task_struct_sched_info; >> >>>>>> long sched_info_last_arrival; >> >>>>>> + long sched_info_last_queued; >> >>>>>> long page_objects; >> >>>>>> long kmem_cache_oo; >> >>>>>> long char_device_struct_cdev; >> >>>>>> diff --git a/symbols.c b/symbols.c >> >>>>>> index e30fafe..fb5035f 100644 >> >>>>>> --- a/symbols.c >> >>>>>> +++ b/symbols.c >> >>>>>> @@ -9930,6 +9930,8 @@ dump_offset_table(char *spec, ulong >> makestruct) >> >>>>>> OFFSET(sched_rt_entity_run_list)); >> >>>>>> fprintf(fp, " sched_info_last_arrival: %ld\n", >> >>>>>> OFFSET(sched_info_last_arrival)); >> >>>>>> + fprintf(fp, " sched_info_last_queued: %ld\n", >> >>>>>> + OFFSET(sched_info_last_queued)); >> >>>>>> fprintf(fp, " task_struct_thread_info: %ld\n", >> >>>>>> OFFSET(task_struct_thread_info)); >> >>>>>> fprintf(fp, " task_struct_stack: %ld\n", >> >>>>>> diff --git a/task.c b/task.c >> >>>>>> index 3bafe79..f5386ac 100644 >> >>>>>> --- a/task.c >> >>>>>> +++ b/task.c >> >>>>>> @@ -332,9 +332,12 @@ task_init(void) >> >>>>>> MEMBER_OFFSET_INIT(task_struct_last_run, >> "task_struct", "last_run"); >> >>>>>> MEMBER_OFFSET_INIT(task_struct_timestamp, >> "task_struct", "timestamp"); >> >>>>>> MEMBER_OFFSET_INIT(task_struct_sched_info, >> "task_struct", "sched_info"); >> >>>>>> - if (VALID_MEMBER(task_struct_sched_info)) >> >>>>>> + if (VALID_MEMBER(task_struct_sched_info)) { >> >>>>>> MEMBER_OFFSET_INIT(sched_info_last_arrival, >> >>>>>> "sched_info", "last_arrival"); >> >>>>>> + MEMBER_OFFSET_INIT(sched_info_last_queued, >> >>>>>> + "sched_info", "last_queued"); >> >>>>>> + } >> >>>>>> if (VALID_MEMBER(task_struct_last_run) || >> >>>>>> VALID_MEMBER(task_struct_timestamp) || >> >>>>>> VALID_MEMBER(sched_info_last_arrival)) { >> >>>>>> @@ -6035,7 +6038,7 @@ ulonglong >> >>>>>> task_last_run(ulong task) >> >>>>>> { >> >>>>>> ulong last_run; >> >>>>>> - ulonglong timestamp; >> >>>>>> + ulonglong timestamp,last_queued; >> >>>>>> >> >>>>>> timestamp = 0; >> >>>>>> fill_task_struct(task); >> >>>>>> @@ -6047,10 +6050,16 @@ task_last_run(ulong task) >> >>>>>> } else if (VALID_MEMBER(task_struct_timestamp)) >> >>>>>> timestamp = tt->last_task_read ? >> ULONGLONG(tt->task_struct + >> >>>>>> OFFSET(task_struct_timestamp)) : 0; >> >>>>>> - else if (VALID_MEMBER(sched_info_last_arrival)) >> >>>>>> - timestamp = tt->last_task_read ? >> ULONGLONG(tt->task_struct + >> >>>>>> - OFFSET(task_struct_sched_info) + >> >>>>>> - OFFSET(sched_info_last_arrival)) : 0; >> >>>>>> + else if (VALID_MEMBER(sched_info_last_queued)) >> >>>>>> + last_queued = ULONGLONG(tt->task_struct + >> >>>>>> + OFFSET(task_struct_sched_info) + >> >>>>>> + OFFSET(sched_info_last_queued)); >> >>>>>> + if (last_queued != 0) { >> >>>>>> + timestamp = tt->last_task_read ? last_queued >> : 0; >> >>>>>> + } else if (VALID_MEMBER(sched_info_last_arrival)) >> >>>>>> + timestamp = tt->last_task_read ? >> ULONGLONG(tt->task_struct + >> >>>>>> + OFFSET(task_struct_sched_info) + >> >>>>>> + OFFSET(sched_info_last_arrival)) : 0; >> >>>>>> >> >>>>>> return timestamp; >> >>>>>> } >> >>>>> -- >> >>>>> Crash-utility mailing list -- devel@lists.crash-utility.osci.io >> >>>>> To unsubscribe send an email to >> devel-le...@lists.crash-utility.osci.io >> >>>>> https://${domain_name}/admin/lists/ >> devel.lists.crash-utility.osci.io/ >> >>>>> Contribution Guidelines: >> https://github.com/crash-utility/crash/wiki >> >>>> -- >> >>>> Crash-utility mailing list -- devel@lists.crash-utility.osci.io >> >>>> To unsubscribe send an email to >> devel-le...@lists.crash-utility.osci.io >> >>>> https://${domain_name}/admin/lists/ >> devel.lists.crash-utility.osci.io/ >> >>>> Contribution Guidelines: https://github.com/crash-utility/crash/wiki >> >>> >> >> >> >> >> >> -- >> >> Kenneth Yin >> >> Senior Software Maintenance Engineer >> >> Customer Experience and Engagement >> >> Phone: +86-10-6533-9459 >> >> Red Hat China >> > -- Kenneth Yin Senior Software Maintenance Engineer Customer Experience and Engagement Phone: +86-10-6533-9459 Red Hat China
-- Crash-utility mailing list -- devel@lists.crash-utility.osci.io To unsubscribe send an email to devel-le...@lists.crash-utility.osci.io https://${domain_name}/admin/lists/devel.lists.crash-utility.osci.io/ Contribution Guidelines: https://github.com/crash-utility/crash/wiki