Hi, On 2025/05/15 8:43, Tao Liu wrote: > Hi Ke Yin, > > On Wed, May 14, 2025 at 8:58 PM Ke Yin <k...@redhat.com> wrote: >> >> Hi Tao Liu & Kazu, >> >> Thanks for replying and sharing your thoughts. >> >> After a quick review of crash tool code, I found: >> >> runq -m will call dump_on_rq_milliseconds() to print the amount >> of time that the active task on each cpu has been running, >> but only for the current running task. >> >> runq -d will call dump_on_rq_tasks() to print all tasks in the run queue >> and the task running on cpu without calling translate_nanoseconds(). >> >> My preliminary idea is to combine these two functions and add a new >> parameter, for example -q, to print the tasks on each cpu that has >> been waiting in the run queue only. And as well as update doc of runq. >> >> In short: >> runq -q will call new_function which is the modified function based on >> dump_on_rq_tasks() (skip current + translate_nanoseconds). >> >> What do you think?
I didn't know the "runq -d" option because it's a kind of debugging option and has no description in the help page. Also it searches all tasks for ones that have on_rq = 1 and doesn't look very efficient (nr_tasks * nr_cpus). so ideally, maybe a new function should be based on dump_runq() than based on dump_on_rq_tasks(), if possible.. Thanks, Kazu > > I'm OK with your idea in general. Please check if I understood > correctly, your implementation is like: > cmd_runq() { > ... > if (-d option) { > dump_on_rq_tasks(old path); > } else if (-q option) { > dump_on_rq_tasks(new path); > } > } > > dump_on_rq_tasks(option) > { > ... > for (i = 0; i < RUNNING_TASKS(); i++, tc++) { > if (old path) // Old path stay unchanged > dump_task_runq_entry(tc, 0); > else // New path will output your time duration > your_new_function_with_translate_nanoseconds(); > } > } > > Thanks, > Tao Liu > >> >> Thanks >> Kenneth Yin >> >> >> >> >> On Mon, May 12, 2025 at 1:36 PM Tao Liu <l...@redhat.com> wrote: >>> >>> Hi Kazu & Kenneth, >>> >>> Sorry for the late reply, and thanks for your fix and comments! >>> >>> On Thu, May 8, 2025 at 12:20 PM HAGIO KAZUHITO(萩尾 一仁) >>> <k-hagio...@nec.com> wrote: >>>> >>>> On 2025/05/07 16:16, HAGIO KAZUHITO(萩尾 一仁) wrote: >>>>> Hi, >>>>> >>>>> On 2025/04/28 19:38, Kenneth Yin wrote: >>>>>> The RU/TASK_RUNNING stat means the task is runnable. >>>>>> It is either currently running or on a run queue waiting to run. >>>>>> >>>>>> Currently, the crash tool uses the "rq_clock - sched_info->last_arrival" >>>>>> formula to >>>>>> calculate the duration of task in RU state. This is for the scenario of >>>>>> a task running on a CPU. >>>>> >>>>> The "ps -l" and "ps -m" options display what their help text describes, >>>>> not the duration of task in RU state. Please see "help ps". >>>>> >>>>> Also, tasks are sorted by the value, using different values for it could >>>>> make another confusion. >>>>> >>>>> The options have been used for a long time with the current code, if we >>>>> change the semantics of the options, it would be better to be careful. >>>>> The change might lose a kind of information instead of getting another >>>>> kind of information. >>>>> >>>>> On the other hand, I think that the duration of waiting in queue might >>>>> also be useful information. I'm not sure how we should display them, >>>>> but for example, how about adding a new option or adding a column for >>>>> last_queued? >>>> >>>> I thought of that the "runq" command might be suitable to display the >>>> waiting duration, because only tasks in the run queues have it. For >>>> example, extending the "runq -m" option or adding a new option. just my >>>> thought. >>>> >>>> Thanks, >>>> Kazu >>>> >>>>> >>>>> What do you think, folks? >>>>> >>>>> Thanks, >>>>> Kazu >>>>> >>>>>> >>>>>> But for the scenario of a task waiting in the CPU run queue (due to >>>>>> some reason >>>>>> for example cfs/rt queue throttled), this formula could cause >>>>>> misunderstanding. >>>>>> >>>>>> For example: >>>>>> [ 220 10:36:38.026] [RU] PID: 12345 TASK: ffff8d674ab6b180 CPU: 1 >>>>>> COMMAND: "task" >>>>>> >>>>>> Looking closer: >>>>>> >>>>>> crash> rq.clock ffff8de438a5acc0 >>>>>> clock = 87029229985307234, >>>>>> >>>>>> crash> task -R sched_info,se.exec_start >>>>>> PID: 12345 TASK: ffff8d674ab6b180 CPU: 1 COMMAND: "task" >>>>>> sched_info = { >>>>>> pcount = 33, >>>>>> run_delay = 0, >>>>>> last_arrival = 67983031958439673, >>>>>> last_queued = 87029224561119369 >>>>>> }, >>>>>> se.exec_start = 67983031958476937, >>>>>> >>>>>> 67983031 67983031 87029224 87029229 >>>>>> |<- running on CPU ->| <- IN ->|<- waiting in queue ->| >>>>>> >>>>>> For this scenario, the "task" was waiting in the run queue of the CPU >>>>>> only for 5 seconds, >>>>>> we should use the "rq_clock - sched_info->last_queued" formula. >>> >>> Please check if my understanding is correct: >>> >>> The result you saw is "rq_clock - sched_info->last_arrival == 87029229 >>> - 67983031 == 19046198" >>> The expected result you want is: "rq_clock - sched_info->last_queued >>> == 87029229 - 87029224 == 5" >>> >>> You think the 19046198 value is misleading and should be 5 which only >>> contains the waiting in queue duration, am I correct? >>> >>> I agree with Kazu's idea, that we shouldn't change the existing ps >>> cmd's behaviour, and runq is a better alternative for the >>> waiting-in-queue duration display. >>> >>> What do you think? Could you please improve your code as well as an >>> updated "help runq" doc for runq? >>> >>> Thanks, >>> Tao Liu >>> >>>>>> >>>>>> We can trust sched_info->last_queued as it is only set when the task >>>>>> enters the CPU run queue. >>>>>> Furthermore, when the task hits/runs on a CPU or dequeues the CPU run >>>>>> queue, it will be reset to 0. >>>>>> >>>>>> Therefore, my idea is simple: >>>>>> >>>>>> If a task in RU stat and sched_info->last_queued has value (!= 0), >>>>>> it means this task is waiting in the run queue, use "rq_clock - >>>>>> sched_info->last_queued". >>>>>> >>>>>> Otherwise, if a task in RU stat and sched_info->last_queued = 0 >>>>>> and sched_info->last_arrival has value (it must be), it means this task >>>>>> is running on the CPU, >>>>>> use "rq_clock - sched_info->last_arrival". >>>>>> >>>>>> Signed-off-by: Kenneth Yin <k...@redhat.com> >>>>>> --- >>>>>> defs.h | 1 + >>>>>> symbols.c | 2 ++ >>>>>> task.c | 21 +++++++++++++++------ >>>>>> 3 files changed, 18 insertions(+), 6 deletions(-) >>>>>> >>>>>> diff --git a/defs.h b/defs.h >>>>>> index 4cf169c..66f5ce4 100644 >>>>>> --- a/defs.h >>>>>> +++ b/defs.h >>>>>> @@ -1787,6 +1787,7 @@ struct offset_table { /* stash >>>>>> of commonly-used offsets */ >>>>>> long vcpu_struct_rq; >>>>>> long task_struct_sched_info; >>>>>> long sched_info_last_arrival; >>>>>> + long sched_info_last_queued; >>>>>> long page_objects; >>>>>> long kmem_cache_oo; >>>>>> long char_device_struct_cdev; >>>>>> diff --git a/symbols.c b/symbols.c >>>>>> index e30fafe..fb5035f 100644 >>>>>> --- a/symbols.c >>>>>> +++ b/symbols.c >>>>>> @@ -9930,6 +9930,8 @@ dump_offset_table(char *spec, ulong makestruct) >>>>>> OFFSET(sched_rt_entity_run_list)); >>>>>> fprintf(fp, " sched_info_last_arrival: %ld\n", >>>>>> OFFSET(sched_info_last_arrival)); >>>>>> + fprintf(fp, " sched_info_last_queued: %ld\n", >>>>>> + OFFSET(sched_info_last_queued)); >>>>>> fprintf(fp, " task_struct_thread_info: %ld\n", >>>>>> OFFSET(task_struct_thread_info)); >>>>>> fprintf(fp, " task_struct_stack: %ld\n", >>>>>> diff --git a/task.c b/task.c >>>>>> index 3bafe79..f5386ac 100644 >>>>>> --- a/task.c >>>>>> +++ b/task.c >>>>>> @@ -332,9 +332,12 @@ task_init(void) >>>>>> MEMBER_OFFSET_INIT(task_struct_last_run, "task_struct", >>>>>> "last_run"); >>>>>> MEMBER_OFFSET_INIT(task_struct_timestamp, "task_struct", >>>>>> "timestamp"); >>>>>> MEMBER_OFFSET_INIT(task_struct_sched_info, "task_struct", >>>>>> "sched_info"); >>>>>> - if (VALID_MEMBER(task_struct_sched_info)) >>>>>> + if (VALID_MEMBER(task_struct_sched_info)) { >>>>>> MEMBER_OFFSET_INIT(sched_info_last_arrival, >>>>>> "sched_info", "last_arrival"); >>>>>> + MEMBER_OFFSET_INIT(sched_info_last_queued, >>>>>> + "sched_info", "last_queued"); >>>>>> + } >>>>>> if (VALID_MEMBER(task_struct_last_run) || >>>>>> VALID_MEMBER(task_struct_timestamp) || >>>>>> VALID_MEMBER(sched_info_last_arrival)) { >>>>>> @@ -6035,7 +6038,7 @@ ulonglong >>>>>> task_last_run(ulong task) >>>>>> { >>>>>> ulong last_run; >>>>>> - ulonglong timestamp; >>>>>> + ulonglong timestamp,last_queued; >>>>>> >>>>>> timestamp = 0; >>>>>> fill_task_struct(task); >>>>>> @@ -6047,10 +6050,16 @@ task_last_run(ulong task) >>>>>> } else if (VALID_MEMBER(task_struct_timestamp)) >>>>>> timestamp = tt->last_task_read ? >>>>>> ULONGLONG(tt->task_struct + >>>>>> OFFSET(task_struct_timestamp)) : 0; >>>>>> - else if (VALID_MEMBER(sched_info_last_arrival)) >>>>>> - timestamp = tt->last_task_read ? ULONGLONG(tt->task_struct >>>>>> + >>>>>> - OFFSET(task_struct_sched_info) + >>>>>> - OFFSET(sched_info_last_arrival)) : 0; >>>>>> + else if (VALID_MEMBER(sched_info_last_queued)) >>>>>> + last_queued = ULONGLONG(tt->task_struct + >>>>>> + OFFSET(task_struct_sched_info) + >>>>>> + OFFSET(sched_info_last_queued)); >>>>>> + if (last_queued != 0) { >>>>>> + timestamp = tt->last_task_read ? last_queued : 0; >>>>>> + } else if (VALID_MEMBER(sched_info_last_arrival)) >>>>>> + timestamp = tt->last_task_read ? >>>>>> ULONGLONG(tt->task_struct + >>>>>> + OFFSET(task_struct_sched_info) + >>>>>> + OFFSET(sched_info_last_arrival)) : 0; >>>>>> >>>>>> return timestamp; >>>>>> } >>>>> -- >>>>> Crash-utility mailing list -- devel@lists.crash-utility.osci.io >>>>> To unsubscribe send an email to devel-le...@lists.crash-utility.osci.io >>>>> https://${domain_name}/admin/lists/devel.lists.crash-utility.osci.io/ >>>>> Contribution Guidelines: https://github.com/crash-utility/crash/wiki >>>> -- >>>> Crash-utility mailing list -- devel@lists.crash-utility.osci.io >>>> To unsubscribe send an email to devel-le...@lists.crash-utility.osci.io >>>> https://${domain_name}/admin/lists/devel.lists.crash-utility.osci.io/ >>>> Contribution Guidelines: https://github.com/crash-utility/crash/wiki >>> >> >> >> -- >> Kenneth Yin >> Senior Software Maintenance Engineer >> Customer Experience and Engagement >> Phone: +86-10-6533-9459 >> Red Hat China -- Crash-utility mailing list -- devel@lists.crash-utility.osci.io To unsubscribe send an email to devel-le...@lists.crash-utility.osci.io https://${domain_name}/admin/lists/devel.lists.crash-utility.osci.io/ Contribution Guidelines: https://github.com/crash-utility/crash/wiki