On Sun, Dec 21, 2025 at 12:02:11AM +0530, Garg, Shivank wrote: > > >On 12/20/2025 6:08 AM, Wei Yang wrote: >> On Fri, Dec 19, 2025 at 02:54:40PM +0530, Garg, Shivank wrote: >>> >>> >>> On 12/17/2025 8:18 AM, Wei Yang wrote: >>>> On Tue, Dec 16, 2025 at 11:11:38AM +0000, Shivank Garg wrote: >>>>> Replace 'goto skip' with actual logic for better code readability. >>>>> >>>>> No functional change. >>>>> >>>>> Signed-off-by: Shivank Garg <[email protected]> >>>>> --- >>>>> mm/khugepaged.c | 7 ++++--- >>>>> 1 file changed, 4 insertions(+), 3 deletions(-) >>>>> >>>>> diff --git a/mm/khugepaged.c b/mm/khugepaged.c >>>>> index 6c8c35d3e0c9..107146f012b1 100644 >>>>> --- a/mm/khugepaged.c >>>>> +++ b/mm/khugepaged.c >>>>> @@ -2442,14 +2442,15 @@ static unsigned int >>>>> khugepaged_scan_mm_slot(unsigned int pages, int *result, >>>>> break; >>>>> } >>>>> if (!thp_vma_allowable_order(vma, vma->vm_flags, >>>>> TVA_KHUGEPAGED, PMD_ORDER)) { >>>>> -skip: >>>>> progress++; >>>>> continue; >>>>> } >>>>> hstart = round_up(vma->vm_start, HPAGE_PMD_SIZE); >>>>> hend = round_down(vma->vm_end, HPAGE_PMD_SIZE); >>>>> - if (khugepaged_scan.address > hend) >>>>> - goto skip; >>>>> + if (khugepaged_scan.address > hend) { >>>>> + progress++; >>>>> + continue; >>>>> + } >>>> >>>> Hi, Shivank >>>> >>>> The change here looks good, while I come up with an question. >>>> >>>> The @progress here seems record two things: >>>> >>>> * number of pages scaned >>>> * number of vma skipped >>>> >>> Three things: number of mm. It's incremented 1 for whole >>> khugepaged_scan_mm_slot(). >>> >> >> Agree. >> >>> >>>> While in very rare case, we may miss to count the second case. >>>> >>>> For example, we have following vmas in a process: >>>> >>>> vma1 vma2 >>>> +----------------+------------+ >>>> |2M |1M | >>>> +----------------+------------+ >>>> >>>> Let's assume vma1 is exactly HPAGE_PMD_SIZE and also HPAGE_PMD_SIZE >>>> aligned. >>>> But vma2 is only half of HPAGE_PMD_SIZE. >>>> >>>> When scan finish vma1 and start on vma2, we would have hstart = hend = >>>> address. So we continue here but would not do real scan, since address == >>>> hend. >>>> >>>> I am thinking whether this could handle it: >>>> >>>> if (khugepaged_scan.address > hend || hend <= hstart) { >>>> progress++; >>>> continue; >>>> } >>>> >>>> Do you thinks I am correct on this? >>> >>> I think you're correct. >>> IIUC, @progress acts as rate limiter here. >>> >>> It is increasing +1 for whole, and then increases by +1 per VMA (if >>> skipped), >>> or by +HPAGE_PMD_NR (if actually scanned). >>> >>> So, progress ensuring the hugepaged_do_scan run only until (progress >= >>> pages) >>> at which point it yields and sleeps (wait_event_freezable). >>> >>> Without your suggested fix, if a process contains a large number of small >>> VMAs (where >>> round_up hstart >= round_down(hend), it will unfairly consume more CPU >>> cycles before >>> yielding compared to a process with fewer or aligned VMAs. >> >> You are right. While I am not sure it exists in reality, but in theory it >> could be. >> >>> >>> I think your suggestion is ensuring fairness by charging 'progress' count >>> correctly. >>> >> >> Thanks for your confirmation. Would you mind add a cleanup in next version, >> or >> you prefer me to send it :-) > >Sure, I'll add this fix patch in the next version. >
Thanks >Thanks, >Shivank -- Wei Yang Help you, Help me
