On 12/21/25 10:35, Li Wang wrote:
David Hildenbrand (Red Hat) <[email protected]> wrote:
On 12/21/25 09:58, Li Wang wrote:
The hugetlb cgroup usage wait loops in charge_reserved_hugetlb.sh were
unbounded and could hang forever if the expected cgroup file value never
appears (e.g. due to bugs, timing issues, or unexpected behavior).
Did you actually hit that in practice? Just wondering.
Yes.
On an aarch64 64k setup with 512MB hugepages, the test failed earlier
(hugetlbfs got mounted with an effective size of 0 due to size=256M), so
write_to_hugetlbfs couldn’t allocate the expected pages. After that, the
script’s wait loops never observed the target value, so they spun forever.
Okay, so essentially what you fix in patch #3, correct?
It might make sense to reorder #2 and #3, and likely current #3 should
get a Fixes: tag.
Then you can just briefly describe here that this was previously hit due
to other tests issues. Although I wonder how much value this patch here
as after #3 is in. But it looks like a cleanup and the timeout of 60s
sounds reasonable.
I know the reservation of hugetlb folios can take a rather long time in
some environments, though.
--
Cheers
David