On 29.07.20 19:31, Mike Kravetz wrote:
> On 6/30/20 7:26 AM, David Hildenbrand wrote:
>> Right now, if we have two isolations racing, we might trigger the
>> WARN_ON_ONCE() and to dump_page(NULL), dereferencing NULL. Let's just
>> return directly.
> Just curious, what call path has the WARN_ON_ONCE()/dump_page(NULL)?
See below, two set_migratetype_isolate() caller racing.
>> In the future, we might want to report -EAGAIN to the caller instead, as
>> this could indicate a temporary isolation failure only.
>> Cc: Andrew Morton
>> Cc: Michal Hocko
>> Cc: Michael S. Tsirkin
>> Signed-off-by: David Hildenbrand
> Hi David,
> That 'return -EAGAIN' was added as a sort of synchronization mechanism.
> See commit message for 2c7452a075d4d. Before adding the 'return -EAGAIN',
> I could create races which would abandon isolated pageblocks. Repeating
> those races over and over would result in a good chunk of system memory
> being isolated and unusable.
It's actually -EBUSY, it should maybe later be changed to -EAGAIN (see
comment), so caller can decide to retry immediately. Other discussion.
> Admittedly, these races are rare and I had to work really hard to produce
> them. I'll try to find my testing mechanism. My concern is reintroducing
> this abandoning of pageblocks. I have not looked further in your series
> to see if this potentially addressed later. If not, then we should not
> remove the return code.
Memory offlining could race with alloc_contig_range(), e.g., called when
allocating gigantic pages, or when virtio-mem tries to unplug memory.
The latter two could also race.
We are getting more alloc_contig_range() users, which is why these races
will become more relevant.
I have no clue what you mean with "reintroducing this abandoning of
pageblocks". All this patch is changing is not doing the dump_page() -
or am I missing something important?
David / dhildenb
Virtualization mailing list