On Wed, Jan 14, 2026 at 10:44:08AM +0100, David Hildenbrand (Red Hat) wrote: > On 1/14/26 09:52, Gregory Price wrote: > > Add a memory notifier to prevent external operations from changing the > > online/offline state of memory blocks managed by dax_kmem. This ensures > > state changes only occur through the driver's hotplug sysfs interface, > > providing consistent state tracking and preventing races with auto-online > > policies or direct memory block sysfs manipulation. > > > > The notifier uses a transition protocol with memory barriers: > > - Before initiating a state change, set target_state then in_transition > > - Use a barrier to ensure target_state is visible before in_transition > > - The notifier checks in_transition, then uses barrier before reading > > target_state to ensure proper ordering on weakly-ordered architectures > > > > The notifier callback: > > - Returns NOTIFY_DONE for non-overlapping memory (not our concern) > > - Returns NOTIFY_BAD if in_transition is false (block external ops) > > - Validates the memory event matches target_state (MEM_GOING_ONLINE > > for online operations, MEM_GOING_OFFLINE for offline/unplug) > > - Returns NOTIFY_OK only for driver-initiated operations with matching > > target_state > > > > This prevents scenarios where: > > - Auto-online policies re-online memory the driver is trying to offline > > Is this still a problem when using offline_and_remove_memory() ? >
I suppose this commit more than the others is actually an RFC. DAX might not want it. Other drivers might. Now at least I have the code to do that. > > - Users manually change memory state via /sys/devices/system/memory/ > > I don't see why we would want to care about that :) > Absolutely critical if we have something like a CXL DCD region that wants to try to protect hot-unplug. But that is probably an argument for implementing this in a cxl region driver than DAX. > > - Other kernel subsystems interfere with driver-managed memory state > What do you have in mind? > > Not sure if this functionality here is really needed when the driver does > add+online and offline+remove in a single operation. So please elaborate :) See above - so yeah I'll probably drop this and come back to it in the sysram_region driver in CXL. ~Gregory

