On Wed, Jan 14, 2026 at 10:44:08AM +0100, David Hildenbrand (Red Hat) wrote:
> On 1/14/26 09:52, Gregory Price wrote:
> > Add a memory notifier to prevent external operations from changing the
> > online/offline state of memory blocks managed by dax_kmem. This ensures
> > state changes only occur through the driver's hotplug sysfs interface,
> > providing consistent state tracking and preventing races with auto-online
> > policies or direct memory block sysfs manipulation.
> > 
> > The notifier uses a transition protocol with memory barriers:
> >    - Before initiating a state change, set target_state then in_transition
> >    - Use a barrier to ensure target_state is visible before in_transition
> >    - The notifier checks in_transition, then uses barrier before reading
> >      target_state to ensure proper ordering on weakly-ordered architectures
> > 
> > The notifier callback:
> >    - Returns NOTIFY_DONE for non-overlapping memory (not our concern)
> >    - Returns NOTIFY_BAD if in_transition is false (block external ops)
> >    - Validates the memory event matches target_state (MEM_GOING_ONLINE
> >      for online operations, MEM_GOING_OFFLINE for offline/unplug)
> >    - Returns NOTIFY_OK only for driver-initiated operations with matching
> >      target_state
> > 
> > This prevents scenarios where:
> >    - Auto-online policies re-online memory the driver is trying to offline
> 
> Is this still a problem when using offline_and_remove_memory() ?
>

I suppose this commit more than the others is actually an RFC.

DAX might not want it.  Other drivers might.  Now at least I have the
code to do that.

> >    - Users manually change memory state via /sys/devices/system/memory/
> 
> I don't see why we would want to care about that :)
> 

Absolutely critical if we have something like a CXL DCD region that
wants to try to protect hot-unplug.  But that is probably an argument
for implementing this in a cxl region driver than DAX.

> >    - Other kernel subsystems interfere with driver-managed memory state
> What do you have in mind?
> 
> Not sure if this functionality here is really needed when the driver does
> add+online and offline+remove in a single operation. So please elaborate :)

See above - so yeah I'll probably drop this and come back to it in the
sysram_region driver in CXL.

~Gregory

Reply via email to