On Tue, 2023-11-28 at 14:35 +0800, Cao, Quanquan/曹 全全 wrote: > > > Add a check for memdev disable to see if there are active regions present > > before disabling the device. This is necessary now regions are present to > > fulfill the TODO that was left there. The best way to determine if a > > region is active is to see if there are decoders enabled for the mem > > device. This is also best effort as the state is only a snapshot the > > kernel provides and is not atomic WRT the memdev disable operation. The > > expectation is the admin issuing the command has full control of the mem > > device and there are no other agents also attempt to control the device. > > > > Signed-off-by: Dave Jiang <[email protected]> > > --- > > cxl/memdev.c | 14 ++++++++++++-- > > 1 file changed, 12 insertions(+), 2 deletions(-) > > > > diff --git a/cxl/memdev.c b/cxl/memdev.c > > index f6a2d3f1fdca..314bac082719 100644 > > --- a/cxl/memdev.c > > +++ b/cxl/memdev.c > > @@ -373,11 +373,21 @@ static int action_free_dpa(struct cxl_memdev *memdev, > > > > static int action_disable(struct cxl_memdev *memdev, struct > > action_context *actx) > > { > > + struct cxl_endpoint *ep; > > + struct cxl_port *port; > > + > > if (!cxl_memdev_is_enabled(memdev)) > > return 0; > > > > - if (!param.force) { > > - /* TODO: actually detect rather than assume active */ > > + ep = cxl_memdev_get_endpoint(memdev); > > + if (!ep) > > + return -ENODEV; > > + > > + port = cxl_endpoint_get_port(ep); > > + if (!port) > > + return -ENODEV; > > + > > + if (cxl_port_decoders_committed(port) && !param.force) { > > log_err(&ml, "%s is part of an active region\n", > > cxl_memdev_get_devname(memdev)); > > return -EBUSY; > > > > > Hi Dave, > > Based on my understanding of the logic in the "disable_region" and > "destroy_region" code, in the code logic of 'disable-region -f,' after > the check, it proceeds with the offline operation. In the code logic of > 'destroy-region -f,' after the check, it performs a disable operation on > the region. For the 'disable-memdev -f' operation, after completing the > check, is it also necessary to perform corresponding operations on the > region(such as disabling region/destroying region) before disabling memdev? > I think it is better for a disable-memdev to just perform the check and complain, instead of trying to unwind a _lot_ of things - i.e. offline memory, disable region, and then disable memdev. If there is an error in one of the intermediate steps, it can make it hard to trace down what happened, and what state the system is in.
I do think that the current -f behavior of just unbind the driver without the safety checks should come with a big loud warning in the man page at least about what -f will do, and how it will potentially break existing regions.
