Nathan Lynch <nath...@linux.ibm.com> writes: > Daniel Axtens <d...@axtens.net> writes: >>> On VMs with NX encryption, compression, and/or RNG offload, these >>> capabilities are described by nodes in the ibm,platform-facilities device >>> tree hierarchy: >>> >>> $ tree -d /sys/firmware/devicetree/base/ibm,platform-facilities/ >>> /sys/firmware/devicetree/base/ibm,platform-facilities/ >>> ├── ibm,compression-v1 >>> ├── ibm,random-v1 >>> └── ibm,sym-encryption-v1 >>> >>> 3 directories >>> >>> The acceleration functions that these nodes describe are not disrupted by >>> live migration, not even temporarily. >>> >>> But the post-migration ibm,update-nodes sequence firmware always sends >>> "delete" messages for this hierarchy, followed by an "add" directive to >>> reconstruct it via ibm,configure-connector (log with debugging statements >>> enabled in mobility.c): >>> >>> mobility: removing node /ibm,platform-facilities/ibm,random-v1:4294967285 >>> mobility: removing node >>> /ibm,platform-facilities/ibm,compression-v1:4294967284 >>> mobility: removing node >>> /ibm,platform-facilities/ibm,sym-encryption-v1:4294967283 >>> mobility: removing node /ibm,platform-facilities:4294967286 >>> ... >>> mobility: added node /ibm,platform-facilities:4294967286 >>> >>> Note we receive a single "add" message for the entire hierarchy, and what >>> we receive from the ibm,configure-connector sequence is the top-level >>> platform-facilities node along with its three children. The debug message >>> simply reports the parent node and not the whole subtree. >> >> If I understand correctly, (and again, this is not my area at all!) we >> still have to go out to the firmware and call the >> ibm,configure-connector sequence in order to figure out that the node >> we're supposed to add is the ibm,platform-facilites node, right? We >> can't save enough information at delete time to avoid the trip out to >> firmware? > > That is right... but maybe I don't understand your angle here. Unsure > what avoiding the configure-connector sequence for the nodes would buy > us.
It's not meant to be a tricky question, so the simple answer is probably the right one. Just wondering if there was a marginal efficiency gain - although I believe it's not really a hot path anyway. > > >>> Until that can be realized we have a confirmed use-after-free and the >>> possibility of memory corruption. So add a limited workaround that >>> discriminates on the node type, ignoring adds and removes. This should be >>> amenable to backporting in the meantime. >> >> Yeah it's an unpleasant situation to find ourselves in. It's a bit icky >> but as I think you said in a previous email, at least this isn't worse: >> in the common case it should now succeed and and if properties change >> significantly it will still fail. >> >> My one question (from more of a security point of view) is: >> 1) Say you start using the facilities with a particular set of >> parameters. >> >> 2) Say you then get migrated and the parameters change. >> >> 3) If you keep using the platform facilities as if the original >> properties are still valid, can this cause any Interesting, >> unexpected or otherwise Bad consequences? Are we going to end up >> (for example) scribbling over random memory somehow? > > If drivers are safely handling errors from H_COP_OP etc, then no. (I > know, this looks like a Well That Would Be a Driver Bug dismissal, but > that's not my attitude.) And again this is a case where the change > cannot make things worse. > > In the current design of the pseries LPM implementation, user space and > other normal system activity resume as soon as we return from the > stop_machine() call which suspends the partition, executing concurrently > with any device tree updates. So even if we had code in place to > correctly resolve the DT changes and the drivers were able to respond to > the changes, there would still be a window of exposure to the kind of > problem you describe: the changed characteristics, if any, of the > destination obtain as soon as execution resumes, regardless of when the > OS initiates the update-nodes sequence. > > The way out of that mess is to use the Linux suspend framework, or > otherwise prevent user space from executing until the destination > system's characteristics have been appropriately propagated out to the > necessary drivers etc. I'm trying to get there. Fair enough. I do appreciate the perfect not being the enemy of the good especially in areas of the codebase like this where there is scope to improve things but there is also a lot of complexity that we cannot really get away from because the underlying problem domain is itself just plain complex. (I think EEH is the other obvious example in arch/powerpc.) >> Apart from that, the code seems to do what it says, it seems to solve a >> real problem, the error and memory handling makes sense, you _put the DT >> nodes that you _get, the comments are helpful and descriptive, and it >> passes the automated tests on patchwork/snowpatch. > > I appreciate your review! With those questions answered, and with the caveats above and noting my complete inability to test the code: Reviewed-by: Daniel Axtens <d...@axtens.net> Kind regards, Daniel