Leonardo Brás <leobra...@gmail.com> writes: > On Fri, 2022-09-09 at 09:04 -0500, Nathan Lynch wrote: >> Leonardo Brás <leobra...@gmail.com> writes: >> > On Wed, 2022-09-07 at 17:01 -0500, Nathan Lynch wrote: >> > > At the time this was submitted by Leonardo, I confirmed -- or thought >> > > I had confirmed -- with PowerVM partition firmware development that >> > > the following RTAS functions: >> > > >> > > - ibm,get-xive >> > > - ibm,int-off >> > > - ibm,int-on >> > > - ibm,set-xive >> > > >> > > were safe to call on multiple CPUs simultaneously, not only with >> > > respect to themselves as indicated by PAPR, but with arbitrary other >> > > RTAS calls: >> > > >> > > https://lore.kernel.org/linuxppc-dev/875zcy2v8o....@linux.ibm.com/ >> > > >> > > Recent discussion with firmware development makes it clear that this >> > > is not true, and that the code in commit b664db8e3f97 ("powerpc/rtas: >> > > Implement reentrant rtas call") is unsafe, likely explaining several >> > > strange bugs we've seen in internal testing involving DLPAR and >> > > LPM. These scenarios use ibm,configure-connector, whose internal state >> > > can be corrupted by the concurrent use of the "reentrant" functions, >> > > leading to symptoms like endless busy statuses from RTAS. >> > >> > Oh, does not it means PowerVM is not compliant to the PAPR specs? >> >> No, it means the premise of commit b664db8e3f97 ("powerpc/rtas: >> Implement reentrant rtas call") change is incorrect. The "reentrant" >> property described in the spec applies only to the individual RTAS >> functions. The OS can invoke (for example) ibm,set-xive on multiple CPUs >> simultaneously, but it must adhere to the more general requirement to >> serialize with other RTAS functions. >> > > I see. Thanks for explaining that part! > I agree: reentrant calls that way don't look as useful on Linux than I > previously thought. > > OTOH, I think that instead of reverting the change, we could make use of the > correct information and fix the current implementation. (This could help when > we > do the same rtas call in multiple cpus)
Hmm I'm happy to be mistaken here, but I doubt we ever really need to do that. I'm not seeing the need. > I have an idea of a patch to fix this. > Do you think it would be ok if I sent that, to prospect being an alternative > to > this reversion? It is my preference, and I believe it is more common, to revert to the well-understood prior state, imperfect as it may be. The revert can be backported to -stable and distros while development and review of another approach proceeds.