Hi Reinette, Richard, On 6/26/26 14:58, Ben Horgan wrote: > Hi Reinette, Richard, > > On 6/26/26 04:26, Reinette Chatre wrote: >> +Ben >> >> Hi Richard, >> >> On 5/28/26 7:23 PM, Richard Cheng wrote: >>> cl_flush() and sb() in fill_buf.c only have implementations for i386 >>> and x86_64, so on aarch64 both compile to empty functions. mem_flush() >>> then walks the buffer calling a no-op cl_flush() per cache line and >>> finishes with a no-op sb(), leaving any caller that expects a flushed >>> buffer (e.g. CMT, L3_CAT) operating on unflushed state with no warning. >>> >>> Add an aarch64 code block using the ARM equivalents: >>> * "dc civac, %0" for cl_flush() >>> * "dsb sy" for sb() >> >> Calling on Arm experts here since my superficial check found sfence to >> be used for __wmb() on x86 and the Arm equivalent per >> arch/arm64/include/asm/barrier.h appears to be "dsb st"? > > Referring to the arm reference manual (DDI0487 version M.a.a): > D7.5.9.15 Ordering and completion of data and instruction cache > instructions > This talks about using dsb for the synchronization and also states: > "In all cases, where the text in this section refers to a DMB or a DSB, > this means a DMB or DSB whose required access type is both loads and > stores." > > Hence, in this case a "dsb st" is insufficient as the required access > type is loads but not stores. A full "dsb sy" would work to synchronize > the "dc civac". > > However, I don't think "dc civac" fulfills the role of what is expected > of cl_flush(). > >> >> Even so, it looks like the changes below were considered by Ben during >> a previous submission but I am not able to tell if his feedback was taken >> into account here. >> Please see: >> https://lore.kernel.org/lkml/[email protected]/ >> https://lore.kernel.org/lkml/[email protected]/ > > My understanding is that the resctrl selftests want to use cl_flush(), > to invalidate entries in a system level cache for testing the cache > portion bitmaps. However, the mechanism to invalidate the system level > cache is generally implementation defined. >
I have also found out that cache maintenance operations on arm64 need only do the necessary work to maintain coherency and no performance effects can be implied. On platforms that are known to be coherent then they may be a NOP. Perhaps you can just do some stores to fill the cache with junk data. Thanks, Ben

