On Wed, Sep 17, 2025 at 11:51:20AM +0800, Yicong Yang wrote: > On 2025/9/16 22:56, Catalin Marinas wrote: > > On Mon, Sep 15, 2025 at 04:29:25PM +0800, Yicong Yang wrote: > >> in my understanding the hwcap only describes the capabilities of the CPU > >> but not > >> the whole system. the users should make sure the function works as > >> expected if the > >> CPU supports it and they're going to use it. specifically the LS64 is > >> intended for > >> device memory only, so the user should take responsibility of using it on > >> supported > >> memory. > > > > We have other cases like MTE where we avoid exposing the HWCAP to user > > if we know the memory system does not support MTE, though we intercepted > > this early and asked the (micro)architects to tie the CPU ID field to > > what the system supports. > > but we lack the same identification mechanism as CPU for the memory system, > so it's just a > restriction for the hardware vendor that if certain feature is not supported > for the whole > system (SoC) then do not advertise it in the CPU's ID field. otherwise i > think we're currently > doing in the manner that if capability mismatch or cannot work as expected > together then a > errata/workaround is used to disable the feature or add some workaround on > this certain > platform. > > this is also the case for LS64 but a bit more complex, since it involves the > completer outside > the SoC (the device) and could be a hotplug one (PCIe). from the SoC part we > can restrict to > advertise the feature only if it's fully supported (what we've already done > on our hardware).
That's good to know. Hopefully other vendors do the same. I think the ARM ARM would benefit from a note here that the system designers should not advertise this if the interconnect does not support it. I can raise this internally. > > Arguably, the use of LD/ST64B* is fairly specialised and won't be used > > on the general purpose RAM and by random applications. It needs a device > > driver to create the NC/Device mapping and specific programs/libraries > > to access it. I'm not sure the LS64 properties are guaranteed by the > > device alone or the device together with the interconnect. I suspect the > > latter and neither the kernel driver nor user space can tell. In the > > best case, you get a fault and realise the system doesn't work as > > expected. Worse is the non-atomicity with potentially silent corruption. > > will be the latter one, both interconnect and the target device need to > support it. but I think the driver developer (kernel driver or userspace > driver) must have knowledge about the support status, otherwise they > should not use it. [...] > my thoughts is that the driver developer should have known whether their > device support it or not if going to use this. the information in the > firmware table should be fine for platform devices, but cannot describe > information for hotpluggable ones like PCIe endpoint devices which may > not be listed in a firmware table. There's a risk of such instructions ending up in more generic copy_to/from_io implementations but it's not much we can do other than not enabling the feature at all. So, I think a HWCAP bit is useful but we need (a) clarification that the CPUID field won't be set if the system doesn't support it and (b) document the Linux bit that it's a per-device capability even if the CPU/system supports it (the HWCAP is only a prerequisite to be able to use the instructions; the driver can fall back to non-atomic ops, maybe with a DGH if it helps performance). An alternative would have been for the kernel driver to communicate to the user that the device supports the 64-byte atomic accesses but I'm not aware of any fairly generic way to do this. -- Catalin
