On 11/08/2020 12:03, Andrew Dinn wrote: > You ought to look at the pdf Ningsheng linked in the RFR that was posted > with the SVE patch. The pdf is available here: > > https://developer.arm.com/docs/ddi0584/latest > > The relevant text is in section 4.4. Memory Ordering.
That looks better than I feared. The only relevant text here AFAIUI is "If an address dependency exists between two memory reads, and an SVE non-temporal vector load instruction generated the second read, then in the absence of any other barrier mechanism to achieve order, the memory accesses can be observed in any order by the other observers within the shareability domain of the memory addresses being accessed." ... but this is only about non-temporal vector load instructions. It does mean that if we want to use those we'll have to separate them from loads of base addresses with fences. So it'll be Load base register load fence ... vector loop ... -- Andrew Haley (he/him) Java Platform Lead Engineer Red Hat UK Ltd. <https://www.redhat.com> https://keybase.io/andrewhaley EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671