On Mon, Oct 26, 2020 at 3:18 PM Mark Kettenis <[email protected]> wrote:
> > Date: Sun, 25 Oct 2020 10:42:38 +0100 (CET) > > From: Mark Kettenis <[email protected]> > > > > While making radeondrm(4) work on powerpc64 I'm running into an > > interesting unaligned access issue. > > > > Modern POWER CPUs generally support unaligned access. Normal load and > > store unstructions work fine with addresses that aren't naturally > > aligned when operating on cached memory. As a result, clang will > > optimize code by replacing two 32-bit store instructions with a single > > 64-bit store instruction even if there is only 32-bit alignment. > > > > However, this doesn't work for memory that is mapped uncachable. And > > there is some code in radeondrm(4) (and also in amdgpu(4)) that > > generates alignment exceptions because it is writing to bits of video > > memory that are mapped through the graphics aperture. > > > > There are two ways to fix this. The compiler won't apply this > > optimization if memory is accessed through pointers that are marked > > volatile. Hence the fix below. In my opinion that is the right fix > > as rdev->uvd.cpu_addr is a volatile pointer and that aspect shouldn't > > be dropped. The downside of this approach is that we may need to > > maintain some additional local fixes. > > > > The alternative is to emulate the access in the kernel. I fear that > > is what Linux does, which is why they don't notice this issue. As > > such, this issue may crop up in more places and the emulation would > > catch them all. But I'm a bit reluctant to add this emulation since > > it may hide bugs in other parts of our kernel. > > > > Thoughts? ok? > > There is more code in radeondrm(4) and amdgpu(4) that is affected by > this issues and some of it isn't easy to "volatilize". > > There is an llvm option to enforce strict alignment, but it isn't > exposed as a proper option by clang. I'm still investigating the use > of that option, but meanwhile I think I'll commit the attached diff > such that the kernel side of things works and I can look at what needs > to happen on the userland side. > Disclaimer: I'm not an expert in this area. Seems odd that this problem doesn't affect loads too... Would it be difficult to also validate that the destination address is mapped uncached, so that normal userland alignment bugs will still get caught? --david
