On 10.11.2017 19:24, Connor Abbott wrote:
On Fri, Nov 10, 2017 at 1:19 PM, Marek Olšák <mar...@gmail.com> wrote:
On Fri, Nov 10, 2017 at 6:55 PM, Nicolai Hähnle <nhaeh...@gmail.com> wrote:
On 10.11.2017 18:43, Marek Olšák wrote:

On Fri, Nov 10, 2017 at 2:09 AM, Connor Abbott <cwabbo...@gmail.com>
wrote:

On Thu, Nov 9, 2017 at 7:17 PM, Marek Olšák <mar...@gmail.com> wrote:

On Fri, Nov 10, 2017 at 12:40 AM, Matt Arsenault <arse...@gmail.com>
wrote:


On Nov 10, 2017, at 07:41, Marek Olšák <mar...@gmail.com> wrote:

Hi,

This fixes the TCS gl_ClipDistance piglit failure that was uncovered
by a recent LLVM change. The solution is to set volatile on loads
and stores to enforce proper ordering.

Please review.



Every LDS access certainly should not be volatile. This kills all
optimizations, like formation of ds_read2_b32. What ordering issue are you
having?


It might be caused by inttoptr(NULL) that we do to declare LDS. There
is simply no ordering enforced, which is weird.


As soon as you do inttoptr(NULL), you've generated a poison value (in
LLVM legalese), so LLVM will assume that you never dereference it and
optimize accordingly. I think a GEP instruction without the inbounds
parameter set will get rid of the poison value, although I'm not sure
about the case where the offset is known to be zero. At least, that's
my reading of the langref text for the GEP instruction
(https://llvm.org/docs/LangRef.html#id215). If zero is a valid address
in LDS, then it sounds like LLVM needs to be fixed to disable this
optimization for certain address spaces. On the other hand, if you're
doing inttoptr(NULL) + offset, where "offset" is the result of a
ptrtoint somewhere, you should be doing inttoptr(offset) instead, and
then LLVM should never misbehave.


I don't think that using inttoptr before every load and store would be
better than volatile. The must be a better solution.


Can't we just allocate the required LDS memory explicitly like we did for
the LDS-based derivative computations?

It may require shuffling around a bit how/when we calculate the required
sizes, but it doesn't seem impossible.

We want to share the same declaration in TCS main and epilog parts.

Does LLVM know that LDS declarations are pre-initialized?
Do sized LDS declarations affect SIMD-occupancy-based optimizations?
Because Mesa always declares 64kB of LDS and the real value is
determined at runtime.

I don't know about the latter, but for the former, if you declare the
LDS variable as having external linkage, LLVM should assume that it
might be initialized beforehand -- exactly like a global non-static
variable in C.

Makes sense.

I don't think LLVM is really looking at LDS size too closely for anything, since LDS is per-thread group. But it's been a while since I checked.

So just declaring a 64/32 KB memory block and then potentially not using all of it is probably fine and is probably the best short-term solution (if it works).

It's a good point though that "shuffling around the computation of the required sizes" is potentially much more involved than I was thinking at first. It looks like if we wanted to be perfectly honest with LLVM about what's going on (and I believe we should, in the long run), we'd have to teach it a notion of "per-thread LDS memory". That requires more thought.

Cheers,
Nicolai
--
Lerne, wie die Welt wirklich ist,
Aber vergiss niemals, wie sie sein sollte.
_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Reply via email to