[Mesa-dev] [Bug 107369] "volatile" in OpenCL code not recognized by POLARIS10 and KABINI

2018-07-27 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=107369

--- Comment #9 from infini...@pwned.gg ---
I guess a third possibility is that stack protectors are actually relevant for
GPUs but Clang/LLVM is not generating correct code for those in this case.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 107369] "volatile" in OpenCL code not recognized by POLARIS10 and KABINI

2018-07-27 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=107369

--- Comment #8 from infini...@pwned.gg ---
For background, this is the CFLAGS that Debian passes when compiling all C
programs. From the GCC man page:

-fstack-protector
   Emit extra code to check for buffer overflows, such as stack smashing
   attacks.  This is done by adding a guard variable to functions with
   vulnerable objects.  This includes functions that call "alloca", and
   functions with buffers larger than 8 bytes.  The guards are initialized when
   a function is entered and then checked when the function exits.  If a guard
   check fails, an error message is printed and the program exits.

-fstack-protector-strong
   Like -fstack-protector but includes additional functions to be protected ---
   those that have local array definitions, or have references to local frame
   addresses.

Of course GPUs are different so this may not be appropriate. I'm not familiar
with how OpenCLs work so please confirm what the most appropriate solution is
for Debian, either:

- append -fno-stack-protector to the Debian flags, if the other flags are
appropriate/relevant here

- omit all the flags completely and leave clang_bc_flags alone, if this is such
a special thing that Debian's system-policy CFLAGS should be completely ignored

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 107369] "volatile" in OpenCL code not recognized by POLARIS10 and KABINI

2018-07-26 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=107369

--- Comment #7 from Gian-Carlo Pascutto  ---
Nice debugging.

Is this LLVM miscompiling the library, or LLVM trying to do stack protection on
AMD GCN code?

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 107369] "volatile" in OpenCL code not recognized by POLARIS10 and KABINI

2018-07-26 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=107369

--- Comment #6 from Aaron Watry  ---
Passing "-fstack-protector-strong" into clang_bc_flags is what's breaking
libclc for debian.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 107369] "volatile" in OpenCL code not recognized by POLARIS10 and KABINI

2018-07-26 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=107369

--- Comment #5 from Aaron Watry  ---
I reverted llvm to the 6.0.0 version packaged in the padoka ppa (from my 7.0svn
build) and libclc to the package manager's version and managed to reproduce the
failure.

Failing versions:
  libclc: 0.2.0+git20180312-1
  llvm..: 1:6.0-41~exp5~ubuntu1
  mesa..: 18.2-dev (b21b38c46cd)

Since mesa hadn't changed and LLVM didn't seem to be the issue, I upgraded
libclc to the current upstream revision (still built using llvm 6.0.0), and it
started working.

Figuring that debian's version included the git checkout date, I checked out
the latest code as of both Mar 8 and Mar 12, but both of those worked as well.

So, I checked out the debian sources and re-built their .debs on my system,
those failed.

They've patched configure.py in libclc to add in some debian-specific build
flags. When I copied those changes to upstream libclc as of 20180312 it started
failing in the same way. Those same changes also break the current upstream
libclc source.

The specific lines that debian have patched libclc's configure.py with that
break things seem to be:

# add in debian build flags
proc_cflags = Popen(("dpkg-buildflags","--get","CFLAGS"), stdout=PIPE)
clang_bc_flags = " " + proc_cflags.communicate()[0].strip("\n")

Where the output of 'dpkg-buildflags --get CFLAGS' for me is:
-g -O2 -fdebug-prefix-map=/home/awatry/src/libclc=. -fstack-protector-strong
-Wformat -Werror=format-security

Something in there is breaking the bitcode compilation for libclc in debian.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 107369] "volatile" in OpenCL code not recognized by POLARIS10 and KABINI

2018-07-26 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=107369

--- Comment #4 from Aaron Watry  ---
My original theory (before I tested 6.0.1 myself) was that the spectre
mitigation changes that went into 6.0.1 broke something about compilation of CL
kernels, but that doesn't seem to have been the case (I did go through the
entire commit history of 6.0.0->6.0.1 to try to identify things that looked
suspicious).

I do run Ubuntu at home, so it would be feasible for me to at least install the
libclc version they've got and give it a spin. LLVM/Mesa might be a bit more
work, but if we run out of other options, I might give it a try. I'd have to
downgrade my whole stack, but it's possible they're compiling with different
flags/features which changes behavior somehow.

I think for now, it's worth giving the original reporter a bit of time to try
to dump the bitcode and get us a bit more info that might help us
reproduce/diagnose this since it seems there's something version/distro
specific possibly at play.  It might be useful to know the/an exact
build/tag/revision that is failing for CLBlast as well, just to eliminate that.
 I'm assuming that the latest git master is broken, but feel free to tell me
otherwise.

And to answer the implied question: libclc supports multiple LLVM versions, and
we've supported LLVM 6.0.x in libclc for a while now (and still build against
3.9 - 7.0.0svn).

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 107369] "volatile" in OpenCL code not recognized by POLARIS10 and KABINI

2018-07-25 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=107369

--- Comment #3 from Gian-Carlo Pascutto  ---
You can see in my original comment that there are passes with LLVM 6.0.0 and
failures for LLVM 6.0.1. So indeed, it can't be LLVM either.

I'm not sure if it's possible for libclc to be different between those, but it
looks like it must be?

I asked one of the people suffering from this bug if he could provide the
requested output. The affected distro appears to be Debian, so maybe we're
looking at the version of this: https://packages.debian.org/sid/libclc-amdgcn

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 107369] "volatile" in OpenCL code not recognized by POLARIS10 and KABINI

2018-07-25 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=107369

--- Comment #2 from Aaron Watry  ---
My home system has an RX580 (Polaris10).

I just cloned/built CLBlast and it appears to be running ./clblast_tuner_xgemm
(mentioned as the specific failing case in
https://github.com/CNugteren/CLBlast/issues/298) successfully, so I don't know
that the issue is polaris10 specific.

In my case, I've got Mesa 18.2.0-b21b38c46cd61d9 (latest upstream as of the
last hour or so), libclc r335280 (latest revision since late June), and llvm
7.0.0-svn as of r337934 (a few minutes ago).

I did notice that the LLVM version for both failing cases is different than the
passing ones, so I went and downgraded to llvm 6.0.1... but it still works.

w/ LLVM 6.0.1 (first section of 578 tests):
  Found best result 1.43 ms: 1505.0 GFLOPS

w/ LLVM 7.0.0svn:
  Found best result 1.50 ms: 1428.2 GFLOPS

I'd agree with Jan that a dump of the dumped llvm bitcode would be useful.
Also, it may be interesting to try to upgrade libclc or mesa to the latest
upstream code to see if one of those has an effect.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 107369] "volatile" in OpenCL code not recognized by POLARIS10 and KABINI

2018-07-25 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=107369

Jan Vesely  changed:

   What|Removed |Added

 Blocks||99553

--- Comment #1 from Jan Vesely  ---
mesa/clover does not touch the clc code. that is processed by clang/llvm.
can you run with CLOVER_DEBUG=llvm,native CLOVER_DEBUG_FILE=dump-file and post
the files?


Referenced Bugs:

https://bugs.freedesktop.org/show_bug.cgi?id=99553
[Bug 99553] Tracker bug for runnning OpenCL applications on Clover
-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 107369] "volatile" in OpenCL code not recognized by POLARIS10 and KABINI

2018-07-25 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=107369

Bug ID: 107369
   Summary: "volatile" in OpenCL code not recognized by POLARIS10
and KABINI
   Product: Mesa
   Version: 18.0
  Hardware: x86-64 (AMD64)
OS: Linux (All)
Status: NEW
  Severity: normal
  Priority: medium
 Component: Other
  Assignee: mesa-dev@lists.freedesktop.org
  Reporter: g...@sjeng.org
QA Contact: mesa-dev@lists.freedesktop.org

https://github.com/CNugteren/CLBlast/issues/298
https://github.com/gcp/leela-zero/issues/1612

Using "volatile" in OpenCL kernels will generate an "unsupported initializer
for address space" error for the combinations of:

Mesa 18.1.3 (POLARIS10, DRM 3.23.0, 4.16.0-1-amd64, LLVM 6.0.1) 
Mesa 17.3.9 AMD KABINI (DRM 2.49.0 / 4.9.0-6-amd64, LLVM 5.0.1)

But works for:

Mesa 18.0.4 (POLARIS11 / DRM 3.23.0 / 4.16.13-1-MANJARO, LLVM 6.0.0)
Mesa 18.1.3 (POLARIS11, DRM 3.23.0, 4.16.18-1-MANJARO, LLVM 6.0.0)

So, this problem curiously seems to avoided by the POLARIS11 OpenCL support and
not specific to the Mesa version itself.

volatile has been legal in OpenCL code at least since OpenCL 1.1 and is used in
the above projects to improve register allocation (without which AMD hardware
gets a performance penalty).

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev