Re: [Mesa-dev] Merging experimental r600/nir code

2020-02-13 Thread Dylan Baker
meson has buildtins for both of these, -Db_lto=true turns on lto, for pgo you
would run:

meson build -Db_pgo=generate
ninja -C build

meson configure build -Db_pgo=use
ninja -C build

Quoting Marek Olšák (2020-02-12 10:46:12)
> How do you enable LTO+PGO? Is it something we could enable by default for
> release builds?
> 
> Marek
> 
> On Wed, Feb 12, 2020 at 1:56 AM Dieter Nützel  wrote:
> 
> Hello Gert,
> 
> your merge 'broke' LTO and then later on PGO compilation/linking.
> 
> I do generally compiling with '-Dgallium-drivers=r600,radeonsi,swrast'
> for testing radeonsi and (your) r600 work. ;-)
> 
> After your merge I get several warnings in 'addrlib' with LTO and even a
> compiler error (gcc (SUSE Linux) 9.2.1 20200128).
> 
> I had to disable 'r600' ('swrast' is needed for 'nine') to get a working
> LTO and even better PGO radeonsi driver.
> I'm preparing GREAT LTO+PGO (the later is the greater) numbers over the
> last 2 months. I'll send my results later, today.
> 
> Summary
> radeonsi is ~40% smaller and 16-20% faster with PGO (!!!).
> 
> Honza and the GCC people (Intel's ICC folks) do GREAT things.
> 'glmark2' numbers are better then 'vkmark'. (Hello, Marek.).
> 
> Need some sleep.
> 
> See my log, below.
> 
> Greetings and GREAT work!
> 
> -Dieter
> 
> Am 09.02.2020 15:46, schrieb Gert Wollny:
> > Am Donnerstag, den 23.01.2020, 20:31 +0100 schrieb Gert Wollny:
> >> has anybody any objections if I merge the r600/NIR code?
> >> Without explicitely setting the debug flag it doesn't change a
> >> thing, but it would be better to continue developing in-tree.
> > Okay, if nobody objects, I'll merge it Monday evening.
> >
> > Best,
> > Gert
> 
> [1425/1433] Linking target src/gallium/targets/dri/libgallium_dri.so.
> FAILED: src/gallium/targets/dri/libgallium_dri.so
> c++  -o src/gallium/targets/dri/libgallium_dri.so
> 'src/gallium/targets/dri/8381c20@@gallium_dri@sha/target.c.o' -flto
> -fprofile-generate -Wl,--as-needed -Wl,--no-undefined -Wl,-O1 -shared
> -fPIC -Wl,--start-group -Wl,-soname,libgallium_dri.so
> src/mesa/libmesa_gallium.a src/mesa/libmesa_common.a
> src/compiler/glsl/libglsl.a src/compiler/glsl/glcpp/libglcpp.a
> src/util/libmesa_util.a src/util/format/libmesa_format.a
> src/compiler/nir/libnir.a src/compiler/libcompiler.a
> src/mesa/libmesa_sse41.a src/mesa/drivers/dri/common/libdricommon.a
> src/mesa/drivers/dri/common/libmegadriver_stub.a
> src/gallium/state_trackers/dri/libdri.a
> src/gallium/auxiliary/libgalliumvl.a src/gallium/auxiliary/libgallium.a
> src/mapi/shared-glapi/libglapi.so.0.0.0
> src/gallium/auxiliary/pipe-loader/libpipe_loader_static.a
> src/loader/libloader.a src/util/libxmlconfig.a
> src/gallium/winsys/sw/null/libws_null.a
> src/gallium/winsys/sw/wrapper/libwsw.a
> src/gallium/winsys/sw/dri/libswdri.a
> src/gallium/winsys/sw/kms-dri/libswkmsdri.a
> src/gallium/drivers/llvmpipe/libllvmpipe.a
> src/gallium/drivers/softpipe/libsoftpipe.a
> src/gallium/drivers/r600/libr600.a
> src/gallium/winsys/radeon/drm/libradeonwinsys.a
> src/gallium/drivers/radeonsi/libradeonsi.a
> src/gallium/winsys/amdgpu/drm/libamdgpuwinsys.a
> src/amd/addrlib/libaddrlib.a src/amd/common/libamd_common.a
> src/amd/llvm/libamd_common_llvm.a -Wl,--build-id=sha1 -Wl,--gc-sections
> -Wl,--version-script /opt/mesa/src/gallium/targets/dri/dri.sym
> -Wl,--dynamic-list /opt/mesa/src/gallium/targets/dri/../dri-vdpau.dyn
> /usr/lib64/libdrm.so -L/usr/local/lib -lLLVM-10git -pthread
> /usr/lib64/libexpat.so
> /usr/lib64/gcc/x86_64-suse-linux/9/../../../../lib64/libz.so -lm
> /usr/lib64/gcc/x86_64-suse-linux/9/../../../../lib64/libzstd.so
> -L/usr/local/lib -lLLVM-10git /usr/lib64/libunwind.so -ldl -lsensors
> -L/usr/local/lib -lLLVM-10git /usr/lib64/libdrm_radeon.so
> /usr/lib64/libelf.so -L/usr/local/lib -lLLVM-10git -L/usr/local/lib
> -lLLVM-10git -L/usr/local/lib -lLLVM-10git /usr/lib64/libdrm_amdgpu.so
> -L/usr/local/lib -lLLVM-10git -Wl,--end-group
> 
> '-Wl,-rpath,$ORIGIN/../../../mesa:$ORIGIN/../../../compiler/glsl:$ORIGIN/..
> /../../compiler/glsl/glcpp:$ORIGIN/../../../util:$ORIGIN/../../../util/
> format:$ORIGIN/../../../compiler/nir:$ORIGIN/../../../compiler:$ORIGIN/..
> /../../mesa/drivers/dri/common:$ORIGIN/../../state_trackers/dri:$ORIGIN/..
> /../auxiliary:$ORIGIN/../../../mapi/shared-glapi:$ORIGIN/../../auxiliary/
> 
> pipe-loader:$ORIGIN/../../../loader:$ORIGIN/../../winsys/sw/null:$ORIGIN/..
> /../winsys/sw/wrapper:$ORIGIN/../../winsys/sw/dri:$ORIGIN/../../winsys/sw/
> kms-dri:$ORIGIN/../../drivers/llvmpipe:$ORIGIN/../../drivers/
> 
> softpipe:$ORIGIN/../../drivers/r600:$ORIGIN/../../winsys/radeon/drm:$ORIGIN
> /../../drivers/radeonsi:$ORIGIN/../../winsys/amdgpu/drm:$ORIGIN/../../../
>

Re: [Mesa-dev] Merging experimental r600/nir code

2020-02-13 Thread Michel Dänzer
On 2020-02-12 7:46 p.m., Marek Olšák wrote:
> How do you enable LTO+PGO? Is it something we could enable by default for
> release builds?

Enabling LTO for Mesa, I get a lot of warnings about issues affecting it
specifically, making me doubt that it's currently safe in general, in
particular for the radeonsi/RADV drivers (due to issues in addrlib). It
shouldn't be enabled by default before those issues are addressed (and
ideally CI coverage in place to prevent them from creeping back in).


-- 
Earthling Michel Dänzer   |   https://redhat.com
Libre software enthusiast | Mesa and X developer
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Merging experimental r600/nir code

2020-02-13 Thread Eero Tamminen

Hi,

On 13.2.2020 10.38, Timur Kristóf wrote:

I think the question about PGO is this: are the profiles of the users'
applications gonna be the same as the profile that is collected from
the benchmarks?

Eg. if the test benchmark uses different draw calls or triggers
different shader compiler code paths than a your favourite game, in
theory PGO could harm the performance of your game.

Also, how do we prevent it from making bad decisions based on the hw
that the profile was made on?

For example, if you collect the profiling data from a machine that has
a Polaris 10 GPU, then the profile will show that chip_class is
extremely likely to be GFX8 and thus the PGO build will be optimized
for that case. If I then run the same build on my Navi 10, the PGO
build might actually be slower, because the driver needs to take a
different code path than what the PGO build was optimized for.

What do you guys think about this?


How much HW specific stuff can impact things, depends on whether those 
things are executed constantly, or is it only something done once.  If 
former, it may be useful to (try) design driver so that they get 
executed only once.


Most CPU extensive part is shader compilation (with Intel, linking stage 
more than things done before it), and the heavy part is AFAIK to a large 
extent HW independent.  In benchmarks, shader compilation is almost 
always done at startup, in games shader compilation typically happens 
also afterwards.


As to how much PGO can make things worse, I think that depends on how 
independent the non-executed part of the code is.  If it's not mixed 
with code that did get executed, I don't think there will be any visible 
impact.  But if it's badly mixed, hot/cold function identification will 
group things wrong.



- Eero


Best regards,
Timur

On Thu, 2020-02-13 at 02:40 -0500, Marek Olšák wrote:

Can we automate this?

Let's say we implement noop ioctls for radeonsi and iris, and then we
run the drivers to collect pgo data on any hw.

Can meson execute this build sequence:
build with pgo=generate
run tests
clean
build with pgo=use

automated as buildtype=release-pgo.
a bit
Marek

On Wed., Feb. 12, 2020, 23:37 Dieter Nützel, 
wrote:

Hello Marek,

I hoped you would ask this...
...but first sorry for the delay of my announced numbers.
Our family is/was sick, my wife more than me and our children are
fine,
again.
So be lenient with me somewhat.

Am 12.02.2020 19:46, schrieb Marek Olšák:

How do you enable LTO+PGO? Is it something we could enable by

default

for release builds?

Marek


I think we can achieve this.

I'm running with LTO+PGO 'release' since late December (around
Christmas).
My KDE Plasma5 (OpenGL 3.0) system/desktop was never
agiler/fluider
since then.
Even the numbers (glmark2) show it. The 'glmark2' numbers are the
best
I've ever seen on this system.
LTO offer only some small space reduction and hardly any speedup.
But LTO+PGO is GREAT.

First I compile with '-Db_lto=true -Db_pgo=generate'.

mkdir build
cd build
meson ../ --strip --buildtype release -Ddri-drivers=
-Dplatforms=drm,x11
-Dgallium-drivers=r600,radeonsi,swrast -Dvulkan-drivers=amd
-Dgallium-nine=true -Dgallium-opencl=standalone -Dglvnd=true
-Dgallium-va=true -Dgallium-xvmc=false -Dgallium-omx=disabled
-Dgallium-xa=false -Db_lto=true -Db_pgo=generate

After that my 'build' dir looks like this:

drwxr-xr-x  8 dieter users4096 13. Feb 04:34 .
drwxr-xr-x 14 dieter users4096 13. Feb 04:33 ..
drwxr-xr-x  2 dieter users4096 13. Feb 04:34 bin
-rw-r--r--  1 dieter users 4369873 13. Feb 04:34 build.ninja
-rw-r--r--  1 dieter users 4236719 13. Feb 04:34
compile_commands.json
drwxr-xr-x  2 dieter users4096 13. Feb 04:34 include
drwxr-xr-x  2 dieter users4096 13. Feb 04:34 meson-info
drwxr-xr-x  2 dieter users4096 13. Feb 04:33 meson-logs
drwxr-xr-x  2 dieter users4096 13. Feb 04:34 meson-private
drwxr-xr-x 14 dieter users4096 13. Feb 04:34 src

time nice +19 ninja

Lasts ~15 minutes on my aging/'slow' Intel Xeon X3470 Nehalem,
4c/8t,
2.93 GHz, 24 GB, Polaris 20.
Without LTO+PGO it is ~4-5 minutes. (AMD anyone?)

Then I remove all files/dirs except 'src'.

Next 'installing' the new built files under '/usr/local/' (mostly
symlinked to /usr/lib64/).

Now run as much OpenGL/Vulkan progs as I can.
Normaly starting with glmark2 and vkmark.

Here comes my (whole) list:
Knights
Wireshark
K3b
Skanlite
Kdenlive
GIMP
Krita
FreeCAD
Blender 2.81x
digikam
K4DirStat
Discover
YaST
Do some 'movements'/work in/with every prog.
+
some LibreOffice work (OpenGL enabled)
one or two OpenGL games
and Vulkan games
+
run some WebGL stuff in my browsers (Konqi/FF).

After that I have the needed '*.gcda' files in 'src'.

Now second rebuild in 'src'.
Due to the deleted files/dirs I can do a second 'meson' config run
in my
current 'build' dir.

meson ../ --strip --buildtype release -Ddri-drivers=
-Dplatforms=drm,x11
-Dgallium-drivers=r600,radeonsi,swrast -Dvulkan-drivers=amd
-Dgallium-nine=true 

Re: [Mesa-dev] Merging experimental r600/nir code

2020-02-13 Thread Timur Kristóf
I think the question about PGO is this: are the profiles of the users'
applications gonna be the same as the profile that is collected from
the benchmarks?

Eg. if the test benchmark uses different draw calls or triggers
different shader compiler code paths than a your favourite game, in
theory PGO could harm the performance of your game.

Also, how do we prevent it from making bad decisions based on the hw
that the profile was made on?

For example, if you collect the profiling data from a machine that has
a Polaris 10 GPU, then the profile will show that chip_class is
extremely likely to be GFX8 and thus the PGO build will be optimized
for that case. If I then run the same build on my Navi 10, the PGO
build might actually be slower, because the driver needs to take a
different code path than what the PGO build was optimized for.

What do you guys think about this?

Best regards,
Timur

On Thu, 2020-02-13 at 02:40 -0500, Marek Olšák wrote:
> Can we automate this?
> 
> Let's say we implement noop ioctls for radeonsi and iris, and then we
> run the drivers to collect pgo data on any hw.
> 
> Can meson execute this build sequence:
> build with pgo=generate
> run tests
> clean
> build with pgo=use
> 
> automated as buildtype=release-pgo.
> 
> Marek
> 
> On Wed., Feb. 12, 2020, 23:37 Dieter Nützel, 
> wrote:
> > Hello Marek,
> > 
> > I hoped you would ask this...
> > ...but first sorry for the delay of my announced numbers.
> > Our family is/was sick, my wife more than me and our children are
> > fine, 
> > again.
> > So be lenient with me somewhat.
> > 
> > Am 12.02.2020 19:46, schrieb Marek Olšák:
> > > How do you enable LTO+PGO? Is it something we could enable by
> > default
> > > for release builds?
> > > 
> > > Marek
> > 
> > I think we can achieve this.
> > 
> > I'm running with LTO+PGO 'release' since late December (around 
> > Christmas).
> > My KDE Plasma5 (OpenGL 3.0) system/desktop was never
> > agiler/fluider 
> > since then.
> > Even the numbers (glmark2) show it. The 'glmark2' numbers are the
> > best 
> > I've ever seen on this system.
> > LTO offer only some small space reduction and hardly any speedup.
> > But LTO+PGO is GREAT.
> > 
> > First I compile with '-Db_lto=true -Db_pgo=generate'.
> > 
> > mkdir build
> > cd build
> > meson ../ --strip --buildtype release -Ddri-drivers=
> > -Dplatforms=drm,x11 
> > -Dgallium-drivers=r600,radeonsi,swrast -Dvulkan-drivers=amd 
> > -Dgallium-nine=true -Dgallium-opencl=standalone -Dglvnd=true 
> > -Dgallium-va=true -Dgallium-xvmc=false -Dgallium-omx=disabled 
> > -Dgallium-xa=false -Db_lto=true -Db_pgo=generate
> > 
> > After that my 'build' dir looks like this:
> > 
> > drwxr-xr-x  8 dieter users4096 13. Feb 04:34 .
> > drwxr-xr-x 14 dieter users4096 13. Feb 04:33 ..
> > drwxr-xr-x  2 dieter users4096 13. Feb 04:34 bin
> > -rw-r--r--  1 dieter users 4369873 13. Feb 04:34 build.ninja
> > -rw-r--r--  1 dieter users 4236719 13. Feb 04:34
> > compile_commands.json
> > drwxr-xr-x  2 dieter users4096 13. Feb 04:34 include
> > drwxr-xr-x  2 dieter users4096 13. Feb 04:34 meson-info
> > drwxr-xr-x  2 dieter users4096 13. Feb 04:33 meson-logs
> > drwxr-xr-x  2 dieter users4096 13. Feb 04:34 meson-private
> > drwxr-xr-x 14 dieter users4096 13. Feb 04:34 src
> > 
> > time nice +19 ninja
> > 
> > Lasts ~15 minutes on my aging/'slow' Intel Xeon X3470 Nehalem,
> > 4c/8t, 
> > 2.93 GHz, 24 GB, Polaris 20.
> > Without LTO+PGO it is ~4-5 minutes. (AMD anyone?)
> > 
> > Then I remove all files/dirs except 'src'.
> > 
> > Next 'installing' the new built files under '/usr/local/' (mostly 
> > symlinked to /usr/lib64/).
> > 
> > Now run as much OpenGL/Vulkan progs as I can.
> > Normaly starting with glmark2 and vkmark.
> > 
> > Here comes my (whole) list:
> > Knights
> > Wireshark
> > K3b
> > Skanlite
> > Kdenlive
> > GIMP
> > Krita
> > FreeCAD
> > Blender 2.81x
> > digikam
> > K4DirStat
> > Discover
> > YaST
> > Do some 'movements'/work in/with every prog.
> > +
> > some LibreOffice work (OpenGL enabled)
> > one or two OpenGL games
> > and Vulkan games
> > +
> > run some WebGL stuff in my browsers (Konqi/FF).
> > 
> > After that I have the needed '*.gcda' files in 'src'.
> > 
> > Now second rebuild in 'src'.
> > Due to the deleted files/dirs I can do a second 'meson' config run
> > in my 
> > current 'build' dir.
> > 
> > meson ../ --strip --buildtype release -Ddri-drivers=
> > -Dplatforms=drm,x11 
> > -Dgallium-drivers=r600,radeonsi,swrast -Dvulkan-drivers=amd 
> > -Dgallium-nine=true -Dgallium-opencl=standalone -Dglvnd=true 
> > -Dgallium-va=true -Dgallium-xvmc=false -Dgallium-omx=disabled 
> > -Dgallium-xa=false -Db_lto=true -Db_pgo=use
> > 
> > After around 5-6 minutes (!!!) I can install the LTO+PGO 'release'
> > build 
> > driver files and enjoy next level of OpenGL speed.
> > Vulkan do NOT show such GREAT improvements.
> > 
> > Only '-Db_lto=true -Db_pgo=generate' need ~3 times compilation and 
> > mostly linking time.
> > 
> > Below 

Re: [Mesa-dev] Merging experimental r600/nir code

2020-02-12 Thread Marek Olšák
Can we automate this?

Let's say we implement noop ioctls for radeonsi and iris, and then we run
the drivers to collect pgo data on any hw.

Can meson execute this build sequence:
build with pgo=generate
run tests
clean
build with pgo=use

automated as buildtype=release-pgo.

Marek

On Wed., Feb. 12, 2020, 23:37 Dieter Nützel,  wrote:

> Hello Marek,
>
> I hoped you would ask this...
> ...but first sorry for the delay of my announced numbers.
> Our family is/was sick, my wife more than me and our children are fine,
> again.
> So be lenient with me somewhat.
>
> Am 12.02.2020 19:46, schrieb Marek Olšák:
> > How do you enable LTO+PGO? Is it something we could enable by default
> > for release builds?
> >
> > Marek
>
> I think we can achieve this.
>
> I'm running with LTO+PGO 'release' since late December (around
> Christmas).
> My KDE Plasma5 (OpenGL 3.0) system/desktop was never agiler/fluider
> since then.
> Even the numbers (glmark2) show it. The 'glmark2' numbers are the best
> I've ever seen on this system.
> LTO offer only some small space reduction and hardly any speedup.
> But LTO+PGO is GREAT.
>
> First I compile with '-Db_lto=true -Db_pgo=generate'.
>
> mkdir build
> cd build
> meson ../ --strip --buildtype release -Ddri-drivers= -Dplatforms=drm,x11
> -Dgallium-drivers=r600,radeonsi,swrast -Dvulkan-drivers=amd
> -Dgallium-nine=true -Dgallium-opencl=standalone -Dglvnd=true
> -Dgallium-va=true -Dgallium-xvmc=false -Dgallium-omx=disabled
> -Dgallium-xa=false -Db_lto=true -Db_pgo=generate
>
> After that my 'build' dir looks like this:
>
> drwxr-xr-x  8 dieter users4096 13. Feb 04:34 .
> drwxr-xr-x 14 dieter users4096 13. Feb 04:33 ..
> drwxr-xr-x  2 dieter users4096 13. Feb 04:34 bin
> -rw-r--r--  1 dieter users 4369873 13. Feb 04:34 build.ninja
> -rw-r--r--  1 dieter users 4236719 13. Feb 04:34 compile_commands.json
> drwxr-xr-x  2 dieter users4096 13. Feb 04:34 include
> drwxr-xr-x  2 dieter users4096 13. Feb 04:34 meson-info
> drwxr-xr-x  2 dieter users4096 13. Feb 04:33 meson-logs
> drwxr-xr-x  2 dieter users4096 13. Feb 04:34 meson-private
> drwxr-xr-x 14 dieter users4096 13. Feb 04:34 src
>
> time nice +19 ninja
>
> Lasts ~15 minutes on my aging/'slow' Intel Xeon X3470 Nehalem, 4c/8t,
> 2.93 GHz, 24 GB, Polaris 20.
> Without LTO+PGO it is ~4-5 minutes. (AMD anyone?)
>
> Then I remove all files/dirs except 'src'.
>
> Next 'installing' the new built files under '/usr/local/' (mostly
> symlinked to /usr/lib64/).
>
> Now run as much OpenGL/Vulkan progs as I can.
> Normaly starting with glmark2 and vkmark.
>
> Here comes my (whole) list:
> Knights
> Wireshark
> K3b
> Skanlite
> Kdenlive
> GIMP
> Krita
> FreeCAD
> Blender 2.81x
> digikam
> K4DirStat
> Discover
> YaST
> Do some 'movements'/work in/with every prog.
> +
> some LibreOffice work (OpenGL enabled)
> one or two OpenGL games
> and Vulkan games
> +
> run some WebGL stuff in my browsers (Konqi/FF).
>
> After that I have the needed '*.gcda' files in 'src'.
>
> Now second rebuild in 'src'.
> Due to the deleted files/dirs I can do a second 'meson' config run in my
> current 'build' dir.
>
> meson ../ --strip --buildtype release -Ddri-drivers= -Dplatforms=drm,x11
> -Dgallium-drivers=r600,radeonsi,swrast -Dvulkan-drivers=amd
> -Dgallium-nine=true -Dgallium-opencl=standalone -Dglvnd=true
> -Dgallium-va=true -Dgallium-xvmc=false -Dgallium-omx=disabled
> -Dgallium-xa=false -Db_lto=true -Db_pgo=use
>
> After around 5-6 minutes (!!!) I can install the LTO+PGO 'release' build
> driver files and enjoy next level of OpenGL speed.
> Vulkan do NOT show such GREAT improvements.
>
> Only '-Db_lto=true -Db_pgo=generate' need ~3 times compilation and
> mostly linking time.
>
> Below are some memory and speed numbers.
> Should I send an additional post with a better title to the list?
> Hope this helps ;-)))
>
> -Dieter
>
>
> ***
>
> Mesa git 21bc16a723 (somewhat older)
>
> normal
>
> -rwxr-xr-x   4 root root 9525520 13. Jan 20:00
> libvdpau_radeonsi.so.1.0.0
> -rwxr-xr-x   4 root root 9525520 13. Jan 20:00 libvdpau_r600.so.1.0.0
>
> -rwxr-xr-x   8 root root 18444192 13. Jan 20:00 swrast_dri.so
> -rwxr-xr-x   8 root root 18444192 13. Jan 20:00 radeonsi_dri.so
> -rwxr-xr-x   8 root root 18444192 13. Jan 20:00 r600_dri.so
> -rwxr-xr-x   8 root root 18444192 13. Jan 20:00 kms_swrast_dri.so
> -rwxr-xr-x   4 root root  9505072 13. Jan 20:00 radeonsi_drv_video.so
> -rwxr-xr-x   4 root root  9505072 13. Jan 20:00 r600_drv_video.so
>
>
> -Db_lto=true
>
> -rwxr-xr-x 2 root root 8078368 13. Jan 21:24 libvdpau_r600.so.1.0.0
> -rwxr-xr-x 2 root root 8078368 13. Jan 21:24 libvdpau_radeonsi.so.1.0.0
>
> -rwxr-xr-x 4 root root 16878368 13. Jan 21:24 kms_swrast_dri.so
> -rwxr-xr-x 4 root root 16878368 13. Jan 21:24 r600_dri.so
> -rwxr-xr-x 2 root root  8074312 13. Jan 21:24 r600_drv_video.so
> -rwxr-xr-x 4 root root 16878368 13. Jan 21:24 radeonsi_dri.so

Re: [Mesa-dev] Merging experimental r600/nir code

2020-02-12 Thread Dieter Nützel

Hello Marek,

I hoped you would ask this...
...but first sorry for the delay of my announced numbers.
Our family is/was sick, my wife more than me and our children are fine, 
again.

So be lenient with me somewhat.

Am 12.02.2020 19:46, schrieb Marek Olšák:

How do you enable LTO+PGO? Is it something we could enable by default
for release builds?

Marek


I think we can achieve this.

I'm running with LTO+PGO 'release' since late December (around 
Christmas).
My KDE Plasma5 (OpenGL 3.0) system/desktop was never agiler/fluider 
since then.
Even the numbers (glmark2) show it. The 'glmark2' numbers are the best 
I've ever seen on this system.

LTO offer only some small space reduction and hardly any speedup.
But LTO+PGO is GREAT.

First I compile with '-Db_lto=true -Db_pgo=generate'.

mkdir build
cd build
meson ../ --strip --buildtype release -Ddri-drivers= -Dplatforms=drm,x11 
-Dgallium-drivers=r600,radeonsi,swrast -Dvulkan-drivers=amd 
-Dgallium-nine=true -Dgallium-opencl=standalone -Dglvnd=true 
-Dgallium-va=true -Dgallium-xvmc=false -Dgallium-omx=disabled 
-Dgallium-xa=false -Db_lto=true -Db_pgo=generate


After that my 'build' dir looks like this:

drwxr-xr-x  8 dieter users4096 13. Feb 04:34 .
drwxr-xr-x 14 dieter users4096 13. Feb 04:33 ..
drwxr-xr-x  2 dieter users4096 13. Feb 04:34 bin
-rw-r--r--  1 dieter users 4369873 13. Feb 04:34 build.ninja
-rw-r--r--  1 dieter users 4236719 13. Feb 04:34 compile_commands.json
drwxr-xr-x  2 dieter users4096 13. Feb 04:34 include
drwxr-xr-x  2 dieter users4096 13. Feb 04:34 meson-info
drwxr-xr-x  2 dieter users4096 13. Feb 04:33 meson-logs
drwxr-xr-x  2 dieter users4096 13. Feb 04:34 meson-private
drwxr-xr-x 14 dieter users4096 13. Feb 04:34 src

time nice +19 ninja

Lasts ~15 minutes on my aging/'slow' Intel Xeon X3470 Nehalem, 4c/8t, 
2.93 GHz, 24 GB, Polaris 20.

Without LTO+PGO it is ~4-5 minutes. (AMD anyone?)

Then I remove all files/dirs except 'src'.

Next 'installing' the new built files under '/usr/local/' (mostly 
symlinked to /usr/lib64/).


Now run as much OpenGL/Vulkan progs as I can.
Normaly starting with glmark2 and vkmark.

Here comes my (whole) list:
Knights
Wireshark
K3b
Skanlite
Kdenlive
GIMP
Krita
FreeCAD
Blender 2.81x
digikam
K4DirStat
Discover
YaST
Do some 'movements'/work in/with every prog.
+
some LibreOffice work (OpenGL enabled)
one or two OpenGL games
and Vulkan games
+
run some WebGL stuff in my browsers (Konqi/FF).

After that I have the needed '*.gcda' files in 'src'.

Now second rebuild in 'src'.
Due to the deleted files/dirs I can do a second 'meson' config run in my 
current 'build' dir.


meson ../ --strip --buildtype release -Ddri-drivers= -Dplatforms=drm,x11 
-Dgallium-drivers=r600,radeonsi,swrast -Dvulkan-drivers=amd 
-Dgallium-nine=true -Dgallium-opencl=standalone -Dglvnd=true 
-Dgallium-va=true -Dgallium-xvmc=false -Dgallium-omx=disabled 
-Dgallium-xa=false -Db_lto=true -Db_pgo=use


After around 5-6 minutes (!!!) I can install the LTO+PGO 'release' build 
driver files and enjoy next level of OpenGL speed.

Vulkan do NOT show such GREAT improvements.

Only '-Db_lto=true -Db_pgo=generate' need ~3 times compilation and 
mostly linking time.


Below are some memory and speed numbers.
Should I send an additional post with a better title to the list?
Hope this helps ;-)))

-Dieter

***

Mesa git 21bc16a723 (somewhat older)

normal

-rwxr-xr-x   4 root root 9525520 13. Jan 20:00 
libvdpau_radeonsi.so.1.0.0

-rwxr-xr-x   4 root root 9525520 13. Jan 20:00 libvdpau_r600.so.1.0.0

-rwxr-xr-x   8 root root 18444192 13. Jan 20:00 swrast_dri.so
-rwxr-xr-x   8 root root 18444192 13. Jan 20:00 radeonsi_dri.so
-rwxr-xr-x   8 root root 18444192 13. Jan 20:00 r600_dri.so
-rwxr-xr-x   8 root root 18444192 13. Jan 20:00 kms_swrast_dri.so
-rwxr-xr-x   4 root root  9505072 13. Jan 20:00 radeonsi_drv_video.so
-rwxr-xr-x   4 root root  9505072 13. Jan 20:00 r600_drv_video.so


-Db_lto=true

-rwxr-xr-x 2 root root 8078368 13. Jan 21:24 libvdpau_r600.so.1.0.0
-rwxr-xr-x 2 root root 8078368 13. Jan 21:24 libvdpau_radeonsi.so.1.0.0

-rwxr-xr-x 4 root root 16878368 13. Jan 21:24 kms_swrast_dri.so
-rwxr-xr-x 4 root root 16878368 13. Jan 21:24 r600_dri.so
-rwxr-xr-x 2 root root  8074312 13. Jan 21:24 r600_drv_video.so
-rwxr-xr-x 4 root root 16878368 13. Jan 21:24 radeonsi_dri.so
-rwxr-xr-x 2 root root  8074312 13. Jan 21:24 radeonsi_drv_video.so
-rwxr-xr-x 4 root root 16878368 13. Jan 21:24 swrast_dri.so


-Db_lto=true -Db_pgo=use

-rwxr-xr-x   4 root root 5600328 14. Jan 00:11 
libvdpau_radeonsi.so.1.0.0

-rwxr-xr-x   4 root root 5600328 14. Jan 00:11 libvdpau_r600.so.1.0.0

-rwxr-xr-x   8 root root 11172768 14. Jan 00:11 swrast_dri.so
-rwxr-xr-x   8 root root 11172768 14. Jan 00:11 radeonsi_dri.so
-rwxr-xr-x   8 root root 11172768 14. Jan 00:11 r600_dri.so
-rwxr-xr-x   8 root root 11172768 14. Jan 00:11 kms_swrast_dri.so

Re: [Mesa-dev] Merging experimental r600/nir code

2020-02-12 Thread Ian Romanick
On 2/12/20 10:46 AM, Marek Olšák wrote:
> How do you enable LTO+PGO? Is it something we could enable by default
> for release builds?

I'm assuming PGO is "profile guided optimization."  That requires a
cycle of build, run workloads to collect data, rebuild with collected
data.  It would be awesome if there were a reasonable way to do that in
distro builds, but I think it will continue to be a dream. :(

> Marek
> 
> On Wed, Feb 12, 2020 at 1:56 AM Dieter Nützel  > wrote:
> 
> Hello Gert,
> 
> your merge 'broke' LTO and then later on PGO compilation/linking.
> 
> I do generally compiling with '-Dgallium-drivers=r600,radeonsi,swrast'
> for testing radeonsi and (your) r600 work. ;-)
> 
> After your merge I get several warnings in 'addrlib' with LTO and
> even a
> compiler error (gcc (SUSE Linux) 9.2.1 20200128).
> 
> I had to disable 'r600' ('swrast' is needed for 'nine') to get a
> working
> LTO and even better PGO radeonsi driver.
> I'm preparing GREAT LTO+PGO (the later is the greater) numbers over the
> last 2 months. I'll send my results later, today.
> 
> Summary
> radeonsi is ~40% smaller and 16-20% faster with PGO (!!!).
> 
> Honza and the GCC people (Intel's ICC folks) do GREAT things.
> 'glmark2' numbers are better then 'vkmark'. (Hello, Marek.).
> 
> Need some sleep.
> 
> See my log, below.
> 
> Greetings and GREAT work!
> 
> -Dieter
> 
> Am 09.02.2020 15:46, schrieb Gert Wollny:
> > Am Donnerstag, den 23.01.2020, 20:31 +0100 schrieb Gert Wollny:
> >> has anybody any objections if I merge the r600/NIR code?
> >> Without explicitely setting the debug flag it doesn't change a
> >> thing, but it would be better to continue developing in-tree.
> > Okay, if nobody objects, I'll merge it Monday evening.
> >
> > Best,
> > Gert
> 
> [1425/1433] Linking target src/gallium/targets/dri/libgallium_dri.so.
> FAILED: src/gallium/targets/dri/libgallium_dri.so
> c++  -o src/gallium/targets/dri/libgallium_dri.so
> 'src/gallium/targets/dri/8381c20@@gallium_dri@sha/target.c.o' -flto
> -fprofile-generate -Wl,--as-needed -Wl,--no-undefined -Wl,-O1 -shared
> -fPIC -Wl,--start-group -Wl,-soname,libgallium_dri.so
> src/mesa/libmesa_gallium.a src/mesa/libmesa_common.a
> src/compiler/glsl/libglsl.a src/compiler/glsl/glcpp/libglcpp.a
> src/util/libmesa_util.a src/util/format/libmesa_format.a
> src/compiler/nir/libnir.a src/compiler/libcompiler.a
> src/mesa/libmesa_sse41.a src/mesa/drivers/dri/common/libdricommon.a
> src/mesa/drivers/dri/common/libmegadriver_stub.a
> src/gallium/state_trackers/dri/libdri.a
> src/gallium/auxiliary/libgalliumvl.a src/gallium/auxiliary/libgallium.a
> src/mapi/shared-glapi/libglapi.so.0.0.0
> src/gallium/auxiliary/pipe-loader/libpipe_loader_static.a
> src/loader/libloader.a src/util/libxmlconfig.a
> src/gallium/winsys/sw/null/libws_null.a
> src/gallium/winsys/sw/wrapper/libwsw.a
> src/gallium/winsys/sw/dri/libswdri.a
> src/gallium/winsys/sw/kms-dri/libswkmsdri.a
> src/gallium/drivers/llvmpipe/libllvmpipe.a
> src/gallium/drivers/softpipe/libsoftpipe.a
> src/gallium/drivers/r600/libr600.a
> src/gallium/winsys/radeon/drm/libradeonwinsys.a
> src/gallium/drivers/radeonsi/libradeonsi.a
> src/gallium/winsys/amdgpu/drm/libamdgpuwinsys.a
> src/amd/addrlib/libaddrlib.a src/amd/common/libamd_common.a
> src/amd/llvm/libamd_common_llvm.a -Wl,--build-id=sha1 -Wl,--gc-sections
> -Wl,--version-script /opt/mesa/src/gallium/targets/dri/dri.sym
> -Wl,--dynamic-list /opt/mesa/src/gallium/targets/dri/../dri-vdpau.dyn
> /usr/lib64/libdrm.so -L/usr/local/lib -lLLVM-10git -pthread
> /usr/lib64/libexpat.so
> /usr/lib64/gcc/x86_64-suse-linux/9/../../../../lib64/libz.so -lm
> /usr/lib64/gcc/x86_64-suse-linux/9/../../../../lib64/libzstd.so
> -L/usr/local/lib -lLLVM-10git /usr/lib64/libunwind.so -ldl -lsensors
> -L/usr/local/lib -lLLVM-10git /usr/lib64/libdrm_radeon.so
> /usr/lib64/libelf.so -L/usr/local/lib -lLLVM-10git -L/usr/local/lib
> -lLLVM-10git -L/usr/local/lib -lLLVM-10git /usr/lib64/libdrm_amdgpu.so
> -L/usr/local/lib -lLLVM-10git -Wl,--end-group
> 
> 

Re: [Mesa-dev] Merging experimental r600/nir code

2020-02-12 Thread Marek Olšák
How do you enable LTO+PGO? Is it something we could enable by default for
release builds?

Marek

On Wed, Feb 12, 2020 at 1:56 AM Dieter Nützel  wrote:

> Hello Gert,
>
> your merge 'broke' LTO and then later on PGO compilation/linking.
>
> I do generally compiling with '-Dgallium-drivers=r600,radeonsi,swrast'
> for testing radeonsi and (your) r600 work. ;-)
>
> After your merge I get several warnings in 'addrlib' with LTO and even a
> compiler error (gcc (SUSE Linux) 9.2.1 20200128).
>
> I had to disable 'r600' ('swrast' is needed for 'nine') to get a working
> LTO and even better PGO radeonsi driver.
> I'm preparing GREAT LTO+PGO (the later is the greater) numbers over the
> last 2 months. I'll send my results later, today.
>
> Summary
> radeonsi is ~40% smaller and 16-20% faster with PGO (!!!).
>
> Honza and the GCC people (Intel's ICC folks) do GREAT things.
> 'glmark2' numbers are better then 'vkmark'. (Hello, Marek.).
>
> Need some sleep.
>
> See my log, below.
>
> Greetings and GREAT work!
>
> -Dieter
>
> Am 09.02.2020 15:46, schrieb Gert Wollny:
> > Am Donnerstag, den 23.01.2020, 20:31 +0100 schrieb Gert Wollny:
> >> has anybody any objections if I merge the r600/NIR code?
> >> Without explicitely setting the debug flag it doesn't change a
> >> thing, but it would be better to continue developing in-tree.
> > Okay, if nobody objects, I'll merge it Monday evening.
> >
> > Best,
> > Gert
>
> [1425/1433] Linking target src/gallium/targets/dri/libgallium_dri.so.
> FAILED: src/gallium/targets/dri/libgallium_dri.so
> c++  -o src/gallium/targets/dri/libgallium_dri.so
> 'src/gallium/targets/dri/8381c20@@gallium_dri@sha/target.c.o' -flto
> -fprofile-generate -Wl,--as-needed -Wl,--no-undefined -Wl,-O1 -shared
> -fPIC -Wl,--start-group -Wl,-soname,libgallium_dri.so
> src/mesa/libmesa_gallium.a src/mesa/libmesa_common.a
> src/compiler/glsl/libglsl.a src/compiler/glsl/glcpp/libglcpp.a
> src/util/libmesa_util.a src/util/format/libmesa_format.a
> src/compiler/nir/libnir.a src/compiler/libcompiler.a
> src/mesa/libmesa_sse41.a src/mesa/drivers/dri/common/libdricommon.a
> src/mesa/drivers/dri/common/libmegadriver_stub.a
> src/gallium/state_trackers/dri/libdri.a
> src/gallium/auxiliary/libgalliumvl.a src/gallium/auxiliary/libgallium.a
> src/mapi/shared-glapi/libglapi.so.0.0.0
> src/gallium/auxiliary/pipe-loader/libpipe_loader_static.a
> src/loader/libloader.a src/util/libxmlconfig.a
> src/gallium/winsys/sw/null/libws_null.a
> src/gallium/winsys/sw/wrapper/libwsw.a
> src/gallium/winsys/sw/dri/libswdri.a
> src/gallium/winsys/sw/kms-dri/libswkmsdri.a
> src/gallium/drivers/llvmpipe/libllvmpipe.a
> src/gallium/drivers/softpipe/libsoftpipe.a
> src/gallium/drivers/r600/libr600.a
> src/gallium/winsys/radeon/drm/libradeonwinsys.a
> src/gallium/drivers/radeonsi/libradeonsi.a
> src/gallium/winsys/amdgpu/drm/libamdgpuwinsys.a
> src/amd/addrlib/libaddrlib.a src/amd/common/libamd_common.a
> src/amd/llvm/libamd_common_llvm.a -Wl,--build-id=sha1 -Wl,--gc-sections
> -Wl,--version-script /opt/mesa/src/gallium/targets/dri/dri.sym
> -Wl,--dynamic-list /opt/mesa/src/gallium/targets/dri/../dri-vdpau.dyn
> /usr/lib64/libdrm.so -L/usr/local/lib -lLLVM-10git -pthread
> /usr/lib64/libexpat.so
> /usr/lib64/gcc/x86_64-suse-linux/9/../../../../lib64/libz.so -lm
> /usr/lib64/gcc/x86_64-suse-linux/9/../../../../lib64/libzstd.so
> -L/usr/local/lib -lLLVM-10git /usr/lib64/libunwind.so -ldl -lsensors
> -L/usr/local/lib -lLLVM-10git /usr/lib64/libdrm_radeon.so
> /usr/lib64/libelf.so -L/usr/local/lib -lLLVM-10git -L/usr/local/lib
> -lLLVM-10git -L/usr/local/lib -lLLVM-10git /usr/lib64/libdrm_amdgpu.so
> -L/usr/local/lib -lLLVM-10git -Wl,--end-group
> '-Wl,-rpath,$ORIGIN/../../../mesa:$ORIGIN/../../../compiler/glsl:$ORIGIN/../../../compiler/glsl/glcpp:$ORIGIN/../../../util:$ORIGIN/../../../util/format:$ORIGIN/../../../compiler/nir:$ORIGIN/../../../compiler:$ORIGIN/../../../mesa/drivers/dri/common:$ORIGIN/../../state_trackers/dri:$ORIGIN/../../auxiliary:$ORIGIN/../../../mapi/shared-glapi:$ORIGIN/../../auxiliary/pipe-loader:$ORIGIN/../../../loader:$ORIGIN/../../winsys/sw/null:$ORIGIN/../../winsys/sw/wrapper:$ORIGIN/../../winsys/sw/dri:$ORIGIN/../../winsys/sw/kms-dri:$ORIGIN/../../drivers/llvmpipe:$ORIGIN/../../drivers/softpipe:$ORIGIN/../../drivers/r600:$ORIGIN/../../winsys/radeon/drm:$ORIGIN/../../drivers/radeonsi:$ORIGIN/../../winsys/amdgpu/drm:$ORIGIN/../../../amd/addrlib:$ORIGIN/../../../amd/common:$ORIGIN/../../../amd/llvm'
>
> -Wl,-rpath-link,/opt/mesa/build/src/mesa
> -Wl,-rpath-link,/opt/mesa/build/src/compiler/glsl
> -Wl,-rpath-link,/opt/mesa/build/src/compiler/glsl/glcpp
> -Wl,-rpath-link,/opt/mesa/build/src/util
> -Wl,-rpath-link,/opt/mesa/build/src/util/format
> -Wl,-rpath-link,/opt/mesa/build/src/compiler/nir
> -Wl,-rpath-link,/opt/mesa/build/src/compiler
> -Wl,-rpath-link,/opt/mesa/build/src/mesa/drivers/dri/common
> -Wl,-rpath-link,/opt/mesa/build/src/gallium/state_trackers/dri
> -Wl,-rpath-link,/opt/mesa/build/src/gallium/auxiliary
> 

Re: [Mesa-dev] Merging experimental r600/nir code

2020-02-12 Thread Gert Wollny
Hello Dieter, 

Am Mittwoch, den 12.02.2020, 10:53 +0100 schrieb Gert Wollny:
> 
> When I enable radeonsi linking libgallium_dri.so seems to take
> forever, that is so far it is not finished and it is running for
> quite some time (>15 min). I'll get back to you when I get some
> result.

For me in the end the current mesa git TOT builds fine with
r600,radeonsi,swrast enabled.

My compile flags are 

"-Wall, -Wextra, -Wdeprecated-declarations, -O2, -g, -funroll-loops,
-ftree-vectorize, -pthread, -march=native, -mtune=native, -mno-xop"
with  "native" being AMD 6300 FX.

So I guess the build failure on your side is either triggered by the
gcc version, or some specific optimization flag. - Or did the error
occure in the  b_pgo=use stage? I didn't test this. 

Hope that helps, 
Gert 


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Merging experimental r600/nir code

2020-02-12 Thread Gert Wollny
Hello Dieter, 

thanks for the report. 

Am Mittwoch, den 12.02.2020, 07:56 +0100 schrieb Dieter Nützel:
> 
> After your merge I get several warnings in 'addrlib' with LTO 
As far as I can see the offending structs are contained in a union
GB_ADDR_CONFIG which is only define within radeonsi, so you should see
these warnings also when r600 is disabled, no?

> and even a  compiler error (gcc (SUSE Linux) 9.2.1 20200128).
I guess this is the real problem that breaks the build, but internal
compiler errors are usually the fault of the compiler ;) 
When compiling with b_lto=true and b_pgo=generate and only r600/swrast
I don't see any errors (gcc (Gentoo 9.2.0-r2 p3)).

When I enable radeonsi linking libgallium_dri.so seems to take forever,
that is so far it is not finished and it is running for quite some time
(>15 min). I'll get back to you when I get some result.

Best, 
Gert 

> 
> I had to disable 'r600' ('swrast' is needed for 'nine') to get a
> working  LTO and even better PGO radeonsi driver.
> I'm preparing GREAT LTO+PGO (the later is the greater) numbers over
> the  last 2 months. I'll send my results later, today.
> 
> Summary
> radeonsi is ~40% smaller and 16-20% faster with PGO (!!!).
> 
> Honza and the GCC people (Intel's ICC folks) do GREAT things.
> 'glmark2' numbers are better then 'vkmark'. (Hello, Marek.).
> 
> Need some sleep.
> 
> See my log, below.
> 
> Greetings and GREAT work!
> 
> -Dieter
> 
> Am 09.02.2020 15:46, schrieb Gert Wollny:
> > Am Donnerstag, den 23.01.2020, 20:31 +0100 schrieb Gert Wollny:
> > > has anybody any objections if I merge the r600/NIR code?
> > > Without explicitely setting the debug flag it doesn't change a
> > > thing, but it would be better to continue developing in-tree.
> > Okay, if nobody objects, I'll merge it Monday evening.
> > 
> > Best,
> > Gert
> 
> [1425/1433] Linking target src/gallium/targets/dri/libgallium_dri.so.
> FAILED: src/gallium/targets/dri/libgallium_dri.so
> c++  -o src/gallium/targets/dri/libgallium_dri.so 
> 'src/gallium/targets/dri/8381c20@@gallium_dri@sha/target.c.o' -flto 
> -fprofile-generate -Wl,--as-needed -Wl,--no-undefined -Wl,-O1
> -shared 
> -fPIC -Wl,--start-group -Wl,-soname,libgallium_dri.so 
> src/mesa/libmesa_gallium.a src/mesa/libmesa_common.a 
> src/compiler/glsl/libglsl.a src/compiler/glsl/glcpp/libglcpp.a 
> src/util/libmesa_util.a src/util/format/libmesa_format.a 
> src/compiler/nir/libnir.a src/compiler/libcompiler.a 
> src/mesa/libmesa_sse41.a src/mesa/drivers/dri/common/libdricommon.a 
> src/mesa/drivers/dri/common/libmegadriver_stub.a 
> src/gallium/state_trackers/dri/libdri.a 
> src/gallium/auxiliary/libgalliumvl.a
> src/gallium/auxiliary/libgallium.a 
> src/mapi/shared-glapi/libglapi.so.0.0.0 
> src/gallium/auxiliary/pipe-loader/libpipe_loader_static.a 
> src/loader/libloader.a src/util/libxmlconfig.a 
> src/gallium/winsys/sw/null/libws_null.a 
> src/gallium/winsys/sw/wrapper/libwsw.a 
> src/gallium/winsys/sw/dri/libswdri.a 
> src/gallium/winsys/sw/kms-dri/libswkmsdri.a 
> src/gallium/drivers/llvmpipe/libllvmpipe.a 
> src/gallium/drivers/softpipe/libsoftpipe.a 
> src/gallium/drivers/r600/libr600.a 
> src/gallium/winsys/radeon/drm/libradeonwinsys.a 
> src/gallium/drivers/radeonsi/libradeonsi.a 
> src/gallium/winsys/amdgpu/drm/libamdgpuwinsys.a 
> src/amd/addrlib/libaddrlib.a src/amd/common/libamd_common.a 
> src/amd/llvm/libamd_common_llvm.a -Wl,--build-id=sha1 -Wl,--gc-
> sections 
> -Wl,--version-script /opt/mesa/src/gallium/targets/dri/dri.sym 
> -Wl,--dynamic-list /opt/mesa/src/gallium/targets/dri/../dri-
> vdpau.dyn 
> /usr/lib64/libdrm.so -L/usr/local/lib -lLLVM-10git -pthread 
> /usr/lib64/libexpat.so 
> /usr/lib64/gcc/x86_64-suse-linux/9/../../../../lib64/libz.so -lm 
> /usr/lib64/gcc/x86_64-suse-linux/9/../../../../lib64/libzstd.so 
> -L/usr/local/lib -lLLVM-10git /usr/lib64/libunwind.so -ldl -lsensors 
> -L/usr/local/lib -lLLVM-10git /usr/lib64/libdrm_radeon.so 
> /usr/lib64/libelf.so -L/usr/local/lib -lLLVM-10git -L/usr/local/lib 
> -lLLVM-10git -L/usr/local/lib -lLLVM-10git
> /usr/lib64/libdrm_amdgpu.so 
> -L/usr/local/lib -lLLVM-10git -Wl,--end-group 
> '-Wl,-
> rpath,$ORIGIN/../../../mesa:$ORIGIN/../../../compiler/glsl:$ORIGIN/..
> /../../compiler/glsl/glcpp:$ORIGIN/../../../util:$ORIGIN/../../../uti
> l/format:$ORIGIN/../../../compiler/nir:$ORIGIN/../../../compiler:$ORI
> GIN/../../../mesa/drivers/dri/common:$ORIGIN/../../state_trackers/dri
> :$ORIGIN/../../auxiliary:$ORIGIN/../../../mapi/shared-
> glapi:$ORIGIN/../../auxiliary/pipe-
> loader:$ORIGIN/../../../loader:$ORIGIN/../../winsys/sw/null:$ORIGIN/.
> ./../winsys/sw/wrapper:$ORIGIN/../../winsys/sw/dri:$ORIGIN/../../wins
> ys/sw/kms-
> dri:$ORIGIN/../../drivers/llvmpipe:$ORIGIN/../../drivers/softpipe:$OR
> IGIN/../../drivers/r600:$ORIGIN/../../winsys/radeon/drm:$ORIGIN/../..
> /drivers/radeonsi:$ORIGIN/../../winsys/amdgpu/drm:$ORIGIN/../../../am
> d/addrlib:$ORIGIN/../../../amd/common:$ORIGIN/../../../amd/llvm' 
> 

Re: [Mesa-dev] Merging experimental r600/nir code

2020-02-11 Thread Dieter Nützel

Hello Gert,

your merge 'broke' LTO and then later on PGO compilation/linking.

I do generally compiling with '-Dgallium-drivers=r600,radeonsi,swrast'
for testing radeonsi and (your) r600 work. ;-)

After your merge I get several warnings in 'addrlib' with LTO and even a 
compiler error (gcc (SUSE Linux) 9.2.1 20200128).


I had to disable 'r600' ('swrast' is needed for 'nine') to get a working 
LTO and even better PGO radeonsi driver.
I'm preparing GREAT LTO+PGO (the later is the greater) numbers over the 
last 2 months. I'll send my results later, today.


Summary
radeonsi is ~40% smaller and 16-20% faster with PGO (!!!).

Honza and the GCC people (Intel's ICC folks) do GREAT things.
'glmark2' numbers are better then 'vkmark'. (Hello, Marek.).

Need some sleep.

See my log, below.

Greetings and GREAT work!

-Dieter

Am 09.02.2020 15:46, schrieb Gert Wollny:

Am Donnerstag, den 23.01.2020, 20:31 +0100 schrieb Gert Wollny:

has anybody any objections if I merge the r600/NIR code?
Without explicitely setting the debug flag it doesn't change a
thing, but it would be better to continue developing in-tree.

Okay, if nobody objects, I'll merge it Monday evening.

Best,
Gert


[1425/1433] Linking target src/gallium/targets/dri/libgallium_dri.so.
FAILED: src/gallium/targets/dri/libgallium_dri.so
c++  -o src/gallium/targets/dri/libgallium_dri.so 
'src/gallium/targets/dri/8381c20@@gallium_dri@sha/target.c.o' -flto 
-fprofile-generate -Wl,--as-needed -Wl,--no-undefined -Wl,-O1 -shared 
-fPIC -Wl,--start-group -Wl,-soname,libgallium_dri.so 
src/mesa/libmesa_gallium.a src/mesa/libmesa_common.a 
src/compiler/glsl/libglsl.a src/compiler/glsl/glcpp/libglcpp.a 
src/util/libmesa_util.a src/util/format/libmesa_format.a 
src/compiler/nir/libnir.a src/compiler/libcompiler.a 
src/mesa/libmesa_sse41.a src/mesa/drivers/dri/common/libdricommon.a 
src/mesa/drivers/dri/common/libmegadriver_stub.a 
src/gallium/state_trackers/dri/libdri.a 
src/gallium/auxiliary/libgalliumvl.a src/gallium/auxiliary/libgallium.a 
src/mapi/shared-glapi/libglapi.so.0.0.0 
src/gallium/auxiliary/pipe-loader/libpipe_loader_static.a 
src/loader/libloader.a src/util/libxmlconfig.a 
src/gallium/winsys/sw/null/libws_null.a 
src/gallium/winsys/sw/wrapper/libwsw.a 
src/gallium/winsys/sw/dri/libswdri.a 
src/gallium/winsys/sw/kms-dri/libswkmsdri.a 
src/gallium/drivers/llvmpipe/libllvmpipe.a 
src/gallium/drivers/softpipe/libsoftpipe.a 
src/gallium/drivers/r600/libr600.a 
src/gallium/winsys/radeon/drm/libradeonwinsys.a 
src/gallium/drivers/radeonsi/libradeonsi.a 
src/gallium/winsys/amdgpu/drm/libamdgpuwinsys.a 
src/amd/addrlib/libaddrlib.a src/amd/common/libamd_common.a 
src/amd/llvm/libamd_common_llvm.a -Wl,--build-id=sha1 -Wl,--gc-sections 
-Wl,--version-script /opt/mesa/src/gallium/targets/dri/dri.sym 
-Wl,--dynamic-list /opt/mesa/src/gallium/targets/dri/../dri-vdpau.dyn 
/usr/lib64/libdrm.so -L/usr/local/lib -lLLVM-10git -pthread 
/usr/lib64/libexpat.so 
/usr/lib64/gcc/x86_64-suse-linux/9/../../../../lib64/libz.so -lm 
/usr/lib64/gcc/x86_64-suse-linux/9/../../../../lib64/libzstd.so 
-L/usr/local/lib -lLLVM-10git /usr/lib64/libunwind.so -ldl -lsensors 
-L/usr/local/lib -lLLVM-10git /usr/lib64/libdrm_radeon.so 
/usr/lib64/libelf.so -L/usr/local/lib -lLLVM-10git -L/usr/local/lib 
-lLLVM-10git -L/usr/local/lib -lLLVM-10git /usr/lib64/libdrm_amdgpu.so 
-L/usr/local/lib -lLLVM-10git -Wl,--end-group 
'-Wl,-rpath,$ORIGIN/../../../mesa:$ORIGIN/../../../compiler/glsl:$ORIGIN/../../../compiler/glsl/glcpp:$ORIGIN/../../../util:$ORIGIN/../../../util/format:$ORIGIN/../../../compiler/nir:$ORIGIN/../../../compiler:$ORIGIN/../../../mesa/drivers/dri/common:$ORIGIN/../../state_trackers/dri:$ORIGIN/../../auxiliary:$ORIGIN/../../../mapi/shared-glapi:$ORIGIN/../../auxiliary/pipe-loader:$ORIGIN/../../../loader:$ORIGIN/../../winsys/sw/null:$ORIGIN/../../winsys/sw/wrapper:$ORIGIN/../../winsys/sw/dri:$ORIGIN/../../winsys/sw/kms-dri:$ORIGIN/../../drivers/llvmpipe:$ORIGIN/../../drivers/softpipe:$ORIGIN/../../drivers/r600:$ORIGIN/../../winsys/radeon/drm:$ORIGIN/../../drivers/radeonsi:$ORIGIN/../../winsys/amdgpu/drm:$ORIGIN/../../../amd/addrlib:$ORIGIN/../../../amd/common:$ORIGIN/../../../amd/llvm' 
-Wl,-rpath-link,/opt/mesa/build/src/mesa 
-Wl,-rpath-link,/opt/mesa/build/src/compiler/glsl 
-Wl,-rpath-link,/opt/mesa/build/src/compiler/glsl/glcpp 
-Wl,-rpath-link,/opt/mesa/build/src/util 
-Wl,-rpath-link,/opt/mesa/build/src/util/format 
-Wl,-rpath-link,/opt/mesa/build/src/compiler/nir 
-Wl,-rpath-link,/opt/mesa/build/src/compiler 
-Wl,-rpath-link,/opt/mesa/build/src/mesa/drivers/dri/common 
-Wl,-rpath-link,/opt/mesa/build/src/gallium/state_trackers/dri 
-Wl,-rpath-link,/opt/mesa/build/src/gallium/auxiliary 
-Wl,-rpath-link,/opt/mesa/build/src/mapi/shared-glapi 
-Wl,-rpath-link,/opt/mesa/build/src/gallium/auxiliary/pipe-loader 
-Wl,-rpath-link,/opt/mesa/build/src/loader 
-Wl,-rpath-link,/opt/mesa/build/src/gallium/winsys/sw/null 
-Wl,-rpath-link,/opt/mesa/build/src/gallium/winsys/sw/wrapper 

Re: [Mesa-dev] Merging experimental r600/nir code

2020-02-09 Thread Gert Wollny
Am Donnerstag, den 23.01.2020, 20:31 +0100 schrieb Gert Wollny:
> has anybody any objections if I merge the r600/NIR code? 
> Without explicitely setting the debug flag it doesn't change a
> thing, but it would be better to continue developing in-tree. 
Okay, if nobody objects, I'll merge it Monday evening. 

Best, 
Gert 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev