subject:"\[Mesa\-dev\] \[PATCH 00\/26\] RadeonSI\: Primitive culling with async compute"

Re: [Mesa-dev] [PATCH 00/26] RadeonSI: Primitive culling with async compute

2019-04-08 Thread Marek Olšák

Hi Dieter,

The latest version is on gitlab as a merge request.

Marek

On Mon, Apr 8, 2019 at 6:06 PM Dieter Nützel  wrote:

> After you've recuperated from hopefully GREAT vacation,
>
> is it time? ;-)
>
> Greetings,
> Dieter
>
> Am 26.02.2019 03:31, schrieb Dieter Nützel:
> > Hello Marek,
> >
> > do you plan to commit or rebase both set?
> >
> > Dieter
> >
> > Am 14.02.2019 07:29, schrieb Marek Olšák:
> >> I have some fixes for Sea Islands that improve Radeon 290X performance
> >> to 43 fps, moving it just below Radeon VII in the picture.
> >>
> >> Marek
> >>
> >> On Wed, Feb 13, 2019 at 12:16 AM Marek Olšák 
> >> wrote:
> >>
> >>> Hi,
> >>>
> >>> This patch series uses async compute to do primitive culling before
> >>> the vertex shader. It significantly improves performance for
> >>> applications
> >>> that use a lot of geometry that is invisible because primitives
> >>> don't
> >>> intersect sample points or there are a lot of back faces, etc.
> >>>
> >>> It passes 99.% of all tests (GL CTS, dEQP, piglit) and is 100%
> >>> stable.
> >>> It supports all chips all the way from Sea Islands to Radeon VII.
> >>>
> >>> As you can see in the results marked (ENABLED) in the picture below,
> >>> it destroys our competition (The GeForce results are from a Phoronix
> >>> article from 2017, the latest ones I could find):
> >>>
> >>> Benchmark: ParaView - Many Spheres - 2560x1440
> >>> https://people.freedesktop.org/~mareko/prim-discard-cs-results.png
> >>>
> >>> The last patch describes the implementation and functional
> >>> limitations
> >>> if you can find the huge code comment, so I'm not gonna do that
> >>> here.
> >>>
> >>> I decided to enable this optimization on all Pro graphics cards.
> >>> The reason is that I haven't had time to benchmark games.
> >>> This decision may be changed based on community feedback, etc.
> >>>
> >>> People using the Pro graphics cards can disable this by setting
> >>> AMD_DEBUG=nopd, and people using consumer graphics cards can enable
> >>> this by setting AMD_DEBUG=pd. So you always have a choice.
> >>>
> >>> Eventually we might also enable this on consumer graphics cards for
> >>> those
> >>> games that benefit. It might decrease performance if there is not
> >>> enough
> >>> invisible geometry.
> >>>
> >>> Branch:
> >>> https://cgit.freedesktop.org/~mareko/mesa/log/?h=prim-discard-cs
> >>>
> >>> Please review.
> >>>
> >>> Thanks,
> >>> Marek
> >> ___
> >> mesa-dev mailing list
> >> mesa-dev@lists.freedesktop.org
> >> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> > ___
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 00/26] RadeonSI: Primitive culling with async compute

2019-04-08 Thread Dieter Nützel


After you've recuperated from hopefully GREAT vacation,

is it time? ;-)

Greetings,
Dieter

Am 26.02.2019 03:31, schrieb Dieter Nützel:

Hello Marek,

do you plan to commit or rebase both set?

Dieter

Am 14.02.2019 07:29, schrieb Marek Olšák:

I have some fixes for Sea Islands that improve Radeon 290X performance
to 43 fps, moving it just below Radeon VII in the picture.

Marek

On Wed, Feb 13, 2019 at 12:16 AM Marek Olšák 
wrote:


Hi,

This patch series uses async compute to do primitive culling before
the vertex shader. It significantly improves performance for
applications
that use a lot of geometry that is invisible because primitives
don't
intersect sample points or there are a lot of back faces, etc.

It passes 99.% of all tests (GL CTS, dEQP, piglit) and is 100%
stable.
It supports all chips all the way from Sea Islands to Radeon VII.

As you can see in the results marked (ENABLED) in the picture below,
it destroys our competition (The GeForce results are from a Phoronix
article from 2017, the latest ones I could find):

Benchmark: ParaView - Many Spheres - 2560x1440
https://people.freedesktop.org/~mareko/prim-discard-cs-results.png

The last patch describes the implementation and functional
limitations
if you can find the huge code comment, so I'm not gonna do that
here.

I decided to enable this optimization on all Pro graphics cards.
The reason is that I haven't had time to benchmark games.
This decision may be changed based on community feedback, etc.

People using the Pro graphics cards can disable this by setting
AMD_DEBUG=nopd, and people using consumer graphics cards can enable
this by setting AMD_DEBUG=pd. So you always have a choice.

Eventually we might also enable this on consumer graphics cards for
those
games that benefit. It might decrease performance if there is not
enough
invisible geometry.

Branch:
https://cgit.freedesktop.org/~mareko/mesa/log/?h=prim-discard-cs

Please review.

Thanks,
Marek

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 00/26] RadeonSI: Primitive culling with async compute

2019-02-25 Thread Dieter Nützel


Hello Marek,

do you plan to commit or rebase both set?

Dieter

Am 14.02.2019 07:29, schrieb Marek Olšák:

I have some fixes for Sea Islands that improve Radeon 290X performance
to 43 fps, moving it just below Radeon VII in the picture.

Marek

On Wed, Feb 13, 2019 at 12:16 AM Marek Olšák 
wrote:


Hi,

This patch series uses async compute to do primitive culling before
the vertex shader. It significantly improves performance for
applications
that use a lot of geometry that is invisible because primitives
don't
intersect sample points or there are a lot of back faces, etc.

It passes 99.% of all tests (GL CTS, dEQP, piglit) and is 100%
stable.
It supports all chips all the way from Sea Islands to Radeon VII.

As you can see in the results marked (ENABLED) in the picture below,
it destroys our competition (The GeForce results are from a Phoronix
article from 2017, the latest ones I could find):

Benchmark: ParaView - Many Spheres - 2560x1440
https://people.freedesktop.org/~mareko/prim-discard-cs-results.png

The last patch describes the implementation and functional
limitations
if you can find the huge code comment, so I'm not gonna do that
here.

I decided to enable this optimization on all Pro graphics cards.
The reason is that I haven't had time to benchmark games.
This decision may be changed based on community feedback, etc.

People using the Pro graphics cards can disable this by setting
AMD_DEBUG=nopd, and people using consumer graphics cards can enable
this by setting AMD_DEBUG=pd. So you always have a choice.

Eventually we might also enable this on consumer graphics cards for
those
games that benefit. It might decrease performance if there is not
enough
invisible geometry.

Branch:
https://cgit.freedesktop.org/~mareko/mesa/log/?h=prim-discard-cs

Please review.

Thanks,
Marek

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 00/26] RadeonSI: Primitive culling with async compute

2019-02-14 Thread Marek Olšák

On Thu, Feb 14, 2019 at 1:43 PM Dieter Nützel  wrote:

> For the whole series (the updated branch merged in)
>
> Tested-by: Dieter Nützel 
>
> on Polaris 20
>
> FreeCAD, Blender, UH, UV, US, some VTK apps
> No surprising speed up but e.g. NO slowdown.
>
> tb stands even for
> [Mesa-dev] [PATCH 0/4] RadeonSI: Follow-up for the primitive culling
> series
> too (but no SI, here).
>
> mplayer / mpv works like a charm, again.
>
> ParaView-5.6.0-MPI-Linux-64bit
>
> 1920x1080
> pd off ~18 fps
> pd on ~24 fps ! ;-)
>
> 2560x1440
> pd off ~14 fps
> pd on ~16 fps
>
> ./pvbatch
> ../lib/python2.7/site-packages/paraview/benchmark/manyspheres.py -s 100
> -r 726 -v 1920,1080 -f 30
>

I don't know. I just used the phoronix test suite.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 00/26] RadeonSI: Primitive culling with async compute

2019-02-14 Thread Dieter Nützel


For the whole series (the updated branch merged in)

Tested-by: Dieter Nützel 

on Polaris 20

FreeCAD, Blender, UH, UV, US, some VTK apps
No surprising speed up but e.g. NO slowdown.

tb stands even for
[Mesa-dev] [PATCH 0/4] RadeonSI: Follow-up for the primitive culling 
series

too (but no SI, here).

mplayer / mpv works like a charm, again.

ParaView-5.6.0-MPI-Linux-64bit

1920x1080
pd off ~18 fps
pd on ~24 fps ! ;-)

2560x1440
pd off ~14 fps
pd on ~16 fps

./pvbatch 
../lib/python2.7/site-packages/paraview/benchmark/manyspheres.py -s 100 
-r 726 -v 1920,1080 -f 30


Is this right?

Poor
Intel Xeon X3470, 2.93 GHz, 3.2 GHz turbo, 4c/8t
24 GB
Polaris 20, 8 GB
PCIe 2.1 only (NO PCIe atomics)

Dieter

Am 14.02.2019 03:07, schrieb Marek Olšák:

I just updated the branch, fixing video players.

Marek

On Wed, Feb 13, 2019 at 8:28 PM Dieter Nützel 
wrote:


Now with LLVM 9.0 git;-)

Running, except mplayer/mpv (same as before).

mplayer: ../src/gallium/drivers/radeon/radeon_winsys.h:866:
radeon_get_heap_index: Assertion `!"32BIT without WC is disallowed"'

failed.
Abbruch (core dumped)

mpv: ../src/gallium/drivers/radeon/radeon_winsys.h:866:
radeon_get_heap_index: Assertion `!"32BIT without WC is disallowed"'

failed.
Abbruch (core dumped)

And this after glxgears, Blender, FreeCAD, UH and UV:

[38939.440950] [drm:amdgpu_ctx_mgr_entity_fini [amdgpu]] *ERROR* ctx

679c61fd is still alive
[38939.440993] [drm:amdgpu_ctx_mgr_fini [amdgpu]] *ERROR* ctx
679c61fd is still alive
[38964.901076] [drm:amdgpu_ctx_mgr_entity_fini [amdgpu]] *ERROR* ctx

9c4b659b is still alive
[38964.901130] [drm:amdgpu_ctx_mgr_fini [amdgpu]] *ERROR* ctx
9c4b659b is still alive
[38980.844577] [drm:amdgpu_ctx_mgr_entity_fini [amdgpu]] *ERROR* ctx

1bee3a35 is still alive
[38980.844642] [drm:amdgpu_ctx_mgr_fini [amdgpu]] *ERROR* ctx
1bee3a35 is still alive

Newer 'amd-staging-drm-next' needed? #0bf64b0a9f78 currently

If I only had some big triangle apps...;-)

Dieter

Am 13.02.2019 17:36, schrieb Marek Olšák:

Dieter, you need final LLVM 8.0.

Marek

On Wed, Feb 13, 2019 at 11:02 AM Dieter Nützel



wrote:


GREAT stuff, Marek!

But sadly some crashes.
Is my LLVM git version to old?
7. Jan 2019 (short before 8.0 cut)

LLVM (http://llvm.org/):
LLVM version 8.0.0svn
Optimized build.
Default target: x86_64-unknown-linux-gnu
Host CPU: nehalem

Registered Targets:
amdgcn - AMD GCN GPUs
r600   - AMD GPUs HD2XXX-HD6XXX
x86- 32-bit X86: Pentium-Pro and above
x86-64 - 64-bit X86: EM64T and AMD64

Please have a look at my post @Phoronix:






https://www.phoronix.com/forums/forum/phoronix/latest-phoronix-articles/1079916-radeonsi-picks-up-primitive-culling-with-async-compute-for-performance-wins?p=1079984#post1079984


Thanks,
Dieter

Am 13.02.2019 06:15, schrieb Marek Olšák:

Hi,

This patch series uses async compute to do primitive culling

before

the vertex shader. It significantly improves performance for
applications
that use a lot of geometry that is invisible because primitives

don't

intersect sample points or there are a lot of back faces, etc.

It passes 99.% of all tests (GL CTS, dEQP, piglit) and is

100%



stable.
It supports all chips all the way from Sea Islands to Radeon

VII.


As you can see in the results marked (ENABLED) in the picture

below,

it destroys our competition (The GeForce results are from a

Phoronix

article from 2017, the latest ones I could find):

Benchmark: ParaView - Many Spheres - 2560x1440


https://people.freedesktop.org/~mareko/prim-discard-cs-results.png



The last patch describes the implementation and functional

limitations

if you can find the huge code comment, so I'm not gonna do that

here.


I decided to enable this optimization on all Pro graphics cards.
The reason is that I haven't had time to benchmark games.
This decision may be changed based on community feedback, etc.

People using the Pro graphics cards can disable this by setting
AMD_DEBUG=nopd, and people using consumer graphics cards can

enable

this by setting AMD_DEBUG=pd. So you always have a choice.

Eventually we might also enable this on consumer graphics cards

for

those
games that benefit. It might decrease performance if there is

not

enough
invisible geometry.

Branch:
https://cgit.freedesktop.org/~mareko/mesa/log/?h=prim-discard-cs

Please review.

Thanks,
Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 00/26] RadeonSI: Primitive culling with async compute

2019-02-13 Thread Marek Olšák

I have some fixes for Sea Islands that improve Radeon 290X performance to
43 fps, moving it just below Radeon VII in the picture.

Marek

On Wed, Feb 13, 2019 at 12:16 AM Marek Olšák  wrote:

> Hi,
>
> This patch series uses async compute to do primitive culling before
> the vertex shader. It significantly improves performance for applications
> that use a lot of geometry that is invisible because primitives don't
> intersect sample points or there are a lot of back faces, etc.
>
> It passes 99.% of all tests (GL CTS, dEQP, piglit) and is 100% stable.
> It supports all chips all the way from Sea Islands to Radeon VII.
>
> As you can see in the results marked (ENABLED) in the picture below,
> it destroys our competition (The GeForce results are from a Phoronix
> article from 2017, the latest ones I could find):
>
> Benchmark: ParaView - Many Spheres - 2560x1440
> https://people.freedesktop.org/~mareko/prim-discard-cs-results.png
>
>
> The last patch describes the implementation and functional limitations
> if you can find the huge code comment, so I'm not gonna do that here.
>
> I decided to enable this optimization on all Pro graphics cards.
> The reason is that I haven't had time to benchmark games.
> This decision may be changed based on community feedback, etc.
>
> People using the Pro graphics cards can disable this by setting
> AMD_DEBUG=nopd, and people using consumer graphics cards can enable
> this by setting AMD_DEBUG=pd. So you always have a choice.
>
> Eventually we might also enable this on consumer graphics cards for those
> games that benefit. It might decrease performance if there is not enough
> invisible geometry.
>
> Branch:
> https://cgit.freedesktop.org/~mareko/mesa/log/?h=prim-discard-cs
>
> Please review.
>
> Thanks,
> Marek
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 00/26] RadeonSI: Primitive culling with async compute

2019-02-13 Thread Dieter Nützel


Got it (merged in), thanks.
But now I need some sleep.
Have to drive my wife to the hospital in a few hours.
No, not hyperacute.

Dieter

Am 14.02.2019 03:07, schrieb Marek Olšák:

I just updated the branch, fixing video players.

Marek

On Wed, Feb 13, 2019 at 8:28 PM Dieter Nützel 
wrote:


Now with LLVM 9.0 git;-)

Running, except mplayer/mpv (same as before).

mplayer: ../src/gallium/drivers/radeon/radeon_winsys.h:866:
radeon_get_heap_index: Assertion `!"32BIT without WC is disallowed"'

failed.
Abbruch (core dumped)

mpv: ../src/gallium/drivers/radeon/radeon_winsys.h:866:
radeon_get_heap_index: Assertion `!"32BIT without WC is disallowed"'

failed.
Abbruch (core dumped)

And this after glxgears, Blender, FreeCAD, UH and UV:

[38939.440950] [drm:amdgpu_ctx_mgr_entity_fini [amdgpu]] *ERROR* ctx

679c61fd is still alive
[38939.440993] [drm:amdgpu_ctx_mgr_fini [amdgpu]] *ERROR* ctx
679c61fd is still alive
[38964.901076] [drm:amdgpu_ctx_mgr_entity_fini [amdgpu]] *ERROR* ctx

9c4b659b is still alive
[38964.901130] [drm:amdgpu_ctx_mgr_fini [amdgpu]] *ERROR* ctx
9c4b659b is still alive
[38980.844577] [drm:amdgpu_ctx_mgr_entity_fini [amdgpu]] *ERROR* ctx

1bee3a35 is still alive
[38980.844642] [drm:amdgpu_ctx_mgr_fini [amdgpu]] *ERROR* ctx
1bee3a35 is still alive

Newer 'amd-staging-drm-next' needed? #0bf64b0a9f78 currently

If I only had some big triangle apps...;-)

Dieter

Am 13.02.2019 17:36, schrieb Marek Olšák:

Dieter, you need final LLVM 8.0.

Marek

On Wed, Feb 13, 2019 at 11:02 AM Dieter Nützel



wrote:


GREAT stuff, Marek!

But sadly some crashes.
Is my LLVM git version to old?
7. Jan 2019 (short before 8.0 cut)

LLVM (http://llvm.org/):
LLVM version 8.0.0svn
Optimized build.
Default target: x86_64-unknown-linux-gnu
Host CPU: nehalem

Registered Targets:
amdgcn - AMD GCN GPUs
r600   - AMD GPUs HD2XXX-HD6XXX
x86- 32-bit X86: Pentium-Pro and above
x86-64 - 64-bit X86: EM64T and AMD64

Please have a look at my post @Phoronix:






https://www.phoronix.com/forums/forum/phoronix/latest-phoronix-articles/1079916-radeonsi-picks-up-primitive-culling-with-async-compute-for-performance-wins?p=1079984#post1079984


Thanks,
Dieter

Am 13.02.2019 06:15, schrieb Marek Olšák:

Hi,

This patch series uses async compute to do primitive culling

before

the vertex shader. It significantly improves performance for
applications
that use a lot of geometry that is invisible because primitives

don't

intersect sample points or there are a lot of back faces, etc.

It passes 99.% of all tests (GL CTS, dEQP, piglit) and is

100%



stable.
It supports all chips all the way from Sea Islands to Radeon

VII.


As you can see in the results marked (ENABLED) in the picture

below,

it destroys our competition (The GeForce results are from a

Phoronix

article from 2017, the latest ones I could find):

Benchmark: ParaView - Many Spheres - 2560x1440


https://people.freedesktop.org/~mareko/prim-discard-cs-results.png



The last patch describes the implementation and functional

limitations

if you can find the huge code comment, so I'm not gonna do that

here.


I decided to enable this optimization on all Pro graphics cards.
The reason is that I haven't had time to benchmark games.
This decision may be changed based on community feedback, etc.

People using the Pro graphics cards can disable this by setting
AMD_DEBUG=nopd, and people using consumer graphics cards can

enable

this by setting AMD_DEBUG=pd. So you always have a choice.

Eventually we might also enable this on consumer graphics cards

for

those
games that benefit. It might decrease performance if there is

not

enough
invisible geometry.

Branch:
https://cgit.freedesktop.org/~mareko/mesa/log/?h=prim-discard-cs

Please review.

Thanks,
Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 00/26] RadeonSI: Primitive culling with async compute

2019-02-13 Thread Marek Olšák

I just updated the branch, fixing video players.

Marek

On Wed, Feb 13, 2019 at 8:28 PM Dieter Nützel  wrote:

> Now with LLVM 9.0 git;-)
>
> Running, except mplayer/mpv (same as before).
>
> mplayer: ../src/gallium/drivers/radeon/radeon_winsys.h:866:
> radeon_get_heap_index: Assertion `!"32BIT without WC is disallowed"'
> failed.
> Abbruch (core dumped)
>
> mpv: ../src/gallium/drivers/radeon/radeon_winsys.h:866:
> radeon_get_heap_index: Assertion `!"32BIT without WC is disallowed"'
> failed.
> Abbruch (core dumped)
>
> And this after glxgears, Blender, FreeCAD, UH and UV:
>
> [38939.440950] [drm:amdgpu_ctx_mgr_entity_fini [amdgpu]] *ERROR* ctx
> 679c61fd is still alive
> [38939.440993] [drm:amdgpu_ctx_mgr_fini [amdgpu]] *ERROR* ctx
> 679c61fd is still alive
> [38964.901076] [drm:amdgpu_ctx_mgr_entity_fini [amdgpu]] *ERROR* ctx
> 9c4b659b is still alive
> [38964.901130] [drm:amdgpu_ctx_mgr_fini [amdgpu]] *ERROR* ctx
> 9c4b659b is still alive
> [38980.844577] [drm:amdgpu_ctx_mgr_entity_fini [amdgpu]] *ERROR* ctx
> 1bee3a35 is still alive
> [38980.844642] [drm:amdgpu_ctx_mgr_fini [amdgpu]] *ERROR* ctx
> 1bee3a35 is still alive
>
> Newer 'amd-staging-drm-next' needed? #0bf64b0a9f78 currently
>
> If I only had some big triangle apps...;-)
>
> Dieter
>
> Am 13.02.2019 17:36, schrieb Marek Olšák:
> > Dieter, you need final LLVM 8.0.
> >
> > Marek
> >
> > On Wed, Feb 13, 2019 at 11:02 AM Dieter Nützel 
> > wrote:
> >
> >> GREAT stuff, Marek!
> >>
> >> But sadly some crashes.
> >> Is my LLVM git version to old?
> >> 7. Jan 2019 (short before 8.0 cut)
> >>
> >> LLVM (http://llvm.org/):
> >> LLVM version 8.0.0svn
> >> Optimized build.
> >> Default target: x86_64-unknown-linux-gnu
> >> Host CPU: nehalem
> >>
> >> Registered Targets:
> >> amdgcn - AMD GCN GPUs
> >> r600   - AMD GPUs HD2XXX-HD6XXX
> >> x86- 32-bit X86: Pentium-Pro and above
> >> x86-64 - 64-bit X86: EM64T and AMD64
> >>
> >> Please have a look at my post @Phoronix:
> >>
> >
> https://www.phoronix.com/forums/forum/phoronix/latest-phoronix-articles/1079916-radeonsi-picks-up-primitive-culling-with-async-compute-for-performance-wins?p=1079984#post1079984
> >>
> >> Thanks,
> >> Dieter
> >>
> >> Am 13.02.2019 06:15, schrieb Marek Olšák:
> >>> Hi,
> >>>
> >>> This patch series uses async compute to do primitive culling
> >> before
> >>> the vertex shader. It significantly improves performance for
> >>> applications
> >>> that use a lot of geometry that is invisible because primitives
> >> don't
> >>> intersect sample points or there are a lot of back faces, etc.
> >>>
> >>> It passes 99.% of all tests (GL CTS, dEQP, piglit) and is 100%
> >>
> >>> stable.
> >>> It supports all chips all the way from Sea Islands to Radeon VII.
> >>>
> >>> As you can see in the results marked (ENABLED) in the picture
> >> below,
> >>> it destroys our competition (The GeForce results are from a
> >> Phoronix
> >>> article from 2017, the latest ones I could find):
> >>>
> >>> Benchmark: ParaView - Many Spheres - 2560x1440
> >>> https://people.freedesktop.org/~mareko/prim-discard-cs-results.png
> >>>
> >>>
> >>> The last patch describes the implementation and functional
> >> limitations
> >>> if you can find the huge code comment, so I'm not gonna do that
> >> here.
> >>>
> >>> I decided to enable this optimization on all Pro graphics cards.
> >>> The reason is that I haven't had time to benchmark games.
> >>> This decision may be changed based on community feedback, etc.
> >>>
> >>> People using the Pro graphics cards can disable this by setting
> >>> AMD_DEBUG=nopd, and people using consumer graphics cards can
> >> enable
> >>> this by setting AMD_DEBUG=pd. So you always have a choice.
> >>>
> >>> Eventually we might also enable this on consumer graphics cards
> >> for
> >>> those
> >>> games that benefit. It might decrease performance if there is not
> >>> enough
> >>> invisible geometry.
> >>>
> >>> Branch:
> >>> https://cgit.freedesktop.org/~mareko/mesa/log/?h=prim-discard-cs
> >>>
> >>> Please review.
> >>>
> >>> Thanks,
> >>> Marek
> >>> ___
> >>> mesa-dev mailing list
> >>> mesa-dev@lists.freedesktop.org
> >>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 00/26] RadeonSI: Primitive culling with async compute

2019-02-13 Thread Marek Olšák

Yes, you need amd-staging-drm-next.

Marek

On Wed, Feb 13, 2019 at 8:28 PM Dieter Nützel  wrote:

> Now with LLVM 9.0 git;-)
>
> Running, except mplayer/mpv (same as before).
>
> mplayer: ../src/gallium/drivers/radeon/radeon_winsys.h:866:
> radeon_get_heap_index: Assertion `!"32BIT without WC is disallowed"'
> failed.
> Abbruch (core dumped)
>
> mpv: ../src/gallium/drivers/radeon/radeon_winsys.h:866:
> radeon_get_heap_index: Assertion `!"32BIT without WC is disallowed"'
> failed.
> Abbruch (core dumped)
>
> And this after glxgears, Blender, FreeCAD, UH and UV:
>
> [38939.440950] [drm:amdgpu_ctx_mgr_entity_fini [amdgpu]] *ERROR* ctx
> 679c61fd is still alive
> [38939.440993] [drm:amdgpu_ctx_mgr_fini [amdgpu]] *ERROR* ctx
> 679c61fd is still alive
> [38964.901076] [drm:amdgpu_ctx_mgr_entity_fini [amdgpu]] *ERROR* ctx
> 9c4b659b is still alive
> [38964.901130] [drm:amdgpu_ctx_mgr_fini [amdgpu]] *ERROR* ctx
> 9c4b659b is still alive
> [38980.844577] [drm:amdgpu_ctx_mgr_entity_fini [amdgpu]] *ERROR* ctx
> 1bee3a35 is still alive
> [38980.844642] [drm:amdgpu_ctx_mgr_fini [amdgpu]] *ERROR* ctx
> 1bee3a35 is still alive
>
> Newer 'amd-staging-drm-next' needed? #0bf64b0a9f78 currently
>
> If I only had some big triangle apps...;-)
>
> Dieter
>
> Am 13.02.2019 17:36, schrieb Marek Olšák:
> > Dieter, you need final LLVM 8.0.
> >
> > Marek
> >
> > On Wed, Feb 13, 2019 at 11:02 AM Dieter Nützel 
> > wrote:
> >
> >> GREAT stuff, Marek!
> >>
> >> But sadly some crashes.
> >> Is my LLVM git version to old?
> >> 7. Jan 2019 (short before 8.0 cut)
> >>
> >> LLVM (http://llvm.org/):
> >> LLVM version 8.0.0svn
> >> Optimized build.
> >> Default target: x86_64-unknown-linux-gnu
> >> Host CPU: nehalem
> >>
> >> Registered Targets:
> >> amdgcn - AMD GCN GPUs
> >> r600   - AMD GPUs HD2XXX-HD6XXX
> >> x86- 32-bit X86: Pentium-Pro and above
> >> x86-64 - 64-bit X86: EM64T and AMD64
> >>
> >> Please have a look at my post @Phoronix:
> >>
> >
> https://www.phoronix.com/forums/forum/phoronix/latest-phoronix-articles/1079916-radeonsi-picks-up-primitive-culling-with-async-compute-for-performance-wins?p=1079984#post1079984
> >>
> >> Thanks,
> >> Dieter
> >>
> >> Am 13.02.2019 06:15, schrieb Marek Olšák:
> >>> Hi,
> >>>
> >>> This patch series uses async compute to do primitive culling
> >> before
> >>> the vertex shader. It significantly improves performance for
> >>> applications
> >>> that use a lot of geometry that is invisible because primitives
> >> don't
> >>> intersect sample points or there are a lot of back faces, etc.
> >>>
> >>> It passes 99.% of all tests (GL CTS, dEQP, piglit) and is 100%
> >>
> >>> stable.
> >>> It supports all chips all the way from Sea Islands to Radeon VII.
> >>>
> >>> As you can see in the results marked (ENABLED) in the picture
> >> below,
> >>> it destroys our competition (The GeForce results are from a
> >> Phoronix
> >>> article from 2017, the latest ones I could find):
> >>>
> >>> Benchmark: ParaView - Many Spheres - 2560x1440
> >>> https://people.freedesktop.org/~mareko/prim-discard-cs-results.png
> >>>
> >>>
> >>> The last patch describes the implementation and functional
> >> limitations
> >>> if you can find the huge code comment, so I'm not gonna do that
> >> here.
> >>>
> >>> I decided to enable this optimization on all Pro graphics cards.
> >>> The reason is that I haven't had time to benchmark games.
> >>> This decision may be changed based on community feedback, etc.
> >>>
> >>> People using the Pro graphics cards can disable this by setting
> >>> AMD_DEBUG=nopd, and people using consumer graphics cards can
> >> enable
> >>> this by setting AMD_DEBUG=pd. So you always have a choice.
> >>>
> >>> Eventually we might also enable this on consumer graphics cards
> >> for
> >>> those
> >>> games that benefit. It might decrease performance if there is not
> >>> enough
> >>> invisible geometry.
> >>>
> >>> Branch:
> >>> https://cgit.freedesktop.org/~mareko/mesa/log/?h=prim-discard-cs
> >>>
> >>> Please review.
> >>>
> >>> Thanks,
> >>> Marek
> >>> ___
> >>> mesa-dev mailing list
> >>> mesa-dev@lists.freedesktop.org
> >>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 00/26] RadeonSI: Primitive culling with async compute

2019-02-13 Thread Dieter Nützel


Now with LLVM 9.0 git;-)

Running, except mplayer/mpv (same as before).

mplayer: ../src/gallium/drivers/radeon/radeon_winsys.h:866: 
radeon_get_heap_index: Assertion `!"32BIT without WC is disallowed"' 
failed.

Abbruch (core dumped)

mpv: ../src/gallium/drivers/radeon/radeon_winsys.h:866: 
radeon_get_heap_index: Assertion `!"32BIT without WC is disallowed"' 
failed.

Abbruch (core dumped)

And this after glxgears, Blender, FreeCAD, UH and UV:

[38939.440950] [drm:amdgpu_ctx_mgr_entity_fini [amdgpu]] *ERROR* ctx 
679c61fd is still alive
[38939.440993] [drm:amdgpu_ctx_mgr_fini [amdgpu]] *ERROR* ctx 
679c61fd is still alive
[38964.901076] [drm:amdgpu_ctx_mgr_entity_fini [amdgpu]] *ERROR* ctx 
9c4b659b is still alive
[38964.901130] [drm:amdgpu_ctx_mgr_fini [amdgpu]] *ERROR* ctx 
9c4b659b is still alive
[38980.844577] [drm:amdgpu_ctx_mgr_entity_fini [amdgpu]] *ERROR* ctx 
1bee3a35 is still alive
[38980.844642] [drm:amdgpu_ctx_mgr_fini [amdgpu]] *ERROR* ctx 
1bee3a35 is still alive


Newer 'amd-staging-drm-next' needed? #0bf64b0a9f78 currently

If I only had some big triangle apps...;-)

Dieter

Am 13.02.2019 17:36, schrieb Marek Olšák:

Dieter, you need final LLVM 8.0.

Marek

On Wed, Feb 13, 2019 at 11:02 AM Dieter Nützel 
wrote:


GREAT stuff, Marek!

But sadly some crashes.
Is my LLVM git version to old?
7. Jan 2019 (short before 8.0 cut)

LLVM (http://llvm.org/):
LLVM version 8.0.0svn
Optimized build.
Default target: x86_64-unknown-linux-gnu
Host CPU: nehalem

Registered Targets:
amdgcn - AMD GCN GPUs
r600   - AMD GPUs HD2XXX-HD6XXX
x86- 32-bit X86: Pentium-Pro and above
x86-64 - 64-bit X86: EM64T and AMD64

Please have a look at my post @Phoronix:


https://www.phoronix.com/forums/forum/phoronix/latest-phoronix-articles/1079916-radeonsi-picks-up-primitive-culling-with-async-compute-for-performance-wins?p=1079984#post1079984


Thanks,
Dieter

Am 13.02.2019 06:15, schrieb Marek Olšák:

Hi,

This patch series uses async compute to do primitive culling

before

the vertex shader. It significantly improves performance for
applications
that use a lot of geometry that is invisible because primitives

don't

intersect sample points or there are a lot of back faces, etc.

It passes 99.% of all tests (GL CTS, dEQP, piglit) and is 100%



stable.
It supports all chips all the way from Sea Islands to Radeon VII.

As you can see in the results marked (ENABLED) in the picture

below,

it destroys our competition (The GeForce results are from a

Phoronix

article from 2017, the latest ones I could find):

Benchmark: ParaView - Many Spheres - 2560x1440
https://people.freedesktop.org/~mareko/prim-discard-cs-results.png


The last patch describes the implementation and functional

limitations

if you can find the huge code comment, so I'm not gonna do that

here.


I decided to enable this optimization on all Pro graphics cards.
The reason is that I haven't had time to benchmark games.
This decision may be changed based on community feedback, etc.

People using the Pro graphics cards can disable this by setting
AMD_DEBUG=nopd, and people using consumer graphics cards can

enable

this by setting AMD_DEBUG=pd. So you always have a choice.

Eventually we might also enable this on consumer graphics cards

for

those
games that benefit. It might decrease performance if there is not
enough
invisible geometry.

Branch:
https://cgit.freedesktop.org/~mareko/mesa/log/?h=prim-discard-cs

Please review.

Thanks,
Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 00/26] RadeonSI: Primitive culling with async compute

2019-02-13 Thread Marek Olšák

On Wed, Feb 13, 2019 at 11:51 AM Axel Davy  wrote:

> On 13/02/2019 17:42, Marek Olšák wrote:
>
> On Wed, Feb 13, 2019 at 2:28 AM Axel Davy  wrote:
>
>> On 13/02/2019 06:15, Marek Olšák wrote:
>> > I decided to enable this optimization on all Pro graphics cards.
>> > The reason is that I haven't had time to benchmark games.
>> > This decision may be changed based on community feedback, etc.
>>
>>
>> Could the decision to run the optimization be based on some perf
>> counters related to culling ? If enough vertices are culled, you'd
>> enable the optimization.
>>
>
> No, that's not possible. When I enable this, all gfx counters and pipeline
> statistics report that (almost) no primitives are culled, because the
> compute shader culls them before the gfx pipeline.
>
> You would disable by default the optimization. The perf counters would
> then be meaningful. If the perf counter tells you enough primitives are
> culled, you'd switch to the optimization and would stop looking at the
> counters. No need to enable if only a few things are culled.
>
> The best of course is that if you detect at some point the optimization is
> worth it, it won't stop being worth it in a different game scene, but it
> should be already a good filter, as if you never go above the threshold,
> you definitely don't need the optimization.
>
I can actually read back the number of primitives not culled by the compute
shader and the driver also knows the total number of input primitives. And
when the compute shader is off, I can use pipeline statistics.


>
>
>
>>
>> There seems to be an AMD patent on the optimization, I failed to see it
>> mentioned, maybe it should be pointed out somewhere.
>>
>
> Unlikely. It's based on this:
>
> https://frostbite-wp-prd.s3.amazonaws.com/wp-content/uploads/2016/03/29204330/GDC_2016_Compute.pdf
>
> And this is pretty much a simpler version of what I implemented:
> https://gpuopen.com/gaming-product/geometryfx/
>
> Marek
>
>
> This is what I found:
>
> https://patents.google.com/patent/US20180033184A1/en
>
It looks similar (I'm not a lawyer), but I generate neither a transform
shader nor a fetch shader.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 00/26] RadeonSI: Primitive culling with async compute

2019-02-13 Thread Axel Davy

On 13/02/2019 17:42, Marek Olšák wrote:
On Wed, Feb 13, 2019 at 2:28 AM Axel Davy > wrote:

On 13/02/2019 06:15, Marek Olšák wrote:
> I decided to enable this optimization on all Pro graphics cards.
> The reason is that I haven't had time to benchmark games.
> This decision may be changed based on community feedback, etc.

Could the decision to run the optimization be based on some perf
counters related to culling ? If enough vertices are culled, you'd
enable the optimization.

No, that's not possible. When I enable this, all gfx counters and
pipeline statistics report that (almost) no primitives are culled,
because the compute shader culls them before the gfx pipeline.

You would disable by default the optimization. The perf counters would
then be meaningful. If the perf counter tells you enough primitives are
culled, you'd switch to the optimization and would stop looking at the
counters. No need to enable if only a few things are culled.

The best of course is that if you detect at some point the optimization
is worth it, it won't stop being worth it in a different game scene, but
it should be already a good filter, as if you never go above the
threshold, you definitely don't need the optimization.

There seems to be an AMD patent on the optimization, I failed to
see it
mentioned, maybe it should be pointed out somewhere.

Unlikely. It's based on this:
https://frostbite-wp-prd.s3.amazonaws.com/wp-content/uploads/2016/03/29204330/GDC_2016_Compute.pdf

And this is pretty much a simpler version of what I implemented:
https://gpuopen.com/gaming-product/geometryfx/

Marek

This is what I found:

https://patents.google.com/patent/US20180033184A1/en

Axel

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 00/26] RadeonSI: Primitive culling with async compute

2019-02-13 Thread Marek Olšák

On Wed, Feb 13, 2019 at 2:28 AM Axel Davy  wrote:

> On 13/02/2019 06:15, Marek Olšák wrote:
> > I decided to enable this optimization on all Pro graphics cards.
> > The reason is that I haven't had time to benchmark games.
> > This decision may be changed based on community feedback, etc.
>
>
> Could the decision to run the optimization be based on some perf
> counters related to culling ? If enough vertices are culled, you'd
> enable the optimization.
>

No, that's not possible. When I enable this, all gfx counters and pipeline
statistics report that (almost) no primitives are culled, because the
compute shader culls them before the gfx pipeline.

>
> There seems to be an AMD patent on the optimization, I failed to see it
> mentioned, maybe it should be pointed out somewhere.
>

Unlikely. It's based on this:
https://frostbite-wp-prd.s3.amazonaws.com/wp-content/uploads/2016/03/29204330/GDC_2016_Compute.pdf

And this is pretty much a simpler version of what I implemented:
https://gpuopen.com/gaming-product/geometryfx/

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 00/26] RadeonSI: Primitive culling with async compute

2019-02-13 Thread Marek Olšák

Dieter, you need final LLVM 8.0.

Marek

On Wed, Feb 13, 2019 at 11:02 AM Dieter Nützel  wrote:

> GREAT stuff, Marek!
>
> But sadly some crashes.
> Is my LLVM git version to old?
> 7. Jan 2019 (short before 8.0 cut)
>
> LLVM (http://llvm.org/):
>LLVM version 8.0.0svn
>Optimized build.
>Default target: x86_64-unknown-linux-gnu
>Host CPU: nehalem
>
>Registered Targets:
>  amdgcn - AMD GCN GPUs
>  r600   - AMD GPUs HD2XXX-HD6XXX
>  x86- 32-bit X86: Pentium-Pro and above
>  x86-64 - 64-bit X86: EM64T and AMD64
>
> Please have a look at my post @Phoronix:
>
> https://www.phoronix.com/forums/forum/phoronix/latest-phoronix-articles/1079916-radeonsi-picks-up-primitive-culling-with-async-compute-for-performance-wins?p=1079984#post1079984
>
> Thanks,
> Dieter
>
> Am 13.02.2019 06:15, schrieb Marek Olšák:
> > Hi,
> >
> > This patch series uses async compute to do primitive culling before
> > the vertex shader. It significantly improves performance for
> > applications
> > that use a lot of geometry that is invisible because primitives don't
> > intersect sample points or there are a lot of back faces, etc.
> >
> > It passes 99.% of all tests (GL CTS, dEQP, piglit) and is 100%
> > stable.
> > It supports all chips all the way from Sea Islands to Radeon VII.
> >
> > As you can see in the results marked (ENABLED) in the picture below,
> > it destroys our competition (The GeForce results are from a Phoronix
> > article from 2017, the latest ones I could find):
> >
> > Benchmark: ParaView - Many Spheres - 2560x1440
> > https://people.freedesktop.org/~mareko/prim-discard-cs-results.png
> >
> >
> > The last patch describes the implementation and functional limitations
> > if you can find the huge code comment, so I'm not gonna do that here.
> >
> > I decided to enable this optimization on all Pro graphics cards.
> > The reason is that I haven't had time to benchmark games.
> > This decision may be changed based on community feedback, etc.
> >
> > People using the Pro graphics cards can disable this by setting
> > AMD_DEBUG=nopd, and people using consumer graphics cards can enable
> > this by setting AMD_DEBUG=pd. So you always have a choice.
> >
> > Eventually we might also enable this on consumer graphics cards for
> > those
> > games that benefit. It might decrease performance if there is not
> > enough
> > invisible geometry.
> >
> > Branch:
> > https://cgit.freedesktop.org/~mareko/mesa/log/?h=prim-discard-cs
> >
> > Please review.
> >
> > Thanks,
> > Marek
> > ___
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 00/26] RadeonSI: Primitive culling with async compute

2019-02-13 Thread Dieter Nützel

GREAT stuff, Marek!

But sadly some crashes.
Is my LLVM git version to old?
7. Jan 2019 (short before 8.0 cut)

LLVM (http://llvm.org/):
LLVM version 8.0.0svn
Optimized build.
Default target: x86_64-unknown-linux-gnu
Host CPU: nehalem

Registered Targets:
amdgcn - AMD GCN GPUs
r600 - AMD GPUs HD2XXX-HD6XXX
x86- 32-bit X86: Pentium-Pro and above
x86-64 - 64-bit X86: EM64T and AMD64

Please have a look at my post @Phoronix:
https://www.phoronix.com/forums/forum/phoronix/latest-phoronix-articles/1079916-radeonsi-picks-up-primitive-culling-with-async-compute-for-performance-wins?p=1079984#post1079984

Thanks,
Dieter

Am 13.02.2019 06:15, schrieb Marek Olšák:

Hi,

This patch series uses async compute to do primitive culling before
the vertex shader. It significantly improves performance for
applications

that use a lot of geometry that is invisible because primitives don't
intersect sample points or there are a lot of back faces, etc.

It passes 99.% of all tests (GL CTS, dEQP, piglit) and is 100%
stable.

It supports all chips all the way from Sea Islands to Radeon VII.

As you can see in the results marked (ENABLED) in the picture below,
it destroys our competition (The GeForce results are from a Phoronix
article from 2017, the latest ones I could find):

Benchmark: ParaView - Many Spheres - 2560x1440
https://people.freedesktop.org/~mareko/prim-discard-cs-results.png

The last patch describes the implementation and functional limitations
if you can find the huge code comment, so I'm not gonna do that here.

I decided to enable this optimization on all Pro graphics cards.
The reason is that I haven't had time to benchmark games.
This decision may be changed based on community feedback, etc.

People using the Pro graphics cards can disable this by setting
AMD_DEBUG=nopd, and people using consumer graphics cards can enable
this by setting AMD_DEBUG=pd. So you always have a choice.

Eventually we might also enable this on consumer graphics cards for
those
games that benefit. It might decrease performance if there is not
enough

invisible geometry.

Branch:
https://cgit.freedesktop.org/~mareko/mesa/log/?h=prim-discard-cs

Please review.

Thanks,
Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 00/26] RadeonSI: Primitive culling with async compute

2019-02-12 Thread Axel Davy


On 13/02/2019 06:15, Marek Olšák wrote:

I decided to enable this optimization on all Pro graphics cards.
The reason is that I haven't had time to benchmark games.
This decision may be changed based on community feedback, etc.



Could the decision to run the optimization be based on some perf 
counters related to culling ? If enough vertices are culled, you'd 
enable the optimization.


There seems to be an AMD patent on the optimization, I failed to see it 
mentioned, maybe it should be pointed out somewhere.



Yours,

Axel


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 00/26] RadeonSI: Primitive culling with async compute

2019-02-12 Thread Marek Olšák

Hi,

This patch series uses async compute to do primitive culling before
the vertex shader. It significantly improves performance for applications
that use a lot of geometry that is invisible because primitives don't
intersect sample points or there are a lot of back faces, etc.

It passes 99.% of all tests (GL CTS, dEQP, piglit) and is 100% stable.
It supports all chips all the way from Sea Islands to Radeon VII.

As you can see in the results marked (ENABLED) in the picture below,
it destroys our competition (The GeForce results are from a Phoronix
article from 2017, the latest ones I could find):

Benchmark: ParaView - Many Spheres - 2560x1440
https://people.freedesktop.org/~mareko/prim-discard-cs-results.png


The last patch describes the implementation and functional limitations
if you can find the huge code comment, so I'm not gonna do that here.

I decided to enable this optimization on all Pro graphics cards.
The reason is that I haven't had time to benchmark games.
This decision may be changed based on community feedback, etc.

People using the Pro graphics cards can disable this by setting
AMD_DEBUG=nopd, and people using consumer graphics cards can enable
this by setting AMD_DEBUG=pd. So you always have a choice.

Eventually we might also enable this on consumer graphics cards for those
games that benefit. It might decrease performance if there is not enough
invisible geometry.

Branch:
https://cgit.freedesktop.org/~mareko/mesa/log/?h=prim-discard-cs

Please review.

Thanks,
Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 00/26] RadeonSI: Primitive culling with async compute

Re: [Mesa-dev] [PATCH 00/26] RadeonSI: Primitive culling with async compute

Re: [Mesa-dev] [PATCH 00/26] RadeonSI: Primitive culling with async compute

Re: [Mesa-dev] [PATCH 00/26] RadeonSI: Primitive culling with async compute

Re: [Mesa-dev] [PATCH 00/26] RadeonSI: Primitive culling with async compute

Re: [Mesa-dev] [PATCH 00/26] RadeonSI: Primitive culling with async compute

Re: [Mesa-dev] [PATCH 00/26] RadeonSI: Primitive culling with async compute

Re: [Mesa-dev] [PATCH 00/26] RadeonSI: Primitive culling with async compute

Re: [Mesa-dev] [PATCH 00/26] RadeonSI: Primitive culling with async compute

Re: [Mesa-dev] [PATCH 00/26] RadeonSI: Primitive culling with async compute

Re: [Mesa-dev] [PATCH 00/26] RadeonSI: Primitive culling with async compute

Re: [Mesa-dev] [PATCH 00/26] RadeonSI: Primitive culling with async compute

Re: [Mesa-dev] [PATCH 00/26] RadeonSI: Primitive culling with async compute

Re: [Mesa-dev] [PATCH 00/26] RadeonSI: Primitive culling with async compute

Re: [Mesa-dev] [PATCH 00/26] RadeonSI: Primitive culling with async compute

Re: [Mesa-dev] [PATCH 00/26] RadeonSI: Primitive culling with async compute

[Mesa-dev] [PATCH 00/26] RadeonSI: Primitive culling with async compute

17 matches

Site Navigation

Mail list logo

Footer information