from:"Samuel Pitoiset"

Re: [ANNOUNCE] mesa 22.0.0-rc2

2022-02-09 Thread Samuel Pitoiset




On 2/9/22 20:56, Bas Nieuwenhuizen wrote:

On Wed, Feb 9, 2022 at 7:10 PM Dylan Baker  wrote:

Hi all,

I'd like to announce the availability of Mesa 22.0.0-rc2, the second
release candidate for mesa 22.0.0. We have lots of fixes here, including
a good deal of zink fixes, and some changes for shared microsoft, egl,
core mesa, crocus, broadcom, iris, core intel, anv, llvmpipe, xvga,
radeonsi, aco, and radv.

Cheers,
Dylan

shortlog



Charmaine Lee (1):
   mesa: fix misaligned pointer returned by dlist_alloc

Daniel Stone (1):
   egl/wayland: Reset buffer age when destroying buffers

Danylo Piliaiev (1):
   turnip: Unconditionaly remove descriptor set from pool's list on free

Dave Airlie (1):
   crocus: find correct relocation target for the bo.

Dylan Baker (4):
   .pick_status.json: Update to 0447a2303fb06d6ad1f64e5f079a74bf2cf540da
   .pick_status.json: Update to 8335fdfeafbe1fd14cb65f9088bbba15d9eb00dc
   .pick_status.json: Update to 5e9df85b1a4504c5b4162e77e139056dc80accc6
   VERSION: bump version for 22.0.0-rc2

Iago Toral Quiroga (1):
   broadcom/compiler: fix offset alignment for ldunifa when skipping

Jesse Natalie (2):
   microsoft/compiler: Only prep phis for the current function
   microsoft/compiler: Only treat tess level location as special if it's a 
patch constant

Kenneth Graunke (1):
   iris: Make an iris_foreach_batch macro that skips unsupported batches

Lionel Landwerlin (3):
   intel/fs: don't set allow_sample_mask for CS intrinsics
   intel/nir: fix shader call lowering
   anv: fix conditional render for vkCmdDrawIndirectByteCountEXT

Mike Blumenkrantz (7):
   zink: disable PIPE_SHADER_CAP_FP16_CONST_BUFFERS
   llvmpipe: disable PIPE_SHADER_CAP_FP16_CONST_BUFFERS
   zink: add VK_BUFFER_USAGE_CONDITIONAL_RENDERING_BIT_EXT for query binds
   zink: use scanout obj when returning resource param info
   zink: fix PIPE_CAP_TGSI_BALLOT export conditional
   zink: reject invalid draws
   zink: min/max blit region in coverage functions

Neha Bhende (1):
   svga: store shared_mem_size in svga_compute_shader instead of 
svga_context

Pierre-Eric Pelloux-Prayer (1):
   radeonsi: limit loop unrolling for LLVM < 13

Rhys Perry (2):
   aco: don't encode src2 for v_writelane_b32_e64
   radv: fix R_02881C_PA_CL_VS_OUT_CNTL with mixed cull/clip distances

Samuel Pitoiset (1):
   Revert "radv: re-apply "Do not access set layout during 
vkCmdBindDescriptorSets.""

Hi Dylan, can we add

commit 66f7289d568db8711adb885acc56622e2aff252a
Author: Samuel Pitoiset 
Date:   Wed Jan 19 16:15:33 2022 +0100

radv: add reference counting for descriptor set layouts


If we take that revert? The revert wasn't because the patch was bad
but because we had a better patch.

Yes, we need that.



git tag: mesa-22.0.0-rc2

https://mesa.freedesktop.org/archive/mesa-22.0.0-rc2.tar.xz
SHA256: 14d6478ad367b22fbb24251f3282d98ba9b8c7758dcd416b33353e1387fd57f7  
mesa-22.0.0-rc2.tar.xz
SHA512: 
9e05355a31f1640df6e800ccdf3150720d1a54aa21d9eb748d567b2b64090b09b6bc54318f2f72644b48c8d08f9db0f7ab3d35c9e1b629ded932fd9ed2e87630
  mesa-22.0.0-rc2.tar.xz
PGP:  https://mesa.freedesktop.org/archive/mesa-22.0.0-rc2.tar.xz.sig

Re: [Mesa-dev] Outstanding Mesa 21.0 patches

2021-04-12 Thread Samuel Pitoiset


For RADV: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10174

On 4/9/21 6:39 PM, Dylan Baker wrote:

Hi all,

I've been a little behind on release work recently, and I'm tryinng to
cleanup the backlog of patches against the 21.0 branch that haven't been
applied but have been nominated. Below is the list of outstanding
patches that either don't apply, or cause regressions. If you'd like to
have these applied please provide a backport, otherwise I'm going to
mark all of these as "denominated".

Cheers,
Dylan


6a29632dd2 Revert "glcpp: disable 'windows' tests"
bd1705a480 vulkan: Make vk_debug_report_callback derive from vk_object_base
366fb28dac ci: Fix MESA_TEMPLATES_COMMIT value
0464117ad9 ci: remove nouveau from shader-db runs
0f7379e308 ci: tracie dashboard URLs only in the failure after the testcase
cff5c40fc3 pan/bi: Fix blend shaders using LD_TILE with MRT
c0c03f29e0 lavapipe: implement physical device group enumeration
b6b3b38434 turnip: consider HW limit on number of views when apply multipos opt
5a340c0929 vulkan/util: add api to reset object magic + private data.
226c7ae2a8 lavapipe: reset object base on recycled command buffers
8b44e45347 intel/perf: fix roll over PERF_CNT counter accumulation
3d3f21f0be ci: add libdrm to the x86_test-vk container
0a939e788f lavapipe: reorder descriptor set stages to get correct binding
abc724e440 lavapipe: sort bindings before creating descriptor set
3436e5295b pan/bi: Treat +DISCARD.f32 as message-passing
2c02740a8c intel/mi_builder: Use AddCSMMIOStartOffset for LRI
d4f21b53f2 nir/range_analysis: Add "is finite" range analysis tracking
aa5d38decd nir/range_analysis: Add "is a number" range analysis tracking
f4a7dbc58f nir/range_analysis: Fix analysis of fmin, fmax, or fsat with NaN 
source
30cf07cc8a lavapipe: fix primitive-restart for uint8 indices
32eb74e1e1 ac/gpu_info: fix more non-coherent RB and GL2 combinations
799a931d12 anv/apply_pipeline_layout: Rework the early pass index/offset helpers
3257ab9f23 radv: Dedupe winsyses per device.
90632ae7b3 lavapipe: stop tracking draw start/count on rendering state
f7acdb1d1d st/glthread: allow for invalid L3 cache id.
a5d5cbdf08 freedreno: Fix file descriptor leak.
61cf77583a lavapipe: Free sorted descriptor array.
33d87eeb5a gallium: add PIPE_CAP_ALLOW_DYNAMIC_VAO_FASTPATH
1df3a00dcd iris: disable dynamic VAO fastpath on GFX version 9
9413c6aec3 mesa: Add anything dynamically indexed before any non-dynamically 
indexed
fe53c22294 lavapipe: fix only clearing depth or stencil paths.
fe5349f70c freedreno/a6xx: Fix alpha tests.
e4ef5f0433 mesa/st: ignore texture_index if tex_instr has deref src
363c1ef0c0 gallium/u_threaded: split draws that don't fit in a batch
961361cdc9 aco: ensure loops nested in a WQM loop are in WQM
0845cabc72 vulkan: Track dependencies of Python imports

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] Request for developer access for Tony Wasserka

2020-11-20 Thread Samuel Pitoiset


+1

On 11/20/20 10:44 AM, Daniel Schürmann wrote:

Hi,

I would like to request developer access for our new team member, Tony 
Wasserka.


He has proven himself capable with a number of MRs and works actively 
on the ACO backend.


His gitlab nick is neobrain.

Thanks in advance,
Daniel

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] mediump support: future work

2020-05-05 Thread Samuel Pitoiset

On RADV, we already support fp16 with the LLVM backend (not saying it's 
always optimized though) and with ACO it should be mostly working but 
not yet enabled because I think we would like fast packed math support 
first and I'm not sure if fp16 I/O are implemented.


There is also some missing fp16 optimizations in the ACO backend which 
makes some games (eg. Youngblood) a bit slower if fp16 
features/extensions are exposed but the gap should be filled soon hopefully.


I think the most important thing missing is a SLP vectorizer in NIR for us.

On 5/4/20 8:43 PM, Marek Olšák wrote:

Hi,

This is the status of mediump support in Mesa. What I listed is what 
AMD GPUs can do. "Yes" means what Mesa supports.


*Feature*   *FP16 support*  *Int16 support*
ALU Yes No
UniformsNo  No
VS in   No  No
VS out / FS in  No  No
FS out  No  No
TCS, TES, GS out / in   No  No
Sampler coordinates (only coord, derivs, lod, bias; not offset and 
compare) 	No 	---

Image coordinates   --- No
Return value from samplers (incl. sampler buffers)  Yes
No
Return value from image loads (incl. image buffers) No  No
Data source for image stores (incl. image buffers)  No  No
If 16-bit sampler/image instructions are surrounded by conversions, 
promote them to 32 bits 	No 	No



Please let me know if you don't see the table correctly.

I'd like to know if I can enable some of them using the existing FP16 
CAP. The only drivers supporting FP16 are currently Freedreno and 
Panfrost.


Thanks,
Marek

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [ANNOUNCE] mesa 20.0.3

2020-04-02 Thread Samuel Pitoiset


Good catch!

Yes, please revert it asap, it breaks a bunch of things ... :(

On 4/2/20 11:11 AM, Danylo Piliaiev wrote:

"spirv: Implement OpCopyObject and OpCopyLogical as blind copies" was reverted 
yesterday
due to the failures in several dEQP-VK tests, see:
  
https://gitlab.freedesktop.org/mesa/mesa/-/commit/68f325b256d96dca923f6c7d84bc6faf43911245
  https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4375
I'm not sure if it's already known or how important it is, but I'd better say 
it than not.

On 02.04.20 00:52, Eric Engestrom wrote:

Hi all,

I'd like to announce the release of Mesa 20.0.3.

Quite a busy cycle again, with fixes all over the tree, but nothing
extraordinary; mostly AMD (radv, aco), NIR and Intel (isl, anv), as
expected.

Cheers,
   Eric

---

Git shortlog


Caio Marcelo de Oliveira Filho (1):
   mesa/main: Fix overflow in validation of DispatchComputeGroupSizeARB

Dylan Baker (6):
   docs/relnotes: Add sha256 sums for 20.0.2
   .pick_status.json: Update to cf62c2b2ac69637785f55b790fdd601c17e7e9d5
   .pick_status.json: Mark 672d10619980687acec329742f055f7f3796c1b8 as 
backported
   .pick_status.json: Mark c923de68dd0ab10a5a5fb3196f539707d046d897 as 
backported
   .pick_status.json: Mark 56de6f698e3f164d97f132203e8159ef0b8e9bb8 as 
denominated
   .pick_status.json: Update to aee004a7c8900938d1c17f0ac299d40001b383b0

Eric Engestrom (8):
   .pick_status.json: Update to 3252041a7872c49e53bb02ffe8b079b5fc43f15e
   .pick_status.json: Update to 12711939320e4fcd3a0d86af22da1042ad92035f
   .pick_status.json: Update to 05069e1f0794aadd40ce9269f858e50c64254388
   .pick_status.json: Update to 8970b7839aebefa7207c9535ac34ab4e8cc0ae25
   .pick_status.json: Update to 5f4d9b419a1c931ad468b8b22b8a95b1216891e4
   .pick_status.json: Update to 70ac7f5b0c46370075a35067c9f7dfe78e84b16d
   docs: add release notes for 20.0.3
   VERSION: bump to 20.0.3

Erik Faye-Lund (3):
   rbug: do not return void-value
   pipebuffer: clean up cast-warnings
   vtn/opencl: fully enable OpenCLstd_Clz

Francisco Jerez (1):
   intel/fs/gen12: Fix interaction of SWSB dependency combination with EU 
fusion workaround.

Greg V (1):
   amd/addrlib: fix build on non-x86 platforms

Ian Romanick (2):
   soft-fp64/fsat: Correctly handle NaN
   soft-fp64: Split a block that was missing a cast on a comparison

Jason Ekstrand (5):
   intel/blorp: Add support for swizzling fast-clear colors
   anv: Swizzle fast-clear values
   nir/lower_int64: Lower 8 and 16-bit downcasts with nir_lower_mov64
   anv: Account for the header in anv_state_stream_alloc
   spirv: Implement OpCopyObject and OpCopyLogical as blind copies

John Stultz (2):
   gallium: hud_context: Fix scalar initializer warning.
   vc4_bufmgr: Remove duplicative VC definition

Jordan Justen (2):
   intel: Update TGL PCI strings
   intel: Add TGL PCI ID

Lionel Landwerlin (5):
   isl: implement linear tiling row pitch requirement for display
   isl: properly filter supported display modifiers on Gen9+
   isl: only apply main surface ccs pitch constraint with CCS
   isl: drop min row pitch alignment when set by the driver
   intel: add new TGL pci ids

Marek Olšák (3):
   nir: fix clip/cull_distance_array_size in 
nir_lower_clip_cull_distance_arrays
   ac: fix fast division
   st/mesa: fix use of uninitialized memory due to st_nir_lower_builtin

Marek Vasut (1):
   etnaviv: Emit PE.ALPHA_COLOR_EXT* on GPUs with half-float support

Neil Armstrong (1):
   Revert "ci: Remove T820 from CI temporarily"

Pierre-Eric Pelloux-Prayer (1):
   st/mesa: disallow deferred flush if there are multiple contexts

Rhys Perry (11):
   nir/gather_info: handle emit_vertex_with_counter
   aco: set has_divergent_branch for discards in loops
   aco: handle missing second predecessors at merge block phis
   aco: skip NIR in unreachable merge blocks
   aco: improve check for unreachable loop continue blocks
   aco: emit IR in IF's merge block instead if the other side ends in a jump
   aco: fix boolean undef regclass
   nir/gather_info: fix per-vertex handling in try_mask_partial_io
   aco: implement 64-bit VGPR constant copies in handle_operands()
   glsl: fix race in instance getters
   util/u_queue: fix race in total_jobs_size access

Rob Clark (2):
   freedreno/ir3/ra: fix array liveranges
   util: fix u_fifo_pop()

Samuel Pitoiset (7):
   radv/gfx10: fix required subgroup size with VK_EXT_subgroup_size_control
   radv/gfx10: fix required ballot size with VK_EXT_subgroup_size_control
   radv: fix optional pSizes parameter when binding streamout buffers
   radv: enable VK_KHR_8bit_storage on GFX6-GFX7
   ac/nir: use llvm.amdgcn.rcp for nir_op_frcp
   ac/nir: use llvm.amdgcn.rsq for nir_op_frsq
   ac/nir:

Re: [Mesa-dev] [ANNOUNCE] Mesa 20.0 branchpoint planned for 2020/01/29, Milestone opened

2020-01-29 Thread Samuel Pitoiset



On 1/28/20 8:46 PM, Dylan Baker wrote:

Quoting Dylan Baker (2020-01-22 10:27:05)

Hi list, due to some last minute changes in plan I'll be managing the 20.0
release. The release calendar has been updated, but the gitlab milestone wasn't
opened. That has been corrected, and is here
https://gitlab.freedesktop.org/mesa/mesa/-/milestones/9, please add any issues
or MRs you would like to land before the branchpoint to the milestone.

Thanks,
Dylan


Hi list,

There are still a fair number of issues and MRs opened for the 20.0 branch
point, should we postpone the branch point?
Everything we wanted to merge to 20.0 is now upstream. No deadline 
extension request from the RADV team, at least. :-)


Dylan

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] 12.0 release?

2020-01-15 Thread Samuel Pitoiset


I should have said that in my previous reply:

We would like to have ACO/GFX6 support and 
VK_AMD_shader_explicit_vertex_parameter in Mesa 20.0.


I think it's doable if branchpoint is in two weeks or something like that.

FWIW, 19.0 branchpoint was on January 29 last year.

On 1/15/20 5:50 PM, Samuel Pitoiset wrote:


If we can wait end of january that would be highly appreciated :-)

On 1/15/20 5:13 PM, Jason Ekstrand wrote:
When were we planning to cut the 20.0 release?  We just landed Vulkan 
1.2 support for ANV and RADV this morning so it seems like a good 
time to me.  The release calendar has nothing for 2020:


https://www.mesa3d.org/release-calendar.html

--Jason

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] 12.0 release?

2020-01-15 Thread Samuel Pitoiset


If we can wait end of january that would be highly appreciated :-)

On 1/15/20 5:13 PM, Jason Ekstrand wrote:
When were we planning to cut the 20.0 release?  We just landed Vulkan 
1.2 support for ANV and RADV this morning so it seems like a good time 
to me.  The release calendar has nothing for 2020:


https://www.mesa3d.org/release-calendar.html

--Jason

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] libdrm versioning - switch to 19.0.0?

2019-10-11 Thread Samuel Pitoiset


LGTM

It's easier to figure how old is a release with year based versioning.

On 10/10/19 10:14 PM, Marek Olšák wrote:

Hi,

I expect to make a new libdrm release soon. Any objections to changing 
the versioning scheme?


Current: 2.4.n
n = starts from 0, incremented per release

New proposals:
year.n.0 (19.0.0)
year.month.n (19.10.0)
year.month.day (19.10.10)

Marek

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] Switching to Gitlab Issues instead of Bugzilla?

2019-08-30 Thread Samuel Pitoiset


+1

On 8/29/19 8:52 PM, Kenneth Graunke wrote:

Hi all,

As a lot of you have probably noticed, Bugzilla seems to be getting a
lot of spam these days - several of us have been disabling a bunch of
accounts per day, sweeping new reports under the rug, hiding comments,
etc.  This bug spam causes emails to be sent (more spam!) and then us
to have to look at ancient bugs that suddenly have updates.

I think it's probably time to consider switching away from Bugzilla.
We are one of the few projects remaining - Mesa, DRM, and a few DDX
drivers are still there, but almost all other projects are gone:

https://bugs.freedesktop.org/enter_bug.cgi

Originally, I was in favor of retaining Bugzilla just to not change too
many processes all at once.  But we've been using Gitlab a while now,
and several of us have been using Gitlab issues in our personal repos;
it's actually pretty nice.

Some niceities:

- Bug reporters don't necessarily need to sign up for an account
   anymore.  They can sign in with their Gitlab.com, Github, Google,
   or Twitter accounts.  Or make one as before.  This may be nicer for
   reporters that don't want to open yet another account just to report
   an issue to us.

- Anti-spam support is actually maintained.  Bugzilla makes it near
   impossible to actually delete garbage, Gitlab makes it easier.  It
   has a better account creation hurdle than Bugzilla's ancient captcha,
   and Akismet plug-ins for handling spam.

- The search interface is more modern and easier to use IMO.

- Permissions & accounts are easier - it's the same unified system.

- Easy linking between issues and MRs - mention one in the other, and
   both get updated with cross-links so you don't miss any discussion.

- Milestone tracking

   - This could be handy for release trackers - both features people
 want to land, and bugs blocking the release.

   - We could also use it for big efforts like direct state access,
 getting feature parity with fglrx, or whatnot.

- Khronos switched a while ago as well, so a number of us are already
   familiar with using it there.

Some cons:

- Moving bug reports between the kernel and Mesa would be harder.
   We would have to open a bug in the other system.  (Then again,
   moving bugs between Mesa and X or Wayland would be easier...)

What do people think?  If folks are in favor, Daniel can migrate
everything for us, like he did with the other projects.  If not,
I'd like to hear what people's concerns are.

--Ken

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] Navi14 for 19.2

2019-08-27 Thread Samuel Pitoiset


ACK.

On 8/26/19 9:05 PM, Marek Olšák wrote:

Hi,

I'd like to push the Navi14 merge request to 19.2 no later than 
Tuesday August 27.


https://gitlab.freedesktop.org/mesa/mesa/merge_requests/1726

Please ack if it's OK with you,

Thanks,
Marek

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2] ac: fix exclusive scans on GFX8-GFX9

2019-08-21 Thread Samuel Pitoiset

This fixes a regression introduced with scan operations
on GFX10. Note that some subgroups CTS still fail on GFX10 but
I assume it's a different issue.

This fixes dEQP-VK.subgroups.arithmetic.*.subgroupexclusive*.

v2: - move the logic back to ac_build_scan()

Fixes: 227c29a80de "amd/common/gfx10: implement scan & reduce operations"
Signed-off-by: Samuel Pitoiset 
---
 src/amd/common/ac_llvm_build.c | 7 +++
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/src/amd/common/ac_llvm_build.c b/src/amd/common/ac_llvm_build.c
index 05871f5ea98..5abae00d8f6 100644
--- a/src/amd/common/ac_llvm_build.c
+++ b/src/amd/common/ac_llvm_build.c
@@ -4221,10 +4221,9 @@ ac_build_scan(struct ac_llvm_context *ctx, nir_op op, 
LLVMValueRef src, LLVMValu
if (ctx->chip_class >= GFX10) {
result = inclusive ? src : identity;
} else {
-   if (inclusive)
-   result = src;
-   else
-   result = ac_build_dpp(ctx, identity, src, dpp_wf_sr1, 
0xf, 0xf, false);
+   if (!inclusive)
+   src = ac_build_dpp(ctx, identity, src, dpp_wf_sr1, 0xf, 
0xf, false);
+   result = src;
}
if (maxprefix <= 1)
return result;
-- 
2.22.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] ac: fix exclusive scans on GFX8-GFX9

2019-08-21 Thread Samuel Pitoiset



On 8/21/19 3:59 PM, Bas Nieuwenhuizen wrote:

On Wed, Aug 21, 2019 at 3:45 PM Samuel Pitoiset
 wrote:

This fixes a regression introduced with scan operations
on GFX10. Note that some subgroups CTS still fail on GFX10 but
I assume it's a different issue.

This fixes dEQP-VK.subgroups.arithmetic.*.subgroupexclusive*.

Fixes: 227c29a80de "amd/common/gfx10: implement scan & reduce operations"
Signed-off-by: Samuel Pitoiset 
---
  src/amd/common/ac_llvm_build.c | 7 +++
  1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/src/amd/common/ac_llvm_build.c b/src/amd/common/ac_llvm_build.c
index 05871f5ea98..d72eaa2db46 100644
--- a/src/amd/common/ac_llvm_build.c
+++ b/src/amd/common/ac_llvm_build.c
@@ -4221,10 +4221,7 @@ ac_build_scan(struct ac_llvm_context *ctx, nir_op op, 
LLVMValueRef src, LLVMValu
 if (ctx->chip_class >= GFX10) {
 result = inclusive ? src : identity;
 } else {
-   if (inclusive)
-   result = src;
-   else
-   result = ac_build_dpp(ctx, identity, src, dpp_wf_sr1, 
0xf, 0xf, false);
+   result = src;
 }
 if (maxprefix <= 1)
 return result;
@@ -4333,6 +4330,8 @@ ac_build_exclusive_scan(struct ac_llvm_context *ctx, 
LLVMValueRef src, nir_op op
 get_reduction_identity(ctx, op, 
ac_get_type_size(LLVMTypeOf(src)));
 result = LLVMBuildBitCast(ctx->builder, ac_build_set_inactive(ctx, 
src, identity),
   LLVMTypeOf(identity), "");
+   if (ctx->chip_class <= GFX9)
+   result = ac_build_dpp(ctx, identity, result, dpp_wf_sr1, 0xf, 
0xf, false);

Kinda annoying that we still do the inclusive/exclusive logic for
gfx10 inside ac_build_scan. Can we keep this inside the function by
using a intermediate src?

Looks better indeed, I will make that change.



 result = ac_build_scan(ctx, op, result, identity, ctx->wave_size, 
false);

 return ac_build_wwm(ctx, result);
--
2.22.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] ac: fix exclusive scans on GFX8-GFX9

2019-08-21 Thread Samuel Pitoiset

This fixes a regression introduced with scan operations
on GFX10. Note that some subgroups CTS still fail on GFX10 but
I assume it's a different issue.

This fixes dEQP-VK.subgroups.arithmetic.*.subgroupexclusive*.

Fixes: 227c29a80de "amd/common/gfx10: implement scan & reduce operations"
Signed-off-by: Samuel Pitoiset 
---
 src/amd/common/ac_llvm_build.c | 7 +++
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/src/amd/common/ac_llvm_build.c b/src/amd/common/ac_llvm_build.c
index 05871f5ea98..d72eaa2db46 100644
--- a/src/amd/common/ac_llvm_build.c
+++ b/src/amd/common/ac_llvm_build.c
@@ -4221,10 +4221,7 @@ ac_build_scan(struct ac_llvm_context *ctx, nir_op op, 
LLVMValueRef src, LLVMValu
if (ctx->chip_class >= GFX10) {
result = inclusive ? src : identity;
} else {
-   if (inclusive)
-   result = src;
-   else
-   result = ac_build_dpp(ctx, identity, src, dpp_wf_sr1, 
0xf, 0xf, false);
+   result = src;
}
if (maxprefix <= 1)
return result;
@@ -4333,6 +4330,8 @@ ac_build_exclusive_scan(struct ac_llvm_context *ctx, 
LLVMValueRef src, nir_op op
get_reduction_identity(ctx, op, 
ac_get_type_size(LLVMTypeOf(src)));
result = LLVMBuildBitCast(ctx->builder, ac_build_set_inactive(ctx, src, 
identity),
  LLVMTypeOf(identity), "");
+   if (ctx->chip_class <= GFX9)
+   result = ac_build_dpp(ctx, identity, result, dpp_wf_sr1, 0xf, 
0xf, false);
result = ac_build_scan(ctx, op, result, identity, ctx->wave_size, 
false);
 
return ac_build_wwm(ctx, result);
-- 
2.22.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/2] radv/gfx10: do not use NGG with NAVI14

2019-08-21 Thread Samuel Pitoiset

Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/radv_pipeline.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/amd/vulkan/radv_pipeline.c b/src/amd/vulkan/radv_pipeline.c
index 64bd0d64401..c049a2844b8 100644
--- a/src/amd/vulkan/radv_pipeline.c
+++ b/src/amd/vulkan/radv_pipeline.c
@@ -2320,6 +2320,7 @@ radv_fill_shader_keys(struct radv_device *device,
}
 
if (device->physical_device->rad_info.chip_class >= GFX10 &&
+   device->physical_device->rad_info.family != CHIP_NAVI14 &&
!(device->instance->debug_flags & RADV_DEBUG_NO_NGG)) {
if (nir[MESA_SHADER_TESS_CTRL]) {
keys[MESA_SHADER_TESS_EVAL].vs_common_out.as_ngg = true;
-- 
2.22.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 1/2] radv/gfx10: don't initialize VGT_INSTANCE_STEP_RATE_0

2019-08-21 Thread Samuel Pitoiset

Only gfx9 and older use it to get InstanceID in VGPR1.
Ported from RadeonSI.

Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/si_cmd_buffer.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/amd/vulkan/si_cmd_buffer.c b/src/amd/vulkan/si_cmd_buffer.c
index e37b9498a71..32674d38bb9 100644
--- a/src/amd/vulkan/si_cmd_buffer.c
+++ b/src/amd/vulkan/si_cmd_buffer.c
@@ -192,7 +192,8 @@ si_emit_graphics(struct radv_physical_device 
*physical_device,
radeon_set_context_reg(cs, R_028B98_VGT_STRMOUT_BUFFER_CONFIG, 
0x0);
}
 
-   radeon_set_context_reg(cs, R_028AA0_VGT_INSTANCE_STEP_RATE_0, 1);
+   if (physical_device->rad_info.chip_class <= GFX9)
+   radeon_set_context_reg(cs, R_028AA0_VGT_INSTANCE_STEP_RATE_0, 
1);
if (!physical_device->has_clear_state)
radeon_set_context_reg(cs, R_028AB8_VGT_VTX_CNT_EN, 0x0);
if (physical_device->rad_info.chip_class < GFX7)
-- 
2.22.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] radv: implement VK_AMD_shader_core_properties2

2019-08-21 Thread Samuel Pitoiset

Trivial extension that matches PAL.

Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/radv_device.c  | 9 +
 src/amd/vulkan/radv_extensions.py | 1 +
 2 files changed, 10 insertions(+)

diff --git a/src/amd/vulkan/radv_device.c b/src/amd/vulkan/radv_device.c
index cc45ac95c08..5fde4577e4e 100644
--- a/src/amd/vulkan/radv_device.c
+++ b/src/amd/vulkan/radv_device.c
@@ -1300,6 +1300,15 @@ void radv_GetPhysicalDeviceProperties2(
properties->vgprAllocationGranularity = 4;
break;
}
+   case 
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_SHADER_CORE_PROPERTIES_2_AMD: {
+   VkPhysicalDeviceShaderCoreProperties2AMD *properties =
+   (VkPhysicalDeviceShaderCoreProperties2AMD *)ext;
+
+   properties->shaderCoreFeatures = 0;
+   properties->activeComputeUnitCount =
+   pdevice->rad_info.num_good_compute_units;
+   break;
+   }
case 
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_VERTEX_ATTRIBUTE_DIVISOR_PROPERTIES_EXT: {
VkPhysicalDeviceVertexAttributeDivisorPropertiesEXT 
*properties =

(VkPhysicalDeviceVertexAttributeDivisorPropertiesEXT *)ext;
diff --git a/src/amd/vulkan/radv_extensions.py 
b/src/amd/vulkan/radv_extensions.py
index 3624970dd37..b28d74f5746 100644
--- a/src/amd/vulkan/radv_extensions.py
+++ b/src/amd/vulkan/radv_extensions.py
@@ -143,6 +143,7 @@ EXTENSIONS = [
 Extension('VK_AMD_rasterization_order',   1, 
'device->has_out_of_order_rast'),
 Extension('VK_AMD_shader_ballot', 1, 
'device->use_shader_ballot'),
 Extension('VK_AMD_shader_core_properties',1, True),
+Extension('VK_AMD_shader_core_properties2',   1, True),
 Extension('VK_AMD_shader_info',   1, True),
 Extension('VK_AMD_shader_trinary_minmax', 1, True),
 Extension('VK_GOOGLE_decorate_string',1, True),
-- 
2.22.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] radv: allow to enable VK_AMD_shader_ballot only on GFX8+

2019-08-21 Thread Samuel Pitoiset

Scans aren't implemented on SI/CIK.

Cc: 19.2 
Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/radv_device.c | 3 ++-
 src/amd/vulkan/radv_shader.c | 2 +-
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/src/amd/vulkan/radv_device.c b/src/amd/vulkan/radv_device.c
index cc45ac95c08..4aafe6e78aa 100644
--- a/src/amd/vulkan/radv_device.c
+++ b/src/amd/vulkan/radv_device.c
@@ -383,7 +383,8 @@ radv_physical_device_init(struct radv_physical_device 
*device,
  device->rad_info.family == 
CHIP_RENOIR ||
  device->rad_info.chip_class >= GFX10;
 
-   device->use_shader_ballot = device->instance->perftest_flags & 
RADV_PERFTEST_SHADER_BALLOT;
+   device->use_shader_ballot = device->rad_info.chip_class >= GFX8 &&
+   device->instance->perftest_flags & 
RADV_PERFTEST_SHADER_BALLOT;
 
/* Determine the number of threads per wave for all stages. */
device->cs_wave_size = 64;
diff --git a/src/amd/vulkan/radv_shader.c b/src/amd/vulkan/radv_shader.c
index 1e6a9a950d8..f2a8ac8abe3 100644
--- a/src/amd/vulkan/radv_shader.c
+++ b/src/amd/vulkan/radv_shader.c
@@ -297,7 +297,7 @@ radv_shader_compile_to_nir(struct radv_device *device,
.lower_ubo_ssbo_access_to_offsets = true,
.caps = {
.amd_gcn_shader = true,
-   .amd_shader_ballot = 
device->instance->perftest_flags & RADV_PERFTEST_SHADER_BALLOT,
+   .amd_shader_ballot = 
device->physical_device->use_shader_ballot,
.amd_trinary_minmax = true,
.derivative_group = true,
.descriptor_array_dynamic_indexing = true,
-- 
2.22.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 1/2] radv: add a new debug option called RADV_DEBUG=noshaderballot

2019-08-20 Thread Samuel Pitoiset

Shader ballot will be enabled by default for Wolfenstein
Youngblood. This follows what we did for sisched.

Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/radv_debug.h  | 1 +
 src/amd/vulkan/radv_device.c | 1 +
 2 files changed, 2 insertions(+)

diff --git a/src/amd/vulkan/radv_debug.h b/src/amd/vulkan/radv_debug.h
index ef5b331d188..1a8b9a42c20 100644
--- a/src/amd/vulkan/radv_debug.h
+++ b/src/amd/vulkan/radv_debug.h
@@ -53,6 +53,7 @@ enum {
RADV_DEBUG_NOBINNING = 0x80,
RADV_DEBUG_NO_LOAD_STORE_OPT = 0x100,
RADV_DEBUG_NO_NGG= 0x200,
+   RADV_DEBUG_NO_SHADER_BALLOT  = 0x400,
 };
 
 enum {
diff --git a/src/amd/vulkan/radv_device.c b/src/amd/vulkan/radv_device.c
index cc45ac95c08..49518d43218 100644
--- a/src/amd/vulkan/radv_device.c
+++ b/src/amd/vulkan/radv_device.c
@@ -495,6 +495,7 @@ static const struct debug_control radv_debug_options[] = {
{"nobinning", RADV_DEBUG_NOBINNING},
{"noloadstoreopt", RADV_DEBUG_NO_LOAD_STORE_OPT},
{"nongg", RADV_DEBUG_NO_NGG},
+   {"noshaderballot", RADV_DEBUG_NO_SHADER_BALLOT},
{NULL, 0}
 };
 
-- 
2.22.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/2] radv: force enable VK_AMD_shader_ballot for Wolfenstein Youngblood

2019-08-20 Thread Samuel Pitoiset

This gives a nice boost, +20% at this time on my Vega 56. Shader
ballot should be enabled by default at some point but it reduces
performance a bit (-6%) with Wolfeinstein II. Enable it only for
Youngblood at the moment, like what we did for Talos in the past.

As a bonus point, it gets rid of some minor artifacts that only
happens when ballot is disabled for some reasons.

Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/radv_device.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/src/amd/vulkan/radv_device.c b/src/amd/vulkan/radv_device.c
index 49518d43218..c04f6a27e82 100644
--- a/src/amd/vulkan/radv_device.c
+++ b/src/amd/vulkan/radv_device.c
@@ -554,6 +554,14 @@ radv_handle_per_app_options(struct radv_instance *instance,
 */
if (HAVE_LLVM < 0x900)
instance->debug_flags |= RADV_DEBUG_NO_LOAD_STORE_OPT;
+   } else if (!strcmp(name, "Wolfenstein: Youngblood")) {
+   if (!(instance->debug_flags & RADV_DEBUG_NO_SHADER_BALLOT)) {
+   /* Force enable VK_AMD_shader_ballot because it looks
+* safe and it gives a nice boost (+20% on Vega 56 at
+* this time).
+*/
+   instance->perftest_flags |= RADV_PERFTEST_SHADER_BALLOT;
+   }
}
 }
 
-- 
2.22.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/2] radv/gfx10: hardcode some depth+stencil formats in the format table

2019-08-20 Thread Samuel Pitoiset

The script doesn't handle them correctly and D16_UNORM_S8_UINT
isn't supported by the hardware, mark it as invalid.

This fixes warning when generating gfx10_format_table.h.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111393
Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/gfx10_format_table.py | 5 +
 1 file changed, 5 insertions(+)

diff --git a/src/amd/vulkan/gfx10_format_table.py 
b/src/amd/vulkan/gfx10_format_table.py
index 81b0bed92aa..f55b302bf82 100644
--- a/src/amd/vulkan/gfx10_format_table.py
+++ b/src/amd/vulkan/gfx10_format_table.py
@@ -66,6 +66,11 @@ HARDCODED = {
 'VK_FORMAT_BC6H_SFLOAT_BLOCK': hardcoded_format('BC6_SFLOAT'),
 'VK_FORMAT_BC7_UNORM_BLOCK': hardcoded_format('BC7_UNORM'),
 'VK_FORMAT_BC7_SRGB_BLOCK': hardcoded_format('BC7_SRGB'),
+
+# DS
+'VK_FORMAT_D16_UNORM_S8_UINT': hardcoded_format('INVALID'),
+'VK_FORMAT_D24_UNORM_S8_UINT': hardcoded_format('8_24_UNORM'),
+'VK_FORMAT_D32_SFLOAT_S8_UINT': hardcoded_format('X24_8_32_FLOAT'),
 }
 
 
-- 
2.22.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 1/2] radv/gfx10: tidy up gfx10_format_table.py

2019-08-20 Thread Samuel Pitoiset

Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/gfx10_format_table.py | 20 +---
 1 file changed, 9 insertions(+), 11 deletions(-)

diff --git a/src/amd/vulkan/gfx10_format_table.py 
b/src/amd/vulkan/gfx10_format_table.py
index 34ad5f6cdf2..81b0bed92aa 100644
--- a/src/amd/vulkan/gfx10_format_table.py
+++ b/src/amd/vulkan/gfx10_format_table.py
@@ -21,7 +21,7 @@
 # USE OR OTHER DEALINGS IN THE SOFTWARE.
 #
 """
-Script that generates the mapping from Gallium PIPE_FORMAT_xxx to gfx10
+Script that generates the mapping from Vulkan VK_FORMAT_xxx to gfx10
 IMG_FORMAT_xxx enums.
 """
 
@@ -34,12 +34,10 @@ import re
 import sys
 
 AMD_REGISTERS = os.path.abspath(os.path.join(os.path.dirname(sys.argv[0]), 
"../registers"))
-#GALLIUM_UTIL = os.path.abspath(os.path.join(os.path.dirname(sys.argv[0]), 
"../../auxiliary/util"))
 sys.path.extend([AMD_REGISTERS])
 
 from regdb import Object, RegisterDatabase
 from vk_format_parse import *
-#from u_format_parse import *
 
 # 
 # Hard-coded mappings
@@ -82,11 +80,11 @@ header_template = mako.template.Template("""\
  ##__VA_ARGS__ }
 
 static const struct gfx10_format gfx10_format_table[VK_FORMAT_RANGE_SIZE] = {
-% for pipe_format, args in formats:
+% for vk_format, args in formats:
  % if args is not None:
-  [${pipe_format}] = FMT(${args}),
+  [${vk_format}] = FMT(${args}),
  % else:
-/* ${pipe_format} is not supported */
+/* ${vk_format} is not supported */
  % endif
 % endfor
 };
@@ -114,8 +112,8 @@ class Gfx10Format(object):
 
 
 class Gfx10FormatMapping(object):
-def __init__(self, pipe_formats, gfx10_formats):
-self.pipe_formats = pipe_formats
+def __init__(self, vk_formats, gfx10_formats):
+self.vk_formats = vk_formats
 self.gfx10_formats = gfx10_formats
 
 self.plain_gfx10_formats = dict(
@@ -219,17 +217,17 @@ class Gfx10FormatMapping(object):
 
 
 if __name__ == '__main__':
-pipe_formats = parse(sys.argv[1])
+vk_formats = parse(sys.argv[1])
 
 with open(sys.argv[2], 'r') as filp:
 db = RegisterDatabase.from_json(json.load(filp))
 
 gfx10_formats = [Gfx10Format(entry) for entry in 
db.enum('IMG_FORMAT').entries]
 
-mapping = Gfx10FormatMapping(pipe_formats, gfx10_formats)
+mapping = Gfx10FormatMapping(vk_formats, gfx10_formats)
 
 formats = []
-for fmt in pipe_formats:
+for fmt in vk_formats:
 if fmt.name in HARDCODED:
 obj = HARDCODED[fmt.name]
 else:
-- 
2.22.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/2] radv/gfx10: do not emit PA_SC_TILE_STEERING_OVERRIDE twice

2019-08-19 Thread Samuel Pitoiset

CLEAR_STATE emits it for us.

Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/si_cmd_buffer.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/src/amd/vulkan/si_cmd_buffer.c b/src/amd/vulkan/si_cmd_buffer.c
index a5057fe25a2..68ec925f2b5 100644
--- a/src/amd/vulkan/si_cmd_buffer.c
+++ b/src/amd/vulkan/si_cmd_buffer.c
@@ -366,8 +366,6 @@ si_emit_graphics(struct radv_physical_device 
*physical_device,
radeon_set_context_reg(cs, R_028C50_PA_SC_NGG_MODE_CNTL,
   S_028C50_MAX_DEALLOCS_IN_WAVE(512));
radeon_set_context_reg(cs, 
R_028C58_VGT_VERTEX_REUSE_BLOCK_CNTL, 14);
-   radeon_set_context_reg(cs, 
R_02835C_PA_SC_TILE_STEERING_OVERRIDE,
-  
physical_device->rad_info.pa_sc_tile_steering_override);
radeon_set_context_reg(cs, R_02807C_DB_RMI_L2_CACHE_CONTROL,
   
S_02807C_Z_WR_POLICY(V_02807C_CACHE_STREAM_WR) |
   
S_02807C_S_WR_POLICY(V_02807C_CACHE_STREAM_WR) |
-- 
2.22.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 1/2] radv: do not emit PKT3_CONTEXT_CONTROL with AMDGPU 3.6.0+

2019-08-19 Thread Samuel Pitoiset

It's emitted by the kernel.

Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/radv_device.c   | 9 ++---
 src/amd/vulkan/si_cmd_buffer.c | 9 ++---
 2 files changed, 12 insertions(+), 6 deletions(-)

diff --git a/src/amd/vulkan/radv_device.c b/src/amd/vulkan/radv_device.c
index 05d09bb08eb..110808fb98d 100644
--- a/src/amd/vulkan/radv_device.c
+++ b/src/amd/vulkan/radv_device.c
@@ -2008,9 +2008,12 @@ VkResult radv_CreateDevice(
device->empty_cs[family] = device->ws->cs_create(device->ws, 
family);
switch (family) {
case RADV_QUEUE_GENERAL:
-   radeon_emit(device->empty_cs[family], 
PKT3(PKT3_CONTEXT_CONTROL, 1, 0));
-   radeon_emit(device->empty_cs[family], 
CONTEXT_CONTROL_LOAD_ENABLE(1));
-   radeon_emit(device->empty_cs[family], 
CONTEXT_CONTROL_SHADOW_ENABLE(1));
+ /* Since amdgpu version 3.6.0, CONTEXT_CONTROL is emitted 
by the kernel */
+   if (device->physical_device->rad_info.drm_minor < 6) {
+   radeon_emit(device->empty_cs[family], 
PKT3(PKT3_CONTEXT_CONTROL, 1, 0));
+   radeon_emit(device->empty_cs[family], 
CONTEXT_CONTROL_LOAD_ENABLE(1));
+   radeon_emit(device->empty_cs[family], 
CONTEXT_CONTROL_SHADOW_ENABLE(1));
+   }
break;
case RADV_QUEUE_COMPUTE:
radeon_emit(device->empty_cs[family], PKT3(PKT3_NOP, 0, 
0));
diff --git a/src/amd/vulkan/si_cmd_buffer.c b/src/amd/vulkan/si_cmd_buffer.c
index 701b2398b50..a5057fe25a2 100644
--- a/src/amd/vulkan/si_cmd_buffer.c
+++ b/src/amd/vulkan/si_cmd_buffer.c
@@ -161,9 +161,12 @@ si_emit_graphics(struct radv_physical_device 
*physical_device,
 {
int i;
 
-   radeon_emit(cs, PKT3(PKT3_CONTEXT_CONTROL, 1, 0));
-   radeon_emit(cs, CONTEXT_CONTROL_LOAD_ENABLE(1));
-   radeon_emit(cs, CONTEXT_CONTROL_SHADOW_ENABLE(1));
+   /* Since amdgpu version 3.6.0, CONTEXT_CONTROL is emitted by the kernel 
*/
+   if (physical_device->rad_info.drm_minor < 6) {
+   radeon_emit(cs, PKT3(PKT3_CONTEXT_CONTROL, 1, 0));
+   radeon_emit(cs, CONTEXT_CONTROL_LOAD_ENABLE(1));
+   radeon_emit(cs, CONTEXT_CONTROL_SHADOW_ENABLE(1));
+   }
 
if (physical_device->has_clear_state) {
radeon_emit(cs, PKT3(PKT3_CLEAR_STATE, 0, 0));
-- 
2.22.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] radv: fix image_has_{cmask,fmask}() helpers

2019-08-02 Thread Samuel Pitoiset

The driver should now rely on cmask_offset because CMASK can be
disabled by the driver for some reasons (eg. mipmaps). Apply the
same change for FMASK, although it should be useless.

Fixes: ad1bc8621df ("radv: remove radv_get_image_fmask_info()")
Fixes: 10d08da52c6 ("radv/gfx10: add missing dcc_tile_swizzle tweak")
Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/radv_private.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/amd/vulkan/radv_private.h b/src/amd/vulkan/radv_private.h
index 49d3c78db98..ee0761e69fe 100644
--- a/src/amd/vulkan/radv_private.h
+++ b/src/amd/vulkan/radv_private.h
@@ -1633,7 +1633,7 @@ bool radv_layout_dcc_compressed(const struct radv_image 
*image,
 static inline bool
 radv_image_has_cmask(const struct radv_image *image)
 {
-   return image->planes[0].surface.cmask_size;
+   return image->cmask_offset;
 }
 
 /**
@@ -1642,7 +1642,7 @@ radv_image_has_cmask(const struct radv_image *image)
 static inline bool
 radv_image_has_fmask(const struct radv_image *image)
 {
-   return image->planes[0].surface.fmask_size;
+   return image->fmask_offset;
 }
 
 /**
-- 
2.22.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/2] radv: remove radv_get_image_fmask_info()

2019-08-01 Thread Samuel Pitoiset

It's unnecessary to duplicate fields in another struct.

Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/radv_device.c | 12 -
 src/amd/vulkan/radv_image.c  | 44 +++-
 src/amd/vulkan/radv_meta_clear.c | 12 ++---
 src/amd/vulkan/radv_private.h| 16 ++--
 4 files changed, 25 insertions(+), 59 deletions(-)

diff --git a/src/amd/vulkan/radv_device.c b/src/amd/vulkan/radv_device.c
index 9aa731a252c..b9db931c309 100644
--- a/src/amd/vulkan/radv_device.c
+++ b/src/amd/vulkan/radv_device.c
@@ -4406,9 +4406,9 @@ radv_initialise_color_surface(struct radv_device *device,
 
if (radv_image_has_fmask(iview->image)) {
if (device->physical_device->rad_info.chip_class >= 
GFX7)
-   cb->cb_color_pitch |= 
S_028C64_FMASK_TILE_MAX(iview->image->fmask.pitch_in_pixels / 8 - 1);
-   cb->cb_color_attrib |= 
S_028C74_FMASK_TILE_MODE_INDEX(iview->image->fmask.tile_mode_index);
-   cb->cb_color_fmask_slice = 
S_028C88_TILE_MAX(iview->image->fmask.slice_tile_max);
+   cb->cb_color_pitch |= 
S_028C64_FMASK_TILE_MAX(surf->u.legacy.fmask.pitch_in_pixels / 8 - 1);
+   cb->cb_color_attrib |= 
S_028C74_FMASK_TILE_MODE_INDEX(surf->u.legacy.fmask.tiling_index);
+   cb->cb_color_fmask_slice = 
S_028C88_TILE_MAX(surf->u.legacy.fmask.slice_tile_max);
} else {
/* This must be set for fast clear to work without 
FMASK. */
if (device->physical_device->rad_info.chip_class >= 
GFX7)
@@ -4449,9 +4449,9 @@ radv_initialise_color_surface(struct radv_device *device,
}
 
if (radv_image_has_fmask(iview->image)) {
-   va = radv_buffer_get_va(iview->bo) + iview->image->offset + 
iview->image->fmask.offset;
+   va = radv_buffer_get_va(iview->bo) + iview->image->offset + 
iview->image->fmask_offset;
cb->cb_color_fmask = va >> 8;
-   cb->cb_color_fmask |= iview->image->fmask.tile_swizzle;
+   cb->cb_color_fmask |= surf->fmask_tile_swizzle;
} else {
cb->cb_color_fmask = cb->cb_color_base;
}
@@ -4501,7 +4501,7 @@ radv_initialise_color_surface(struct radv_device *device,
if (radv_image_has_fmask(iview->image)) {
cb->cb_color_info |= S_028C70_COMPRESSION(1);
if (device->physical_device->rad_info.chip_class == GFX6) {
-   unsigned fmask_bankh = 
util_logbase2(iview->image->fmask.bank_height);
+   unsigned fmask_bankh = 
util_logbase2(surf->u.legacy.fmask.bankh);
cb->cb_color_attrib |= 
S_028C74_FMASK_BANK_HEIGHT(fmask_bankh);
}
 
diff --git a/src/amd/vulkan/radv_image.c b/src/amd/vulkan/radv_image.c
index aaaf15ec8dc..efbb9de96b7 100644
--- a/src/amd/vulkan/radv_image.c
+++ b/src/amd/vulkan/radv_image.c
@@ -715,7 +715,7 @@ gfx10_make_texture_descriptor(struct radv_device *device,
 
assert(image->plane_count == 1);
 
-   va = gpu_address + image->offset + image->fmask.offset;
+   va = gpu_address + image->offset + image->fmask_offset;
 
switch (image->info.samples) {
case 2:
@@ -879,7 +879,7 @@ si_make_texture_descriptor(struct radv_device *device,
 
assert(image->plane_count == 1);
 
-   va = gpu_address + image->offset + image->fmask.offset;
+   va = gpu_address + image->offset + image->fmask_offset;
 
if (device->physical_device->rad_info.chip_class == GFX9) {
fmask_format = V_008F14_IMG_DATA_FORMAT_FMASK;
@@ -915,7 +915,7 @@ si_make_texture_descriptor(struct radv_device *device,
}
 
fmask_state[0] = va >> 8;
-   fmask_state[0] |= image->fmask.tile_swizzle;
+   fmask_state[0] |= image->planes[0].surface.fmask_tile_swizzle;
fmask_state[1] = S_008F14_BASE_ADDRESS_HI(va >> 40) |
S_008F14_DATA_FORMAT(fmask_format) |
S_008F14_NUM_FORMAT(num_format);
@@ -946,9 +946,9 @@ si_make_texture_descriptor(struct radv_device *device,
fmask_state[7] |= va >> 8;
}
} else {
-   fmask_state[3] |= 
S_008F1C_TILING_INDEX(image->fmask.tile_mode_index);
+   fmask_state[3] |= 
S_008F1C_TILING_INDEX(image->planes[0].surface.u.legacy.fmask.tiling_index);
fmask_state[4] |= S_008F20_DEPTH(depth - 1) |
-   S_008F20_PI

[Mesa-dev] [PATCH 1/2] radv: remove radv_get_image_cmask_info()

2019-08-01 Thread Samuel Pitoiset

It's unnecessary to duplicate fields in another struct.

Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/radv_device.c |  4 ++--
 src/amd/vulkan/radv_image.c  | 38 +---
 src/amd/vulkan/radv_meta_clear.c | 11 +
 src/amd/vulkan/radv_private.h| 13 ++-
 4 files changed, 21 insertions(+), 45 deletions(-)

diff --git a/src/amd/vulkan/radv_device.c b/src/amd/vulkan/radv_device.c
index 29be192443a..9aa731a252c 100644
--- a/src/amd/vulkan/radv_device.c
+++ b/src/amd/vulkan/radv_device.c
@@ -4400,7 +4400,7 @@ radv_initialise_color_surface(struct radv_device *device,
 
cb->cb_color_pitch = S_028C64_TILE_MAX(pitch_tile_max);
cb->cb_color_slice = S_028C68_TILE_MAX(slice_tile_max);
-   cb->cb_color_cmask_slice = iview->image->cmask.slice_tile_max;
+   cb->cb_color_cmask_slice = surf->u.legacy.cmask_slice_tile_max;
 
cb->cb_color_attrib |= 
S_028C74_TILE_MODE_INDEX(tile_mode_index);
 
@@ -4420,7 +4420,7 @@ radv_initialise_color_surface(struct radv_device *device,
 
/* CMASK variables */
va = radv_buffer_get_va(iview->bo) + iview->image->offset;
-   va += iview->image->cmask.offset;
+   va += iview->image->cmask_offset;
cb->cb_color_cmask = va >> 8;
 
va = radv_buffer_get_va(iview->bo) + iview->image->offset;
diff --git a/src/amd/vulkan/radv_image.c b/src/amd/vulkan/radv_image.c
index 8ff93e4344c..aaaf15ec8dc 100644
--- a/src/amd/vulkan/radv_image.c
+++ b/src/amd/vulkan/radv_image.c
@@ -939,7 +939,7 @@ si_make_texture_descriptor(struct radv_device *device,
  
S_008F24_META_RB_ALIGNED(image->planes[0].surface.u.gfx9.cmask.rb_aligned);
 
if (radv_image_is_tc_compat_cmask(image)) {
-   va = gpu_address + image->offset + 
image->cmask.offset;
+   va = gpu_address + image->offset + 
image->cmask_offset;
 
fmask_state[5] |= S_008F24_META_DATA_ADDRESS(va 
>> 40);
fmask_state[6] |= S_008F28_COMPRESSION_EN(1);
@@ -952,7 +952,7 @@ si_make_texture_descriptor(struct radv_device *device,
fmask_state[5] |= S_008F24_LAST_ARRAY(last_layer);
 
if (radv_image_is_tc_compat_cmask(image)) {
-   va = gpu_address + image->offset + 
image->cmask.offset;
+   va = gpu_address + image->offset + 
image->cmask_offset;
 
fmask_state[6] |= S_008F28_COMPRESSION_EN(1);
fmask_state[7] |= va >> 8;
@@ -1138,45 +1138,27 @@ radv_image_alloc_fmask(struct radv_device *device,
image->alignment = MAX2(image->alignment, image->fmask.alignment);
 }
 
-static void
-radv_image_get_cmask_info(struct radv_device *device,
- struct radv_image *image,
- struct radv_cmask_info *out)
-{
-   assert(image->plane_count == 1);
-
-   if (device->physical_device->rad_info.chip_class >= GFX9) {
-   out->alignment = image->planes[0].surface.cmask_alignment;
-   out->size = image->planes[0].surface.cmask_size;
-   return;
-   }
-
-   out->slice_tile_max = 
image->planes[0].surface.u.legacy.cmask_slice_tile_max;
-   out->alignment = image->planes[0].surface.cmask_alignment;
-   out->slice_size = image->planes[0].surface.cmask_slice_size;
-   out->size = image->planes[0].surface.cmask_size;
-}
-
 static void
 radv_image_alloc_cmask(struct radv_device *device,
   struct radv_image *image)
 {
+   unsigned cmask_alignment = image->planes[0].surface.cmask_alignment;
+   unsigned cmask_size = image->planes[0].surface.cmask_size;
uint32_t clear_value_size = 0;
-   radv_image_get_cmask_info(device, image, >cmask);
 
-   if (!image->cmask.size)
+   if (!cmask_size)
return;
 
-   assert(image->cmask.alignment);
+   assert(cmask_alignment);
 
-   image->cmask.offset = align64(image->size, image->cmask.alignment);
+   image->cmask_offset = align64(image->size, cmask_alignment);
/* + 8 for storing the clear values */
if (!image->clear_value_offset) {
-   image->clear_value_offset = image->cmask.offset + 
image->cmask.size;
+   image->clear_value_offset = image->cmask_offset + cmask_size;
clear_value_size = 8;
}
-   image->size = image->cmask.offset + image->cmask.size + 
clear_value_size;
-   image->alignment = MAX2(image->alignment, image->cmask.alignment);
+   image->size = image->

[Mesa-dev] [PATCH 1/2] radv: only account for tile_swizzle for color surfaces with DCC

2019-08-01 Thread Samuel Pitoiset

It's 0 for depth surfaces with TC compat HTILE enabled.

Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/radv_image.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/src/amd/vulkan/radv_image.c b/src/amd/vulkan/radv_image.c
index f3237dd5985..221b554e73e 100644
--- a/src/amd/vulkan/radv_image.c
+++ b/src/amd/vulkan/radv_image.c
@@ -483,6 +483,8 @@ si_set_mutable_tex_desc_fields(struct radv_device *device,
meta_va = gpu_address + image->dcc_offset;
if (chip_class <= GFX8)
meta_va += base_level_info->dcc_offset;
+
+   meta_va |= (uint32_t)plane->surface.tile_swizzle << 8;
} else if (!is_storage_image &&
   radv_image_is_tc_compat_htile(image)) {
meta_va = gpu_address + image->htile_offset;
@@ -490,10 +492,8 @@ si_set_mutable_tex_desc_fields(struct radv_device *device,
 
if (meta_va) {
state[6] |= S_008F28_COMPRESSION_EN(1);
-   if (chip_class <= GFX9) {
+   if (chip_class <= GFX9)
state[7] = meta_va >> 8;
-   state[7] |= plane->surface.tile_swizzle;
-   }
}
}
 
-- 
2.22.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/2] radv/gfx10: add missing dcc_tile_swizzle tweak

2019-08-01 Thread Samuel Pitoiset

Fixes: c90f46700dd ("radv/gfx10: mask DCC tile swizzle by alignment")
Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/radv_image.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/src/amd/vulkan/radv_image.c b/src/amd/vulkan/radv_image.c
index 221b554e73e..8ff93e4344c 100644
--- a/src/amd/vulkan/radv_image.c
+++ b/src/amd/vulkan/radv_image.c
@@ -484,7 +484,9 @@ si_set_mutable_tex_desc_fields(struct radv_device *device,
if (chip_class <= GFX8)
meta_va += base_level_info->dcc_offset;
 
-   meta_va |= (uint32_t)plane->surface.tile_swizzle << 8;
+   unsigned dcc_tile_swizzle = plane->surface.tile_swizzle 
<< 8;
+   dcc_tile_swizzle &= plane->surface.dcc_alignment - 1;
+   meta_va |= dcc_tile_swizzle;
} else if (!is_storage_image &&
   radv_image_is_tc_compat_htile(image)) {
meta_va = gpu_address + image->htile_offset;
-- 
2.22.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 3/4] radv/gfx10: determine correct wave size when lowering subgroups

2019-08-01 Thread Samuel Pitoiset

Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/radv_shader.c | 30 +-
 1 file changed, 17 insertions(+), 13 deletions(-)

diff --git a/src/amd/vulkan/radv_shader.c b/src/amd/vulkan/radv_shader.c
index 97fa80b348c..f0ab2d5e467 100644
--- a/src/amd/vulkan/radv_shader.c
+++ b/src/amd/vulkan/radv_shader.c
@@ -124,6 +124,17 @@ unsigned shader_io_get_unique_index(gl_varying_slot slot)
unreachable("illegal slot in get unique index\n");
 }
 
+static uint8_t
+radv_get_shader_wave_size(const struct radv_physical_device *pdevice,
+ gl_shader_stage stage)
+{
+   if (stage == MESA_SHADER_COMPUTE)
+   return pdevice->cs_wave_size;
+   else if (stage == MESA_SHADER_FRAGMENT)
+   return pdevice->ps_wave_size;
+   return pdevice->ge_wave_size;
+}
+
 VkResult radv_CreateShaderModule(
VkDevice_device,
const VkShaderModuleCreateInfo* pCreateInfo,
@@ -422,9 +433,13 @@ radv_shader_compile_to_nir(struct radv_device *device,
 
nir_lower_global_vars_to_local(nir);
nir_remove_dead_variables(nir, nir_var_function_temp);
+
+   uint8_t wave_size = radv_get_shader_wave_size(device->physical_device,
+ nir->info.stage);
+
nir_lower_subgroups(nir, &(struct nir_lower_subgroups_options) {
-   .subgroup_size = 64,
-   .ballot_bit_size = 64,
+   .subgroup_size = wave_size,
+   .ballot_bit_size = wave_size,
.lower_to_scalar = 1,
.lower_subgroup_masks = 1,
.lower_shuffle = 1,
@@ -667,17 +682,6 @@ radv_get_shader_binary_size(size_t code_size)
return code_size + DEBUGGER_NUM_MARKERS * 4;
 }
 
-static uint8_t
-radv_get_shader_wave_size(const struct radv_physical_device *pdevice,
- gl_shader_stage stage)
-{
-   if (stage == MESA_SHADER_COMPUTE)
-   return pdevice->cs_wave_size;
-   else if (stage == MESA_SHADER_FRAGMENT)
-   return pdevice->ps_wave_size;
-   return pdevice->ge_wave_size;
-}
-
 static void radv_postprocess_config(const struct radv_physical_device *pdevice,
const struct ac_shader_config *config_in,
const struct radv_shader_variant_info *info,
-- 
2.22.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 4/4] radv/gfx10: use the correct target machine for Wave32

2019-08-01 Thread Samuel Pitoiset

Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/radv_llvm_helper.cpp | 30 +
 src/amd/vulkan/radv_shader.c|  3 ++-
 src/amd/vulkan/radv_shader_helper.h |  3 ++-
 3 files changed, 26 insertions(+), 10 deletions(-)

diff --git a/src/amd/vulkan/radv_llvm_helper.cpp 
b/src/amd/vulkan/radv_llvm_helper.cpp
index 2b14ddcf184..612548e4219 100644
--- a/src/amd/vulkan/radv_llvm_helper.cpp
+++ b/src/amd/vulkan/radv_llvm_helper.cpp
@@ -28,8 +28,10 @@
 class radv_llvm_per_thread_info {
 public:
radv_llvm_per_thread_info(enum radeon_family arg_family,
-   enum ac_target_machine_options arg_tm_options)
-   : family(arg_family), tm_options(arg_tm_options), passes(NULL) 
{}
+   enum ac_target_machine_options arg_tm_options,
+   unsigned arg_wave_size)
+   : family(arg_family), tm_options(arg_tm_options),
+ wave_size(arg_wave_size), passes(NULL), passes_wave32(NULL) {}
 
~radv_llvm_per_thread_info()
{
@@ -47,19 +49,28 @@ public:
if (!passes)
return false;
 
+   if (llvm_info.tm_wave32) {
+   passes_wave32 = 
ac_create_llvm_passes(llvm_info.tm_wave32);
+   if (!passes_wave32)
+   return false;
+   }
+
return true;
}
 
bool compile_to_memory_buffer(LLVMModuleRef module,
  char **pelf_buffer, size_t *pelf_size)
{
-   return ac_compile_module_to_elf(passes, module, pelf_buffer, 
pelf_size);
+   struct ac_compiler_passes *p = wave_size == 32 ? passes_wave32 
: passes;
+   return ac_compile_module_to_elf(p, module, pelf_buffer, 
pelf_size);
}
 
bool is_same(enum radeon_family arg_family,
-enum ac_target_machine_options arg_tm_options) {
+enum ac_target_machine_options arg_tm_options,
+unsigned arg_wave_size) {
if (arg_family == family &&
-   arg_tm_options == tm_options)
+   arg_tm_options == tm_options &&
+   arg_wave_size == wave_size)
return true;
return false;
}
@@ -67,7 +78,9 @@ public:
 private:
enum radeon_family family;
enum ac_target_machine_options tm_options;
+   unsigned wave_size;
struct ac_compiler_passes *passes;
+   struct ac_compiler_passes *passes_wave32;
 };
 
 /* we have to store a linked list per thread due to the possiblity of multiple 
gpus being required */
@@ -99,17 +112,18 @@ bool radv_compile_to_elf(struct ac_llvm_compiler *info,
 bool radv_init_llvm_compiler(struct ac_llvm_compiler *info,
 bool thread_compiler,
 enum radeon_family family,
-enum ac_target_machine_options tm_options)
+enum ac_target_machine_options tm_options,
+unsigned wave_size)
 {
if (thread_compiler) {
for (auto  : radv_llvm_per_thread_list) {
-   if (I.is_same(family, tm_options)) {
+   if (I.is_same(family, tm_options, wave_size)) {
*info = I.llvm_info;
return true;
}
}
 
-   radv_llvm_per_thread_list.emplace_back(family, tm_options);
+   radv_llvm_per_thread_list.emplace_back(family, tm_options, 
wave_size);
radv_llvm_per_thread_info  = 
radv_llvm_per_thread_list.back();
 
if (!tinfo.init()) {
diff --git a/src/amd/vulkan/radv_shader.c b/src/amd/vulkan/radv_shader.c
index f0ab2d5e467..5e3b1378a14 100644
--- a/src/amd/vulkan/radv_shader.c
+++ b/src/amd/vulkan/radv_shader.c
@@ -1163,7 +1163,8 @@ shader_variant_compile(struct radv_device *device,
radv_init_llvm_once();
radv_init_llvm_compiler(_llvm,
thread_compiler,
-   chip_family, tm_options);
+   chip_family, tm_options,
+   
radv_get_shader_wave_size(device->physical_device, stage));
if (gs_copy_shader) {
assert(shader_count == 1);
radv_compile_gs_copy_shader(_llvm, *shaders, ,
diff --git a/src/amd/vulkan/radv_shader_helper.h 
b/src/amd/vulkan/radv_shader_helper.h
index d9dace0b495..c64d2df676b 100644
--- a/src/amd/vulkan/radv_shader_helper.h
+++ b/src/amd/vulkan/radv_shader_helper.h
@@ -29,7 +29,8 @@ extern "C" {
 bool radv_init_llvm_compiler(struct ac_llvm_compiler *info,
 bool thread_compiler,
 enum radeon_family family,
-

[Mesa-dev] [PATCH 1/4] radv/gfx10: add Wave32 support for fragment shaders

2019-08-01 Thread Samuel Pitoiset

It can be enabled with RADV_PERFTEST=pswave32.

Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/radv_debug.h   | 1 +
 src/amd/vulkan/radv_device.c  | 6 ++
 src/amd/vulkan/radv_nir_to_llvm.c | 2 ++
 src/amd/vulkan/radv_pipeline.c| 3 ++-
 src/amd/vulkan/radv_private.h | 1 +
 src/amd/vulkan/radv_shader.c  | 4 +++-
 src/amd/vulkan/radv_shader.h  | 1 +
 7 files changed, 16 insertions(+), 2 deletions(-)

diff --git a/src/amd/vulkan/radv_debug.h b/src/amd/vulkan/radv_debug.h
index 6414e882676..65dbec6e90d 100644
--- a/src/amd/vulkan/radv_debug.h
+++ b/src/amd/vulkan/radv_debug.h
@@ -65,6 +65,7 @@ enum {
RADV_PERFTEST_SHADER_BALLOT  =  0x40,
RADV_PERFTEST_TC_COMPAT_CMASK = 0x80,
RADV_PERFTEST_CS_WAVE_32 = 0x100,
+   RADV_PERFTEST_PS_WAVE_32 = 0x200,
 };
 
 bool
diff --git a/src/amd/vulkan/radv_device.c b/src/amd/vulkan/radv_device.c
index 29be192443a..b66b15edf73 100644
--- a/src/amd/vulkan/radv_device.c
+++ b/src/amd/vulkan/radv_device.c
@@ -385,10 +385,15 @@ radv_physical_device_init(struct radv_physical_device 
*device,
 
/* Determine the number of threads per wave for all stages. */
device->cs_wave_size = 64;
+   device->ps_wave_size = 64;
 
if (device->rad_info.chip_class >= GFX10) {
if (device->instance->perftest_flags & RADV_PERFTEST_CS_WAVE_32)
device->cs_wave_size = 32;
+
+   /* For pixel shaders, wave64 is recommanded. */
+   if (device->instance->perftest_flags & RADV_PERFTEST_PS_WAVE_32)
+   device->ps_wave_size = 32;
}
 
radv_physical_device_init_mem_types(device);
@@ -503,6 +508,7 @@ static const struct debug_control radv_perftest_options[] = 
{
{"shader_ballot", RADV_PERFTEST_SHADER_BALLOT},
{"tccompatcmask", RADV_PERFTEST_TC_COMPAT_CMASK},
{"cswave32", RADV_PERFTEST_CS_WAVE_32},
+   {"pswave32", RADV_PERFTEST_PS_WAVE_32},
{NULL, 0}
 };
 
diff --git a/src/amd/vulkan/radv_nir_to_llvm.c 
b/src/amd/vulkan/radv_nir_to_llvm.c
index bb78bcccf0e..bba5849b152 100644
--- a/src/amd/vulkan/radv_nir_to_llvm.c
+++ b/src/amd/vulkan/radv_nir_to_llvm.c
@@ -4323,6 +4323,8 @@ radv_nir_shader_wave_size(struct nir_shader *const 
*shaders, int shader_count,
 {
if (shaders[0]->info.stage == MESA_SHADER_COMPUTE)
return options->cs_wave_size;
+   else if (shaders[0]->info.stage == MESA_SHADER_FRAGMENT)
+   return options->ps_wave_size;
return 64;
 }
 
diff --git a/src/amd/vulkan/radv_pipeline.c b/src/amd/vulkan/radv_pipeline.c
index d62066cbee4..dbfe261c982 100644
--- a/src/amd/vulkan/radv_pipeline.c
+++ b/src/amd/vulkan/radv_pipeline.c
@@ -4060,7 +4060,8 @@ radv_pipeline_generate_fragment_shader(struct 
radeon_cmdbuf *ctx_cs,
   ps->config.spi_ps_input_addr);
 
radeon_set_context_reg(ctx_cs, R_0286D8_SPI_PS_IN_CONTROL,
-  S_0286D8_NUM_INTERP(ps->info.fs.num_interp));
+  S_0286D8_NUM_INTERP(ps->info.fs.num_interp) |
+  
S_0286D8_PS_W32_EN(pipeline->device->physical_device->ps_wave_size == 32));
 
radeon_set_context_reg(ctx_cs, R_0286E0_SPI_BARYC_CNTL, 
pipeline->graphics.spi_baryc_cntl);
 
diff --git a/src/amd/vulkan/radv_private.h b/src/amd/vulkan/radv_private.h
index 143c09811c8..a1347060190 100644
--- a/src/amd/vulkan/radv_private.h
+++ b/src/amd/vulkan/radv_private.h
@@ -302,6 +302,7 @@ struct radv_physical_device {
bool has_dcc_constant_encode;
 
/* Number of threads per wave. */
+   uint8_t ps_wave_size;
uint8_t cs_wave_size;
 
/* This is the drivers on-disk cache used as a fallback as opposed to
diff --git a/src/amd/vulkan/radv_shader.c b/src/amd/vulkan/radv_shader.c
index 9c88ab551bb..48ed86c99b1 100644
--- a/src/amd/vulkan/radv_shader.c
+++ b/src/amd/vulkan/radv_shader.c
@@ -673,7 +673,8 @@ radv_get_shader_wave_size(const struct radv_physical_device 
*pdevice,
 {
if (stage == MESA_SHADER_COMPUTE)
return pdevice->cs_wave_size;
-
+   else if (stage == MESA_SHADER_FRAGMENT)
+   return pdevice->ps_wave_size;
return 64;
 }
 
@@ -1142,6 +1143,7 @@ shader_variant_compile(struct radv_device *device,
options->tess_offchip_block_dw_size = 
device->tess_offchip_block_dw_size;
options->address32_hi = device->physical_device->rad_info.address32_hi;
options->cs_wave_size = device->physical_device->cs_wave_size;
+   options->ps_wave_size = device->physical_device->ps_wave_size;
 
if (options->supports_spill)
tm_options |= AC_TM_SUPPORTS_SPILL;
diff --git a/src/amd/vulkan/radv_shader.h b/src/amd/vulkan/radv_shader.h
index 92ae2a7259d..0ef4962

[Mesa-dev] [PATCH 2/4] radv/gfx10: add Wave32 support for vertex, tessellation and geometry shaders

2019-08-01 Thread Samuel Pitoiset

It can be enabled with RADV_PERFTEST=gewave32.

Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/radv_debug.h   |  1 +
 src/amd/vulkan/radv_device.c  |  5 +
 src/amd/vulkan/radv_nir_to_llvm.c | 13 +++--
 src/amd/vulkan/radv_pipeline.c| 10 +-
 src/amd/vulkan/radv_private.h |  1 +
 src/amd/vulkan/radv_shader.c  |  3 ++-
 src/amd/vulkan/radv_shader.h  |  1 +
 7 files changed, 26 insertions(+), 8 deletions(-)

diff --git a/src/amd/vulkan/radv_debug.h b/src/amd/vulkan/radv_debug.h
index 65dbec6e90d..ef5b331d188 100644
--- a/src/amd/vulkan/radv_debug.h
+++ b/src/amd/vulkan/radv_debug.h
@@ -66,6 +66,7 @@ enum {
RADV_PERFTEST_TC_COMPAT_CMASK = 0x80,
RADV_PERFTEST_CS_WAVE_32 = 0x100,
RADV_PERFTEST_PS_WAVE_32 = 0x200,
+   RADV_PERFTEST_GE_WAVE_32 = 0x400,
 };
 
 bool
diff --git a/src/amd/vulkan/radv_device.c b/src/amd/vulkan/radv_device.c
index b66b15edf73..fc961040b6e 100644
--- a/src/amd/vulkan/radv_device.c
+++ b/src/amd/vulkan/radv_device.c
@@ -386,6 +386,7 @@ radv_physical_device_init(struct radv_physical_device 
*device,
/* Determine the number of threads per wave for all stages. */
device->cs_wave_size = 64;
device->ps_wave_size = 64;
+   device->ge_wave_size = 64;
 
if (device->rad_info.chip_class >= GFX10) {
if (device->instance->perftest_flags & RADV_PERFTEST_CS_WAVE_32)
@@ -394,6 +395,9 @@ radv_physical_device_init(struct radv_physical_device 
*device,
/* For pixel shaders, wave64 is recommanded. */
if (device->instance->perftest_flags & RADV_PERFTEST_PS_WAVE_32)
device->ps_wave_size = 32;
+
+   if (device->instance->perftest_flags & RADV_PERFTEST_GE_WAVE_32)
+   device->ge_wave_size = 32;
}
 
radv_physical_device_init_mem_types(device);
@@ -509,6 +513,7 @@ static const struct debug_control radv_perftest_options[] = 
{
{"tccompatcmask", RADV_PERFTEST_TC_COMPAT_CMASK},
{"cswave32", RADV_PERFTEST_CS_WAVE_32},
{"pswave32", RADV_PERFTEST_PS_WAVE_32},
+   {"gewave32", RADV_PERFTEST_GE_WAVE_32},
{NULL, 0}
 };
 
diff --git a/src/amd/vulkan/radv_nir_to_llvm.c 
b/src/amd/vulkan/radv_nir_to_llvm.c
index bba5849b152..91251aa69bd 100644
--- a/src/amd/vulkan/radv_nir_to_llvm.c
+++ b/src/amd/vulkan/radv_nir_to_llvm.c
@@ -295,7 +295,7 @@ get_tcs_num_patches(struct radv_shader_context *ctx)
 
/* GFX6 bug workaround - limit LS-HS threadgroups to only one wave. */
if (ctx->options->chip_class == GFX6) {
-   unsigned one_wave = 64 / MAX2(num_tcs_input_cp, 
num_tcs_output_cp);
+   unsigned one_wave = ctx->options->ge_wave_size / 
MAX2(num_tcs_input_cp, num_tcs_output_cp);
num_patches = MIN2(num_patches, one_wave);
}
return num_patches;
@@ -3038,7 +3038,8 @@ handle_es_outputs_post(struct radv_shader_context *ctx,
LLVMValueRef wave_idx = ac_unpack_param(>ac, 
ctx->merged_wave_info, 24, 4);
vertex_idx = LLVMBuildOr(ctx->ac.builder, vertex_idx,
 LLVMBuildMul(ctx->ac.builder, wave_idx,
- LLVMConstInt(ctx->ac.i32, 
64, false), ""), "");
+ LLVMConstInt(ctx->ac.i32,
+  
ctx->ac.wave_size, false), ""), "");
lds_base = LLVMBuildMul(ctx->ac.builder, vertex_idx,
LLVMConstInt(ctx->ac.i32, itemsize_dw, 
0), "");
}
@@ -3140,7 +3141,7 @@ static LLVMValueRef get_thread_id_in_tg(struct 
radv_shader_context *ctx)
LLVMBuilderRef builder = ctx->ac.builder;
LLVMValueRef tmp;
tmp = LLVMBuildMul(builder, get_wave_id_in_tg(ctx),
-  LLVMConstInt(ctx->ac.i32, 64, false), "");
+  LLVMConstInt(ctx->ac.i32, ctx->ac.wave_size, false), 
"");
return LLVMBuildAdd(builder, tmp, ac_get_thread_id(>ac), "");
 }
 
@@ -4190,7 +4191,7 @@ ac_setup_rings(struct radv_shader_context *ctx)
 */
LLVMTypeRef v2i64 = LLVMVectorType(ctx->ac.i64, 2);
uint64_t stream_offset = 0;
-   unsigned num_records = 64;
+   unsigned num_records = ctx->ac.wave_size;
LLVMValueRef base_ring;
 
base_ring =
@@ -4223,7 +4224,7 @@ ac_setup_rings(struct radv_shader_context *ctx)
ring = LLVMBuildInsertElement(ctx->ac.builder,

[Mesa-dev] [PATCH 2/6] radv/gfx10: implement a bug workaround for NGG -> legacy transitions

2019-07-31 Thread Samuel Pitoiset

Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/radv_cmd_buffer.c | 14 ++
 src/amd/vulkan/si_cmd_buffer.c   |  9 +++--
 2 files changed, 21 insertions(+), 2 deletions(-)

diff --git a/src/amd/vulkan/radv_cmd_buffer.c b/src/amd/vulkan/radv_cmd_buffer.c
index 37026246aa9..cf3f81b2031 100644
--- a/src/amd/vulkan/radv_cmd_buffer.c
+++ b/src/amd/vulkan/radv_cmd_buffer.c
@@ -3626,6 +3626,20 @@ void radv_CmdBindPipeline(
/* Prefetch all pipeline shaders at first draw time. */
cmd_buffer->state.prefetch_L2_mask |= RADV_PREFETCH_SHADERS;
 
+   if ((cmd_buffer->device->physical_device->rad_info.family == 
CHIP_NAVI10 ||
+cmd_buffer->device->physical_device->rad_info.family == 
CHIP_NAVI12 ||
+cmd_buffer->device->physical_device->rad_info.family == 
CHIP_NAVI14) &&
+   cmd_buffer->state.emitted_pipeline &&
+   radv_pipeline_has_ngg(cmd_buffer->state.emitted_pipeline) &&
+   !radv_pipeline_has_ngg(cmd_buffer->state.pipeline)) {
+   /* Transitioning from NGG to legacy GS requires
+* VGT_FLUSH on Navi10-14.  VGT_FLUSH is also emitted
+* at the beginning of IBs when legacy GS ring pointers
+* are set.
+*/
+   cmd_buffer->state.flush_bits |= RADV_CMD_FLAG_VGT_FLUSH;
+   }
+
radv_bind_dynamic_state(cmd_buffer, >dynamic_state);
radv_bind_streamout_state(cmd_buffer, pipeline);
 
diff --git a/src/amd/vulkan/si_cmd_buffer.c b/src/amd/vulkan/si_cmd_buffer.c
index 94f759139ee..18b2236e54b 100644
--- a/src/amd/vulkan/si_cmd_buffer.c
+++ b/src/amd/vulkan/si_cmd_buffer.c
@@ -878,8 +878,7 @@ gfx10_cs_emit_cache_flush(struct radeon_cmdbuf *cs,
unsigned cb_db_event = 0;
 
/* We don't need these. */
-   assert(!(flush_bits & (RADV_CMD_FLAG_VGT_FLUSH |
-  RADV_CMD_FLAG_VGT_STREAMOUT_SYNC)));
+   assert(!(flush_bits & (RADV_CMD_FLAG_VGT_STREAMOUT_SYNC)));
 
if (flush_bits & RADV_CMD_FLAG_INV_ICACHE)
gcr_cntl |= S_586_GLI_INV(V_586_GLI_ALL);
@@ -998,6 +997,12 @@ gfx10_cs_emit_cache_flush(struct radeon_cmdbuf *cs,
 *flush_cnt, 0x);
}
 
+   /* VGT state sync */
+   if (flush_bits & RADV_CMD_FLAG_VGT_FLUSH) {
+   radeon_emit(cs, PKT3(PKT3_EVENT_WRITE, 0, 0));
+   radeon_emit(cs, EVENT_TYPE(V_028A90_VGT_FLUSH) | 
EVENT_INDEX(0));
+   }
+
/* Ignore fields that only modify the behavior of other fields. */
if (gcr_cntl & C_586_GL1_RANGE & C_586_GL2_RANGE & C_586_SEQ) {
/* Flush caches and wait for the caches to assert idle.
-- 
2.22.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 5/6] radv/gfx10: remove an obsolete VGT_REUSE_OFF workaround

2019-07-31 Thread Samuel Pitoiset

Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/radv_pipeline.c | 6 --
 1 file changed, 6 deletions(-)

diff --git a/src/amd/vulkan/radv_pipeline.c b/src/amd/vulkan/radv_pipeline.c
index 4d4f86a7e24..b3952846f43 100644
--- a/src/amd/vulkan/radv_pipeline.c
+++ b/src/amd/vulkan/radv_pipeline.c
@@ -3641,12 +3641,6 @@ radv_pipeline_generate_hw_ngg(struct radeon_cmdbuf 
*ctx_cs,
   S_028A84_PRIMITIVEID_EN(es_enable_prim_id) |
   
S_028A84_NGG_DISABLE_PROVOK_REUSE(es_enable_prim_id));
 
-   bool vgt_reuse_off = pipeline->device->physical_device->rad_info.family 
== CHIP_NAVI10 &&
-
pipeline->device->physical_device->rad_info.chip_external_rev == 0x1 &&
-es_type == MESA_SHADER_TESS_EVAL;
-
-   radeon_set_context_reg(ctx_cs, R_028AB4_VGT_REUSE_OFF,
-  S_028AB4_REUSE_OFF(vgt_reuse_off));
radeon_set_context_reg(ctx_cs, R_028AAC_VGT_ESGS_RING_ITEMSIZE,
   ngg_state->vgt_esgs_ring_itemsize);
 
-- 
2.22.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 4/6] radv/gfx10: disable LATE_ALLOC_GS on Navi14

2019-07-31 Thread Samuel Pitoiset

Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/si_cmd_buffer.c | 9 -
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/src/amd/vulkan/si_cmd_buffer.c b/src/amd/vulkan/si_cmd_buffer.c
index 3d6c672dd0f..d48ed804e63 100644
--- a/src/amd/vulkan/si_cmd_buffer.c
+++ b/src/amd/vulkan/si_cmd_buffer.c
@@ -311,6 +311,7 @@ si_emit_graphics(struct radv_physical_device 
*physical_device,
late_alloc_limit = (num_cu_per_sh - 2) * 4;
}
 
+   unsigned late_alloc_limit_gs = late_alloc_limit;
unsigned cu_mask_vs = 0x;
unsigned cu_mask_gs = 0x;
 
@@ -324,6 +325,12 @@ si_emit_graphics(struct radv_physical_device 
*physical_device,
}
}
 
+   /* Don't use late alloc for NGG on Navi14 due to a hw bug. */
+   if (physical_device->rad_info.family == CHIP_NAVI14) {
+   late_alloc_limit_gs = 0;
+   cu_mask_gs = 0x;
+   }
+
radeon_set_sh_reg_idx(physical_device, cs, 
R_00B118_SPI_SHADER_PGM_RSRC3_VS,
  3, S_00B118_CU_EN(cu_mask_vs) |
  S_00B118_WAVE_LIMIT(0x3F));
@@ -336,7 +343,7 @@ si_emit_graphics(struct radv_physical_device 
*physical_device,
if (physical_device->rad_info.chip_class >= GFX10) {
radeon_set_sh_reg_idx(physical_device, cs, 
R_00B204_SPI_SHADER_PGM_RSRC4_GS,
  3, S_00B204_CU_EN(0x) |
- 
S_00B204_SPI_SHADER_LATE_ALLOC_GS_GFX10(late_alloc_limit));
+ 
S_00B204_SPI_SHADER_LATE_ALLOC_GS_GFX10(late_alloc_limit_gs));
}
 
radeon_set_sh_reg_idx(physical_device, cs, 
R_00B01C_SPI_SHADER_PGM_RSRC3_PS,
-- 
2.22.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 3/6] radv/gfx10: implement a bug workaround for GE_PC_ALLOC

2019-07-31 Thread Samuel Pitoiset

Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/radv_pipeline.c | 17 -
 src/amd/vulkan/si_cmd_buffer.c | 13 +
 2 files changed, 13 insertions(+), 17 deletions(-)

diff --git a/src/amd/vulkan/radv_pipeline.c b/src/amd/vulkan/radv_pipeline.c
index 5c913f29a5a..4d4f86a7e24 100644
--- a/src/amd/vulkan/radv_pipeline.c
+++ b/src/amd/vulkan/radv_pipeline.c
@@ -3456,18 +3456,6 @@ radv_pipeline_generate_vgt_gs_mode(struct radeon_cmdbuf 
*ctx_cs,
radeon_set_context_reg(ctx_cs, R_028A40_VGT_GS_MODE, vgt_gs_mode);
 }
 
-static void
-gfx10_set_ge_pc_alloc(struct radeon_cmdbuf *ctx_cs,
- struct radv_pipeline *pipeline,
- bool culling)
-{
-   struct radeon_info *info = >device->physical_device->rad_info;
-
-   radeon_set_uconfig_reg(ctx_cs, R_030980_GE_PC_ALLOC,
-  S_030980_OVERSUB_EN(1) |
-  S_030980_NUM_PC_LINES((culling ? 256 : 128) * 
info->max_se - 1));
-}
-
 static void
 radv_pipeline_generate_hw_vs(struct radeon_cmdbuf *ctx_cs,
 struct radeon_cmdbuf *cs,
@@ -3534,9 +3522,6 @@ radv_pipeline_generate_hw_vs(struct radeon_cmdbuf *ctx_cs,
if (pipeline->device->physical_device->rad_info.chip_class <= GFX8)
radeon_set_context_reg(ctx_cs, R_028AB4_VGT_REUSE_OFF,
   outinfo->writes_viewport_index);
-
-   if (pipeline->device->physical_device->rad_info.chip_class >= GFX10)
-   gfx10_set_ge_pc_alloc(ctx_cs, pipeline, false);
 }
 
 static void
@@ -3699,8 +3684,6 @@ radv_pipeline_generate_hw_ngg(struct radeon_cmdbuf 
*ctx_cs,
   S_03096C_PRIM_GRP_SIZE(ngg_state->max_gsprims) |
   
S_03096C_VERT_GRP_SIZE(ngg_state->hw_max_esverts) |
   S_03096C_BREAK_WAVE_AT_EOI(break_wave_at_eoi));
-
-   gfx10_set_ge_pc_alloc(ctx_cs, pipeline, false);
 }
 
 static void
diff --git a/src/amd/vulkan/si_cmd_buffer.c b/src/amd/vulkan/si_cmd_buffer.c
index 18b2236e54b..3d6c672dd0f 100644
--- a/src/amd/vulkan/si_cmd_buffer.c
+++ b/src/amd/vulkan/si_cmd_buffer.c
@@ -382,6 +382,19 @@ si_emit_graphics(struct radv_physical_device 
*physical_device,
  S_00B0C0_SOFT_GROUPING_EN(1) |
  S_00B0C0_NUMBER_OF_REQUESTS_PER_CU(4 - 1));
radeon_set_sh_reg(cs, R_00B1C0_SPI_SHADER_REQ_CTRL_VS, 0);
+
+   if (physical_device->rad_info.family == CHIP_NAVI10 ||
+   physical_device->rad_info.family == CHIP_NAVI12 ||
+   physical_device->rad_info.family == CHIP_NAVI14) {
+   /* SQ_NON_EVENT must be emitted before GE_PC_ALLOC is 
written. */
+   radeon_emit(cs, PKT3(PKT3_EVENT_WRITE, 0, 0));
+   radeon_emit(cs, EVENT_TYPE(V_028A90_SQ_NON_EVENT) | 
EVENT_INDEX(0));
+   }
+
+   /* TODO: For culling, replace 128 with 256. */
+   radeon_set_uconfig_reg(cs, R_030980_GE_PC_ALLOC,
+  S_030980_OVERSUB_EN(1) |
+  S_030980_NUM_PC_LINES(128 * 
physical_device->rad_info.max_se - 1));
}
 
if (physical_device->rad_info.chip_class >= GFX8) {
-- 
2.22.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 1/6] radv: skip draw calls with 0-sized index buffers

2019-07-31 Thread Samuel Pitoiset

Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/radv_cmd_buffer.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/src/amd/vulkan/radv_cmd_buffer.c b/src/amd/vulkan/radv_cmd_buffer.c
index e0ea47b5745..37026246aa9 100644
--- a/src/amd/vulkan/radv_cmd_buffer.c
+++ b/src/amd/vulkan/radv_cmd_buffer.c
@@ -4323,6 +4323,12 @@ radv_emit_draw_packets(struct radv_cmd_buffer 
*cmd_buffer,
int index_size = 
radv_get_vgt_index_size(state->index_type);
uint64_t index_va;
 
+   /* Skip draw calls with 0-sized index buffers. They
+* cause a hang on some chips, like Navi10-14.
+*/
+   if (!cmd_buffer->state.max_index_count)
+   return;
+
index_va = state->index_va;
index_va += info->first_index * index_size;
 
-- 
2.22.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 6/6] radv/gfx10: implement a GE bug workaround

2019-07-31 Thread Samuel Pitoiset

Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/radv_pipeline.c | 27 +++
 1 file changed, 23 insertions(+), 4 deletions(-)

diff --git a/src/amd/vulkan/radv_pipeline.c b/src/amd/vulkan/radv_pipeline.c
index b3952846f43..d62066cbee4 100644
--- a/src/amd/vulkan/radv_pipeline.c
+++ b/src/amd/vulkan/radv_pipeline.c
@@ -3592,6 +3592,7 @@ radv_pipeline_generate_hw_ngg(struct radeon_cmdbuf 
*ctx_cs,
bool es_enable_prim_id = outinfo->export_prim_id ||
 (es && es->info.info.uses_prim_id);
bool break_wave_at_eoi = false;
+   unsigned ge_cntl;
unsigned nparams;
 
if (es_type == MESA_SHADER_TESS_EVAL) {
@@ -3674,10 +3675,28 @@ radv_pipeline_generate_hw_ngg(struct radeon_cmdbuf 
*ctx_cs,
   
S_028838_INDEX_BUF_EDGE_FLAG_ENA(!radv_pipeline_has_tess(pipeline) &&

!radv_pipeline_has_gs(pipeline)));
 
-   radeon_set_uconfig_reg(ctx_cs, R_03096C_GE_CNTL,
-  S_03096C_PRIM_GRP_SIZE(ngg_state->max_gsprims) |
-  
S_03096C_VERT_GRP_SIZE(ngg_state->hw_max_esverts) |
-  S_03096C_BREAK_WAVE_AT_EOI(break_wave_at_eoi));
+   ge_cntl = S_03096C_PRIM_GRP_SIZE(ngg_state->max_gsprims) |
+ S_03096C_VERT_GRP_SIZE(ngg_state->hw_max_esverts) |
+ S_03096C_BREAK_WAVE_AT_EOI(break_wave_at_eoi);
+
+   /* Bug workaround for a possible hang with non-tessellation cases.
+* Tessellation always sets GE_CNTL.VERT_GRP_SIZE = 0
+*
+* Requirement: GE_CNTL.VERT_GRP_SIZE = 
VGT_GS_ONCHIP_CNTL.ES_VERTS_PER_SUBGRP - 5
+*/
+   if ((pipeline->device->physical_device->rad_info.family == CHIP_NAVI10 
||
+pipeline->device->physical_device->rad_info.family == CHIP_NAVI12 
||
+pipeline->device->physical_device->rad_info.family == CHIP_NAVI14) 
&&
+   !radv_pipeline_has_tess(pipeline) &&
+   ngg_state->hw_max_esverts != 256) {
+   ge_cntl &= C_03096C_VERT_GRP_SIZE;
+
+   if (ngg_state->hw_max_esverts > 5) {
+   ge_cntl |= 
S_03096C_VERT_GRP_SIZE(ngg_state->hw_max_esverts - 5);
+   }
+   }
+
+   radeon_set_uconfig_reg(ctx_cs, R_03096C_GE_CNTL, ge_cntl);
 }
 
 static void
-- 
2.22.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] radv/gfx10: add Wave32 support for compute shaders

2019-07-30 Thread Samuel Pitoiset

It can be enabled with RADV_PERFTEST=cswave32.

Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/radv_debug.h   |  1 +
 src/amd/vulkan/radv_device.c  | 12 +++-
 src/amd/vulkan/radv_nir_to_llvm.c | 14 +-
 src/amd/vulkan/radv_pipeline.c|  3 ++-
 src/amd/vulkan/radv_private.h |  3 +++
 src/amd/vulkan/radv_shader.c  | 25 ++---
 src/amd/vulkan/radv_shader.h  |  1 +
 7 files changed, 53 insertions(+), 6 deletions(-)

diff --git a/src/amd/vulkan/radv_debug.h b/src/amd/vulkan/radv_debug.h
index 723fabda57f..6414e882676 100644
--- a/src/amd/vulkan/radv_debug.h
+++ b/src/amd/vulkan/radv_debug.h
@@ -64,6 +64,7 @@ enum {
RADV_PERFTEST_BO_LIST=  0x20,
RADV_PERFTEST_SHADER_BALLOT  =  0x40,
RADV_PERFTEST_TC_COMPAT_CMASK = 0x80,
+   RADV_PERFTEST_CS_WAVE_32 = 0x100,
 };
 
 bool
diff --git a/src/amd/vulkan/radv_device.c b/src/amd/vulkan/radv_device.c
index 65e3ccf91ad..29be192443a 100644
--- a/src/amd/vulkan/radv_device.c
+++ b/src/amd/vulkan/radv_device.c
@@ -383,6 +383,14 @@ radv_physical_device_init(struct radv_physical_device 
*device,
 
device->use_shader_ballot = device->instance->perftest_flags & 
RADV_PERFTEST_SHADER_BALLOT;
 
+   /* Determine the number of threads per wave for all stages. */
+   device->cs_wave_size = 64;
+
+   if (device->rad_info.chip_class >= GFX10) {
+   if (device->instance->perftest_flags & RADV_PERFTEST_CS_WAVE_32)
+   device->cs_wave_size = 32;
+   }
+
radv_physical_device_init_mem_types(device);
radv_fill_device_extension_table(device, >supported_extensions);
 
@@ -494,6 +502,7 @@ static const struct debug_control radv_perftest_options[] = 
{
{"bolist", RADV_PERFTEST_BO_LIST},
{"shader_ballot", RADV_PERFTEST_SHADER_BALLOT},
{"tccompatcmask", RADV_PERFTEST_TC_COMPAT_CMASK},
+   {"cswave32", RADV_PERFTEST_CS_WAVE_32},
{NULL, 0}
 };
 
@@ -1930,7 +1939,8 @@ VkResult radv_CreateDevice(
device->scratch_waves = MAX2(32 * 
physical_device->rad_info.num_good_compute_units,
 max_threads_per_block / 64);
 
-   device->dispatch_initiator = S_00B800_COMPUTE_SHADER_EN(1);
+   device->dispatch_initiator = S_00B800_COMPUTE_SHADER_EN(1) |
+
S_00B800_CS_W32_EN(device->physical_device->cs_wave_size == 32);
 
if (device->physical_device->rad_info.chip_class >= GFX7) {
/* If the KMD allows it (there is a KMD hw register for it),
diff --git a/src/amd/vulkan/radv_nir_to_llvm.c 
b/src/amd/vulkan/radv_nir_to_llvm.c
index 020c6d17771..feaab8f6370 100644
--- a/src/amd/vulkan/radv_nir_to_llvm.c
+++ b/src/amd/vulkan/radv_nir_to_llvm.c
@@ -4317,6 +4317,15 @@ static void declare_esgs_ring(struct radv_shader_context 
*ctx)
LLVMSetAlignment(ctx->esgs_ring, 64 * 1024);
 }
 
+static uint8_t
+radv_nir_shader_wave_size(struct nir_shader *const *shaders, int shader_count,
+ const struct radv_nir_compiler_options *options)
+{
+   if (shaders[0]->info.stage == MESA_SHADER_COMPUTE)
+   return options->cs_wave_size;
+   return 64;
+}
+
 static
 LLVMModuleRef ac_translate_nir_to_llvm(struct ac_llvm_compiler *ac_llvm,
struct nir_shader *const *shaders,
@@ -4333,8 +4342,11 @@ LLVMModuleRef ac_translate_nir_to_llvm(struct 
ac_llvm_compiler *ac_llvm,
options->unsafe_math ? AC_FLOAT_MODE_UNSAFE_FP_MATH :
   AC_FLOAT_MODE_DEFAULT;
 
+   uint8_t wave_size = radv_nir_shader_wave_size(shaders,
+ shader_count, options);
+
ac_llvm_context_init(, ac_llvm, options->chip_class,
-options->family, float_mode, 64);
+options->family, float_mode, wave_size);
ctx.context = ctx.ac.context;
 
radv_nir_shader_info_init(_info->info);
diff --git a/src/amd/vulkan/radv_pipeline.c b/src/amd/vulkan/radv_pipeline.c
index 583b600dfdd..6b8b7bbe25a 100644
--- a/src/amd/vulkan/radv_pipeline.c
+++ b/src/amd/vulkan/radv_pipeline.c
@@ -4648,7 +4648,8 @@ radv_compute_generate_pm4(struct radv_pipeline *pipeline)
threads_per_threadgroup = compute_shader->info.cs.block_size[0] *
  compute_shader->info.cs.block_size[1] *
  compute_shader->info.cs.block_size[2];
-   waves_per_threadgroup = DIV_ROUND_UP(threads_per_threadgroup, 64);
+   waves_per_threadgroup = DIV_ROUND_UP(threads_per_threadgroup,
+
device->physical_device->cs_wave_size);
 
if (device->physical_device->rad_info.chip_class >= G

[Mesa-dev] [PATCH] radv/gfx10: only compile the GS copy shader on-demand

2019-07-30 Thread Samuel Pitoiset

Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/radv_pipeline.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/amd/vulkan/radv_pipeline.c b/src/amd/vulkan/radv_pipeline.c
index 583b600dfdd..e11196bd82e 100644
--- a/src/amd/vulkan/radv_pipeline.c
+++ b/src/amd/vulkan/radv_pipeline.c
@@ -2626,7 +2626,8 @@ void radv_create_shaders(struct radv_pipeline *pipeline,
 
if(modules[MESA_SHADER_GEOMETRY]) {
struct radv_shader_binary *gs_copy_binary = NULL;
-   if (!pipeline->gs_copy_shader) {
+   if (!pipeline->gs_copy_shader &&
+   !radv_pipeline_has_ngg(pipeline)) {
pipeline->gs_copy_shader = radv_create_gs_copy_shader(
device, nir[MESA_SHADER_GEOMETRY], 
_copy_binary,

keys[MESA_SHADER_GEOMETRY].has_multiview_view_index);
-- 
2.22.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] radv/gfx10: do not use the fast depth or stencil clear bytes path

2019-07-29 Thread Samuel Pitoiset



On 7/29/19 2:30 PM, Bas Nieuwenhuizen wrote:

On Mon, Jul 29, 2019 at 2:20 PM Samuel Pitoiset
 wrote:


On 7/29/19 2:15 PM, Bas Nieuwenhuizen wrote:

On Mon, Jul 29, 2019 at 2:11 PM Samuel Pitoiset
 wrote:

The HTILE masks seem to be different and so we need to rework that
path. Just disabled for now and implement later.

The HTILE masks are not different per amdvlk?

Can you at least rework the commit message to reflect that?

"It needs to be reworked on GFX10, so just disable it for now." ?

How about just "It causes issues on GFX10"? We don't know it needs to
be reworked either?

Looks like it needs but whatever, I'm fine with that, so Rb?




This fixes rendering issues with vkmark and Wreckfest at least.

Signed-off-by: Samuel Pitoiset 
---
   src/amd/vulkan/radv_meta_clear.c | 5 +++--
   1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/src/amd/vulkan/radv_meta_clear.c b/src/amd/vulkan/radv_meta_clear.c
index b93ba3e0b29..8ddc2e38cd4 100644
--- a/src/amd/vulkan/radv_meta_clear.c
+++ b/src/amd/vulkan/radv_meta_clear.c
@@ -1005,7 +1005,7 @@ radv_can_fast_clear_depth(struct radv_cmd_buffer 
*cmd_buffer,
  if (!view_mask && clear_rect->layerCount != 
iview->image->info.array_size)
  return false;

-   if (cmd_buffer->device->physical_device->rad_info.chip_class < GFX9 &&
+   if (cmd_buffer->device->physical_device->rad_info.chip_class != GFX9 &&
  (!(aspects & VK_IMAGE_ASPECT_DEPTH_BIT) ||
  ((vk_format_aspects(iview->image->vk_format) & 
VK_IMAGE_ASPECT_STENCIL_BIT) &&
   !(aspects & VK_IMAGE_ASPECT_STENCIL_BIT
@@ -1048,7 +1048,8 @@ radv_fast_clear_depth(struct radv_cmd_buffer *cmd_buffer,

iview->image->planes[0].surface.htile_size, clear_word);
  } else {
  /* Only clear depth or stencil bytes in the HTILE buffer. */
-   assert(cmd_buffer->device->physical_device->rad_info.chip_class 
>= GFX9);
+   /* TODO: Implement that path for GFX10. */
+   assert(cmd_buffer->device->physical_device->rad_info.chip_class 
== GFX9);
  flush_bits = clear_htile_mask(cmd_buffer, iview->image->bo,
iview->image->offset + 
iview->image->htile_offset,

iview->image->planes[0].surface.htile_size, clear_word,
--
2.22.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] radv/gfx10: do not use the fast depth or stencil clear bytes path

2019-07-29 Thread Samuel Pitoiset



On 7/29/19 2:15 PM, Bas Nieuwenhuizen wrote:

On Mon, Jul 29, 2019 at 2:11 PM Samuel Pitoiset
 wrote:

The HTILE masks seem to be different and so we need to rework that
path. Just disabled for now and implement later.

The HTILE masks are not different per amdvlk?

Can you at least rework the commit message to reflect that?

"It needs to be reworked on GFX10, so just disable it for now." ?

This fixes rendering issues with vkmark and Wreckfest at least.

Signed-off-by: Samuel Pitoiset 
---
  src/amd/vulkan/radv_meta_clear.c | 5 +++--
  1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/src/amd/vulkan/radv_meta_clear.c b/src/amd/vulkan/radv_meta_clear.c
index b93ba3e0b29..8ddc2e38cd4 100644
--- a/src/amd/vulkan/radv_meta_clear.c
+++ b/src/amd/vulkan/radv_meta_clear.c
@@ -1005,7 +1005,7 @@ radv_can_fast_clear_depth(struct radv_cmd_buffer 
*cmd_buffer,
 if (!view_mask && clear_rect->layerCount != 
iview->image->info.array_size)
 return false;

-   if (cmd_buffer->device->physical_device->rad_info.chip_class < GFX9 &&
+   if (cmd_buffer->device->physical_device->rad_info.chip_class != GFX9 &&
 (!(aspects & VK_IMAGE_ASPECT_DEPTH_BIT) ||
 ((vk_format_aspects(iview->image->vk_format) & 
VK_IMAGE_ASPECT_STENCIL_BIT) &&
  !(aspects & VK_IMAGE_ASPECT_STENCIL_BIT
@@ -1048,7 +1048,8 @@ radv_fast_clear_depth(struct radv_cmd_buffer *cmd_buffer,
   
iview->image->planes[0].surface.htile_size, clear_word);
 } else {
 /* Only clear depth or stencil bytes in the HTILE buffer. */
-   assert(cmd_buffer->device->physical_device->rad_info.chip_class 
>= GFX9);
+   /* TODO: Implement that path for GFX10. */
+   assert(cmd_buffer->device->physical_device->rad_info.chip_class 
== GFX9);
 flush_bits = clear_htile_mask(cmd_buffer, iview->image->bo,
   iview->image->offset + 
iview->image->htile_offset,
   
iview->image->planes[0].surface.htile_size, clear_word,
--
2.22.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] radv/gfx10: do not use the fast depth or stencil clear bytes path

2019-07-29 Thread Samuel Pitoiset

The HTILE masks seem to be different and so we need to rework that
path. Just disabled for now and implement later.

This fixes rendering issues with vkmark and Wreckfest at least.

Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/radv_meta_clear.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/src/amd/vulkan/radv_meta_clear.c b/src/amd/vulkan/radv_meta_clear.c
index b93ba3e0b29..8ddc2e38cd4 100644
--- a/src/amd/vulkan/radv_meta_clear.c
+++ b/src/amd/vulkan/radv_meta_clear.c
@@ -1005,7 +1005,7 @@ radv_can_fast_clear_depth(struct radv_cmd_buffer 
*cmd_buffer,
if (!view_mask && clear_rect->layerCount != 
iview->image->info.array_size)
return false;
 
-   if (cmd_buffer->device->physical_device->rad_info.chip_class < GFX9 &&
+   if (cmd_buffer->device->physical_device->rad_info.chip_class != GFX9 &&
(!(aspects & VK_IMAGE_ASPECT_DEPTH_BIT) ||
((vk_format_aspects(iview->image->vk_format) & 
VK_IMAGE_ASPECT_STENCIL_BIT) &&
 !(aspects & VK_IMAGE_ASPECT_STENCIL_BIT
@@ -1048,7 +1048,8 @@ radv_fast_clear_depth(struct radv_cmd_buffer *cmd_buffer,
  
iview->image->planes[0].surface.htile_size, clear_word);
} else {
/* Only clear depth or stencil bytes in the HTILE buffer. */
-   assert(cmd_buffer->device->physical_device->rad_info.chip_class 
>= GFX9);
+   /* TODO: Implement that path for GFX10. */
+   assert(cmd_buffer->device->physical_device->rad_info.chip_class 
== GFX9);
flush_bits = clear_htile_mask(cmd_buffer, iview->image->bo,
  iview->image->offset + 
iview->image->htile_offset,
  
iview->image->planes[0].surface.htile_size, clear_word,
-- 
2.22.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] ac: do not crash when the buffer data format is invalid

2019-07-29 Thread Samuel Pitoiset

This might happen when a pipeline doesn't define the vertex input
state, so the buffer data format is 0 (aka INVALID).

This fixes crashes when compiling some shaders on GFX10.

Signed-off-by: Samuel Pitoiset 
---
 src/amd/common/ac_llvm_build.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/amd/common/ac_llvm_build.c b/src/amd/common/ac_llvm_build.c
index 250bfc5229e..278f8893432 100644
--- a/src/amd/common/ac_llvm_build.c
+++ b/src/amd/common/ac_llvm_build.c
@@ -1508,6 +1508,7 @@ ac_get_tbuffer_format(struct ac_llvm_context *ctx,
unsigned format;
switch (dfmt) {
default: unreachable("bad dfmt");
+   case V_008F0C_BUF_DATA_FORMAT_INVALID: format = 
V_008F0C_IMG_FORMAT_INVALID; break;
case V_008F0C_BUF_DATA_FORMAT_8: format = 
V_008F0C_IMG_FORMAT_8_UINT; break;
case V_008F0C_BUF_DATA_FORMAT_8_8: format = 
V_008F0C_IMG_FORMAT_8_8_UINT; break;
case V_008F0C_BUF_DATA_FORMAT_8_8_8_8: format = 
V_008F0C_IMG_FORMAT_8_8_8_8_UINT; break;
-- 
2.22.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] radv: implement VK_EXT_index_type_uint8

2019-07-29 Thread Samuel Pitoiset

Natively supported on VI+.

Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/radv_cmd_buffer.c  | 60 +++
 src/amd/vulkan/radv_device.c  |  6 
 src/amd/vulkan/radv_extensions.py |  1 +
 3 files changed, 61 insertions(+), 6 deletions(-)

diff --git a/src/amd/vulkan/radv_cmd_buffer.c b/src/amd/vulkan/radv_cmd_buffer.c
index d9783e6ca8a..e0ea47b5745 100644
--- a/src/amd/vulkan/radv_cmd_buffer.c
+++ b/src/amd/vulkan/radv_cmd_buffer.c
@@ -2541,6 +2541,21 @@ struct radv_draw_info {
uint64_t strmout_buffer_offset;
 };
 
+static uint32_t
+radv_get_primitive_reset_index(struct radv_cmd_buffer *cmd_buffer)
+{
+   switch (cmd_buffer->state.index_type) {
+   case V_028A7C_VGT_INDEX_8:
+   return 0xffu;
+   case V_028A7C_VGT_INDEX_16:
+   return 0xu;
+   case V_028A7C_VGT_INDEX_32:
+   return 0xu;
+   default:
+   unreachable("invalid index type");
+   }
+}
+
 static void
 si_emit_ia_multi_vgt_param(struct radv_cmd_buffer *cmd_buffer,
   bool instanced_draw, bool indirect_draw,
@@ -2612,7 +2627,7 @@ radv_emit_draw_registers(struct radv_cmd_buffer 
*cmd_buffer,
 
if (primitive_reset_en) {
uint32_t primitive_reset_index =
-   state->index_type ? 0xu : 0xu;
+   radv_get_primitive_reset_index(cmd_buffer);
 
if (primitive_reset_index != state->last_primitive_reset_index) 
{
radeon_set_context_reg(cs,
@@ -3233,6 +3248,36 @@ void radv_CmdBindVertexBuffers(
cmd_buffer->state.dirty |= RADV_CMD_DIRTY_VERTEX_BUFFER;
 }
 
+static uint32_t
+vk_to_index_type(VkIndexType type)
+{
+   switch (type) {
+   case VK_INDEX_TYPE_UINT8_EXT:
+   return V_028A7C_VGT_INDEX_8;
+   case VK_INDEX_TYPE_UINT16:
+   return V_028A7C_VGT_INDEX_16;
+   case VK_INDEX_TYPE_UINT32:
+   return V_028A7C_VGT_INDEX_32;
+   default:
+   unreachable("invalid index type");
+   }
+}
+
+static uint32_t
+radv_get_vgt_index_size(uint32_t type)
+{
+   switch (type) {
+   case V_028A7C_VGT_INDEX_8:
+   return 1;
+   case V_028A7C_VGT_INDEX_16:
+   return 2;
+   case V_028A7C_VGT_INDEX_32:
+   return 4;
+   default:
+   unreachable("invalid index type");
+   }
+}
+
 void radv_CmdBindIndexBuffer(
VkCommandBuffer commandBuffer,
VkBuffer buffer,
@@ -3251,12 +3296,12 @@ void radv_CmdBindIndexBuffer(
 
cmd_buffer->state.index_buffer = index_buffer;
cmd_buffer->state.index_offset = offset;
-   cmd_buffer->state.index_type = indexType; /* vk matches hw */
+   cmd_buffer->state.index_type = vk_to_index_type(indexType);
cmd_buffer->state.index_va = radv_buffer_get_va(index_buffer->bo);
cmd_buffer->state.index_va += index_buffer->offset + offset;
 
-   int index_size_shift = cmd_buffer->state.index_type ? 2 : 1;
-   cmd_buffer->state.max_index_count = (index_buffer->size - offset) >> 
index_size_shift;
+   int index_size = radv_get_vgt_index_size(indexType);
+   cmd_buffer->state.max_index_count = (index_buffer->size - offset) / 
index_size;
cmd_buffer->state.dirty |= RADV_CMD_DIRTY_INDEX_BUFFER;
radv_cs_add_buffer(cmd_buffer->device->ws, cmd_buffer->cs, 
index_buffer->bo);
 }
@@ -4275,7 +4320,7 @@ radv_emit_draw_packets(struct radv_cmd_buffer *cmd_buffer,
}
 
if (info->indexed) {
-   int index_size = state->index_type ? 4 : 2;
+   int index_size = 
radv_get_vgt_index_size(state->index_type);
uint64_t index_va;
 
index_va = state->index_va;
@@ -4354,8 +4399,11 @@ static bool radv_need_late_scissor_emission(struct 
radv_cmd_buffer *cmd_buffer,
if (cmd_buffer->state.dirty & used_states)
return true;
 
+   uint32_t primitive_reset_index =
+   radv_get_primitive_reset_index(cmd_buffer);
+
if (info->indexed && state->pipeline->graphics.prim_restart_enable &&
-   (state->index_type ? 0xu : 0xu) != 
state->last_primitive_reset_index)
+   primitive_reset_index != state->last_primitive_reset_index)
return true;
 
return false;
diff --git a/src/amd/vulkan/radv_device.c b/src/amd/vulkan/radv_device.c
index 9ba100df6e8..65e3ccf91ad 100644
--- a/src/amd/vulkan/radv_device.c
+++ b/src/amd/vulkan/radv_device.c
@@ -987,6 +987,12 @@ void radv_GetPhysicalDeviceFeatures2(
features->uniformBufferStandardLayout = true;
break;
}
+

Re: [Mesa-dev] [PATCH] radv: Set correct metadata size for GFX9+.

2019-07-25 Thread Samuel Pitoiset


Reviewed-by: Samuel Pitoiset 

On 7/25/19 4:55 PM, Bas Nieuwenhuizen wrote:

Without correct size, radeonsi assumes the metadata is incorrect,
which can and will cause issues.

Since the metadata is really incorrect without the size, let us
fix that.

Fixes: e43cc3e3afc "radv/gfx9: handle GFX9 opaque metadata"
---
  src/amd/vulkan/radv_image.c | 3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/amd/vulkan/radv_image.c b/src/amd/vulkan/radv_image.c
index 0941cbb..541ff4086f4 100644
--- a/src/amd/vulkan/radv_image.c
+++ b/src/amd/vulkan/radv_image.c
@@ -1034,7 +1034,8 @@ radv_query_opaque_metadata(struct radv_device *device,
for (i = 0; i <= image->info.levels - 1; i++)
md->metadata[10+i] = 
image->planes[0].surface.u.legacy.level[i].offset >> 8;
md->size_metadata = (11 + image->info.levels - 1) * 4;
-   }
+   } else
+   md->size_metadata = 10 * 4;
  }
  
  void

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] radv/gfx10: Disable DCC with scanout.

2019-07-25 Thread Samuel Pitoiset


It's already disabled later in this function?

On 7/25/19 4:34 PM, Bas Nieuwenhuizen wrote:

(a) radv does not set the DCC fields required yet.
(b) radeonsi just broke their DCC metadata.

Fixes: f8b6c5a1a63 "radeonsi: rewrite si_get_opaque_metadata, also for gfx10 
support"
---
  src/amd/vulkan/radv_image.c | 3 +++
  1 file changed, 3 insertions(+)

diff --git a/src/amd/vulkan/radv_image.c b/src/amd/vulkan/radv_image.c
index 0941cbb..4bcdb70214a 100644
--- a/src/amd/vulkan/radv_image.c
+++ b/src/amd/vulkan/radv_image.c
@@ -161,6 +161,9 @@ radv_use_dcc_for_image(struct radv_device *device,
if (image->shareable)
return false;
  
+	if (radv_surface_has_scanout(device, create_info))

+   return false;
+
/* TODO: Enable DCC for storage images. */
if ((pCreateInfo->usage & VK_IMAGE_USAGE_STORAGE_BIT) ||
(pCreateInfo->flags & VK_IMAGE_CREATE_EXTENDED_USAGE_BIT))

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] radv/gfx10: use L2 for DMA copy/fill operations

2019-07-25 Thread Samuel Pitoiset



On 7/25/19 3:39 PM, Bas Nieuwenhuizen wrote:

r-b

though it sounds like some of our cache flushes might be not ideal.

Yes.


On Thu, Jul 25, 2019 at 3:35 PM Samuel Pitoiset
 wrote:

It's coherent and faster. GFX7-GFX9 should also support this but
for now only uses L2 for GFX10 because it's untested on previous gens.

This fixes dEQP-VK.memory.pipeline_barrier.transfer_*

This also fixes some missing geometry in Dawn Of War III because
VBOs weren't updated correctly.

Signed-off-by: Samuel Pitoiset 
---
  src/amd/vulkan/si_cmd_buffer.c | 16 
  1 file changed, 16 insertions(+)

diff --git a/src/amd/vulkan/si_cmd_buffer.c b/src/amd/vulkan/si_cmd_buffer.c
index 21a90cb2514..94f759139ee 100644
--- a/src/amd/vulkan/si_cmd_buffer.c
+++ b/src/amd/vulkan/si_cmd_buffer.c
@@ -1501,6 +1501,14 @@ void si_cp_dma_buffer_copy(struct radv_cmd_buffer 
*cmd_buffer,
 unsigned dma_flags = 0;
 unsigned byte_count = MIN2(size, 
cp_dma_max_byte_count(cmd_buffer));

+   if (cmd_buffer->device->physical_device->rad_info.chip_class >= 
GFX10) {
+   /* DMA operations via L2 are coherent and faster.
+* TODO: GFX7-GFX9 should also support this but it
+* requires tests/benchmarks.
+*/
+   dma_flags |= CP_DMA_USE_L2;
+   }
+
 si_cp_dma_prepare(cmd_buffer, byte_count,
   size + skipped_size + realign_size,
   _flags);
@@ -1545,6 +1553,14 @@ void si_cp_dma_clear_buffer(struct radv_cmd_buffer 
*cmd_buffer, uint64_t va,
 unsigned byte_count = MIN2(size, 
cp_dma_max_byte_count(cmd_buffer));
 unsigned dma_flags = CP_DMA_CLEAR;

+   if (cmd_buffer->device->physical_device->rad_info.chip_class >= 
GFX10) {
+   /* DMA operations via L2 are coherent and faster.
+* TODO: GFX7-GFX9 should also support this but it
+* requires tests/benchmarks.
+*/
+   dma_flags |= CP_DMA_USE_L2;
+   }
+
 si_cp_dma_prepare(cmd_buffer, byte_count, size, _flags);

 /* Emit the clear packet. */
--
2.22.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] radv/gfx10: use L2 for DMA copy/fill operations

2019-07-25 Thread Samuel Pitoiset

It's coherent and faster. GFX7-GFX9 should also support this but
for now only uses L2 for GFX10 because it's untested on previous gens.

This fixes dEQP-VK.memory.pipeline_barrier.transfer_*

This also fixes some missing geometry in Dawn Of War III because
VBOs weren't updated correctly.

Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/si_cmd_buffer.c | 16 
 1 file changed, 16 insertions(+)

diff --git a/src/amd/vulkan/si_cmd_buffer.c b/src/amd/vulkan/si_cmd_buffer.c
index 21a90cb2514..94f759139ee 100644
--- a/src/amd/vulkan/si_cmd_buffer.c
+++ b/src/amd/vulkan/si_cmd_buffer.c
@@ -1501,6 +1501,14 @@ void si_cp_dma_buffer_copy(struct radv_cmd_buffer 
*cmd_buffer,
unsigned dma_flags = 0;
unsigned byte_count = MIN2(size, 
cp_dma_max_byte_count(cmd_buffer));
 
+   if (cmd_buffer->device->physical_device->rad_info.chip_class >= 
GFX10) {
+   /* DMA operations via L2 are coherent and faster.
+* TODO: GFX7-GFX9 should also support this but it
+* requires tests/benchmarks.
+*/
+   dma_flags |= CP_DMA_USE_L2;
+   }
+
si_cp_dma_prepare(cmd_buffer, byte_count,
  size + skipped_size + realign_size,
  _flags);
@@ -1545,6 +1553,14 @@ void si_cp_dma_clear_buffer(struct radv_cmd_buffer 
*cmd_buffer, uint64_t va,
unsigned byte_count = MIN2(size, 
cp_dma_max_byte_count(cmd_buffer));
unsigned dma_flags = CP_DMA_CLEAR;
 
+   if (cmd_buffer->device->physical_device->rad_info.chip_class >= 
GFX10) {
+   /* DMA operations via L2 are coherent and faster.
+* TODO: GFX7-GFX9 should also support this but it
+* requires tests/benchmarks.
+*/
+   dma_flags |= CP_DMA_USE_L2;
+   }
+
si_cp_dma_prepare(cmd_buffer, byte_count, size, _flags);
 
/* Emit the clear packet. */
-- 
2.22.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2] radv/gfx10: fix intensity formats by setting ALPHA_IS_ON_MSB

2019-07-24 Thread Samuel Pitoiset

This fixes
dEQP-VK.rasterization.primitive_size.points.point_size_*

This also fixes some black squares with the Sascha SSAO demo.

v2: - do not set for multiple channels
- call vi_alpha_is_on_msb() for pre-GFX10
- remove unused 'swap'

Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/radv_image.c | 17 +++--
 1 file changed, 11 insertions(+), 6 deletions(-)

diff --git a/src/amd/vulkan/radv_image.c b/src/amd/vulkan/radv_image.c
index 0941cbb..d46946269e6 100644
--- a/src/amd/vulkan/radv_image.c
+++ b/src/amd/vulkan/radv_image.c
@@ -617,6 +617,15 @@ static unsigned gfx9_border_color_swizzle(const enum 
vk_swizzle swizzle[4])
return bc_swizzle;
 }
 
+static bool vi_alpha_is_on_msb(struct radv_device *device, VkFormat format)
+{
+   const struct vk_format_description *desc = 
vk_format_description(format);
+
+   if (device->physical_device->rad_info.chip_class >= GFX10 && 
desc->nr_channels == 1)
+   return desc->swizzle[3] == VK_SWIZZLE_X;
+
+   return radv_translate_colorswap(format, false) <= 1;
+}
 /**
  * Build the sampler view descriptor for a texture (GFX10).
  */
@@ -691,11 +700,9 @@ gfx10_make_texture_descriptor(struct radv_device *device,
state[7] = 0;
 
if (radv_dcc_enabled(image, first_level)) {
-   unsigned swap = radv_translate_colorswap(vk_format, FALSE);
-
state[6] |= 
S_00A018_MAX_UNCOMPRESSED_BLOCK_SIZE(V_028C78_MAX_BLOCK_SIZE_256B) |

S_00A018_MAX_COMPRESSED_BLOCK_SIZE(V_028C78_MAX_BLOCK_SIZE_128B) |
-   S_00A018_ALPHA_IS_ON_MSB(swap <= 1);
+   S_00A018_ALPHA_IS_ON_MSB(vi_alpha_is_on_msb(device, 
vk_format));
}
 
/* Initialize the sampler view for FMASK. */
@@ -849,9 +856,7 @@ si_make_texture_descriptor(struct radv_device *device,
state[5] |= S_008F24_LAST_ARRAY(last_layer);
}
if (image->dcc_offset) {
-   unsigned swap = radv_translate_colorswap(vk_format, FALSE);
-
-   state[6] = S_008F28_ALPHA_IS_ON_MSB(swap <= 1);
+   state[6] = S_008F28_ALPHA_IS_ON_MSB(vi_alpha_is_on_msb(device, 
vk_format));
} else {
/* The last dword is unused by hw. The shader uses it to clear
 * bits in the first dword of sampler state.
-- 
2.22.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] radv/gfx10: fix intensity formats by setting ALPHA_IS_ON_MSB

2019-07-24 Thread Samuel Pitoiset

This fixes
dEQP-VK.rasterization.primitive_size.points.point_size_*

This also fixes some black squares with the Sascha SSAO demo.

Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/radv_image.c | 15 ++-
 1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/src/amd/vulkan/radv_image.c b/src/amd/vulkan/radv_image.c
index 0941cbb..59d6d0ced78 100644
--- a/src/amd/vulkan/radv_image.c
+++ b/src/amd/vulkan/radv_image.c
@@ -617,6 +617,19 @@ static unsigned gfx9_border_color_swizzle(const enum 
vk_swizzle swizzle[4])
return bc_swizzle;
 }
 
+static bool vi_alpha_is_on_msb(struct radv_device *device, VkFormat format)
+{
+   const struct vk_format_description *desc = 
vk_format_description(format);
+
+   /* Formats with 3 channels can't have alpha. */
+   if (desc->nr_channels == 3)
+   return true; /* same as xxxA; is any value OK here? */
+
+   if (device->physical_device->rad_info.chip_class >= GFX10 && 
desc->nr_channels == 1)
+   return desc->swizzle[3] == VK_SWIZZLE_X;
+
+   return radv_translate_colorswap(format, false) <= 1;
+}
 /**
  * Build the sampler view descriptor for a texture (GFX10).
  */
@@ -695,7 +708,7 @@ gfx10_make_texture_descriptor(struct radv_device *device,
 
state[6] |= 
S_00A018_MAX_UNCOMPRESSED_BLOCK_SIZE(V_028C78_MAX_BLOCK_SIZE_256B) |

S_00A018_MAX_COMPRESSED_BLOCK_SIZE(V_028C78_MAX_BLOCK_SIZE_128B) |
-   S_00A018_ALPHA_IS_ON_MSB(swap <= 1);
+   S_00A018_ALPHA_IS_ON_MSB(vi_alpha_is_on_msb(device, 
vk_format));
}
 
/* Initialize the sampler view for FMASK. */
-- 
2.22.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 4/5] radv/gfx10: do not enable NGG if a pipeline uses XFB

2019-07-24 Thread Samuel Pitoiset



On 7/23/19 9:31 PM, Bas Nieuwenhuizen wrote:

On Tue, Jul 23, 2019 at 3:21 PM Samuel Pitoiset
 wrote:

NGG GS for streamout requires a bunch of work, so enable it with
the legacy path only for now.

Signed-off-by: Samuel Pitoiset 
---
  src/amd/vulkan/radv_pipeline.c | 28 
  1 file changed, 28 insertions(+)

diff --git a/src/amd/vulkan/radv_pipeline.c b/src/amd/vulkan/radv_pipeline.c
index a7ff0e2d139..0903e5abf37 100644
--- a/src/amd/vulkan/radv_pipeline.c
+++ b/src/amd/vulkan/radv_pipeline.c
@@ -33,6 +33,7 @@
  #include "radv_shader.h"
  #include "nir/nir.h"
  #include "nir/nir_builder.h"
+#include "nir/nir_xfb_info.h"
  #include "spirv/nir_spirv.h"
  #include "vk_util.h"

@@ -2269,6 +2270,16 @@ radv_generate_graphics_pipeline_key(struct radv_pipeline 
*pipeline,
 return key;
  }

+static bool
+radv_nir_stage_uses_xfb(const nir_shader *nir)
+{
+   nir_xfb_info *xfb = nir_gather_xfb_info(nir, NULL);
+   bool uses_xfb = !!xfb;
+
+   ralloc_free(xfb);
+   return uses_xfb;
+}
+
  static void
  radv_fill_shader_keys(struct radv_device *device,
   struct radv_shader_variant_key *keys,
@@ -2321,6 +2332,23 @@ radv_fill_shader_keys(struct radv_device *device,
  */
 keys[MESA_SHADER_TESS_EVAL].vs_common_out.as_ngg = 
false;
 }
+
+   /* TODO: Implement streamout support for NGG. */
+   bool uses_xfb = false;
+   if ((nir[MESA_SHADER_VERTEX] &&
+radv_nir_stage_uses_xfb(nir[MESA_SHADER_VERTEX])) ||
+   (nir[MESA_SHADER_TESS_EVAL] &&
+radv_nir_stage_uses_xfb(nir[MESA_SHADER_TESS_EVAL])) ||
+   (nir[MESA_SHADER_GEOMETRY] &&
+radv_nir_stage_uses_xfb(nir[MESA_SHADER_GEOMETRY])))
+   uses_xfb = true;

transform feedback can only happen on the last stage before PS right?
Can we first determine what the last shader is and only then check for
xfb? That way we don't have to scan 3 shaders.

Yes. Pushed with this slightly improved.

+
+   if (uses_xfb) {
+   if (nir[MESA_SHADER_TESS_CTRL])
+   
keys[MESA_SHADER_TESS_EVAL].vs_common_out.as_ngg = false;
+   else
+   keys[MESA_SHADER_VERTEX].vs_common_out.as_ngg = 
false;
+   }
 }

 for(int i = 0; i < MESA_SHADER_STAGES; ++i)
--
2.22.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 1/5] radv/gfx10: update streamout descriptors

2019-07-23 Thread Samuel Pitoiset

Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/radv_cmd_buffer.c | 11 +--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/src/amd/vulkan/radv_cmd_buffer.c b/src/amd/vulkan/radv_cmd_buffer.c
index 84d627340e9..c2e3f3b5fd0 100644
--- a/src/amd/vulkan/radv_cmd_buffer.c
+++ b/src/amd/vulkan/radv_cmd_buffer.c
@@ -2419,8 +2419,15 @@ radv_flush_streamout_descriptors(struct radv_cmd_buffer 
*cmd_buffer)
desc[3] = S_008F0C_DST_SEL_X(V_008F0C_SQ_SEL_X) |
  S_008F0C_DST_SEL_Y(V_008F0C_SQ_SEL_Y) |
  S_008F0C_DST_SEL_Z(V_008F0C_SQ_SEL_Z) |
- S_008F0C_DST_SEL_W(V_008F0C_SQ_SEL_W) |
- 
S_008F0C_DATA_FORMAT(V_008F0C_BUF_DATA_FORMAT_32);
+ S_008F0C_DST_SEL_W(V_008F0C_SQ_SEL_W);
+
+   if 
(cmd_buffer->device->physical_device->rad_info.chip_class >= GFX10) {
+   desc[3] |= 
S_008F0C_FORMAT(V_008F0C_IMG_FORMAT_32_FLOAT) |
+  S_008F0C_OOB_SELECT(3) |
+  S_008F0C_RESOURCE_LEVEL(1);
+   } else {
+   desc[3] |= 
S_008F0C_DATA_FORMAT(V_008F0C_BUF_DATA_FORMAT_32);
+   }
}
 
va = radv_buffer_get_va(cmd_buffer->upload.upload_bo);
-- 
2.22.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 3/5] radv/gfx10: emit streamout shader config

2019-07-23 Thread Samuel Pitoiset

Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/radv_shader.c | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/src/amd/vulkan/radv_shader.c b/src/amd/vulkan/radv_shader.c
index 5fd1022b05a..56f421026b7 100644
--- a/src/amd/vulkan/radv_shader.c
+++ b/src/amd/vulkan/radv_shader.c
@@ -690,7 +690,12 @@ static void radv_postprocess_config(const struct 
radv_physical_device *pdevice,
config_out->float_mode |= V_00B028_FP_64_DENORMS;
 
config_out->rsrc2 = S_00B12C_USER_SGPR(info->num_user_sgprs) |
-   S_00B12C_SCRATCH_EN(scratch_enabled);
+   S_00B12C_SCRATCH_EN(scratch_enabled) |
+   S_00B12C_SO_BASE0_EN(!!info->info.so.strides[0]) |
+   S_00B12C_SO_BASE1_EN(!!info->info.so.strides[1]) |
+   S_00B12C_SO_BASE2_EN(!!info->info.so.strides[2]) |
+   S_00B12C_SO_BASE3_EN(!!info->info.so.strides[3]) |
+   S_00B12C_SO_EN(!!info->info.so.num_outputs);
 
config_out->rsrc1 = S_00B848_VGPRS((num_vgprs - 1) / 4) |
S_00B848_DX10_CLAMP(1) |
@@ -700,12 +705,7 @@ static void radv_postprocess_config(const struct 
radv_physical_device *pdevice,
config_out->rsrc2 |= 
S_00B22C_USER_SGPR_MSB_GFX10(info->num_user_sgprs >> 5);
} else {
config_out->rsrc1 |= S_00B228_SGPRS((num_sgprs - 1) / 8);
-   config_out->rsrc2 |= 
S_00B22C_USER_SGPR_MSB_GFX9(info->num_user_sgprs >> 5)  |
-
S_00B12C_SO_BASE0_EN(!!info->info.so.strides[0]) |
-
S_00B12C_SO_BASE1_EN(!!info->info.so.strides[1]) |
-
S_00B12C_SO_BASE2_EN(!!info->info.so.strides[2]) |
-
S_00B12C_SO_BASE3_EN(!!info->info.so.strides[3]) |
-
S_00B12C_SO_EN(!!info->info.so.num_outputs);
+   config_out->rsrc2 |= 
S_00B22C_USER_SGPR_MSB_GFX9(info->num_user_sgprs >> 5);
}
 
switch (stage) {
-- 
2.22.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 4/5] radv/gfx10: do not enable NGG if a pipeline uses XFB

2019-07-23 Thread Samuel Pitoiset

NGG GS for streamout requires a bunch of work, so enable it with
the legacy path only for now.

Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/radv_pipeline.c | 28 
 1 file changed, 28 insertions(+)

diff --git a/src/amd/vulkan/radv_pipeline.c b/src/amd/vulkan/radv_pipeline.c
index a7ff0e2d139..0903e5abf37 100644
--- a/src/amd/vulkan/radv_pipeline.c
+++ b/src/amd/vulkan/radv_pipeline.c
@@ -33,6 +33,7 @@
 #include "radv_shader.h"
 #include "nir/nir.h"
 #include "nir/nir_builder.h"
+#include "nir/nir_xfb_info.h"
 #include "spirv/nir_spirv.h"
 #include "vk_util.h"
 
@@ -2269,6 +2270,16 @@ radv_generate_graphics_pipeline_key(struct radv_pipeline 
*pipeline,
return key;
 }
 
+static bool
+radv_nir_stage_uses_xfb(const nir_shader *nir)
+{
+   nir_xfb_info *xfb = nir_gather_xfb_info(nir, NULL);
+   bool uses_xfb = !!xfb;
+
+   ralloc_free(xfb);
+   return uses_xfb;
+}
+
 static void
 radv_fill_shader_keys(struct radv_device *device,
  struct radv_shader_variant_key *keys,
@@ -2321,6 +2332,23 @@ radv_fill_shader_keys(struct radv_device *device,
 */
keys[MESA_SHADER_TESS_EVAL].vs_common_out.as_ngg = 
false;
}
+
+   /* TODO: Implement streamout support for NGG. */
+   bool uses_xfb = false;
+   if ((nir[MESA_SHADER_VERTEX] &&
+radv_nir_stage_uses_xfb(nir[MESA_SHADER_VERTEX])) ||
+   (nir[MESA_SHADER_TESS_EVAL] &&
+radv_nir_stage_uses_xfb(nir[MESA_SHADER_TESS_EVAL])) ||
+   (nir[MESA_SHADER_GEOMETRY] &&
+radv_nir_stage_uses_xfb(nir[MESA_SHADER_GEOMETRY])))
+   uses_xfb = true;
+
+   if (uses_xfb) {
+   if (nir[MESA_SHADER_TESS_CTRL])
+   
keys[MESA_SHADER_TESS_EVAL].vs_common_out.as_ngg = false;
+   else
+   keys[MESA_SHADER_VERTEX].vs_common_out.as_ngg = 
false;
+   }
}
 
for(int i = 0; i < MESA_SHADER_STAGES; ++i)
-- 
2.22.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 5/5] radv/gfx10: enable VK_EXT_transform_feedback

2019-07-23 Thread Samuel Pitoiset

When a pipeline uses transform feedback, the driver fallbacks to
the legacy path because NGG support for streamout is a non-trivial
amount of work.

AMDVLK also uses the legacy path for streamout, while RadeonSI
uses the new NGG path.

Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/radv_extensions.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/amd/vulkan/radv_extensions.py 
b/src/amd/vulkan/radv_extensions.py
index e9addad0035..8e1d61dfaaf 100644
--- a/src/amd/vulkan/radv_extensions.py
+++ b/src/amd/vulkan/radv_extensions.py
@@ -129,7 +129,7 @@ EXTENSIONS = [
 Extension('VK_EXT_shader_stencil_export', 1, True),
 Extension('VK_EXT_shader_subgroup_ballot',1, True),
 Extension('VK_EXT_shader_subgroup_vote',  1, True),
-Extension('VK_EXT_transform_feedback',1, 
'device->rad_info.chip_class < GFX10'),
+Extension('VK_EXT_transform_feedback',1, True),
 Extension('VK_EXT_vertex_attribute_divisor',  3, True),
 Extension('VK_EXT_ycbcr_image_arrays',1, True),
 Extension('VK_AMD_buffer_marker', 1, True),
-- 
2.22.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/5] radv/gfx10: declare streamout user SGPRs

2019-07-23 Thread Samuel Pitoiset

Required for legacy streamout.

Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/radv_nir_to_llvm.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/src/amd/vulkan/radv_nir_to_llvm.c 
b/src/amd/vulkan/radv_nir_to_llvm.c
index cf73cdc692b..020c6d17771 100644
--- a/src/amd/vulkan/radv_nir_to_llvm.c
+++ b/src/amd/vulkan/radv_nir_to_llvm.c
@@ -876,9 +876,6 @@ declare_streamout_sgprs(struct radv_shader_context *ctx, 
gl_shader_stage stage,
 {
int i;
 
-   if (ctx->ac.chip_class >= GFX10)
-   return;
-
/* Streamout SGPRs. */
if (ctx->shader_info->info.so.num_outputs) {
assert(stage == MESA_SHADER_VERTEX ||
-- 
2.22.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v3] radv/gfx10: fix VS input VGPRs with the legacy path

2019-07-23 Thread Samuel Pitoiset

For some reasons, InstanceID is VGPR3 although StepRate0 is set to 1.

v3: fix instanceID input VGPR for geometry
v2: fix instanceID

Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/radv_nir_to_llvm.c | 12 +---
 src/amd/vulkan/radv_shader.c  |  8 ++--
 2 files changed, 15 insertions(+), 5 deletions(-)

diff --git a/src/amd/vulkan/radv_nir_to_llvm.c 
b/src/amd/vulkan/radv_nir_to_llvm.c
index 336bae28614..cf73cdc692b 100644
--- a/src/amd/vulkan/radv_nir_to_llvm.c
+++ b/src/amd/vulkan/radv_nir_to_llvm.c
@@ -852,9 +852,15 @@ declare_vs_input_vgprs(struct radv_shader_context *ctx, 
struct arg_info *args)
}
} else {
if (ctx->ac.chip_class >= GFX10) {
-   add_arg(args, ARG_VGPR, ctx->ac.i32, NULL); /* 
user vgpr */
-   add_arg(args, ARG_VGPR, ctx->ac.i32, NULL); /* 
user vgpr */
-   add_arg(args, ARG_VGPR, ctx->ac.i32, 
>abi.instance_id);
+   if (ctx->options->key.vs_common_out.as_ngg) {
+   add_arg(args, ARG_VGPR, ctx->ac.i32, 
NULL); /* user vgpr */
+   add_arg(args, ARG_VGPR, ctx->ac.i32, 
NULL); /* user vgpr */
+   add_arg(args, ARG_VGPR, ctx->ac.i32, 
>abi.instance_id);
+   } else {
+   add_arg(args, ARG_VGPR, ctx->ac.i32, 
NULL); /* unused */
+   add_arg(args, ARG_VGPR, ctx->ac.i32, 
>vs_prim_id);
+   add_arg(args, ARG_VGPR, ctx->ac.i32, 
>abi.instance_id);
+   }
} else {
add_arg(args, ARG_VGPR, ctx->ac.i32, 
>abi.instance_id);
add_arg(args, ARG_VGPR, ctx->ac.i32, 
>vs_prim_id);
diff --git a/src/amd/vulkan/radv_shader.c b/src/amd/vulkan/radv_shader.c
index 3adaf52e152..06122664a13 100644
--- a/src/amd/vulkan/radv_shader.c
+++ b/src/amd/vulkan/radv_shader.c
@@ -765,7 +765,7 @@ static void radv_postprocess_config(const struct 
radv_physical_device *pdevice,
if (info->vs.export_prim_id) {
vgpr_comp_cnt = 2;
} else if (info->info.vs.needs_instance_id) {
-   vgpr_comp_cnt = 1;
+   vgpr_comp_cnt = pdevice->rad_info.chip_class >= 
GFX10 ? 3 : 1;
} else {
vgpr_comp_cnt = 0;
}
@@ -837,7 +837,11 @@ static void radv_postprocess_config(const struct 
radv_physical_device *pdevice,
 
if (es_type == MESA_SHADER_VERTEX) {
/* VGPR0-3: (VertexID, InstanceID / StepRate0, ...) */
-   es_vgpr_comp_cnt = info->info.vs.needs_instance_id ? 1 
: 0;
+   if (info->info.vs.needs_instance_id) {
+   es_vgpr_comp_cnt = pdevice->rad_info.chip_class 
>= GFX10 ? 3 : 1;
+   } else {
+   es_vgpr_comp_cnt = 0;
+   }
} else if (es_type == MESA_SHADER_TESS_EVAL) {
es_vgpr_comp_cnt = info->info.uses_prim_id ? 3 : 2;
} else {
-- 
2.22.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2] radv/gfx10: fix VS input VGPRs with the legacy path

2019-07-23 Thread Samuel Pitoiset

For some reasons, InstanceID is VGPR3 although StepRate0 is set to 1.

v2: fix instanceID

Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/radv_nir_to_llvm.c | 12 +---
 src/amd/vulkan/radv_shader.c  |  2 +-
 2 files changed, 10 insertions(+), 4 deletions(-)

diff --git a/src/amd/vulkan/radv_nir_to_llvm.c 
b/src/amd/vulkan/radv_nir_to_llvm.c
index 336bae28614..cf73cdc692b 100644
--- a/src/amd/vulkan/radv_nir_to_llvm.c
+++ b/src/amd/vulkan/radv_nir_to_llvm.c
@@ -852,9 +852,15 @@ declare_vs_input_vgprs(struct radv_shader_context *ctx, 
struct arg_info *args)
}
} else {
if (ctx->ac.chip_class >= GFX10) {
-   add_arg(args, ARG_VGPR, ctx->ac.i32, NULL); /* 
user vgpr */
-   add_arg(args, ARG_VGPR, ctx->ac.i32, NULL); /* 
user vgpr */
-   add_arg(args, ARG_VGPR, ctx->ac.i32, 
>abi.instance_id);
+   if (ctx->options->key.vs_common_out.as_ngg) {
+   add_arg(args, ARG_VGPR, ctx->ac.i32, 
NULL); /* user vgpr */
+   add_arg(args, ARG_VGPR, ctx->ac.i32, 
NULL); /* user vgpr */
+   add_arg(args, ARG_VGPR, ctx->ac.i32, 
>abi.instance_id);
+   } else {
+   add_arg(args, ARG_VGPR, ctx->ac.i32, 
NULL); /* unused */
+   add_arg(args, ARG_VGPR, ctx->ac.i32, 
>vs_prim_id);
+   add_arg(args, ARG_VGPR, ctx->ac.i32, 
>abi.instance_id);
+   }
} else {
add_arg(args, ARG_VGPR, ctx->ac.i32, 
>abi.instance_id);
add_arg(args, ARG_VGPR, ctx->ac.i32, 
>vs_prim_id);
diff --git a/src/amd/vulkan/radv_shader.c b/src/amd/vulkan/radv_shader.c
index 3adaf52e152..3d1b56e7f60 100644
--- a/src/amd/vulkan/radv_shader.c
+++ b/src/amd/vulkan/radv_shader.c
@@ -765,7 +765,7 @@ static void radv_postprocess_config(const struct 
radv_physical_device *pdevice,
if (info->vs.export_prim_id) {
vgpr_comp_cnt = 2;
} else if (info->info.vs.needs_instance_id) {
-   vgpr_comp_cnt = 1;
+   vgpr_comp_cnt = pdevice->rad_info.chip_class >= 
GFX10 ? 3 : 1;
} else {
vgpr_comp_cnt = 0;
}
-- 
2.22.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] radv/gfx10: fix VS input VGPRs with the legacy path

2019-07-23 Thread Samuel Pitoiset



On 7/23/19 1:37 PM, Bas Nieuwenhuizen wrote:

So does this work with tests that use multiple instances?

Apparently no.


If so, r-b.

On Tue, Jul 23, 2019 at 1:29 PM Samuel Pitoiset
 wrote:

Signed-off-by: Samuel Pitoiset 
---
  src/amd/vulkan/radv_nir_to_llvm.c | 3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/amd/vulkan/radv_nir_to_llvm.c 
b/src/amd/vulkan/radv_nir_to_llvm.c
index 336bae28614..9cea92e8a69 100644
--- a/src/amd/vulkan/radv_nir_to_llvm.c
+++ b/src/amd/vulkan/radv_nir_to_llvm.c
@@ -851,7 +851,8 @@ declare_vs_input_vgprs(struct radv_shader_context *ctx, 
struct arg_info *args)
 add_arg(args, ARG_VGPR, ctx->ac.i32, NULL); /* 
unused */
 }
 } else {
-   if (ctx->ac.chip_class >= GFX10) {
+   if (ctx->ac.chip_class >= GFX10 &&
+   ctx->options->key.vs_common_out.as_ngg) {
 add_arg(args, ARG_VGPR, ctx->ac.i32, NULL); /* 
user vgpr */
 add_arg(args, ARG_VGPR, ctx->ac.i32, NULL); /* 
user vgpr */
 add_arg(args, ARG_VGPR, ctx->ac.i32, 
>abi.instance_id);
--
2.22.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] radv/gfx10: fix VS input VGPRs with the legacy path

2019-07-23 Thread Samuel Pitoiset

Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/radv_nir_to_llvm.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/amd/vulkan/radv_nir_to_llvm.c 
b/src/amd/vulkan/radv_nir_to_llvm.c
index 336bae28614..9cea92e8a69 100644
--- a/src/amd/vulkan/radv_nir_to_llvm.c
+++ b/src/amd/vulkan/radv_nir_to_llvm.c
@@ -851,7 +851,8 @@ declare_vs_input_vgprs(struct radv_shader_context *ctx, 
struct arg_info *args)
add_arg(args, ARG_VGPR, ctx->ac.i32, NULL); /* 
unused */
}
} else {
-   if (ctx->ac.chip_class >= GFX10) {
+   if (ctx->ac.chip_class >= GFX10 &&
+   ctx->options->key.vs_common_out.as_ngg) {
add_arg(args, ARG_VGPR, ctx->ac.i32, NULL); /* 
user vgpr */
add_arg(args, ARG_VGPR, ctx->ac.i32, NULL); /* 
user vgpr */
add_arg(args, ARG_VGPR, ctx->ac.i32, 
>abi.instance_id);
-- 
2.22.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] radv: fix dumping disassembly with RADV_DEBUG=shaders

2019-07-23 Thread Samuel Pitoiset

Fixes: a20a9d0c5e7 ("radv: dont store disasm string unless keep_shader_info 
flag set")
Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/radv_shader.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/amd/vulkan/radv_shader.c b/src/amd/vulkan/radv_shader.c
index 3adaf52e152..736388c555c 100644
--- a/src/amd/vulkan/radv_shader.c
+++ b/src/amd/vulkan/radv_shader.c
@@ -1013,7 +1013,8 @@ radv_shader_variant_create(struct radv_device *device,
return NULL;
}
 
-   if (device->keep_shader_info) {
+   if (device->keep_shader_info ||
+   (device->instance->debug_flags & RADV_DEBUG_DUMP_SHADERS)) {
const char *disasm_data;
size_t disasm_size;
if (!ac_rtld_get_section_by_name(_binary, 
".AMDGPU.disasm", _data, _size)) {
-- 
2.22.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] radv/gfx10: enable CLEAR_state

2019-07-23 Thread Samuel Pitoiset

It actually works.

Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/radv_device.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/src/amd/vulkan/radv_device.c b/src/amd/vulkan/radv_device.c
index 992e12840f7..93b03afda22 100644
--- a/src/amd/vulkan/radv_device.c
+++ b/src/amd/vulkan/radv_device.c
@@ -354,8 +354,7 @@ radv_physical_device_init(struct radv_physical_device 
*device,
/* The mere presence of CLEAR_STATE in the IB causes random GPU hangs
 * on GFX6.
 */
-   device->has_clear_state = device->rad_info.chip_class >= GFX7 &&
- device->rad_info.chip_class <= GFX9;
+   device->has_clear_state = device->rad_info.chip_class >= GFX7;
 
device->cpdma_prefetch_writes_memory = device->rad_info.chip_class <= 
GFX8;
 
-- 
2.22.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/2] radv/gfx10: correctly determine the number of vertices per primitive

2019-07-22 Thread Samuel Pitoiset



On 7/22/19 6:01 PM, Ilia Mirkin wrote:

On Mon, Jul 22, 2019 at 11:49 AM Samuel Pitoiset
 wrote:

For TES as NGG.

Signed-off-by: Samuel Pitoiset 
---
  src/amd/vulkan/radv_nir_to_llvm.c | 17 -
  1 file changed, 16 insertions(+), 1 deletion(-)

diff --git a/src/amd/vulkan/radv_nir_to_llvm.c 
b/src/amd/vulkan/radv_nir_to_llvm.c
index 336bae28614..6e5a283f923 100644
--- a/src/amd/vulkan/radv_nir_to_llvm.c
+++ b/src/amd/vulkan/radv_nir_to_llvm.c
@@ -112,6 +112,7 @@ struct radv_shader_context {
 unsigned gs_max_out_vertices;
 unsigned gs_output_prim;

+   unsigned tes_point_mode;
 unsigned tes_primitive_mode;

 uint32_t tcs_patch_outputs_read;
@@ -3304,7 +3305,6 @@ handle_ngg_outputs_post(struct radv_shader_context *ctx)
  {
 LLVMBuilderRef builder = ctx->ac.builder;
 struct ac_build_if_state if_state;
-   unsigned num_vertices = 3;
 LLVMValueRef tmp;

 assert((ctx->stage == MESA_SHADER_VERTEX ||
@@ -3322,6 +3322,20 @@ handle_ngg_outputs_post(struct radv_shader_context *ctx)
 ac_unpack_param(>ac, ctx->gs_vtx_offset[2], 0, 16),
 };

+   /* Determine the number of vertices per primitive. */
+   unsigned num_vertices;
+
+   if (ctx->stage == MESA_SHADER_VERTEX) {
+   num_vertices = 3; /* TODO: optimize for points & lines */
+   } else {
+   if (ctx->tes_point_mode)
+   num_vertices = 1;
+   else if (ctx->tes_primitive_mode == GL_LINES)
+   num_vertices = 2;
+   else
+   num_vertices = 3;
+   }
+
 /* TODO: streamout */

 /* Copy Primitive IDs from GS threads to the LDS address corresponding
@@ -4435,6 +4449,7 @@ LLVMModuleRef ac_translate_nir_to_llvm(struct 
ac_llvm_compiler *ac_llvm,
 ctx.tcs_num_inputs = 
util_last_bit64(shader_info->info.vs.ls_outputs_written);
 ctx.tcs_num_patches = get_tcs_num_patches();
 } else if (shaders[i]->info.stage == MESA_SHADER_TESS_EVAL) {
+   ctx.tes_point_mode = shaders[i]->info.tess.point_mode;

Drive-by-comment without reading the full context...

What if there's e.g. a GS which produces not-points? This bool will be
set, and the logic above will say num_vertices = 1, which presumably
is bad.
With GS, TES is emitted as ES and this function isn't called because 
it's for NGG only.


   -ilia


 ctx.tes_primitive_mode = 
shaders[i]->info.tess.primitive_mode;
 ctx.abi.load_tess_varyings = load_tes_input;
 ctx.abi.load_tess_coord = load_tess_coord;
--
2.22.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/2] radv/gfx10: correctly determine the number of vertices per primitive

2019-07-22 Thread Samuel Pitoiset

For TES as NGG.

Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/radv_nir_to_llvm.c | 17 -
 1 file changed, 16 insertions(+), 1 deletion(-)

diff --git a/src/amd/vulkan/radv_nir_to_llvm.c 
b/src/amd/vulkan/radv_nir_to_llvm.c
index 336bae28614..6e5a283f923 100644
--- a/src/amd/vulkan/radv_nir_to_llvm.c
+++ b/src/amd/vulkan/radv_nir_to_llvm.c
@@ -112,6 +112,7 @@ struct radv_shader_context {
unsigned gs_max_out_vertices;
unsigned gs_output_prim;
 
+   unsigned tes_point_mode;
unsigned tes_primitive_mode;
 
uint32_t tcs_patch_outputs_read;
@@ -3304,7 +3305,6 @@ handle_ngg_outputs_post(struct radv_shader_context *ctx)
 {
LLVMBuilderRef builder = ctx->ac.builder;
struct ac_build_if_state if_state;
-   unsigned num_vertices = 3;
LLVMValueRef tmp;
 
assert((ctx->stage == MESA_SHADER_VERTEX ||
@@ -3322,6 +3322,20 @@ handle_ngg_outputs_post(struct radv_shader_context *ctx)
ac_unpack_param(>ac, ctx->gs_vtx_offset[2], 0, 16),
};
 
+   /* Determine the number of vertices per primitive. */
+   unsigned num_vertices;
+
+   if (ctx->stage == MESA_SHADER_VERTEX) {
+   num_vertices = 3; /* TODO: optimize for points & lines */
+   } else {
+   if (ctx->tes_point_mode)
+   num_vertices = 1;
+   else if (ctx->tes_primitive_mode == GL_LINES)
+   num_vertices = 2;
+   else
+   num_vertices = 3;
+   }
+
/* TODO: streamout */
 
/* Copy Primitive IDs from GS threads to the LDS address corresponding
@@ -4435,6 +4449,7 @@ LLVMModuleRef ac_translate_nir_to_llvm(struct 
ac_llvm_compiler *ac_llvm,
ctx.tcs_num_inputs = 
util_last_bit64(shader_info->info.vs.ls_outputs_written);
ctx.tcs_num_patches = get_tcs_num_patches();
} else if (shaders[i]->info.stage == MESA_SHADER_TESS_EVAL) {
+   ctx.tes_point_mode = shaders[i]->info.tess.point_mode;
ctx.tes_primitive_mode = 
shaders[i]->info.tess.primitive_mode;
ctx.abi.load_tess_varyings = load_tes_input;
ctx.abi.load_tess_coord = load_tess_coord;
-- 
2.22.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 1/2] radv/gfx10: reduce max_esverts_base to 128

2019-07-22 Thread Samuel Pitoiset

Same as RadeonSI.

Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/radv_pipeline.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/amd/vulkan/radv_pipeline.c b/src/amd/vulkan/radv_pipeline.c
index a7ff0e2d139..fce60a62ee9 100644
--- a/src/amd/vulkan/radv_pipeline.c
+++ b/src/amd/vulkan/radv_pipeline.c
@@ -1704,7 +1704,7 @@ calculate_ngg_info(const VkGraphicsPipelineCreateInfo 
*pCreateInfo,
 
/* All these are per subgroup: */
bool max_vert_out_per_gs_instance = false;
-   unsigned max_esverts_base = 256;
+   unsigned max_esverts_base = 128;
unsigned max_gsprims_base = 128; /* default prim group size clamp */
 
/* Hardware has the following non-natural restrictions on the value
-- 
2.22.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] radv: fix crash in vkCmdClearAttachments with unused attachment

2019-07-22 Thread Samuel Pitoiset

depth_stencil_attachment and/or ds_resolve attachment can be NULL.

This fixes crashes with
dEQP-VK.renderpass.suballocation.unused_clear_attachments.*

Cc: 19.1 
Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/radv_meta_clear.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/amd/vulkan/radv_meta_clear.c b/src/amd/vulkan/radv_meta_clear.c
index dd2ba402f40..b93ba3e0b29 100644
--- a/src/amd/vulkan/radv_meta_clear.c
+++ b/src/amd/vulkan/radv_meta_clear.c
@@ -1688,7 +1688,7 @@ emit_clear(struct radv_cmd_buffer *cmd_buffer,
if (ds_resolve_clear)
ds_att = subpass->ds_resolve_attachment;
 
-   if (ds_att->attachment == VK_ATTACHMENT_UNUSED)
+   if (!ds_att || ds_att->attachment == VK_ATTACHMENT_UNUSED)
return;
 
VkImageLayout image_layout = ds_att->layout;
-- 
2.22.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] ac/nir: fix txf_ms with an offset

2019-07-22 Thread Samuel Pitoiset


Reviewed-by: Samuel Pitoiset 

On 7/19/19 9:17 PM, Rhys Perry wrote:

Seems to fix some hair artifacts in Max Payne 3:
https://github.com/daniel-schuermann/mesa/issues/76

Signed-off-by: Rhys Perry 
Fixes: f4e499ec791 ('radv: add initial non-conformant radv vulkan driver')
---
  src/amd/common/ac_nir_to_llvm.c | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c
index 96bf89a8bf9..549a26ea243 100644
--- a/src/amd/common/ac_nir_to_llvm.c
+++ b/src/amd/common/ac_nir_to_llvm.c
@@ -3784,7 +3784,7 @@ static void visit_tex(struct ac_nir_context *ctx, 
nir_tex_instr *instr)
goto write_result;
}
  
-	if (args.offset && instr->op != nir_texop_txf) {

+   if (args.offset && instr->op != nir_texop_txf && instr->op != 
nir_texop_txf_ms) {
LLVMValueRef offset[3], pack;
for (unsigned chan = 0; chan < 3; ++chan)
offset[chan] = ctx->ac.i32_0;
@@ -3919,7 +3919,7 @@ static void visit_tex(struct ac_nir_context *ctx, 
nir_tex_instr *instr)
args.coords[sample_chan], fmask_ptr);
}
  
-	if (args.offset && instr->op == nir_texop_txf) {

+   if (args.offset && (instr->op == nir_texop_txf || instr->op == 
nir_texop_txf_ms)) {
int num_offsets = 
instr->src[offset_src].src.ssa->num_components;
num_offsets = MIN2(num_offsets, instr->coord_components);
for (unsigned i = 0; i < num_offsets; ++i) {

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 4/7] radv/gfx10: do not set ELEMENT_SIZE for buffer descriptors

2019-07-18 Thread Samuel Pitoiset

This field doesn't exist.

Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/radv_device.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/src/amd/vulkan/radv_device.c b/src/amd/vulkan/radv_device.c
index 6e313aa9aa1..3c553cb93e7 100644
--- a/src/amd/vulkan/radv_device.c
+++ b/src/amd/vulkan/radv_device.c
@@ -2164,7 +2164,6 @@ fill_geom_tess_rings(struct radv_queue *queue,
  S_008F0C_DST_SEL_Y(V_008F0C_SQ_SEL_Y) |
  S_008F0C_DST_SEL_Z(V_008F0C_SQ_SEL_Z) |
  S_008F0C_DST_SEL_W(V_008F0C_SQ_SEL_W) |
- S_008F0C_ELEMENT_SIZE(1) |
  S_008F0C_INDEX_STRIDE(3) |
  S_008F0C_ADD_TID_ENABLE(1);
 
@@ -2174,7 +2173,8 @@ fill_geom_tess_rings(struct radv_queue *queue,
   S_008F0C_RESOURCE_LEVEL(1);
} else {
desc[3] |= 
S_008F0C_NUM_FORMAT(V_008F0C_BUF_NUM_FORMAT_FLOAT) |
-  
S_008F0C_DATA_FORMAT(V_008F0C_BUF_DATA_FORMAT_32);
+  
S_008F0C_DATA_FORMAT(V_008F0C_BUF_DATA_FORMAT_32) |
+  S_008F0C_ELEMENT_SIZE(1);
}
 
/* GS entry for ES->GS ring */
@@ -2234,7 +2234,6 @@ fill_geom_tess_rings(struct radv_queue *queue,
  S_008F0C_DST_SEL_Y(V_008F0C_SQ_SEL_Y) |
  S_008F0C_DST_SEL_Z(V_008F0C_SQ_SEL_Z) |
  S_008F0C_DST_SEL_W(V_008F0C_SQ_SEL_W) |
- S_008F0C_ELEMENT_SIZE(1) |
  S_008F0C_INDEX_STRIDE(1) |
  S_008F0C_ADD_TID_ENABLE(true);
 
@@ -2244,7 +2243,8 @@ fill_geom_tess_rings(struct radv_queue *queue,
   S_008F0C_RESOURCE_LEVEL(1);
} else {
desc[7] |= 
S_008F0C_NUM_FORMAT(V_008F0C_BUF_NUM_FORMAT_FLOAT) |
-  
S_008F0C_DATA_FORMAT(V_008F0C_BUF_DATA_FORMAT_32);
+  
S_008F0C_DATA_FORMAT(V_008F0C_BUF_DATA_FORMAT_32) |
+  S_008F0C_ELEMENT_SIZE(1);
}
 
}
-- 
2.22.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 6/7] radv/gfx10: emit the GS NGG prologue before the nested barrier

2019-07-18 Thread Samuel Pitoiset

Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/radv_nir_to_llvm.c | 7 +--
 1 file changed, 1 insertion(+), 6 deletions(-)

diff --git a/src/amd/vulkan/radv_nir_to_llvm.c 
b/src/amd/vulkan/radv_nir_to_llvm.c
index 7e623414adc..6feb55e3916 100644
--- a/src/amd/vulkan/radv_nir_to_llvm.c
+++ b/src/amd/vulkan/radv_nir_to_llvm.c
@@ -4453,6 +4453,7 @@ LLVMModuleRef ac_translate_nir_to_llvm(struct 
ac_llvm_compiler *ac_llvm,
if (i) {
if (shaders[i]->info.stage == MESA_SHADER_GEOMETRY &&
ctx.options->key.vs_common_out.as_ngg) {
+   gfx10_ngg_gs_emit_prologue();
nested_barrier = false;
} else {
nested_barrier = true;
@@ -4495,12 +4496,6 @@ LLVMModuleRef ac_translate_nir_to_llvm(struct 
ac_llvm_compiler *ac_llvm,
 
LLVMBasicBlockRef merge_block;
if (shader_count >= 2 || is_ngg) {
-
-   if (shaders[i]->info.stage == MESA_SHADER_GEOMETRY &&
-   ctx.options->key.vs_common_out.as_ngg) {
-   gfx10_ngg_gs_emit_prologue();
-   }
-
LLVMValueRef fn = 
LLVMGetBasicBlockParent(LLVMGetInsertBlock(ctx.ac.builder));
LLVMBasicBlockRef then_block = 
LLVMAppendBasicBlockInContext(ctx.ac.context, fn, "");
merge_block = 
LLVMAppendBasicBlockInContext(ctx.ac.context, fn, "");
-- 
2.22.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 5/7] radv/gfx10: do not allocate space for the ZPASS_DONE bug

2019-07-18 Thread Samuel Pitoiset

GFX10 isn't affected.

Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/radv_cmd_buffer.c | 14 --
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/src/amd/vulkan/radv_cmd_buffer.c b/src/amd/vulkan/radv_cmd_buffer.c
index b6ac14f63a9..84d627340e9 100644
--- a/src/amd/vulkan/radv_cmd_buffer.c
+++ b/src/amd/vulkan/radv_cmd_buffer.c
@@ -364,12 +364,14 @@ radv_reset_cmd_buffer(struct radv_cmd_buffer *cmd_buffer)
radv_buffer_get_va(cmd_buffer->upload.upload_bo);
cmd_buffer->gfx9_fence_va += fence_offset;
 
-   /* Allocate a buffer for the EOP bug on GFX9. */
-   radv_cmd_buffer_upload_alloc(cmd_buffer, 16 * num_db, 8,
-_bug_offset, _ptr);
-   cmd_buffer->gfx9_eop_bug_va =
-   radv_buffer_get_va(cmd_buffer->upload.upload_bo);
-   cmd_buffer->gfx9_eop_bug_va += eop_bug_offset;
+   if (cmd_buffer->device->physical_device->rad_info.chip_class == 
GFX9) {
+   /* Allocate a buffer for the EOP bug on GFX9. */
+   radv_cmd_buffer_upload_alloc(cmd_buffer, 16 * num_db, 8,
+_bug_offset, 
_ptr);
+   cmd_buffer->gfx9_eop_bug_va =
+   
radv_buffer_get_va(cmd_buffer->upload.upload_bo);
+   cmd_buffer->gfx9_eop_bug_va += eop_bug_offset;
+   }
}
 
cmd_buffer->status = RADV_CMD_BUFFER_STATUS_INITIAL;
-- 
2.22.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/7] radv: change a bunch of >= GFX9 to == GFX9

2019-07-18 Thread Samuel Pitoiset

Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/radv_cmd_buffer.c |  6 +++---
 src/amd/vulkan/radv_device.c |  2 +-
 src/amd/vulkan/radv_image.c  | 10 +-
 src/amd/vulkan/si_cmd_buffer.c   | 12 ++--
 4 files changed, 15 insertions(+), 15 deletions(-)

diff --git a/src/amd/vulkan/radv_cmd_buffer.c b/src/amd/vulkan/radv_cmd_buffer.c
index b4301c0da15..b6ac14f63a9 100644
--- a/src/amd/vulkan/radv_cmd_buffer.c
+++ b/src/amd/vulkan/radv_cmd_buffer.c
@@ -1294,7 +1294,7 @@ radv_emit_fb_color_state(struct radv_cmd_buffer 
*cmd_buffer,
   cb->cb_color_attrib2);
radeon_set_context_reg(cmd_buffer->cs, 
R_028EE0_CB_COLOR0_ATTRIB3 + index * 4,
   cb->cb_color_attrib3);
-   } else if (cmd_buffer->device->physical_device->rad_info.chip_class >= 
GFX9) {
+   } else if (cmd_buffer->device->physical_device->rad_info.chip_class == 
GFX9) {
radeon_set_context_reg_seq(cmd_buffer->cs, 
R_028C60_CB_COLOR0_BASE + index * 0x3c, 11);
radeon_emit(cmd_buffer->cs, cb->cb_color_base);
radeon_emit(cmd_buffer->cs, 
S_028C64_BASE_256B(cb->cb_color_base >> 32));
@@ -1432,7 +1432,7 @@ radv_emit_fb_ds_state(struct radv_cmd_buffer *cmd_buffer,
radeon_emit(cmd_buffer->cs, ds->db_z_read_base >> 32);
radeon_emit(cmd_buffer->cs, ds->db_stencil_read_base >> 32);
radeon_emit(cmd_buffer->cs, ds->db_htile_data_base >> 32);
-   } else if (cmd_buffer->device->physical_device->rad_info.chip_class >= 
GFX9) {
+   } else if (cmd_buffer->device->physical_device->rad_info.chip_class == 
GFX9) {
radeon_set_context_reg_seq(cmd_buffer->cs, 
R_028014_DB_HTILE_DATA_BASE, 3);
radeon_emit(cmd_buffer->cs, ds->db_htile_data_base);
radeon_emit(cmd_buffer->cs, 
S_028018_BASE_HI(ds->db_htile_data_base >> 32));
@@ -2508,7 +2508,7 @@ si_emit_ia_multi_vgt_param(struct radv_cmd_buffer 
*cmd_buffer,
  draw_vertex_count);
 
if (state->last_ia_multi_vgt_param != ia_multi_vgt_param) {
-   if (info->chip_class >= GFX9) {
+   if (info->chip_class == GFX9) {

radeon_set_uconfig_reg_idx(cmd_buffer->device->physical_device,
   cs,
   R_030960_IA_MULTI_VGT_PARAM,
diff --git a/src/amd/vulkan/radv_device.c b/src/amd/vulkan/radv_device.c
index 8dd24cb8192..15bda6822e8 100644
--- a/src/amd/vulkan/radv_device.c
+++ b/src/amd/vulkan/radv_device.c
@@ -2502,7 +2502,7 @@ radv_emit_global_shader_pointers(struct radv_queue *queue,
radv_emit_shader_pointer(queue->device, cs, regs[i],
 va, true);
}
-   } else if (queue->device->physical_device->rad_info.chip_class >= GFX9) 
{
+   } else if (queue->device->physical_device->rad_info.chip_class == GFX9) 
{
uint32_t regs[] = {R_00B030_SPI_SHADER_USER_DATA_PS_0,
   R_00B130_SPI_SHADER_USER_DATA_VS_0,
   R_00B208_SPI_SHADER_USER_DATA_ADDR_LO_GS,
diff --git a/src/amd/vulkan/radv_image.c b/src/amd/vulkan/radv_image.c
index 4d3ed71c23c..0941cbb 100644
--- a/src/amd/vulkan/radv_image.c
+++ b/src/amd/vulkan/radv_image.c
@@ -522,7 +522,7 @@ si_set_mutable_tex_desc_fields(struct radv_device *device,
}
 
state[7] = meta_va >> 16;
-   } else if (chip_class >= GFX9) {
+   } else if (chip_class == GFX9) {
state[3] &= C_008F1C_SW_MODE;
state[4] &= C_008F20_PITCH;
 
@@ -787,7 +787,7 @@ si_make_texture_descriptor(struct radv_device *device,
}
 
/* S8 with either Z16 or Z32 HTILE need a special format. */
-   if (device->physical_device->rad_info.chip_class >= GFX9 &&
+   if (device->physical_device->rad_info.chip_class == GFX9 &&
vk_format == VK_FORMAT_S8_UINT &&
radv_image_is_tc_compat_htile(image)) {
if (image->vk_format == VK_FORMAT_D32_SFLOAT_S8_UINT)
@@ -828,7 +828,7 @@ si_make_texture_descriptor(struct radv_device *device,
state[6] = 0;
state[7] = 0;
 
-   if (device->physical_device->rad_info.chip_class >= GFX9) {
+   if (device->physical_device->rad_info.chip_class == GFX9) {
unsigned bc_swizzle = gfx9_border_color_swizzle(swizzle);
 
/* Depth is the last accessible layer on Gfx9.
@@ -874,7 +874,7 @@ si_make_texture_descriptor(struct

[Mesa-dev] [PATCH 3/7] radv: clean up fill_geom_tess_rings()

2019-07-18 Thread Samuel Pitoiset

Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/radv_device.c | 34 +-
 1 file changed, 9 insertions(+), 25 deletions(-)

diff --git a/src/amd/vulkan/radv_device.c b/src/amd/vulkan/radv_device.c
index 15bda6822e8..6e313aa9aa1 100644
--- a/src/amd/vulkan/radv_device.c
+++ b/src/amd/vulkan/radv_device.c
@@ -2158,7 +2158,6 @@ fill_geom_tess_rings(struct radv_queue *queue,
   index stride 64 */
desc[0] = esgs_va;
desc[1] = S_008F04_BASE_ADDRESS_HI(esgs_va >> 32) |
- S_008F04_STRIDE(0) |
  S_008F04_SWIZZLE_ENABLE(true);
desc[2] = esgs_ring_size;
desc[3] = S_008F0C_DST_SEL_X(V_008F0C_SQ_SEL_X) |
@@ -2167,7 +2166,7 @@ fill_geom_tess_rings(struct radv_queue *queue,
  S_008F0C_DST_SEL_W(V_008F0C_SQ_SEL_W) |
  S_008F0C_ELEMENT_SIZE(1) |
  S_008F0C_INDEX_STRIDE(3) |
- S_008F0C_ADD_TID_ENABLE(true);
+ S_008F0C_ADD_TID_ENABLE(1);
 
if (queue->device->physical_device->rad_info.chip_class >= 
GFX10) {
desc[3] |= 
S_008F0C_FORMAT(V_008F0C_IMG_FORMAT_32_FLOAT) |
@@ -2182,17 +2181,12 @@ fill_geom_tess_rings(struct radv_queue *queue,
/* stride 0, num records - size, elsize0,
   index stride 0 */
desc[4] = esgs_va;
-   desc[5] = S_008F04_BASE_ADDRESS_HI(esgs_va >> 32)|
- S_008F04_STRIDE(0) |
- S_008F04_SWIZZLE_ENABLE(false);
+   desc[5] = S_008F04_BASE_ADDRESS_HI(esgs_va >> 32);
desc[6] = esgs_ring_size;
desc[7] = S_008F0C_DST_SEL_X(V_008F0C_SQ_SEL_X) |
  S_008F0C_DST_SEL_Y(V_008F0C_SQ_SEL_Y) |
  S_008F0C_DST_SEL_Z(V_008F0C_SQ_SEL_Z) |
- S_008F0C_DST_SEL_W(V_008F0C_SQ_SEL_W) |
- S_008F0C_ELEMENT_SIZE(0) |
- S_008F0C_INDEX_STRIDE(0) |
- S_008F0C_ADD_TID_ENABLE(false);
+ S_008F0C_DST_SEL_W(V_008F0C_SQ_SEL_W);
 
if (queue->device->physical_device->rad_info.chip_class >= 
GFX10) {
desc[7] |= 
S_008F0C_FORMAT(V_008F0C_IMG_FORMAT_32_FLOAT) |
@@ -2213,17 +2207,12 @@ fill_geom_tess_rings(struct radv_queue *queue,
/* stride 0, num records - size, elsize0,
   index stride 0 */
desc[0] = gsvs_va;
-   desc[1] = S_008F04_BASE_ADDRESS_HI(gsvs_va >> 32)|
- S_008F04_STRIDE(0) |
- S_008F04_SWIZZLE_ENABLE(false);
+   desc[1] = S_008F04_BASE_ADDRESS_HI(gsvs_va >> 32);
desc[2] = gsvs_ring_size;
desc[3] = S_008F0C_DST_SEL_X(V_008F0C_SQ_SEL_X) |
  S_008F0C_DST_SEL_Y(V_008F0C_SQ_SEL_Y) |
  S_008F0C_DST_SEL_Z(V_008F0C_SQ_SEL_Z) |
- S_008F0C_DST_SEL_W(V_008F0C_SQ_SEL_W) |
- S_008F0C_ELEMENT_SIZE(0) |
- S_008F0C_INDEX_STRIDE(0) |
- S_008F0C_ADD_TID_ENABLE(false);
+ S_008F0C_DST_SEL_W(V_008F0C_SQ_SEL_W);
 
if (queue->device->physical_device->rad_info.chip_class >= 
GFX10) {
desc[3] |= 
S_008F0C_FORMAT(V_008F0C_IMG_FORMAT_32_FLOAT) |
@@ -2238,9 +2227,8 @@ fill_geom_tess_rings(struct radv_queue *queue,
   elsize 4, index stride 16 */
/* shader will patch stride and desc[2] */
desc[4] = gsvs_va;
-   desc[5] = S_008F04_BASE_ADDRESS_HI(gsvs_va >> 32)|
- S_008F04_STRIDE(0) |
- S_008F04_SWIZZLE_ENABLE(true);
+   desc[5] = S_008F04_BASE_ADDRESS_HI(gsvs_va >> 32) |
+ S_008F04_SWIZZLE_ENABLE(1);
desc[6] = 0;
desc[7] = S_008F0C_DST_SEL_X(V_008F0C_SQ_SEL_X) |
  S_008F0C_DST_SEL_Y(V_008F0C_SQ_SEL_Y) |
@@ -2268,9 +2256,7 @@ fill_geom_tess_rings(struct radv_queue *queue,
uint64_t tess_offchip_va = tess_va + tess_offchip_ring_offset;
 
desc[0] = tess_va;
-   desc[1] = S_008F04_BASE_ADDRESS_HI(tess_va >> 32) |
- S_008F04_STRIDE(0) |
- S_008F04_SWIZZLE_ENABLE(false);
+   desc[1] = S_008F04_BASE_ADDRESS_HI(tess_va >> 32);
desc[2] = tess_factor_ring_size;
desc[3] = S_008F0C_DST_SEL_X(V_008F0C_SQ_SEL_X) |
  S_008F0C_DST_SEL_Y(V_008F0C_SQ_SEL_Y) |
@@ -2287,9 +2273,7 @@ fill_geom_tess_rings(struct radv_queue *

[Mesa-dev] [PATCH 7/7] radv/gfx10: update descriptors for inline uniform blocks

2019-07-18 Thread Samuel Pitoiset

Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/radv_nir_to_llvm.c | 13 ++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/src/amd/vulkan/radv_nir_to_llvm.c 
b/src/amd/vulkan/radv_nir_to_llvm.c
index 6feb55e3916..19dcae3a476 100644
--- a/src/amd/vulkan/radv_nir_to_llvm.c
+++ b/src/amd/vulkan/radv_nir_to_llvm.c
@@ -1373,9 +1373,16 @@ radv_load_resource(struct ac_shader_abi *abi, 
LLVMValueRef index,
uint32_t desc_type = S_008F0C_DST_SEL_X(V_008F0C_SQ_SEL_X) |
S_008F0C_DST_SEL_Y(V_008F0C_SQ_SEL_Y) |
S_008F0C_DST_SEL_Z(V_008F0C_SQ_SEL_Z) |
-   S_008F0C_DST_SEL_W(V_008F0C_SQ_SEL_W) |
-   S_008F0C_NUM_FORMAT(V_008F0C_BUF_NUM_FORMAT_FLOAT) |
-   S_008F0C_DATA_FORMAT(V_008F0C_BUF_DATA_FORMAT_32);
+   S_008F0C_DST_SEL_W(V_008F0C_SQ_SEL_W);
+
+   if (ctx->ac.chip_class >= GFX10) {
+   desc_type |= 
S_008F0C_FORMAT(V_008F0C_IMG_FORMAT_32_FLOAT) |
+S_008F0C_OOB_SELECT(3) |
+S_008F0C_RESOURCE_LEVEL(1);
+   } else {
+   desc_type |= 
S_008F0C_NUM_FORMAT(V_008F0C_BUF_NUM_FORMAT_FLOAT) |
+
S_008F0C_DATA_FORMAT(V_008F0C_BUF_DATA_FORMAT_32);
+   }
 
LLVMValueRef desc_components[4] = {
LLVMBuildPtrToInt(ctx->ac.builder, desc_ptr, 
ctx->ac.intptr, ""),
-- 
2.22.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 1/7] ac/nir: do not clamp shadow reference on GFX10

2019-07-18 Thread Samuel Pitoiset

RadeonSI only uses Z32_FLOAT_CLAMP for upgraded depth textures
on GFX10 and RADV doesn't promotes Z16 or Z24.

Signed-off-by: Samuel Pitoiset 
---
 src/amd/common/ac_nir_to_llvm.c | 8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c
index 96bf89a8bf9..75ee534eb3e 100644
--- a/src/amd/common/ac_nir_to_llvm.c
+++ b/src/amd/common/ac_nir_to_llvm.c
@@ -3805,12 +3805,16 @@ static void visit_tex(struct ac_nir_context *ctx, 
nir_tex_instr *instr)
 
/* TC-compatible HTILE on radeonsi promotes Z16 and Z24 to Z32_FLOAT,
 * so the depth comparison value isn't clamped for Z16 and
-* Z24 anymore. Do it manually here.
+* Z24 anymore. Do it manually here for GFX8-9; GFX10 has an explicitly
+* clamped 32-bit float format.
 *
 * It's unnecessary if the original texture format was
 * Z32_FLOAT, but we don't know that here.
 */
-   if (args.compare && ctx->ac.chip_class >= GFX8 && 
ctx->abi->clamp_shadow_reference)
+   if (args.compare &&
+   ctx->ac.chip_class >= GFX8 &&
+   ctx->ac.chip_class <= GFX9 &&
+   ctx->abi->clamp_shadow_reference)
args.compare = ac_build_clamp(>ac, ac_to_float(>ac, 
args.compare));
 
/* pack derivatives */
-- 
2.22.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 1/3] radv/gfx10: move emitting VGT_PRIMITIVEID_EN into the NGG path

2019-07-18 Thread Samuel Pitoiset

Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/radv_pipeline.c | 14 --
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/src/amd/vulkan/radv_pipeline.c b/src/amd/vulkan/radv_pipeline.c
index 9338fcd550a..bcb7ccc803d 100644
--- a/src/amd/vulkan/radv_pipeline.c
+++ b/src/amd/vulkan/radv_pipeline.c
@@ -3280,12 +3280,6 @@ radv_pipeline_generate_vgt_gs_mode(struct radeon_cmdbuf 
*ctx_cs,
 
vgt_gs_mode = ac_vgt_gs_mode(gs->info.gs.vertices_out,
 
pipeline->device->physical_device->rad_info.chip_class);
-   } else if (radv_pipeline_has_ngg(pipeline)) {
-   bool enable_prim_id =
-   outinfo->export_prim_id || vs->info.info.uses_prim_id;
-
-   vgt_primitiveid_en |= S_028A84_PRIMITIVEID_EN(enable_prim_id) |
- 
S_028A84_NGG_DISABLE_PROVOK_REUSE(enable_prim_id);
} else if (outinfo->export_prim_id || vs->info.info.uses_prim_id) {
vgt_gs_mode = S_028A40_MODE(V_028A40_GS_SCENARIO_A);
vgt_primitiveid_en |= S_028A84_PRIMITIVEID_EN(1);
@@ -3425,6 +3419,8 @@ radv_pipeline_generate_hw_ngg(struct radeon_cmdbuf 
*ctx_cs,
uint64_t va = radv_buffer_get_va(shader->bo) + shader->bo_offset;
gl_shader_stage es_type =
radv_pipeline_has_tess(pipeline) ? MESA_SHADER_TESS_EVAL : 
MESA_SHADER_VERTEX;
+   struct radv_shader_variant *es =
+   es_type == MESA_SHADER_TESS_EVAL ? 
pipeline->shaders[MESA_SHADER_TESS_EVAL] : 
pipeline->shaders[MESA_SHADER_VERTEX];
 
radeon_set_sh_reg_seq(cs, R_00B320_SPI_SHADER_PGM_LO_ES, 2);
radeon_emit(cs, va >> 8);
@@ -3441,6 +3437,8 @@ radv_pipeline_generate_hw_ngg(struct radeon_cmdbuf 
*ctx_cs,
bool misc_vec_ena = outinfo->writes_pointsize ||
outinfo->writes_layer ||
outinfo->writes_viewport_index;
+   bool es_enable_prim_id = outinfo->export_prim_id ||
+(es && es->info.info.uses_prim_id);
bool break_wave_at_eoi = false;
unsigned nparams;
 
@@ -3479,6 +3477,10 @@ radv_pipeline_generate_hw_ngg(struct radeon_cmdbuf 
*ctx_cs,
   cull_dist_mask << 8 |
   clip_dist_mask);
 
+   radeon_set_context_reg(ctx_cs, R_028A84_VGT_PRIMITIVEID_EN,
+  S_028A84_PRIMITIVEID_EN(es_enable_prim_id) |
+  
S_028A84_NGG_DISABLE_PROVOK_REUSE(es_enable_prim_id));
+
bool vgt_reuse_off = pipeline->device->physical_device->rad_info.family 
== CHIP_NAVI10 &&
 
pipeline->device->physical_device->rad_info.chip_external_rev == 0x1 &&
 es_type == MESA_SHADER_TESS_EVAL;
-- 
2.22.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 3/3] radv/gfx10: set BREAK_WAVE_AT_EOI if TES or GS enable the primitive ID

2019-07-18 Thread Samuel Pitoiset

Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/radv_pipeline.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/src/amd/vulkan/radv_pipeline.c b/src/amd/vulkan/radv_pipeline.c
index b11d79f4811..a7ff0e2d139 100644
--- a/src/amd/vulkan/radv_pipeline.c
+++ b/src/amd/vulkan/radv_pipeline.c
@@ -3445,6 +3445,14 @@ radv_pipeline_generate_hw_ngg(struct radeon_cmdbuf 
*ctx_cs,
bool break_wave_at_eoi = false;
unsigned nparams;
 
+   if (es_type == MESA_SHADER_TESS_EVAL) {
+   struct radv_shader_variant *gs =
+   pipeline->shaders[MESA_SHADER_GEOMETRY];
+
+   if (es_enable_prim_id || (gs && gs->info.info.uses_prim_id))
+   break_wave_at_eoi = true;
+   }
+
nparams = MAX2(outinfo->param_exports, 1);
radeon_set_context_reg(ctx_cs, R_0286C4_SPI_VS_OUT_CONFIG,
   S_0286C4_VS_EXPORT_COUNT(nparams - 1) |
-- 
2.22.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 2/3] radv/gfx10: do not emit VGT_GS_MODE

2019-07-18 Thread Samuel Pitoiset

Unnecessary.

Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/radv_pipeline.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/src/amd/vulkan/radv_pipeline.c b/src/amd/vulkan/radv_pipeline.c
index bcb7ccc803d..b11d79f4811 100644
--- a/src/amd/vulkan/radv_pipeline.c
+++ b/src/amd/vulkan/radv_pipeline.c
@@ -3274,6 +3274,9 @@ radv_pipeline_generate_vgt_gs_mode(struct radeon_cmdbuf 
*ctx_cs,
unsigned vgt_primitiveid_en = 0;
uint32_t vgt_gs_mode = 0;
 
+   if (radv_pipeline_has_ngg(pipeline))
+   return;
+
if (radv_pipeline_has_gs(pipeline)) {
const struct radv_shader_variant *gs =
pipeline->shaders[MESA_SHADER_GEOMETRY];
-- 
2.22.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 4/4] radv/gfx10: do not always execute a barrier before the second shader

2019-07-18 Thread Samuel Pitoiset



On 7/18/19 2:29 AM, Bas Nieuwenhuizen wrote:

On Wed, Jul 17, 2019 at 3:44 PM Samuel Pitoiset
 wrote:

With NGG, empty waves may still be required to export data.

This fixes dEQP-VK.ycbcr.format.*_unorm.geometry_*.

Signed-off-by: Samuel Pitoiset 
---
  src/amd/vulkan/radv_nir_to_llvm.c | 31 ++-
  1 file changed, 30 insertions(+), 1 deletion(-)

diff --git a/src/amd/vulkan/radv_nir_to_llvm.c 
b/src/amd/vulkan/radv_nir_to_llvm.c
index 3e18303879e..7e623414adc 100644
--- a/src/amd/vulkan/radv_nir_to_llvm.c
+++ b/src/amd/vulkan/radv_nir_to_llvm.c
@@ -4448,8 +4448,37 @@ LLVMModuleRef ac_translate_nir_to_llvm(struct 
ac_llvm_compiler *ac_llvm,
 declare_esgs_ring();
 }

-   if (i)
+   bool nested_barrier = false;
+
+   if (i) {
+   if (shaders[i]->info.stage == MESA_SHADER_GEOMETRY &&
+   ctx.options->key.vs_common_out.as_ngg) {
+   nested_barrier = false;
+   } else {
+   nested_barrier = true;
+   }
+   }

We can simplify this to

nested_barrier = i && (shaders[i]->info.stage != MESA_SHADER_GEOMETRY
|| !ctx.options->key.vs_common_out.as_ngg);

Otherwise r-b, I'm just surprised an s_barrier is okay.
I'm going to move the NGG GS prologue into that inner if, so I would 
prefer to keep this way.

+
+   if (nested_barrier) {
+   /* Execute a barrier before the second shader in
+* a merged shader.
+*
+* Execute the barrier inside the conditional block,
+* so that empty waves can jump directly to s_endpgm,
+* which will also signal the barrier.
+*
+* This is possible in gfx9, because an empty wave
+* for the second shader does not participate in
+* the epilogue. With NGG, empty waves may still
+* be required to export data (e.g. GS output vertices),
+* so we cannot let them exit early.
+*
+* If the shader is TCS and the TCS epilog is present
+* and contains a barrier, it will wait there and then
+* reach s_endpgm.
+   */
 ac_emit_barrier(, ctx.stage);
+   }

 nir_foreach_variable(variable, [i]->outputs)
 scan_shader_output_decl(, variable, shaders[i], 
shaders[i]->info.stage);
--
2.22.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/4] radv: move emitting VGT_GS_MODE into the HW VS path

2019-07-18 Thread Samuel Pitoiset



On 7/18/19 2:10 AM, Bas Nieuwenhuizen wrote:

On Thu, Jul 18, 2019 at 2:05 AM Bas Nieuwenhuizen
 wrote:

On Wed, Jul 17, 2019 at 3:44 PM Samuel Pitoiset
 wrote:

It's useless for NGG anyways.

Signed-off-by: Samuel Pitoiset 
---
  src/amd/vulkan/radv_pipeline.c | 43 ++
  1 file changed, 33 insertions(+), 10 deletions(-)

diff --git a/src/amd/vulkan/radv_pipeline.c b/src/amd/vulkan/radv_pipeline.c
index fdeb31c453e..686fd371f0f 100644
--- a/src/amd/vulkan/radv_pipeline.c
+++ b/src/amd/vulkan/radv_pipeline.c
@@ -3272,27 +3272,18 @@ radv_pipeline_generate_vgt_gs_mode(struct radeon_cmdbuf 
*ctx_cs,

Can you rename the function?

Actually now that I see your later patches, how about we keep this
function, return immediately if ngg, and then move the primitive id
stuff for ngg to ngg?

Yes, looks better.






 pipeline->shaders[MESA_SHADER_TESS_EVAL] :
 pipeline->shaders[MESA_SHADER_VERTEX];
 unsigned vgt_primitiveid_en = 0;
-   uint32_t vgt_gs_mode = 0;

-   if (radv_pipeline_has_gs(pipeline)) {
-   const struct radv_shader_variant *gs =
-   pipeline->shaders[MESA_SHADER_GEOMETRY];
-
-   vgt_gs_mode = ac_vgt_gs_mode(gs->info.gs.vertices_out,
-
pipeline->device->physical_device->rad_info.chip_class);
-   } else if (radv_pipeline_has_ngg(pipeline)) {
+   if (radv_pipeline_has_ngg(pipeline)) {
 bool enable_prim_id =
 outinfo->export_prim_id || vs->info.info.uses_prim_id;

 vgt_primitiveid_en |= S_028A84_PRIMITIVEID_EN(enable_prim_id) |
   
S_028A84_NGG_DISABLE_PROVOK_REUSE(enable_prim_id);
 } else if (outinfo->export_prim_id || vs->info.info.uses_prim_id) {
-   vgt_gs_mode = S_028A40_MODE(V_028A40_GS_SCENARIO_A);
 vgt_primitiveid_en |= S_028A84_PRIMITIVEID_EN(1);
 }

 radeon_set_context_reg(ctx_cs, R_028A84_VGT_PRIMITIVEID_EN, 
vgt_primitiveid_en);
-   radeon_set_context_reg(ctx_cs, R_028A40_VGT_GS_MODE, vgt_gs_mode);
  }

  static void
@@ -3370,6 +3361,38 @@ radv_pipeline_generate_hw_vs(struct radeon_cmdbuf 
*ctx_cs,
cull_dist_mask << 8 |
clip_dist_mask);

+   /* We always write VGT_GS_MODE in the VS state, because every switch
+* between different shader pipelines involving a different GS or no GS
+* at all involves a switch of the VS (different GS use different copy
+* shaders). On the other hand, when the API switches from a GS to no
+* GS and then back to the same GS used originally, the GS state is not
+* sent again.
+*/
+   unsigned vgt_gs_mode;
+   if (!radv_pipeline_has_gs(pipeline)) {
+   const struct radv_vs_output_info *outinfo =
+   get_vs_output_info(pipeline);
+   const struct radv_shader_variant *vs =
+   pipeline->shaders[MESA_SHADER_TESS_EVAL] ?
+   pipeline->shaders[MESA_SHADER_TESS_EVAL] :
+   pipeline->shaders[MESA_SHADER_VERTEX];
+   unsigned mode = V_028A40_GS_OFF;
+
+   /* PrimID needs GS scenario A. */
+   if (outinfo->export_prim_id || vs->info.info.uses_prim_id)
+   mode = V_028A40_GS_SCENARIO_A;
+
+   vgt_gs_mode = S_028A40_MODE(mode);
+   } else {
+   const struct radv_shader_variant *gs =
+   pipeline->shaders[MESA_SHADER_GEOMETRY];
+
+   vgt_gs_mode = ac_vgt_gs_mode(gs->info.gs.vertices_out,
+
pipeline->device->physical_device->rad_info.chip_class);
+   }
+
+   radeon_set_context_reg(ctx_cs, R_028A40_VGT_GS_MODE, vgt_gs_mode);
+

Can you keep this in a separate function (possibly with the name
radv_pipeline_generate_vgt_gs_mode)?

 if (pipeline->device->physical_device->rad_info.chip_class <= GFX8)
 radeon_set_context_reg(ctx_cs, R_028AB4_VGT_REUSE_OFF,
outinfo->writes_viewport_index);
--
2.22.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] radv: fix crash in shader tracing.

2019-07-18 Thread Samuel Pitoiset


Reviewed-by: Samuel Pitoiset 

On 7/18/19 2:51 AM, Dave Airlie wrote:

From: Dave Airlie 

Enabling tracing, and then having a vmfault, can leads to a segfault
before we print out the traces, as if a meta shader is executing
and we don't have the NIR for it.

Just pass the stage and give back a default.

Fixes: 9b9ccee4d64 ("radv: take LDS into account for compute shader occupancy 
stats")
---
  src/amd/vulkan/radv_nir_to_llvm.c | 8 ++--
  src/amd/vulkan/radv_private.h | 1 +
  src/amd/vulkan/radv_shader.c  | 2 +-
  3 files changed, 8 insertions(+), 3 deletions(-)

diff --git a/src/amd/vulkan/radv_nir_to_llvm.c 
b/src/amd/vulkan/radv_nir_to_llvm.c
index 3e18303879e..c08789a4361 100644
--- a/src/amd/vulkan/radv_nir_to_llvm.c
+++ b/src/amd/vulkan/radv_nir_to_llvm.c
@@ -4244,9 +4244,10 @@ ac_setup_rings(struct radv_shader_context *ctx)
  
  unsigned

  radv_nir_get_max_workgroup_size(enum chip_class chip_class,
+   gl_shader_stage stage,
const struct nir_shader *nir)
  {
-   switch (nir->info.stage) {
+   switch (stage) {
case MESA_SHADER_TESS_CTRL:
return chip_class >= GFX7 ? 128 : 64;
case MESA_SHADER_GEOMETRY:
@@ -4257,6 +4258,8 @@ radv_nir_get_max_workgroup_size(enum chip_class 
chip_class,
return 0;
}
  
+	if (!nir)

+   return chip_class >= GFX9 ? 128 : 64;
unsigned max_workgroup_size = nir->info.cs.local_size[0] *
nir->info.cs.local_size[1] *
nir->info.cs.local_size[2];
@@ -4340,7 +4343,8 @@ LLVMModuleRef ac_translate_nir_to_llvm(struct 
ac_llvm_compiler *ac_llvm,
for (int i = 0; i < shader_count; ++i) {
ctx.max_workgroup_size = MAX2(ctx.max_workgroup_size,
  
radv_nir_get_max_workgroup_size(ctx.options->chip_class,
-   
shaders[i]));
+ 
shaders[i]->info.stage,
+ 
shaders[i]));
}
  
  	if (ctx.ac.chip_class >= GFX10) {

diff --git a/src/amd/vulkan/radv_private.h b/src/amd/vulkan/radv_private.h
index 931d4039397..f1f30887e01 100644
--- a/src/amd/vulkan/radv_private.h
+++ b/src/amd/vulkan/radv_private.h
@@ -2138,6 +2138,7 @@ void radv_compile_nir_shader(struct ac_llvm_compiler 
*ac_llvm,
 const struct radv_nir_compiler_options *options);
  
  unsigned radv_nir_get_max_workgroup_size(enum chip_class chip_class,

+gl_shader_stage stage,
 const struct nir_shader *nir);
  
  /* radv_shader_info.h */

diff --git a/src/amd/vulkan/radv_shader.c b/src/amd/vulkan/radv_shader.c
index bcc050a86cc..8f24a6d72b0 100644
--- a/src/amd/vulkan/radv_shader.c
+++ b/src/amd/vulkan/radv_shader.c
@@ -1232,7 +1232,7 @@ generate_shader_stats(struct radv_device *device,
 lds_increment);
} else if (stage == MESA_SHADER_COMPUTE) {
unsigned max_workgroup_size =
-   radv_nir_get_max_workgroup_size(chip_class, 
variant->nir);
+   radv_nir_get_max_workgroup_size(chip_class, stage, 
variant->nir);
lds_per_wave = (conf->lds_size * lds_increment) /
   DIV_ROUND_UP(max_workgroup_size, 64);
}

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] radv: reset the window scissor with no clear state.

2019-07-18 Thread Samuel Pitoiset


Reviewed-by: Samuel Pitoiset 

On 7/18/19 3:20 AM, Dave Airlie wrote:

From: Dave Airlie 

IF we don't have clear state (which gfx10 doesn't currently)
we will fix to reset the scissor. AMDVLK will leave it set
to something else.

Marek also has this fix for radeonsi pending.
---
  src/amd/vulkan/si_cmd_buffer.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/amd/vulkan/si_cmd_buffer.c b/src/amd/vulkan/si_cmd_buffer.c
index 6fe447ef2e9..0efa169d674 100644
--- a/src/amd/vulkan/si_cmd_buffer.c
+++ b/src/amd/vulkan/si_cmd_buffer.c
@@ -202,7 +202,7 @@ si_emit_graphics(struct radv_physical_device 
*physical_device,
/* CLEAR_STATE doesn't clear these correctly on certain generations.
 * I don't know why. Deduced by trial and error.
 */
-   if (physical_device->rad_info.chip_class <= GFX7) {
+   if (physical_device->rad_info.chip_class <= GFX7 || 
!physical_device->has_clear_state) {
radeon_set_context_reg(cs, 
R_028B28_VGT_STRMOUT_DRAW_OPAQUE_OFFSET, 0);
radeon_set_context_reg(cs, R_028204_PA_SC_WINDOW_SCISSOR_TL,
   S_028204_WINDOW_OFFSET_DISABLE(1));

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] radv: put back VGT_FLUSH at ring init on gfx10

2019-07-18 Thread Samuel Pitoiset


Reviewed-by: Samuel Pitoiset 

On 7/18/19 8:14 AM, Dave Airlie wrote:

From: Dave Airlie 

I can find no evidence that removing this is a good idea.

Fixes: 9b116173b6a ("radv: do not emit VGT_FLUSH on GFX10")
---
  src/amd/vulkan/radv_device.c | 6 ++
  1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/src/amd/vulkan/radv_device.c b/src/amd/vulkan/radv_device.c
index b397a9a8aa0..8dd24cb8192 100644
--- a/src/amd/vulkan/radv_device.c
+++ b/src/amd/vulkan/radv_device.c
@@ -2753,10 +2753,8 @@ radv_get_preamble_cs(struct radv_queue *queue,
radeon_emit(cs, PKT3(PKT3_EVENT_WRITE, 0, 0));
radeon_emit(cs, EVENT_TYPE(V_028A90_VS_PARTIAL_FLUSH) | 
EVENT_INDEX(4));
  
-			if (queue->device->physical_device->rad_info.chip_class < GFX10) {

-   radeon_emit(cs, PKT3(PKT3_EVENT_WRITE, 0, 0));
-   radeon_emit(cs, EVENT_TYPE(V_028A90_VGT_FLUSH) 
| EVENT_INDEX(0));
-   }
+   radeon_emit(cs, PKT3(PKT3_EVENT_WRITE, 0, 0));
+   radeon_emit(cs, EVENT_TYPE(V_028A90_VGT_FLUSH) | 
EVENT_INDEX(0));
}
  
  		radv_emit_gs_ring_sizes(queue, cs, esgs_ring_bo, esgs_ring_size,

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 4/4] radv/gfx10: do not always execute a barrier before the second shader

2019-07-17 Thread Samuel Pitoiset

With NGG, empty waves may still be required to export data.

This fixes dEQP-VK.ycbcr.format.*_unorm.geometry_*.

Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/radv_nir_to_llvm.c | 31 ++-
 1 file changed, 30 insertions(+), 1 deletion(-)

diff --git a/src/amd/vulkan/radv_nir_to_llvm.c 
b/src/amd/vulkan/radv_nir_to_llvm.c
index 3e18303879e..7e623414adc 100644
--- a/src/amd/vulkan/radv_nir_to_llvm.c
+++ b/src/amd/vulkan/radv_nir_to_llvm.c
@@ -4448,8 +4448,37 @@ LLVMModuleRef ac_translate_nir_to_llvm(struct 
ac_llvm_compiler *ac_llvm,
declare_esgs_ring();
}
 
-   if (i)
+   bool nested_barrier = false;
+
+   if (i) {
+   if (shaders[i]->info.stage == MESA_SHADER_GEOMETRY &&
+   ctx.options->key.vs_common_out.as_ngg) {
+   nested_barrier = false;
+   } else {
+   nested_barrier = true;
+   }
+   }
+
+   if (nested_barrier) {
+   /* Execute a barrier before the second shader in
+* a merged shader.
+*
+* Execute the barrier inside the conditional block,
+* so that empty waves can jump directly to s_endpgm,
+* which will also signal the barrier.
+*
+* This is possible in gfx9, because an empty wave
+* for the second shader does not participate in
+* the epilogue. With NGG, empty waves may still
+* be required to export data (e.g. GS output vertices),
+* so we cannot let them exit early.
+*
+* If the shader is TCS and the TCS epilog is present
+* and contains a barrier, it will wait there and then
+* reach s_endpgm.
+   */
ac_emit_barrier(, ctx.stage);
+   }
 
nir_foreach_variable(variable, [i]->outputs)
scan_shader_output_decl(, variable, shaders[i], 
shaders[i]->info.stage);
-- 
2.22.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 1/4] radv: move emitting VGT_GS_MODE into the HW VS path

2019-07-17 Thread Samuel Pitoiset

It's useless for NGG anyways.

Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/radv_pipeline.c | 43 ++
 1 file changed, 33 insertions(+), 10 deletions(-)

diff --git a/src/amd/vulkan/radv_pipeline.c b/src/amd/vulkan/radv_pipeline.c
index fdeb31c453e..686fd371f0f 100644
--- a/src/amd/vulkan/radv_pipeline.c
+++ b/src/amd/vulkan/radv_pipeline.c
@@ -3272,27 +3272,18 @@ radv_pipeline_generate_vgt_gs_mode(struct radeon_cmdbuf 
*ctx_cs,
pipeline->shaders[MESA_SHADER_TESS_EVAL] :
pipeline->shaders[MESA_SHADER_VERTEX];
unsigned vgt_primitiveid_en = 0;
-   uint32_t vgt_gs_mode = 0;
 
-   if (radv_pipeline_has_gs(pipeline)) {
-   const struct radv_shader_variant *gs =
-   pipeline->shaders[MESA_SHADER_GEOMETRY];
-
-   vgt_gs_mode = ac_vgt_gs_mode(gs->info.gs.vertices_out,
-
pipeline->device->physical_device->rad_info.chip_class);
-   } else if (radv_pipeline_has_ngg(pipeline)) {
+   if (radv_pipeline_has_ngg(pipeline)) {
bool enable_prim_id =
outinfo->export_prim_id || vs->info.info.uses_prim_id;
 
vgt_primitiveid_en |= S_028A84_PRIMITIVEID_EN(enable_prim_id) |
  
S_028A84_NGG_DISABLE_PROVOK_REUSE(enable_prim_id);
} else if (outinfo->export_prim_id || vs->info.info.uses_prim_id) {
-   vgt_gs_mode = S_028A40_MODE(V_028A40_GS_SCENARIO_A);
vgt_primitiveid_en |= S_028A84_PRIMITIVEID_EN(1);
}
 
radeon_set_context_reg(ctx_cs, R_028A84_VGT_PRIMITIVEID_EN, 
vgt_primitiveid_en);
-   radeon_set_context_reg(ctx_cs, R_028A40_VGT_GS_MODE, vgt_gs_mode);
 }
 
 static void
@@ -3370,6 +3361,38 @@ radv_pipeline_generate_hw_vs(struct radeon_cmdbuf 
*ctx_cs,
   cull_dist_mask << 8 |
   clip_dist_mask);
 
+   /* We always write VGT_GS_MODE in the VS state, because every switch
+* between different shader pipelines involving a different GS or no GS
+* at all involves a switch of the VS (different GS use different copy
+* shaders). On the other hand, when the API switches from a GS to no
+* GS and then back to the same GS used originally, the GS state is not
+* sent again.
+*/
+   unsigned vgt_gs_mode;
+   if (!radv_pipeline_has_gs(pipeline)) {
+   const struct radv_vs_output_info *outinfo =
+   get_vs_output_info(pipeline);
+   const struct radv_shader_variant *vs =
+   pipeline->shaders[MESA_SHADER_TESS_EVAL] ?
+   pipeline->shaders[MESA_SHADER_TESS_EVAL] :
+   pipeline->shaders[MESA_SHADER_VERTEX];
+   unsigned mode = V_028A40_GS_OFF;
+
+   /* PrimID needs GS scenario A. */
+   if (outinfo->export_prim_id || vs->info.info.uses_prim_id)
+   mode = V_028A40_GS_SCENARIO_A;
+
+   vgt_gs_mode = S_028A40_MODE(mode);
+   } else {
+   const struct radv_shader_variant *gs =
+   pipeline->shaders[MESA_SHADER_GEOMETRY];
+
+   vgt_gs_mode = ac_vgt_gs_mode(gs->info.gs.vertices_out,
+
pipeline->device->physical_device->rad_info.chip_class);
+   }
+
+   radeon_set_context_reg(ctx_cs, R_028A40_VGT_GS_MODE, vgt_gs_mode);
+
if (pipeline->device->physical_device->rad_info.chip_class <= GFX8)
radeon_set_context_reg(ctx_cs, R_028AB4_VGT_REUSE_OFF,
   outinfo->writes_viewport_index);
-- 
2.22.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 3/4] radv/gfx10: set BREAK_WAVE_AT_EOI if TES or GS enable the primitive ID

2019-07-17 Thread Samuel Pitoiset

Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/radv_pipeline.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/src/amd/vulkan/radv_pipeline.c b/src/amd/vulkan/radv_pipeline.c
index de933937f03..8b6e62a75f5 100644
--- a/src/amd/vulkan/radv_pipeline.c
+++ b/src/amd/vulkan/radv_pipeline.c
@@ -3452,6 +3452,14 @@ radv_pipeline_generate_hw_ngg(struct radeon_cmdbuf 
*ctx_cs,
bool break_wave_at_eoi = false;
unsigned nparams;
 
+   if (es_type == MESA_SHADER_TESS_EVAL) {
+   struct radv_shader_variant *gs =
+   pipeline->shaders[MESA_SHADER_GEOMETRY];
+
+   if (es_enable_prim_id || (gs && gs->info.info.uses_prim_id))
+   break_wave_at_eoi = true;
+   }
+
nparams = MAX2(outinfo->param_exports, 1);
radeon_set_context_reg(ctx_cs, R_0286C4_SPI_VS_OUT_CONFIG,
   S_0286C4_VS_EXPORT_COUNT(nparams - 1) |
-- 
2.22.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/4] radv: move emitting VGT_PRIMITIVEID_EN into the HW VS and NGG paths

2019-07-17 Thread Samuel Pitoiset

Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/radv_pipeline.c | 42 --
 1 file changed, 15 insertions(+), 27 deletions(-)

diff --git a/src/amd/vulkan/radv_pipeline.c b/src/amd/vulkan/radv_pipeline.c
index 686fd371f0f..de933937f03 100644
--- a/src/amd/vulkan/radv_pipeline.c
+++ b/src/amd/vulkan/radv_pipeline.c
@@ -3262,30 +3262,6 @@ radv_pipeline_generate_multisample_state(struct 
radeon_cmdbuf *ctx_cs,
   S_02882C_YMAX_BOTTOM_EXCLUSION(exclusion));
 }
 
-static void
-radv_pipeline_generate_vgt_gs_mode(struct radeon_cmdbuf *ctx_cs,
-   struct radv_pipeline *pipeline)
-{
-   const struct radv_vs_output_info *outinfo = 
get_vs_output_info(pipeline);
-   const struct radv_shader_variant *vs =
-   pipeline->shaders[MESA_SHADER_TESS_EVAL] ?
-   pipeline->shaders[MESA_SHADER_TESS_EVAL] :
-   pipeline->shaders[MESA_SHADER_VERTEX];
-   unsigned vgt_primitiveid_en = 0;
-
-   if (radv_pipeline_has_ngg(pipeline)) {
-   bool enable_prim_id =
-   outinfo->export_prim_id || vs->info.info.uses_prim_id;
-
-   vgt_primitiveid_en |= S_028A84_PRIMITIVEID_EN(enable_prim_id) |
- 
S_028A84_NGG_DISABLE_PROVOK_REUSE(enable_prim_id);
-   } else if (outinfo->export_prim_id || vs->info.info.uses_prim_id) {
-   vgt_primitiveid_en |= S_028A84_PRIMITIVEID_EN(1);
-   }
-
-   radeon_set_context_reg(ctx_cs, R_028A84_VGT_PRIMITIVEID_EN, 
vgt_primitiveid_en);
-}
-
 static void
 gfx10_set_ge_pc_alloc(struct radeon_cmdbuf *ctx_cs,
  struct radv_pipeline *pipeline,
@@ -3368,7 +3344,7 @@ radv_pipeline_generate_hw_vs(struct radeon_cmdbuf *ctx_cs,
 * GS and then back to the same GS used originally, the GS state is not
 * sent again.
 */
-   unsigned vgt_gs_mode;
+   unsigned vgt_primitiveid_en, vgt_gs_mode;
if (!radv_pipeline_has_gs(pipeline)) {
const struct radv_vs_output_info *outinfo =
get_vs_output_info(pipeline);
@@ -3376,22 +3352,27 @@ radv_pipeline_generate_hw_vs(struct radeon_cmdbuf 
*ctx_cs,
pipeline->shaders[MESA_SHADER_TESS_EVAL] ?
pipeline->shaders[MESA_SHADER_TESS_EVAL] :
pipeline->shaders[MESA_SHADER_VERTEX];
+   bool enable_prim_id = outinfo->export_prim_id ||
+ vs->info.info.uses_prim_id;
unsigned mode = V_028A40_GS_OFF;
 
/* PrimID needs GS scenario A. */
-   if (outinfo->export_prim_id || vs->info.info.uses_prim_id)
+   if (enable_prim_id)
mode = V_028A40_GS_SCENARIO_A;
 
vgt_gs_mode = S_028A40_MODE(mode);
+   vgt_primitiveid_en = enable_prim_id;
} else {
const struct radv_shader_variant *gs =
pipeline->shaders[MESA_SHADER_GEOMETRY];
 
vgt_gs_mode = ac_vgt_gs_mode(gs->info.gs.vertices_out,
 
pipeline->device->physical_device->rad_info.chip_class);
+   vgt_primitiveid_en = 0;
}
 
radeon_set_context_reg(ctx_cs, R_028A40_VGT_GS_MODE, vgt_gs_mode);
+   radeon_set_context_reg(ctx_cs, R_028A84_VGT_PRIMITIVEID_EN, 
vgt_primitiveid_en);
 
if (pipeline->device->physical_device->rad_info.chip_class <= GFX8)
radeon_set_context_reg(ctx_cs, R_028AB4_VGT_REUSE_OFF,
@@ -3448,6 +3429,8 @@ radv_pipeline_generate_hw_ngg(struct radeon_cmdbuf 
*ctx_cs,
uint64_t va = radv_buffer_get_va(shader->bo) + shader->bo_offset;
gl_shader_stage es_type =
radv_pipeline_has_tess(pipeline) ? MESA_SHADER_TESS_EVAL : 
MESA_SHADER_VERTEX;
+   struct radv_shader_variant *es =
+   es_type == MESA_SHADER_TESS_EVAL ? 
pipeline->shaders[MESA_SHADER_TESS_EVAL] : 
pipeline->shaders[MESA_SHADER_VERTEX];
 
radeon_set_sh_reg_seq(cs, R_00B320_SPI_SHADER_PGM_LO_ES, 2);
radeon_emit(cs, va >> 8);
@@ -3464,6 +3447,8 @@ radv_pipeline_generate_hw_ngg(struct radeon_cmdbuf 
*ctx_cs,
bool misc_vec_ena = outinfo->writes_pointsize ||
outinfo->writes_layer ||
outinfo->writes_viewport_index;
+   bool es_enable_prim_id = outinfo->export_prim_id ||
+(es && es->info.info.uses_prim_id);
bool break_wave_at_eoi = false;
unsigned nparams;
 
@@ -3502,6 +3487,10 @@ radv_pipeline_generate_hw_ngg(struct radeon_cmdbuf 
*ctx_cs,
   cull_dist_mask << 8 |
   clip_dist_mask);
 
+   radeon_set_conte

[Mesa-dev] [PATCH] radv: fix VGT_GS_MODE if VS uses the primitive ID

2019-07-17 Thread Samuel Pitoiset

Found by inspection.

Cc: 
Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/radv_pipeline.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/src/amd/vulkan/radv_pipeline.c b/src/amd/vulkan/radv_pipeline.c
index a3323ae8135..f6cb3611c9d 100644
--- a/src/amd/vulkan/radv_pipeline.c
+++ b/src/amd/vulkan/radv_pipeline.c
@@ -3264,6 +3264,10 @@ radv_pipeline_generate_vgt_gs_mode(struct radeon_cmdbuf 
*ctx_cs,
struct radv_pipeline *pipeline)
 {
const struct radv_vs_output_info *outinfo = 
get_vs_output_info(pipeline);
+   const struct radv_shader_variant *vs =
+   pipeline->shaders[MESA_SHADER_TESS_EVAL] ?
+   pipeline->shaders[MESA_SHADER_TESS_EVAL] :
+   pipeline->shaders[MESA_SHADER_VERTEX];
unsigned vgt_primitiveid_en = 0;
uint32_t vgt_gs_mode = 0;
 
@@ -3274,16 +3278,12 @@ radv_pipeline_generate_vgt_gs_mode(struct radeon_cmdbuf 
*ctx_cs,
vgt_gs_mode = ac_vgt_gs_mode(gs->info.gs.vertices_out,
 
pipeline->device->physical_device->rad_info.chip_class);
} else if (radv_pipeline_has_ngg(pipeline)) {
-   const struct radv_shader_variant *vs =
-   pipeline->shaders[MESA_SHADER_TESS_EVAL] ?
-   pipeline->shaders[MESA_SHADER_TESS_EVAL] :
-   pipeline->shaders[MESA_SHADER_VERTEX];
bool enable_prim_id =
outinfo->export_prim_id || vs->info.info.uses_prim_id;
 
vgt_primitiveid_en |= S_028A84_PRIMITIVEID_EN(enable_prim_id) |
  
S_028A84_NGG_DISABLE_PROVOK_REUSE(enable_prim_id);
-   } else if (outinfo->export_prim_id) {
+   } else if (outinfo->export_prim_id || vs->info.info.uses_prim_id) {
vgt_gs_mode = S_028A40_MODE(V_028A40_GS_SCENARIO_A);
vgt_primitiveid_en |= S_028A84_PRIMITIVEID_EN(1);
}
-- 
2.22.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] radv: fix gathering clip/cull distance masks for GS

2019-07-17 Thread Samuel Pitoiset



On 7/17/19 10:25 AM, Juan A. Suarez Romero wrote:

On Tue, 2019-07-16 at 08:37 +0200, Samuel Pitoiset wrote:

For NGG, the driver relies on the VS outinfo struct.

This fixes
dEQP-VK.clipping.user_defined.clip_*_vert_tess_geom_*


Should this be included in 19.1 stable branch?

No, it's GFX10 specific.




Signed-off-by: Samuel Pitoiset 
---
  src/amd/vulkan/radv_nir_to_llvm.c | 5 +
  1 file changed, 5 insertions(+)

diff --git a/src/amd/vulkan/radv_nir_to_llvm.c 
b/src/amd/vulkan/radv_nir_to_llvm.c
index 76d784b3374..b890ce56f16 100644
--- a/src/amd/vulkan/radv_nir_to_llvm.c
+++ b/src/amd/vulkan/radv_nir_to_llvm.c
@@ -2407,6 +2407,11 @@ scan_shader_output_decl(struct radv_shader_context *ctx,
ctx->shader_info->tes.outinfo.cull_dist_mask = (1 
<< shader->info.cull_distance_array_size) - 1;
ctx->shader_info->tes.outinfo.cull_dist_mask <<= 
shader->info.clip_distance_array_size;
}
+   if (stage == MESA_SHADER_GEOMETRY) {
+   ctx->shader_info->vs.outinfo.clip_dist_mask = (1 
<< shader->info.clip_distance_array_size) - 1;
+   ctx->shader_info->vs.outinfo.cull_dist_mask = (1 
<< shader->info.cull_distance_array_size) - 1;
+   ctx->shader_info->vs.outinfo.cull_dist_mask <<= 
shader->info.clip_distance_array_size;
+   }
}
}
  

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] radv: use correct register setter for ngg hw addr

2019-07-17 Thread Samuel Pitoiset


Reviewed-by: Samuel Pitoiset 

On 7/17/19 6:59 AM, Dave Airlie wrote:

From: Dave Airlie 

this shouldn't matter, but it's good to be correct.
---
  src/amd/vulkan/radv_pipeline.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/amd/vulkan/radv_pipeline.c b/src/amd/vulkan/radv_pipeline.c
index 5cdfe6d24eb..c7660c2900c 100644
--- a/src/amd/vulkan/radv_pipeline.c
+++ b/src/amd/vulkan/radv_pipeline.c
@@ -3408,7 +3408,7 @@ radv_pipeline_generate_hw_ngg(struct radeon_cmdbuf 
*ctx_cs,
  
  	radeon_set_sh_reg_seq(cs, R_00B320_SPI_SHADER_PGM_LO_ES, 2);

radeon_emit(cs, va >> 8);
-   radeon_emit(cs, va >> 40);
+   radeon_emit(cs, S_00B324_MEM_BASE(va >> 40));
radeon_set_sh_reg_seq(cs, R_00B228_SPI_SHADER_PGM_RSRC1_GS, 2);
radeon_emit(cs, shader->config.rsrc1);
radeon_emit(cs, shader->config.rsrc2);

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] radv/gfx10: disable the TC compat zrange workaround

2019-07-16 Thread Samuel Pitoiset

Unnecessary.

Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/radv_cmd_buffer.c | 7 ++-
 src/amd/vulkan/radv_device.c | 2 ++
 src/amd/vulkan/radv_image.c  | 7 ---
 src/amd/vulkan/radv_private.h| 1 +
 4 files changed, 13 insertions(+), 4 deletions(-)

diff --git a/src/amd/vulkan/radv_cmd_buffer.c b/src/amd/vulkan/radv_cmd_buffer.c
index a6d4e0d0e21..b4301c0da15 100644
--- a/src/amd/vulkan/radv_cmd_buffer.c
+++ b/src/amd/vulkan/radv_cmd_buffer.c
@@ -1356,7 +1356,8 @@ radv_update_zrange_precision(struct radv_cmd_buffer 
*cmd_buffer,
uint32_t db_z_info = ds->db_z_info;
uint32_t db_z_info_reg;
 
-   if (!radv_image_is_tc_compat_htile(image))
+   if (!cmd_buffer->device->physical_device->has_tc_compat_zrange_bug ||
+   !radv_image_is_tc_compat_htile(image))
return;
 
if (!radv_layout_has_htile(image, layout,
@@ -1566,6 +1567,10 @@ radv_set_tc_compat_zrange_metadata(struct 
radv_cmd_buffer *cmd_buffer,
 {
struct radeon_cmdbuf *cs = cmd_buffer->cs;
uint64_t va = radv_buffer_get_va(image->bo);
+
+   if (!cmd_buffer->device->physical_device->has_tc_compat_zrange_bug)
+   return;
+
va += image->offset + image->tc_compat_zrange_offset;
 
radeon_emit(cs, PKT3(PKT3_WRITE_DATA, 3, 
cmd_buffer->state.predicating));
diff --git a/src/amd/vulkan/radv_device.c b/src/amd/vulkan/radv_device.c
index 9d75305fc2b..b397a9a8aa0 100644
--- a/src/amd/vulkan/radv_device.c
+++ b/src/amd/vulkan/radv_device.c
@@ -363,6 +363,8 @@ radv_physical_device_init(struct radv_physical_device 
*device,
device->has_scissor_bug = device->rad_info.family == CHIP_VEGA10 ||
  device->rad_info.family == CHIP_RAVEN;
 
+   device->has_tc_compat_zrange_bug = device->rad_info.chip_class < GFX10;
+
/* Out-of-order primitive rasterization. */
device->has_out_of_order_rast = device->rad_info.chip_class >= GFX8 &&
device->rad_info.max_se >= 2;
diff --git a/src/amd/vulkan/radv_image.c b/src/amd/vulkan/radv_image.c
index ccbec36849e..4d3ed71c23c 100644
--- a/src/amd/vulkan/radv_image.c
+++ b/src/amd/vulkan/radv_image.c
@@ -1186,14 +1186,15 @@ radv_image_alloc_dcc(struct radv_image *image)
 }
 
 static void
-radv_image_alloc_htile(struct radv_image *image)
+radv_image_alloc_htile(struct radv_device *device, struct radv_image *image)
 {
image->htile_offset = align64(image->size, 
image->planes[0].surface.htile_alignment);
 
/* + 8 for storing the clear values */
image->clear_value_offset = image->htile_offset + 
image->planes[0].surface.htile_size;
image->size = image->clear_value_offset + 8;
-   if (radv_image_is_tc_compat_htile(image)) {
+   if (radv_image_is_tc_compat_htile(image) &&
+   device->physical_device->has_tc_compat_zrange_bug) {
/* Metadata for the TC-compatible HTILE hardware bug which
 * have to be fixed by updating ZRANGE_PRECISION when doing
 * fast depth clears to 0.0f.
@@ -1402,7 +1403,7 @@ radv_image_create(VkDevice _device,
if (radv_image_can_enable_htile(image) &&
!(device->instance->debug_flags & 
RADV_DEBUG_NO_HIZ)) {
image->tc_compatible_htile = 
image->planes[0].surface.flags & RADEON_SURF_TC_COMPATIBLE_HTILE;
-   radv_image_alloc_htile(image);
+   radv_image_alloc_htile(device, image);
} else {
radv_image_disable_htile(image);
}
diff --git a/src/amd/vulkan/radv_private.h b/src/amd/vulkan/radv_private.h
index e1b5b456ef3..931d4039397 100644
--- a/src/amd/vulkan/radv_private.h
+++ b/src/amd/vulkan/radv_private.h
@@ -317,6 +317,7 @@ struct radv_physical_device {
bool has_clear_state;
bool cpdma_prefetch_writes_memory;
bool has_scissor_bug;
+   bool has_tc_compat_zrange_bug;
 
bool has_out_of_order_rast;
bool out_of_order_rast_allowed;
-- 
2.22.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] radv/gfx10: implement VK_EXT_post_depth_coverage

2019-07-16 Thread Samuel Pitoiset

I did implement this extension a while ago but it didn't work
on pre GFX10 for some reasons. Now all CTS pass.

Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/radv_extensions.py | 1 +
 src/amd/vulkan/radv_nir_to_llvm.c | 1 +
 src/amd/vulkan/radv_pipeline.c| 1 +
 src/amd/vulkan/radv_shader.c  | 1 +
 src/amd/vulkan/radv_shader.h  | 1 +
 5 files changed, 5 insertions(+)

diff --git a/src/amd/vulkan/radv_extensions.py 
b/src/amd/vulkan/radv_extensions.py
index 8b6ba6a4df0..e9addad0035 100644
--- a/src/amd/vulkan/radv_extensions.py
+++ b/src/amd/vulkan/radv_extensions.py
@@ -120,6 +120,7 @@ EXTENSIONS = [
 Extension('VK_EXT_memory_priority',   1, True),
 Extension('VK_EXT_pci_bus_info',  2, True),
 Extension('VK_EXT_pipeline_creation_feedback',1, True),
+Extension('VK_EXT_post_depth_coverage',   1, 
'device->rad_info.chip_class >= GFX10'),
 Extension('VK_EXT_queue_family_foreign',  1, True),
 Extension('VK_EXT_sample_locations',  1, True),
 Extension('VK_EXT_sampler_filter_minmax', 1, 
'device->rad_info.chip_class >= GFX7'),
diff --git a/src/amd/vulkan/radv_nir_to_llvm.c 
b/src/amd/vulkan/radv_nir_to_llvm.c
index a689003d473..3e18303879e 100644
--- a/src/amd/vulkan/radv_nir_to_llvm.c
+++ b/src/amd/vulkan/radv_nir_to_llvm.c
@@ -4637,6 +4637,7 @@ ac_fill_shader_info(struct radv_shader_variant_info 
*shader_info, struct nir_sha
 break;
 case MESA_SHADER_FRAGMENT:
 shader_info->fs.early_fragment_test = 
nir->info.fs.early_fragment_tests;
+shader_info->fs.post_depth_coverage = 
nir->info.fs.post_depth_coverage;
 break;
 case MESA_SHADER_GEOMETRY:
 shader_info->gs.vertices_in = nir->info.gs.vertices_in;
diff --git a/src/amd/vulkan/radv_pipeline.c b/src/amd/vulkan/radv_pipeline.c
index 31495ec078d..7056ac8ca60 100644
--- a/src/amd/vulkan/radv_pipeline.c
+++ b/src/amd/vulkan/radv_pipeline.c
@@ -3822,6 +3822,7 @@ radv_compute_db_shader_control(const struct radv_device 
*device,
S_02880C_MASK_EXPORT_ENABLE(mask_export_enable) |
S_02880C_Z_ORDER(z_order) |
S_02880C_DEPTH_BEFORE_SHADER(ps->info.fs.early_fragment_test) |
+   
S_02880C_PRE_SHADER_DEPTH_COVERAGE_ENABLE(ps->info.fs.post_depth_coverage) |
S_02880C_EXEC_ON_HIER_FAIL(ps->info.info.ps.writes_memory) |
S_02880C_EXEC_ON_NOOP(ps->info.info.ps.writes_memory) |
S_02880C_DUAL_QUAD_DISABLE(disable_rbplus);
diff --git a/src/amd/vulkan/radv_shader.c b/src/amd/vulkan/radv_shader.c
index 1e9399de193..75f1ce3e869 100644
--- a/src/amd/vulkan/radv_shader.c
+++ b/src/amd/vulkan/radv_shader.c
@@ -270,6 +270,7 @@ radv_shader_compile_to_nir(struct radv_device *device,
.int64_atomics = true,
.multiview = true,
.physical_storage_buffer_address = true,
+   .post_depth_coverage = true,
.runtime_descriptor_array = true,
.shader_viewport_index_layer = true,
.stencil_export = true,
diff --git a/src/amd/vulkan/radv_shader.h b/src/amd/vulkan/radv_shader.h
index 360591349a8..fea0d1c8df1 100644
--- a/src/amd/vulkan/radv_shader.h
+++ b/src/amd/vulkan/radv_shader.h
@@ -283,6 +283,7 @@ struct radv_shader_variant_info {
uint32_t float16_shaded_mask;
bool can_discard;
bool early_fragment_test;
+   bool post_depth_coverage;
} fs;
struct {
unsigned block_size[3];
-- 
2.22.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/2] radv/gfx10: fallback to the legacy path if tess and extreme geometry

2019-07-16 Thread Samuel Pitoiset

This is unsupported and hangs.

This fixes GPU hangs with
dEQP-VK.tessellation.geometry_interaction.limits.output_required_*.

Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/radv_pipeline.c | 12 
 src/amd/vulkan/radv_shader.c   |  2 +-
 2 files changed, 13 insertions(+), 1 deletion(-)

diff --git a/src/amd/vulkan/radv_pipeline.c b/src/amd/vulkan/radv_pipeline.c
index d1eede172dc..a22e605ca1c 100644
--- a/src/amd/vulkan/radv_pipeline.c
+++ b/src/amd/vulkan/radv_pipeline.c
@@ -2306,6 +2306,18 @@ radv_fill_shader_keys(struct radv_device *device,
} else {
keys[MESA_SHADER_VERTEX].vs_common_out.as_ngg = true;
}
+
+   if (nir[MESA_SHADER_TESS_CTRL] &&
+   nir[MESA_SHADER_GEOMETRY] &&
+   nir[MESA_SHADER_GEOMETRY]->info.gs.invocations *
+   nir[MESA_SHADER_GEOMETRY]->info.gs.vertices_out > 256) {
+   /* Fallback to the legacy path if tessellation is
+* enabled with extreme geometry because
+* EN_MAX_VERT_OUT_PER_GS_INSTANCE doesn't work and it
+* might hang.
+*/
+   keys[MESA_SHADER_TESS_EVAL].vs_common_out.as_ngg = 
false;
+   }
}
 
for(int i = 0; i < MESA_SHADER_STAGES; ++i)
diff --git a/src/amd/vulkan/radv_shader.c b/src/amd/vulkan/radv_shader.c
index 1e9399de193..6bafcb2f869 100644
--- a/src/amd/vulkan/radv_shader.c
+++ b/src/amd/vulkan/radv_shader.c
@@ -796,7 +796,7 @@ static void radv_postprocess_config(const struct 
radv_physical_device *pdevice,
break;
}
 
-   if (pdevice->rad_info.chip_class >= GFX10 &&
+   if (pdevice->rad_info.chip_class >= GFX10 && info->is_ngg &&
(stage == MESA_SHADER_VERTEX || stage == MESA_SHADER_TESS_EVAL || 
stage == MESA_SHADER_GEOMETRY)) {
unsigned gs_vgpr_comp_cnt, es_vgpr_comp_cnt;
gl_shader_stage es_stage = stage;
-- 
2.22.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 1/2] radv/gfx10: always build the GS copy shader but uses it on-demand

2019-07-16 Thread Samuel Pitoiset

It should be possible to build it on-demand too but it requires
more work. On GFX10, the GS copy shader is required when tess
is enabled with extreme geometry.

Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/radv_cmd_buffer.c |  8 
 src/amd/vulkan/radv_pipeline.c   | 21 ++---
 src/amd/vulkan/radv_private.h|  2 ++
 3 files changed, 24 insertions(+), 7 deletions(-)

diff --git a/src/amd/vulkan/radv_cmd_buffer.c b/src/amd/vulkan/radv_cmd_buffer.c
index 6a0db2b67e9..a6d4e0d0e21 100644
--- a/src/amd/vulkan/radv_cmd_buffer.c
+++ b/src/amd/vulkan/radv_cmd_buffer.c
@@ -929,7 +929,7 @@ radv_emit_prefetch_L2(struct radv_cmd_buffer *cmd_buffer,
if (mask & RADV_PREFETCH_GS) {
radv_emit_shader_prefetch(cmd_buffer,
  
pipeline->shaders[MESA_SHADER_GEOMETRY]);
-   if (pipeline->gs_copy_shader)
+   if (radv_pipeline_has_gs_copy_shader(pipeline))
radv_emit_shader_prefetch(cmd_buffer, 
pipeline->gs_copy_shader);
}
 
@@ -1124,7 +1124,7 @@ radv_emit_graphics_pipeline(struct radv_cmd_buffer 
*cmd_buffer)
   pipeline->shaders[i]->bo);
}
 
-   if (radv_pipeline_has_gs(pipeline) && pipeline->gs_copy_shader)
+   if (radv_pipeline_has_gs_copy_shader(pipeline))
radv_cs_add_buffer(cmd_buffer->device->ws, cmd_buffer->cs,
   pipeline->gs_copy_shader->bo);
 
@@ -2362,7 +2362,7 @@ radv_emit_streamout_buffers(struct radv_cmd_buffer 
*cmd_buffer, uint64_t va)
 base_reg + loc->sgpr_idx * 4, va, 
false);
}
 
-   if (pipeline->gs_copy_shader) {
+   if (radv_pipeline_has_gs_copy_shader(pipeline)) {
loc = 
>gs_copy_shader->info.user_sgprs_locs.shader_data[AC_UD_STREAMOUT_BUFFERS];
if (loc->sgpr_idx != -1) {
base_reg = R_00B130_SPI_SHADER_USER_DATA_VS_0;
@@ -4071,7 +4071,7 @@ static void radv_emit_view_index(struct radv_cmd_buffer 
*cmd_buffer, unsigned in
radeon_set_sh_reg(cmd_buffer->cs, base_reg + loc->sgpr_idx * 4, 
index);
 
}
-   if (pipeline->gs_copy_shader) {
+   if (radv_pipeline_has_gs_copy_shader(pipeline)) {
struct radv_userdata_info *loc = 
>gs_copy_shader->info.user_sgprs_locs.shader_data[AC_UD_VIEW_INDEX];
if (loc->sgpr_idx != -1) {
uint32_t base_reg = R_00B130_SPI_SHADER_USER_DATA_VS_0;
diff --git a/src/amd/vulkan/radv_pipeline.c b/src/amd/vulkan/radv_pipeline.c
index 31495ec078d..d1eede172dc 100644
--- a/src/amd/vulkan/radv_pipeline.c
+++ b/src/amd/vulkan/radv_pipeline.c
@@ -120,6 +120,22 @@ bool radv_pipeline_has_ngg(const struct radv_pipeline 
*pipeline)
return variant->info.is_ngg;
 }
 
+bool radv_pipeline_has_gs_copy_shader(const struct radv_pipeline *pipeline)
+{
+   if (!radv_pipeline_has_gs(pipeline))
+   return false;
+
+   /* The GS copy shader is required if the pipeline has GS on GFX6-GFX9.
+* On GFX10, it might be required in rare cases if it's not possible to
+* enable NGG.
+*/
+   if (radv_pipeline_has_ngg(pipeline))
+   return false;
+
+   assert(pipeline->gs_copy_shader);
+   return true;
+}
+
 static void
 radv_pipeline_destroy(struct radv_device *device,
   struct radv_pipeline *pipeline,
@@ -2395,7 +2411,6 @@ void radv_create_shaders(struct radv_pipeline *pipeline,
struct radv_shader_binary *binaries[MESA_SHADER_STAGES] = {NULL};
struct radv_shader_variant_key keys[MESA_SHADER_STAGES] = {0};
unsigned char hash[20], gs_copy_hash[20];
-   bool use_ngg = device->physical_device->rad_info.chip_class >= GFX10;
 
radv_start_feedback(pipeline_feedback);
 
@@ -2416,7 +2431,7 @@ void radv_create_shaders(struct radv_pipeline *pipeline,
gs_copy_hash[0] ^= 1;
 
bool found_in_application_cache = true;
-   if (modules[MESA_SHADER_GEOMETRY] && !use_ngg) {
+   if (modules[MESA_SHADER_GEOMETRY]) {
struct radv_shader_variant *variants[MESA_SHADER_STAGES] = {0};
radv_create_shader_variants_from_pipeline_cache(device, cache, 
gs_copy_hash, variants,

_in_application_cache);
@@ -2567,7 +2582,7 @@ void radv_create_shaders(struct radv_pipeline *pipeline,
}
}
 
-   if(modules[MESA_SHADER_GEOMETRY] && !use_ngg) {
+   if(modules[MESA_SHADER_GEOMETRY]) {
struct radv_shader_binary *gs_copy_binary = NULL;
if (!pipeline->gs_copy_shader) {
pipeline->gs_copy_shader = radv_create_gs_copy_shader(
diff --git a/src/amd/vulkan/radv_private.h b/sr

Re: [Mesa-dev] [PATCH] android: radv/gfx10: generate gfx10_format_table.h

2019-07-16 Thread Samuel Pitoiset


Acked-by: Samuel Pitoiset 

On 7/10/19 9:13 AM, Mauro Rossi wrote:

This patch adds gfx10_format_table.h in Makefile.sources
and the rules for Android to fix following building errors:

In file included from external/mesa/src/amd/vulkan/radv_debug.c:35:
In file included from external/mesa/src/amd/vulkan/radv_debug.h:27:
external/mesa/src/amd/vulkan/radv_private.h:95:10:
fatal error: 'gfx10_format_table.h' file not found
  ^~
1 error generated.

In file included from external/mesa/src/amd/vulkan/radv_android.c:31:
external/mesa/src/amd/vulkan/radv_private.h:95:10:
fatal error: 'gfx10_format_table.h' file not found
  ^~
1 error generated.

Fixes: 3dc5ec5d16 ("radv/gfx10: generate gfx10_format_table.h")
Signed-off-by: Mauro Rossi 
---
  src/amd/vulkan/Android.mk   | 15 +++
  src/amd/vulkan/Makefile.sources |  3 ++-
  2 files changed, 17 insertions(+), 1 deletion(-)

diff --git a/src/amd/vulkan/Android.mk b/src/amd/vulkan/Android.mk
index 0725feacb5..23cebb1ec8 100644
--- a/src/amd/vulkan/Android.mk
+++ b/src/amd/vulkan/Android.mk
@@ -83,6 +83,7 @@ LOCAL_GENERATED_SOURCES += $(intermediates)/radv_entrypoints.h
  LOCAL_GENERATED_SOURCES += $(intermediates)/radv_extensions.c
  LOCAL_GENERATED_SOURCES += $(intermediates)/radv_extensions.h
  LOCAL_GENERATED_SOURCES += $(intermediates)/vk_format_table.c
+LOCAL_GENERATED_SOURCES += $(intermediates)/gfx10_format_table.h
  
  RADV_ENTRYPOINTS_SCRIPT := $(MESA_TOP)/src/amd/vulkan/radv_entrypoints_gen.py

  RADV_EXTENSIONS_SCRIPT := $(MESA_TOP)/src/amd/vulkan/radv_extensions.py
@@ -117,6 +118,20 @@ $(intermediates)/vk_format_table.c: 
$(VK_FORMAT_TABLE_SCRIPT) \
@mkdir -p $(dir $@)
$(MESA_PYTHON2) $(VK_FORMAT_TABLE_SCRIPT) $(vk_format_layout_csv) > $@
  
+RADV_GEN10_FORMAT_TABLE_INPUTS := \

+   $(MESA_TOP)/src/amd/vulkan/vk_format_layout.csv \
+   $(MESA_TOP)/src/amd/registers/gfx10-rsrc.json
+
+RADV_GEN10_FORMAT_TABLE_DEP := \
+   $(MESA_TOP)/src/amd/registers/regdb.py
+
+RADV_GEN10_FORMAT_TABLE := $(LOCAL_PATH)/gfx10_format_table.py
+
+$(intermediates)/gfx10_format_table.h: $(RADV_GEN10_FORMAT_TABLE) 
$(RADV_GEN10_FORMAT_TABLE_INPUTS) $(RADV_GEN10_FORMAT_TABLE_DEP)
+   @mkdir -p $(dir $@)
+   @echo "Gen Header: $(PRIVATE_MODULE) <= $(notdir $(@))"
+   $(hide) $(MESA_PYTHON2) $(RADV_GEN10_FORMAT_TABLE) 
$(RADV_GEN10_FORMAT_TABLE_INPUTS) > $@ || ($(RM) $@; false)
+
  LOCAL_SHARED_LIBRARIES += $(RADV_SHARED_LIBRARIES)
  
  LOCAL_EXPORT_C_INCLUDE_DIRS := \

diff --git a/src/amd/vulkan/Makefile.sources b/src/amd/vulkan/Makefile.sources
index df90c1150a..312cd0b1e9 100644
--- a/src/amd/vulkan/Makefile.sources
+++ b/src/amd/vulkan/Makefile.sources
@@ -91,5 +91,6 @@ VULKAN_GENERATED_FILES := \
radv_entrypoints.h \
radv_extensions.c \
radv_extensions.h \
-   vk_format_table.c
+   vk_format_table.c \
+   gfx10_format_table.h
  

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/2] radv: update LATE_ALLOC_VS.LIMIT

2019-07-16 Thread Samuel Pitoiset

Mirror RadeonSI.

Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/si_cmd_buffer.c | 60 --
 1 file changed, 42 insertions(+), 18 deletions(-)

diff --git a/src/amd/vulkan/si_cmd_buffer.c b/src/amd/vulkan/si_cmd_buffer.c
index a832dbd89eb..e996fa250a9 100644
--- a/src/amd/vulkan/si_cmd_buffer.c
+++ b/src/amd/vulkan/si_cmd_buffer.c
@@ -264,9 +264,6 @@ si_emit_graphics(struct radv_physical_device 
*physical_device,
/* Logical CUs 16 - 31 */
radeon_set_sh_reg(cs, R_00B404_SPI_SHADER_PGM_RSRC4_HS,
  S_00B404_CU_EN(0x));
-   radeon_set_sh_reg(cs, R_00B204_SPI_SHADER_PGM_RSRC4_GS,
- S_00B204_CU_EN(0x) |
- 
S_00B204_SPI_SHADER_LATE_ALLOC_GS_GFX10(0));
radeon_set_sh_reg(cs, R_00B104_SPI_SHADER_PGM_RSRC4_VS,
  S_00B104_CU_EN(0x));
radeon_set_sh_reg(cs, R_00B004_SPI_SHADER_PGM_RSRC4_PS,
@@ -291,28 +288,55 @@ si_emit_graphics(struct radv_physical_device 
*physical_device,
   S_028A44_ES_VERTS_PER_SUBGRP(64) 
|
   S_028A44_GS_PRIMS_PER_SUBGRP(4));
}
-   radeon_set_sh_reg(cs, R_00B21C_SPI_SHADER_PGM_RSRC3_GS,
- S_00B21C_CU_EN(0x) | 
S_00B21C_WAVE_LIMIT(0x3F));
 
-   if (physical_device->rad_info.num_good_cu_per_sh <= 4) {
+   /* Compute LATE_ALLOC_VS.LIMIT. */
+   unsigned num_cu_per_sh = 
physical_device->rad_info.num_good_cu_per_sh;
+   unsigned late_alloc_limit; /* The limit is per SH. */
+
+   if (physical_device->rad_info.family == CHIP_KABINI) {
+   late_alloc_limit = 0; /* Potential hang on Kabini. */
+   } else if (num_cu_per_sh <= 4) {
/* Too few available compute units per SH. Disallowing
-* VS to run on CU0 could hurt us more than late VS
+* VS to run on one CU could hurt us more than late VS
 * allocation would help.
 *
-* LATE_ALLOC_VS = 2 is the highest safe number.
+* 2 is the highest safe number that allows us to keep
+* all CUs enabled.
 */
-   radeon_set_sh_reg(cs, R_00B118_SPI_SHADER_PGM_RSRC3_VS,
- S_00B118_CU_EN(0x) | 
S_00B118_WAVE_LIMIT(0x3F) );
-   radeon_set_sh_reg(cs, 
R_00B11C_SPI_SHADER_LATE_ALLOC_VS, S_00B11C_LIMIT(2));
+   late_alloc_limit = 2;
} else {
-   /* Set LATE_ALLOC_VS == 31. It should be less than
-* the number of scratch waves. Limitations:
-* - VS can't execute on CU0.
-* - If HS writes outputs to LDS, LS can't execute on 
CU0.
+   /* This is a good initial value, allowing 1 late_alloc
+* wave per SIMD on num_cu - 2.
 */
-   radeon_set_sh_reg(cs, R_00B118_SPI_SHADER_PGM_RSRC3_VS,
- S_00B118_CU_EN(0xfffe) | 
S_00B118_WAVE_LIMIT(0x3F));
-   radeon_set_sh_reg(cs, 
R_00B11C_SPI_SHADER_LATE_ALLOC_VS, S_00B11C_LIMIT(31));
+   late_alloc_limit = (num_cu_per_sh - 2) * 4;
+   }
+
+   unsigned cu_mask_vs = 0x;
+   unsigned cu_mask_gs = 0x;
+
+   if (late_alloc_limit > 2) {
+   if (physical_device->rad_info.chip_class >= GFX10) {
+   /* CU2 & CU3 disabled because of the dual CU 
design */
+   cu_mask_vs = 0xfff3;
+   cu_mask_gs = 0xfff3; /* NGG only */
+   } else {
+   cu_mask_vs = 0xfffe; /* 1 CU disabled */
+   }
+   }
+
+   radeon_set_sh_reg(cs, R_00B118_SPI_SHADER_PGM_RSRC3_VS,
+ S_00B118_CU_EN(cu_mask_vs) |
+ S_00B118_WAVE_LIMIT(0x3F));
+   radeon_set_sh_reg(cs, R_00B11C_SPI_SHADER_LATE_ALLOC_VS,
+ S_00B11C_LIMIT(late_alloc_limit));
+
+   radeon_set_sh_reg(cs, R_00B21C_SPI_SHADER_PGM_RSRC3_GS,
+ S_00B21C_CU_EN(cu_mask_gs) | 
S_00B21C_WAVE_LIMIT(0x3F));
+
+   if (physical_device->rad_info.chip_class >= GFX10) {
+   radeon_set_sh_reg(cs

[Mesa-dev] [PATCH 1/2] radv/gfx10: support pixel shaders without exports

2019-07-16 Thread Samuel Pitoiset

Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/radv_pipeline.c | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/src/amd/vulkan/radv_pipeline.c b/src/amd/vulkan/radv_pipeline.c
index fdb0ed29ea4..31495ec078d 100644
--- a/src/amd/vulkan/radv_pipeline.c
+++ b/src/amd/vulkan/radv_pipeline.c
@@ -4283,9 +4283,15 @@ radv_pipeline_init(struct radv_pipeline *pipeline,
 *stalls without this setting.
 *
 * Don't add this to CB_SHADER_MASK.
+*
+* GFX10 supports pixel shaders without exports by setting both the
+* color and Z formats to SPI_SHADER_ZERO. The hw will skip export
+* instructions if any are present.
 */
struct radv_shader_variant *ps = 
pipeline->shaders[MESA_SHADER_FRAGMENT];
-   if (!blend.spi_shader_col_format) {
+   if ((pipeline->device->physical_device->rad_info.chip_class <= GFX9 ||
+ps->info.fs.can_discard) &&
+   !blend.spi_shader_col_format) {
if (!ps->info.info.ps.writes_z &&
!ps->info.info.ps.writes_stencil &&
!ps->info.info.ps.writes_sample_mask)
-- 
2.22.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] radv: fix gathering clip/cull distance masks for GS

2019-07-16 Thread Samuel Pitoiset

For NGG, the driver relies on the VS outinfo struct.

This fixes
dEQP-VK.clipping.user_defined.clip_*_vert_tess_geom_*

Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/radv_nir_to_llvm.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/src/amd/vulkan/radv_nir_to_llvm.c 
b/src/amd/vulkan/radv_nir_to_llvm.c
index 76d784b3374..b890ce56f16 100644
--- a/src/amd/vulkan/radv_nir_to_llvm.c
+++ b/src/amd/vulkan/radv_nir_to_llvm.c
@@ -2407,6 +2407,11 @@ scan_shader_output_decl(struct radv_shader_context *ctx,
ctx->shader_info->tes.outinfo.cull_dist_mask = 
(1 << shader->info.cull_distance_array_size) - 1;
ctx->shader_info->tes.outinfo.cull_dist_mask 
<<= shader->info.clip_distance_array_size;
}
+   if (stage == MESA_SHADER_GEOMETRY) {
+   ctx->shader_info->vs.outinfo.clip_dist_mask = 
(1 << shader->info.clip_distance_array_size) - 1;
+   ctx->shader_info->vs.outinfo.cull_dist_mask = 
(1 << shader->info.cull_distance_array_size) - 1;
+   ctx->shader_info->vs.outinfo.cull_dist_mask <<= 
shader->info.clip_distance_array_size;
+   }
}
}
 
-- 
2.22.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] radv/gfx10: enable OC_LDS_EN for NGG GS if the ES stage is TES

2019-07-15 Thread Samuel Pitoiset

Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/radv_shader.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/amd/vulkan/radv_shader.c b/src/amd/vulkan/radv_shader.c
index f6b0297d4a3..1e9399de193 100644
--- a/src/amd/vulkan/radv_shader.c
+++ b/src/amd/vulkan/radv_shader.c
@@ -826,7 +826,8 @@ static void radv_postprocess_config(const struct 
radv_physical_device *pdevice,
config_out->rsrc1 |= 
S_00B228_GS_VGPR_COMP_CNT(gs_vgpr_comp_cnt) |
 S_00B228_WGP_MODE(1);
config_out->rsrc2 |= 
S_00B22C_ES_VGPR_COMP_CNT(es_vgpr_comp_cnt) |
-S_00B22C_LDS_SIZE(config_in->lds_size);
+S_00B22C_LDS_SIZE(config_in->lds_size) |
+S_00B22C_OC_LDS_EN(es_stage == 
MESA_SHADER_TESS_EVAL);
} else if (pdevice->rad_info.chip_class >= GFX9 &&
   stage == MESA_SHADER_GEOMETRY) {
unsigned es_type = info->gs.es_type;
-- 
2.22.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 5019 matches

Mail list logo