from:"Bas Nieuwenhuizen"

Re: Benefits of cryptographic hash functions for uniquely identifing Vulkan shaders.

2023-07-03 Thread Bas Nieuwenhuizen

We throw away the original for space though, so there is nothing to compare
on collision (hence the cryptographic hash).

On Mon, Jul 3, 2023 at 10:23 AM abel.berna...@gmail.com <
abel.berna...@gmail.com> wrote:

> Two cents, sorry if too obvious.
>
> If you want to try to squeeze more performance here, it seems valid to try
> to fallback to full comparison in case of collision. The algorithm will be
> correct irrespective of your (bad luck) with hash collisions, and at worst,
> with an insignificant probability, the time cost is O(n*n), but the typical
> cost will remain close to always O(n).
>
> That way you try cheaper hashing algorithms without worry.
>
> Regards.
>
>
>
> On Thu, 29 Jun 2023 at 13:35, Marek Olšák  wrote:
>
>> If there is a hash collision, it will cause a GPU hang. A cryptographic
>> hash function reduces that chance to practically zero.
>>
>> Marek
>>
>> On Thu, Jun 29, 2023, 07:04 mikolajlubiak1337 <
>> mikolajlubiak1...@proton.me> wrote:
>>
>>> Hi,
>>> I have recently read Phoronix article[1] about you switching to BLAKE3
>>> instead of SHA1.
>>> If BLAKE3 is a cryptographic hash function wouldn't it be faster to use
>>> a non cryptographic hash function or even a checksum function? Do you need
>>> the benefits of cryptographic hash functions over other hash/checksum
>>> functions for the purpose of uniquely identifing Vulkan shaders?
>>>
>>> [1]: https://www.phoronix.com/news/Mesa-BLAKE3-Shader-Hashing
>>>
>>> -- me
>>>
>>>

Re: Mesa 20.0 backlog

2022-04-24 Thread Bas Nieuwenhuizen

Assuming you meant 22.0, I have a backport for the 22.0 radv patch:
https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16126

On Fri, Apr 22, 2022 at 6:56 AM Dylan Baker  wrote:
>
> Hi all,
>
> I've spent a good deal of time this week crushing the backlog of
> patches on the mesa 20.0 series before making the release today. As such
> there are not only a dozen outstanding patches, mostly for zink, which I
> couldn't figure out how to correctly backport.
>
> I've touched base with Lionel about the anv patches, but the remaning
> patches I'd appreciate guidance on what you'd like to do with them.
>
> 2022-03-14 FIXES  d5870c45ae panfrost: Optimise recalculation of max sampler 
> view
> 2022-03-24 FIXES  f348103fce anv: fix dynamic state emission
> 2022-03-24 FIXES  a4f502de32 anv: fix VK_DYNAMIC_STATE_COLOR_WRITE_ENABLE_EXT 
> state
> 2022-03-24 FIXES  1d250b7b95 anv: fix color write enable interaction with 
> color mask
> 2022-03-30 CC 4eca6e3e5d lavapipe: fix xfb availability query copying
> 2022-04-05 CC 3dcb80da9d zink: fix barrier generation for ssbo descriptors
> 2022-04-11 FIXES  dd078d13cb zink: fix tessellation shader key matching.
> 2022-04-13 FIXES  bbdf22ce13 radv: Fix barriers with cp dma
> 2022-04-18 CC 8806f444a5 zink: fix extended restart prim types without 
> dynamic state2
> 2022-04-21 CC 373c8001d6 zink: set VK_QUERY_RESULT_WAIT_BIT when copying 
> to qbo
> 2022-04-21 CC a056cbc691 zink: fix synchronization when drawing from 
> streamout
> 2022-04-21 CC fc5edf9b68 zink: fix xfb counter buffer barriers
> 2022-04-21 CC e509598470 zink: remove xfb_barrier flag
>
> Cheers,
> Dylan

Re: [ANNOUNCE] mesa 22.0.0-rc2

2022-02-09 Thread Bas Nieuwenhuizen

On Wed, Feb 9, 2022 at 7:10 PM Dylan Baker  wrote:
>
> Hi all,
>
> I'd like to announce the availability of Mesa 22.0.0-rc2, the second
> release candidate for mesa 22.0.0. We have lots of fixes here, including
> a good deal of zink fixes, and some changes for shared microsoft, egl,
> core mesa, crocus, broadcom, iris, core intel, anv, llvmpipe, xvga,
> radeonsi, aco, and radv.
>
> Cheers,
> Dylan
>
> shortlog
> 
>
>
> Charmaine Lee (1):
>   mesa: fix misaligned pointer returned by dlist_alloc
>
> Daniel Stone (1):
>   egl/wayland: Reset buffer age when destroying buffers
>
> Danylo Piliaiev (1):
>   turnip: Unconditionaly remove descriptor set from pool's list on free
>
> Dave Airlie (1):
>   crocus: find correct relocation target for the bo.
>
> Dylan Baker (4):
>   .pick_status.json: Update to 0447a2303fb06d6ad1f64e5f079a74bf2cf540da
>   .pick_status.json: Update to 8335fdfeafbe1fd14cb65f9088bbba15d9eb00dc
>   .pick_status.json: Update to 5e9df85b1a4504c5b4162e77e139056dc80accc6
>   VERSION: bump version for 22.0.0-rc2
>
> Iago Toral Quiroga (1):
>   broadcom/compiler: fix offset alignment for ldunifa when skipping
>
> Jesse Natalie (2):
>   microsoft/compiler: Only prep phis for the current function
>   microsoft/compiler: Only treat tess level location as special if it's a 
> patch constant
>
> Kenneth Graunke (1):
>   iris: Make an iris_foreach_batch macro that skips unsupported batches
>
> Lionel Landwerlin (3):
>   intel/fs: don't set allow_sample_mask for CS intrinsics
>   intel/nir: fix shader call lowering
>   anv: fix conditional render for vkCmdDrawIndirectByteCountEXT
>
> Mike Blumenkrantz (7):
>   zink: disable PIPE_SHADER_CAP_FP16_CONST_BUFFERS
>   llvmpipe: disable PIPE_SHADER_CAP_FP16_CONST_BUFFERS
>   zink: add VK_BUFFER_USAGE_CONDITIONAL_RENDERING_BIT_EXT for query binds
>   zink: use scanout obj when returning resource param info
>   zink: fix PIPE_CAP_TGSI_BALLOT export conditional
>   zink: reject invalid draws
>   zink: min/max blit region in coverage functions
>
> Neha Bhende (1):
>   svga: store shared_mem_size in svga_compute_shader instead of 
> svga_context
>
> Pierre-Eric Pelloux-Prayer (1):
>   radeonsi: limit loop unrolling for LLVM < 13
>
> Rhys Perry (2):
>   aco: don't encode src2 for v_writelane_b32_e64
>   radv: fix R_02881C_PA_CL_VS_OUT_CNTL with mixed cull/clip distances
>
> Samuel Pitoiset (1):
>   Revert "radv: re-apply "Do not access set layout during 
> vkCmdBindDescriptorSets.""

Hi Dylan, can we add

commit 66f7289d568db8711adb885acc56622e2aff252a
Author: Samuel Pitoiset 
Date:   Wed Jan 19 16:15:33 2022 +0100

   radv: add reference counting for descriptor set layouts


If we take that revert? The revert wasn't because the patch was bad
but because we had a better patch.

>
> git tag: mesa-22.0.0-rc2
>
> https://mesa.freedesktop.org/archive/mesa-22.0.0-rc2.tar.xz
> SHA256: 14d6478ad367b22fbb24251f3282d98ba9b8c7758dcd416b33353e1387fd57f7  
> mesa-22.0.0-rc2.tar.xz
> SHA512: 
> 9e05355a31f1640df6e800ccdf3150720d1a54aa21d9eb748d567b2b64090b09b6bc54318f2f72644b48c8d08f9db0f7ab3d35c9e1b629ded932fd9ed2e87630
>   mesa-22.0.0-rc2.tar.xz
> PGP:  https://mesa.freedesktop.org/archive/mesa-22.0.0-rc2.tar.xz.sig

Re: [Mesa-dev] Workflow Proposal

2021-10-06 Thread Bas Nieuwenhuizen

On Wed, Oct 6, 2021 at 8:49 PM Jordan Justen  wrote:
>
> Mike Blumenkrantz  writes:
>
> > On Wed, Oct 6, 2021 at 1:27 PM Bas Nieuwenhuizen 
> > wrote:
> >
> >> On Wed, Oct 6, 2021 at 7:07 PM Jason Ekstrand 
> >> wrote:
> >> >
> >> > My primary grip with approvals or the  button is that it's the wrong
> >> > granularity.  It's per-MR instead of per-patch.  When people are
> >> > regularly posting MRs that touch a bunch of different stuff, per-patch
> >> > review is pretty common.  I'm not sure I want to lose that.  :-/
>
> Could a hybrid approach work? Marge could just add:
>
> Approved-by: @jljusten
>
> to the commit message based on the state of the MR. But, for MR's where
> r-b is more appropriate, the developer can still manually add
> Reviewed-by.
>
> Personally I don't find adding the r-b and force pushing to be much of a
> burden, but I could see how in some cases of a small MR, it could be
> nice to just click some buttons on the web-page and be done with it.
>
> But, I really would like Marge to add something to the commit messages
> indicating who approved it. Yeah, you can get that info today by
> following the Part-of link, but there's no guarantees about that being
> around forever.
>
> >> Would it be an option to get Marge to not remove existing Rb tags, so
> >> we could get the streamlined process where possible and fall back if
> >> the MRs turn more complicated?
>
> I guess I missed where it was suggested that Marge should remove
> Reviewed-by tags. I don't think Marge should ever remove something from
> the commit message.

AFAIU this is upstream Marge behavior. Once you enable the
Approval->Rb tag conversion it removes existing Rb tags. Hence why we
don't have the conversion enabled.

>
> > If people really, truly care about per-patch Approval, couldn't they just
> > split out patches from bigger MRs and get Approvals there? Otherwise it
> > should be trivial enough to check the gitlab MR and see who reviewed which
> > patch if it becomes an issue at a later date. Odds are at that point you're
> > already going to the MR to see wtf someone was thinking...
>
> I don't like the idea of saying "just split out the MRs". That doesn't
> work in a lot of cases where patches have dependencies, and just causes
> potential reviewers to have to look in more places to see the big
> picture.
>
> -Jordan

Re: [Mesa-dev] Workflow Proposal

2021-10-06 Thread Bas Nieuwenhuizen

On Wed, Oct 6, 2021 at 7:07 PM Jason Ekstrand  wrote:
>
> On Wed, Oct 6, 2021 at 11:24 AM Emma Anholt  wrote:
> >
> > On Wed, Oct 6, 2021 at 9:20 AM Mike Blumenkrantz
> >  wrote:
> > >
> > > Hi,
> > >
> > > It's recently come to my attention that gitlab has Approvals. Was anyone 
> > > else aware of this feature? You can just click a button and have your 
> > > name recorded in the system as having signed off on landing a patch? Blew 
> > > my mind.
> > >
> > > So with that being said, we also have this thing in the Mesa repo where 
> > > everyone* has to constantly be adding these esoteric tags like 
> > > Reviewed-by (I reviewed it), and Acked-by (I rubber stamped it), or 
> > > Tested-by (I compiled it and maybe ran glxgears), and so forth.
> > >
> > > * Except some incredibly smart people already know where I'm going with 
> > > this
> > >
> > > Instead of continuing to have to manually update each patch with the 
> > > appropriate and definitely-unforgeable tags, what if we just used 
> > > Approvals in the UI instead? We could then have marge-bot require 
> > > approvals as needed in components and bring reviewing into the current 
> > > year. Just think: no more rewriting all the commit logs and force-pushing 
> > > the branch again before you merge!
> > >
> > > Anyway, I thought maybe this would be a nice idea to improve everyone's 
> > > workflows. What do other people think?
>
> My primary grip with approvals or the  button is that it's the wrong
> granularity.  It's per-MR instead of per-patch.  When people are
> regularly posting MRs that touch a bunch of different stuff, per-patch
> review is pretty common.  I'm not sure I want to lose that.  :-/

Would it be an option to get Marge to not remove existing Rb tags, so
we could get the streamlined process where possible and fall back if
the MRs turn more complicated?

(as an aside I think we should just drop the tags in git, but I'll
take anything that moves us forward)
>
> --Jason
>
> > I would love to see this be the process across Mesa.  We already don't
> > rewrite commit messages for freedreno and i915g, and I only have to do
> > the rebase (busy-)work for my projects in other areas of the tree.
> >
> > I don't think we should have marge-bot require approvals
> > per-component, though.  There are times when an MR only incidentally
> > touches a component (for example, changing function signatures in
> > gallium), and actually getting a dev from every driver to sign off on
> > it would be too much.

Re: [Mesa-dev] [PATCH 15/15] RFC: drm/amdgpu: Implement a proper implicit fencing uapi

2021-06-23 Thread Bas Nieuwenhuizen

On Wed, Jun 23, 2021 at 4:50 PM Daniel Vetter  wrote:
>
> On Wed, Jun 23, 2021 at 4:02 PM Christian König
>  wrote:
> >
> > Am 23.06.21 um 15:49 schrieb Daniel Vetter:
> > > On Wed, Jun 23, 2021 at 3:44 PM Christian König
> > >  wrote:
> > >> Am 23.06.21 um 15:38 schrieb Bas Nieuwenhuizen:
> > >>> On Wed, Jun 23, 2021 at 2:59 PM Christian König
> > >>>  wrote:
> > >>>> Am 23.06.21 um 14:18 schrieb Daniel Vetter:
> > >>>>> On Wed, Jun 23, 2021 at 11:45 AM Bas Nieuwenhuizen
> > >>>>>  wrote:
> > >>>>>> On Tue, Jun 22, 2021 at 6:55 PM Daniel Vetter 
> > >>>>>>  wrote:
> > >>>>>>> WARNING: Absolutely untested beyond "gcc isn't dying in agony".
> > >>>>>>>
> > >>>>>>> Implicit fencing done properly needs to treat the implicit fencing
> > >>>>>>> slots like a funny kind of IPC mailbox. In other words it needs to 
> > >>>>>>> be
> > >>>>>>> explicitly. This is the only way it will mesh well with explicit
> > >>>>>>> fencing userspace like vk, and it's also the bare minimum required 
> > >>>>>>> to
> > >>>>>>> be able to manage anything else that wants to use the same buffer on
> > >>>>>>> multiple engines in parallel, and still be able to share it through
> > >>>>>>> implicit sync.
> > >>>>>>>
> > >>>>>>> amdgpu completely lacks such an uapi. Fix this.
> > >>>>>>>
> > >>>>>>> Luckily the concept of ignoring implicit fences exists already, and
> > >>>>>>> takes care of all the complexities of making sure that non-optional
> > >>>>>>> fences (like bo moves) are not ignored. This support was added in
> > >>>>>>>
> > >>>>>>> commit 177ae09b5d699a5ebd1cafcee78889db968abf54
> > >>>>>>> Author: Andres Rodriguez 
> > >>>>>>> Date:   Fri Sep 15 20:44:06 2017 -0400
> > >>>>>>>
> > >>>>>>>drm/amdgpu: introduce AMDGPU_GEM_CREATE_EXPLICIT_SYNC v2
> > >>>>>>>
> > >>>>>>> Unfortuantely it's the wrong semantics, because it's a bo flag and
> > >>>>>>> disables implicit sync on an allocated buffer completely.
> > >>>>>>>
> > >>>>>>> We _do_ want implicit sync, but control it explicitly. For this we
> > >>>>>>> need a flag on the drm_file, so that a given userspace (like vulkan)
> > >>>>>>> can manage the implicit sync slots explicitly. The other side of the
> > >>>>>>> pipeline (compositor, other process or just different stage in a 
> > >>>>>>> media
> > >>>>>>> pipeline in the same process) can then either do the same, or fully
> > >>>>>>> participate in the implicit sync as implemented by the kernel by
> > >>>>>>> default.
> > >>>>>>>
> > >>>>>>> By building on the existing flag for buffers we avoid any issues 
> > >>>>>>> with
> > >>>>>>> opening up additional security concerns - anything this new flag 
> > >>>>>>> here
> > >>>>>>> allows is already.
> > >>>>>>>
> > >>>>>>> All drivers which supports this concept of a userspace-specific
> > >>>>>>> opt-out of implicit sync have a flag in their CS ioctl, but in 
> > >>>>>>> reality
> > >>>>>>> that turned out to be a bit too inflexible. See the discussion 
> > >>>>>>> below,
> > >>>>>>> let's try to do a bit better for amdgpu.
> > >>>>>>>
> > >>>>>>> This alone only allows us to completely avoid any stalls due to
> > >>>>>>> implicit sync, it does not yet allow us to use implicit sync as a
> > >>>>>>> strange form of IPC for sync_file.
> > >>>>>>>
> > >>>>>>> For that we need two more pieces:
> > >>>>>>>
> > &

Re: [Mesa-dev] [PATCH 15/15] RFC: drm/amdgpu: Implement a proper implicit fencing uapi

2021-06-23 Thread Bas Nieuwenhuizen

On Wed, Jun 23, 2021 at 2:59 PM Christian König
 wrote:
>
> Am 23.06.21 um 14:18 schrieb Daniel Vetter:
> > On Wed, Jun 23, 2021 at 11:45 AM Bas Nieuwenhuizen
> >  wrote:
> >> On Tue, Jun 22, 2021 at 6:55 PM Daniel Vetter  
> >> wrote:
> >>> WARNING: Absolutely untested beyond "gcc isn't dying in agony".
> >>>
> >>> Implicit fencing done properly needs to treat the implicit fencing
> >>> slots like a funny kind of IPC mailbox. In other words it needs to be
> >>> explicitly. This is the only way it will mesh well with explicit
> >>> fencing userspace like vk, and it's also the bare minimum required to
> >>> be able to manage anything else that wants to use the same buffer on
> >>> multiple engines in parallel, and still be able to share it through
> >>> implicit sync.
> >>>
> >>> amdgpu completely lacks such an uapi. Fix this.
> >>>
> >>> Luckily the concept of ignoring implicit fences exists already, and
> >>> takes care of all the complexities of making sure that non-optional
> >>> fences (like bo moves) are not ignored. This support was added in
> >>>
> >>> commit 177ae09b5d699a5ebd1cafcee78889db968abf54
> >>> Author: Andres Rodriguez 
> >>> Date:   Fri Sep 15 20:44:06 2017 -0400
> >>>
> >>>  drm/amdgpu: introduce AMDGPU_GEM_CREATE_EXPLICIT_SYNC v2
> >>>
> >>> Unfortuantely it's the wrong semantics, because it's a bo flag and
> >>> disables implicit sync on an allocated buffer completely.
> >>>
> >>> We _do_ want implicit sync, but control it explicitly. For this we
> >>> need a flag on the drm_file, so that a given userspace (like vulkan)
> >>> can manage the implicit sync slots explicitly. The other side of the
> >>> pipeline (compositor, other process or just different stage in a media
> >>> pipeline in the same process) can then either do the same, or fully
> >>> participate in the implicit sync as implemented by the kernel by
> >>> default.
> >>>
> >>> By building on the existing flag for buffers we avoid any issues with
> >>> opening up additional security concerns - anything this new flag here
> >>> allows is already.
> >>>
> >>> All drivers which supports this concept of a userspace-specific
> >>> opt-out of implicit sync have a flag in their CS ioctl, but in reality
> >>> that turned out to be a bit too inflexible. See the discussion below,
> >>> let's try to do a bit better for amdgpu.
> >>>
> >>> This alone only allows us to completely avoid any stalls due to
> >>> implicit sync, it does not yet allow us to use implicit sync as a
> >>> strange form of IPC for sync_file.
> >>>
> >>> For that we need two more pieces:
> >>>
> >>> - a way to get the current implicit sync fences out of a buffer. Could
> >>>be done in a driver ioctl, but everyone needs this, and generally a
> >>>dma-buf is involved anyway to establish the sharing. So an ioctl on
> >>>the dma-buf makes a ton more sense:
> >>>
> >>>
> >>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fdri-devel%2F20210520190007.534046-4-jason%40jlekstrand.net%2Fdata=04%7C01%7Cchristian.koenig%40amd.com%7Cf026055f523d4e4df95b08d936410e39%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637600475351085536%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=gUnM8%2Fulx%2B%2BDLxByO%2F0V3cLqt%2Fc2unWjizEpptQqM8g%3Dreserved=0
> >>>
> >>>Current drivers in upstream solves this by having the opt-out flag
> >>>on their CS ioctl. This has the downside that very often the CS
> >>>which must actually stall for the implicit fence is run a while
> >>>after the implicit fence point was logically sampled per the api
> >>>spec (vk passes an explicit syncobj around for that afaiui), and so
> >>>results in oversync. Converting the implicit sync fences into a
> >>>snap-shot sync_file is actually accurate.
> >>>
> >>> - Simillar we need to be able to set the exclusive implicit fence.
> >>>Current drivers again do this with a CS ioctl flag, with again the
> >>>same problems that the time the CS happens additional dependencies
> >>>have been added. An explicit ioctl to only insert a s

Re: [Mesa-dev] [PATCH 15/15] RFC: drm/amdgpu: Implement a proper implicit fencing uapi

2021-06-23 Thread Bas Nieuwenhuizen

the dependency-side shortcut. We
> need both, or this doesn't do much.
>
> v4: Rebase over the amdgpu patch to always set the implicit sync
> fences.

So I think there is still a case missing in this implementation.
Consider these 3 cases

(format: a->b: b waits on a. Yes, I know arrows are hard)

explicit->explicit: This doesn't wait now, which is good
Implicit->explicit: This doesn't wait now, which is good
explicit->implicit : This still waits as the explicit submission still
adds shared fences and most things that set an exclusive fence for
implicit sync will hence wait on it.

This is probably good enough for what radv needs now but also sounds
like a risk wrt baking in new uapi behavior that we don't want to be
the end result.

Within AMDGPU this is probably solvable in two ways:

1) Downgrade AMDGPU_SYNC_NE_OWNER to AMDGPU_SYNC_EXPLICIT for shared fences.
2) Have an EXPLICIT fence owner that is used for explicit submissions
that is ignored by AMDGPU_SYNC_NE_OWNER.

But this doesn't solve cross-driver interactions here.

>
> Cc: mesa-dev@lists.freedesktop.org
> Cc: Bas Nieuwenhuizen 
> Cc: Dave Airlie 
> Cc: Rob Clark 
> Cc: Kristian H. Kristensen 
> Cc: Michel Dänzer 
> Cc: Daniel Stone 
> Cc: Sumit Semwal 
> Cc: "Christian König" 
> Cc: Alex Deucher 
> Cc: Daniel Vetter 
> Cc: Deepak R Varma 
> Cc: Chen Li 
> Cc: Kevin Wang 
> Cc: Dennis Li 
> Cc: Luben Tuikov 
> Cc: linaro-mm-...@lists.linaro.org
> Signed-off-by: Daniel Vetter 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c  |  7 +--
>  drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 21 +
>  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h  |  6 ++
>  include/uapi/drm/amdgpu_drm.h   | 10 ++
>  4 files changed, 42 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> index 65df34c17264..c5386d13eb4a 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> @@ -498,6 +498,7 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser 
> *p,
> struct amdgpu_bo *gds;
> struct amdgpu_bo *gws;
> struct amdgpu_bo *oa;
> +   bool no_implicit_sync = READ_ONCE(fpriv->vm.no_implicit_sync);
> int r;
>
> INIT_LIST_HEAD(>validated);
> @@ -577,7 +578,8 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser 
> *p,
>
> e->bo_va = amdgpu_vm_bo_find(vm, bo);
>
> -   if (bo->tbo.base.dma_buf && !amdgpu_bo_explicit_sync(bo)) {
> +   if (bo->tbo.base.dma_buf &&
> +   !(no_implicit_sync || amdgpu_bo_explicit_sync(bo))) {
> e->chain = dma_fence_chain_alloc();
> if (!e->chain) {
> r = -ENOMEM;
> @@ -649,6 +651,7 @@ static int amdgpu_cs_sync_rings(struct amdgpu_cs_parser 
> *p)
>  {
> struct amdgpu_fpriv *fpriv = p->filp->driver_priv;
> struct amdgpu_bo_list_entry *e;
> +   bool no_implicit_sync = READ_ONCE(fpriv->vm.no_implicit_sync);
> int r;
>
> list_for_each_entry(e, >validated, tv.head) {
> @@ -656,7 +659,7 @@ static int amdgpu_cs_sync_rings(struct amdgpu_cs_parser 
> *p)
> struct dma_resv *resv = bo->tbo.base.resv;
> enum amdgpu_sync_mode sync_mode;
>
> -   sync_mode = amdgpu_bo_explicit_sync(bo) ?
> +   sync_mode = no_implicit_sync || amdgpu_bo_explicit_sync(bo) ?
> AMDGPU_SYNC_EXPLICIT : AMDGPU_SYNC_NE_OWNER;
> r = amdgpu_sync_resv(p->adev, >job->sync, resv, sync_mode,
>  >vm);
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> index c080ba15ae77..f982626b5328 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> @@ -1724,6 +1724,26 @@ int amdgpu_file_to_fpriv(struct file *filp, struct 
> amdgpu_fpriv **fpriv)
> return 0;
>  }
>
> +int amdgpu_setparam_ioctl(struct drm_device *dev, void *data,
> + struct drm_file *filp)
> +{
> +   struct drm_amdgpu_setparam *setparam = data;
> +   struct amdgpu_fpriv *fpriv = filp->driver_priv;
> +
> +   switch (setparam->param) {
> +   case AMDGPU_SETPARAM_NO_IMPLICIT_SYNC:
> +   if (setparam->value)
> +   WRITE_ONCE(fpriv->vm.no_implicit_sync, true);
> +   else
> +   WRITE_ONCE(fpriv->vm.no_implicit_sync, false);
> +   break;
> +   d

Re: [Mesa-dev] [PATCH 01/11] drm/amdgpu: Comply with implicit fencing rules

2021-05-21 Thread Bas Nieuwenhuizen

On Fri, May 21, 2021 at 4:37 PM Daniel Vetter  wrote:
>
> On Fri, May 21, 2021 at 11:46:23AM +0200, Bas Nieuwenhuizen wrote:
> > On Fri, May 21, 2021 at 11:10 AM Daniel Vetter  
> > wrote:
> > > ---
> > >  drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 4 ++--
> > >  1 file changed, 2 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c 
> > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> > > index 88a24a0b5691..cc8426e1e8a8 100644
> > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> > > @@ -617,8 +617,8 @@ static int amdgpu_cs_parser_bos(struct 
> > > amdgpu_cs_parser *p,
> > > amdgpu_bo_list_for_each_entry(e, p->bo_list) {
> > > struct amdgpu_bo *bo = ttm_to_amdgpu_bo(e->tv.bo);
> > >
> > > -   /* Make sure we use the exclusive slot for shared BOs */
> > > -   if (bo->prime_shared_count)
> > > +   /* Make sure we use the exclusive slot for all 
> > > potentially shared BOs */
> > > +   if (!(bo->flags & AMDGPU_GEM_CREATE_VM_ALWAYS_VALID))
> > > e->tv.num_shared = 0;
> >
> > I think it also makes sense to skip this with
> > AMDGPU_GEM_CREATE_EXPLICIT_SYNC? It can be shared but I don't think
> > anyone expects implicit sync to happen with those.
>
> Ah yes, I missed this entirely. So the "no implicit flag" is already
> there, and the _only_ thing that's missing really is a way to fish out the
> implicit fences, and set them.
>
> https://lore.kernel.org/dri-devel/20210520190007.534046-1-ja...@jlekstrand.net/
>
> So I think all that's really needed in radv is not setting
> RADEON_FLAG_IMPLICIT_SYNC for winsys buffers when Jason's dma-buf ioctl
> are present (means you need to do some import/export and keep the fd
> around for winsys buffers, but shouldn't be too bad), and then control the
> implicit fences entirely explicitly like vk expects.

That is the part I'm less sure about. This is a BO wide flag so we are
also disabling implicit sync in the compositor. If the compositor does
only do read stuff that is ok, as the inserted exclusive fence will
work for that. But as I learned recently the app provided buffer may
end up being written to by the X server which open a whole can of
potential problems if implicit sync gets disabled between Xserver
operations on the app provided buffer. Hence setting that on the WSI
buffer is a whole new can of potential problems and hence I've said a
submission based flag would be preferred.

I can certainly try it out though.

>
> Are you bored enough to type this up for radv? I'll give Jason's kernel
> stuff another review meanwhile.
> -Daniel
>
> > > e->bo_va = amdgpu_vm_bo_find(vm, bo);
> > > }
> > > --
> > > 2.31.0
> > >
>
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 01/11] drm/amdgpu: Comply with implicit fencing rules

2021-05-21 Thread Bas Nieuwenhuizen

the wrong context with a synchronous dma_fence_wait. See
>   submit_fence_sync() leading to msm_gem_sync_object(). Investing into
>   a scheduler might be a good idea.
>
> - all the remaining drivers are ttm based, where I hope they do
>   appropriately obey implicit fences already. I didn't do the full
>   audit there because a) not follow the contract would confuse ttm
>   quite well and b) reading non-standard scheduler and submit code
>   which isn't based on drm/scheduler is a pain.
>
> Onwards to the display side.
>
> - Any driver using the drm_gem_plane_helper_prepare_fb() helper will
>   correctly. Overwhelmingly most drivers get this right, except a few
>   totally dont. I'll follow up with a patch to make this the default
>   and avoid a bunch of bugs.
>
> - I didn't audit the ttm drivers, but given that dma_resv started
>   there I hope they get this right.
>
> In conclusion this IS the contract, both as documented and
> overwhelmingly implemented, specically as implemented by all render
> drivers except amdgpu.
>
> Amdgpu tried to fix this already in
>
> commit 049aca4363d8af87cab8d53de5401602db3b
> Author: Christian König 
> Date:   Wed Sep 19 16:54:35 2018 +0200
>
> drm/amdgpu: fix using shared fence for exported BOs v2
>
> but this fix falls short on a number of areas:
>
> - It's racy, by the time the buffer is shared it might be too late. To
>   make sure there's definitely never a problem we need to set the
>   fences correctly for any buffer that's potentially exportable.
>
> - It's breaking uapi, dma-buf fds support poll() and differentitiate
>   between, which was introduced in
>
> commit 9b495a5887994a6d74d5c261d012083a92b94738
> Author: Maarten Lankhorst 
> Date:   Tue Jul 1 12:57:43 2014 +0200
>
> dma-buf: add poll support, v3
>
> - Christian König wants to nack new uapi building further on this
>   dma_resv contract because it breaks amdgpu, quoting
>
>   "Yeah, and that is exactly the reason why I will NAK this uAPI change.
>
>   "This doesn't works for amdgpu at all for the reasons outlined above."
>
>   
> https://lore.kernel.org/dri-devel/f2eb6751-2f82-9b23-f57e-548de5b72...@gmail.com/
>
>   Rejecting new development because your own driver is broken and
>   violates established cross driver contracts and uapi is really not
>   how upstream works.
>
> Now this patch will have a severe performance impact on anything that
> runs on multiple engines. So we can't just merge it outright, but need
> a bit a plan:
>
> - amdgpu needs a proper uapi for handling implicit fencing. The funny
>   thing is that to do it correctly, implicit fencing must be treated
>   as a very strange IPC mechanism for transporting fences, where both
>   setting the fence and dependency intercepts must be handled
>   explicitly. Current best practices is a per-bo flag to indicate
>   writes, and a per-bo flag to to skip implicit fencing in the CS
>   ioctl as a new chunk.
>
> - Since amdgpu has been shipping with broken behaviour we need an
>   opt-out flag from the butchered implicit fencing model to enable the
>   proper explicit implicit fencing model.
>
> - for kernel memory fences due to bo moves at least the i915 idea is
>   to use ttm_bo->moving. amdgpu probably needs the same.
>
> - since the current p2p dma-buf interface assumes the kernel memory
>   fence is in the exclusive dma_resv fence slot we need to add a new
>   fence slot for kernel fences, which must never be ignored. Since
>   currently only amdgpu supports this there's no real problem here
>   yet, until amdgpu gains a NO_IMPLICIT CS flag.
>
> - New userspace needs to ship in enough desktop distros so that users
>   wont notice the perf impact. I think we can ignore LTS distros who
>   upgrade their kernels but not their mesa3d snapshot.
>
> - Then when this is all in place we can merge this patch here.
>
> What is not a solution to this problem here is trying to make the
> dma_resv rules in the kernel more clever. The fundamental issue here
> is that the amdgpu CS uapi is the least expressive one across all
> drivers (only equalled by panfrost, which has an actual excuse) by not
> allowing any userspace control over how implicit sync is conducted.
>
> Until this is fixed it's completely pointless to make the kernel more
> clever to improve amdgpu, because all we're doing is papering over
> this uapi design issue. amdgpu needs to attain the status quo
> established by other drivers first, once that's achieved we can tackle
> the remaining issues in a consistent way across drivers.
>
> Cc: mesa-dev@lists.freedesktop.org
> Cc: Bas Nieuwenhuizen 
> Cc: Dave Airlie 
> C

Re: [Mesa-dev] [RFC] Linux Graphics Next: Explicit fences everywhere and no BO fences - initial proposal

2021-05-03 Thread Bas Nieuwenhuizen

On Mon, May 3, 2021 at 5:00 PM Jason Ekstrand  wrote:
>
> Sorry for the top-post but there's no good thing to reply to here...
>
> One of the things pointed out to me recently by Daniel Vetter that I
> didn't fully understand before is that dma_buf has a very subtle
> second requirement beyond finite time completion:  Nothing required
> for signaling a dma-fence can allocate memory.  Why?  Because the act
> of allocating memory may wait on your dma-fence.  This, as it turns
> out, is a massively more strict requirement than finite time
> completion and, I think, throws out all of the proposals we have so
> far.
>
> Take, for instance, Marek's proposal for userspace involvement with
> dma-fence by asking the kernel for a next serial and the kernel
> trusting userspace to signal it.  That doesn't work at all if
> allocating memory to trigger a dma-fence can blow up.  There's simply
> no way for the kernel to trust userspace to not do ANYTHING which
> might allocate memory.  I don't even think there's a way userspace can
> trust itself there.  It also blows up my plan of moving the fences to
> transition boundaries.
>
> Not sure where that leaves us.

Honestly the more I look at things I think userspace-signalable fences
with a timeout sound like they are a valid solution for these issues.
Especially since (as has been mentioned countless times in this email
thread) userspace already has a lot of ways to cause timeouts and or
GPU hangs through GPU work already.

Adding a timeout on the signaling side of a dma_fence would ensure:

- The dma_fence signals in finite time
-  If the timeout case does not allocate memory then memory allocation
is not a blocker for signaling.

Of course you lose the full dependency graph and we need to make sure
garbage collection of fences works correctly when we have cycles.
However, the latter sounds very doable and the first sounds like it is
to some extent inevitable.

I feel like I'm missing some requirement here given that we
immediately went to much more complicated things but can't find it.
Thoughts?

- Bas
>
> --Jason
>
> On Mon, May 3, 2021 at 9:42 AM Alex Deucher  wrote:
> >
> > On Sat, May 1, 2021 at 6:27 PM Marek Olšák  wrote:
> > >
> > > On Wed, Apr 28, 2021 at 5:07 AM Michel Dänzer  wrote:
> > >>
> > >> On 2021-04-28 8:59 a.m., Christian König wrote:
> > >> > Hi Dave,
> > >> >
> > >> > Am 27.04.21 um 21:23 schrieb Marek Olšák:
> > >> >> Supporting interop with any device is always possible. It depends on 
> > >> >> which drivers we need to interoperate with and update them. We've 
> > >> >> already found the path forward for amdgpu. We just need to find out 
> > >> >> how many other drivers need to be updated and evaluate the 
> > >> >> cost/benefit aspect.
> > >> >>
> > >> >> Marek
> > >> >>
> > >> >> On Tue, Apr 27, 2021 at 2:38 PM Dave Airlie  > >> >> > wrote:
> > >> >>
> > >> >> On Tue, 27 Apr 2021 at 22:06, Christian König
> > >> >>  > >> >> > wrote:
> > >> >> >
> > >> >> > Correct, we wouldn't have synchronization between device with 
> > >> >> and without user queues any more.
> > >> >> >
> > >> >> > That could only be a problem for A+I Laptops.
> > >> >>
> > >> >> Since I think you mentioned you'd only be enabling this on newer
> > >> >> chipsets, won't it be a problem for A+A where one A is a 
> > >> >> generation
> > >> >> behind the other?
> > >> >>
> > >> >
> > >> > Crap, that is a good point as well.
> > >> >
> > >> >>
> > >> >> I'm not really liking where this is going btw, seems like a ill
> > >> >> thought out concept, if AMD is really going down the road of 
> > >> >> designing
> > >> >> hw that is currently Linux incompatible, you are going to have to
> > >> >> accept a big part of the burden in bringing this support in to 
> > >> >> more
> > >> >> than just amd drivers for upcoming generations of gpu.
> > >> >>
> > >> >
> > >> > Well we don't really like that either, but we have no other option as 
> > >> > far as I can see.
> > >>
> > >> I don't really understand what "future hw may remove support for kernel 
> > >> queues" means exactly. While the per-context queues can be mapped to 
> > >> userspace directly, they don't *have* to be, do they? I.e. the kernel 
> > >> driver should be able to either intercept userspace access to the 
> > >> queues, or in the worst case do it all itself, and provide the existing 
> > >> synchronization semantics as needed?
> > >>
> > >> Surely there are resource limits for the per-context queues, so the 
> > >> kernel driver needs to do some kind of virtualization / multi-plexing 
> > >> anyway, or we'll get sad user faces when there's no queue available for 
> > >> .
> > >>
> > >> I'm probably missing something though, awaiting enlightenment. :)
> > >
> > >
> > > The hw interface for userspace is that the ring buffer is mapped to the 
> > > process address space alongside a doorbell aperture

Re: [Mesa-dev] [RFC] Linux Graphics Next: Explicit fences everywhere and no BO fences - initial proposal

2021-04-20 Thread Bas Nieuwenhuizen

On Tue, Apr 20, 2021 at 8:16 PM Daniel Stone  wrote:

> On Tue, 20 Apr 2021 at 19:00, Christian König <
> ckoenig.leichtzumer...@gmail.com> wrote:
>
>> Am 20.04.21 um 19:44 schrieb Daniel Stone:
>>
>> But winsys is something _completely_ different. Yes, you're using the GPU
>> to do things with buffers A, B, and C to produce buffer Z. Yes, you're
>> using vkQueuePresentKHR to schedule that work. Yes, Mutter's composition
>> job might depend on a Chromium composition job which depends on GTA's
>> render job which depends on GTA's compute job which might take a year to
>> complete. Mutter's composition job needs to complete in 'reasonable'
>> (again, FSVO) time, no matter what. The two are compatible.
>>
>> How? Don't lump them together. Isolate them aggressively, and
>> _predictably_ in a way that you can reason about.
>>
>> What clients do in their own process space is their own business. Games
>> can deadlock themselves if they get wait-before-signal wrong. Compute jobs
>> can run for a year. Their problem. Winsys is not that, because you're
>> crossing every isolation boundary possible. Process, user, container, VM -
>> every kind of privilege boundary. Thus far, dma_fence has protected us from
>> the most egregious abuses by guaranteeing bounded-time completion; it also
>> acts as a sequencing primitive, but from the perspective of a winsys person
>> that's of secondary importance, which is probably one of the bigger
>> disconnects between winsys people and GPU driver people.
>>
>>
>> Finally somebody who understands me :)
>>
>> Well the question is then how do we get winsys and your own process space
>> together then?
>>
>
> It's a jarring transition. If you take a very narrow view and say 'it's
> all GPU work in shared buffers so it should all work the same', then
> client<->winsys looks the same as client<->client gbuffer. But this is a
> trap.
>

I think this is where I think we have have a serious gap of what a winsys
or a compositor is. Like if you have only a single wayland server running
on a physical machine this is easy. But add a VR compositor, an
intermediate compositor (say gamescope), Xwayland and some containers/VM,
some video capture  (or, gasp, a browser that doubles as compositor) and
this story gets seriously complicated. Like who are you protecting from
who? at what point is something client<->winsys vs. client<->client?



> Just because you can mmap() a file on an NFS server in New Zealand doesn't
> mean that you should have the same expectations of memory access to that
> file as you do to of a pointer from alloca(). Even if the primitives look
> the same, you are crossing significant boundaries, and those do not come
> without a compromise and a penalty.
>
>
>> Anyway, one of the great things about winsys (there are some! trust me)
>> is we don't need to be as hopelessly general as for game engines, nor as
>> hyperoptimised. We place strict demands on our clients, and we literally
>> kill them every single time they get something wrong in a way that's
>> visible to us. Our demands on the GPU are so embarrassingly simple that you
>> can run every modern desktop environment on GPUs which don't have unified
>> shaders. And on certain platforms who don't share tiling formats between
>> texture/render-target/scanout ... and it all still runs fast enough that
>> people don't complain.
>>
>>
>> Ignoring everything below since that is the display pipeline I'm not
>> really interested in. My concern is how to get the buffer from the client
>> to the server without allowing the client to get the server into trouble?
>>
>> My thinking is still to use timeouts to acquire texture locks. E.g. when
>> the compositor needs to access texture it grabs a lock and if that lock
>> isn't available in less than 20ms whoever is holding it is killed hard and
>> the lock given to the compositor.
>>
>> It's perfectly fine if a process has a hung queue, but if it tries to
>> send buffers which should be filled by that queue to the compositor it just
>> gets a corrupted window content.
>>
>
> Kill the client hard. If the compositor has speculatively queued sampling
> against rendering which never completed, let it access garbage. You'll have
> one frame of garbage (outdated content, all black, random pattern; the
> failure mode is equally imperfect, because there is no perfect answer),
> then the compositor will notice the client has disappeared and remove all
> its resources.
>
> It's not possible to completely prevent this situation if the compositor
> wants to speculatively pipeline work, only ameliorate it. From a
> system-global point of view, just expose the situation and let it bubble
> up. Watch the number of fences which failed to retire in time, and destroy
> the context if there are enough of them (maybe 1, maybe 100). Watch the
> number of contexts the file description get forcibly destroyed, and destroy
> the file description if there are enough of them. Watch the number of
> descriptions which

Re: [Mesa-dev] AMD Vulkan driver entry point

2020-08-15 Thread Bas Nieuwenhuizen

On Thu, Aug 13, 2020 at 6:57 PM vivek pandya  wrote:
>
> Hello,
>
> I found following function:
>
> src/amd/vulkan/radv_pipeline.c
> 4998:VkResult radv_CreateGraphicsPipelines(

In src/amd/vulkan/radv_entrypoints_gen.py we generate a bunch of code
that implements the vkGet*ProcAddr lookup, and that generated code
will end up referencing radv_CreateGraphicsPipelines
>
> but I could not find any caller to this function. Can someone please explain 
> how this works?
>
> Thanks,
> Vivek
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] Remove classic drivers or fork src/mesa for gallium?

2020-03-29 Thread Bas Nieuwenhuizen

On Sun, Mar 29, 2020 at 11:36 PM Eric Engestrom  wrote:
>
>
>
> On 2020-03-29 at 20:58, Jason Ekstrand  wrote:
> > On Sun, Mar 29, 2020 at 11:45 AM Kristian Høgsberg  
> > wrote:
> > >
> > > As for loading, doesn't glvnd solve that?
> >
> > Not really.  There are still problems if you have HW drivers from both
> > repos on the same system and someone has to decide which one to use.
> > We would either have to come up with a good solution to that problem
> > or we would have to delete/disable all of the drivers still in master
> > in the LTS branch.  In any case, there are real problems to solve
> > there.
>
> That's a simple packaging issue, and IMO it's ok to just say in the 
> announcement
> email "this 'legacy drivers' branch also contains old versions of the new
> drivers. If you ship both these and a modern version of Mesa, make sure
> not to build the same drivers from both trees".
>
> Packagers will then pick the right `-D{dri,gallium,vulkan}-drivers=` lists
> to avoid collisions on their distros.

Wouldn't this be much safer anyway with a small patch to remove those
"new" drivers from the meson options list?

> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [ANNOUNCE] Mesa 20.0 branchpoint planned for 2020/01/29, Milestone opened

2020-01-29 Thread Bas Nieuwenhuizen

On Tue, Jan 28, 2020 at 8:46 PM Dylan Baker  wrote:
>
> Quoting Dylan Baker (2020-01-22 10:27:05)
> > Hi list, due to some last minute changes in plan I'll be managing the 20.0
> > release. The release calendar has been updated, but the gitlab milestone 
> > wasn't
> > opened. That has been corrected, and is here
> > https://gitlab.freedesktop.org/mesa/mesa/-/milestones/9, please add any 
> > issues
> > or MRs you would like to land before the branchpoint to the milestone.
> >
> > Thanks,
> > Dylan
> >
>
> Hi list,
>
> There are still a fair number of issues and MRs opened for the 20.0 branch
> point, should we postpone the branch point?

IMO we should primarily look at what is needed to be ready in time for
the Spring 2020 distro releases* in this release cycle. It doesn't
make sense to add a few more features if we're effectively postponing
improvements (most of which should already be committed!) getting into
the hands of users.

*: I do not know what the timelines for these are 


>
> Dylan
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] -fno-common build failures (default from upcoming gcc release 10)

2020-01-21 Thread Bas Nieuwenhuizen

I think this ended up spawning a bunch of work in
https://gitlab.freedesktop.org/mesa/mesa/issues/2385

On Mon, Jan 20, 2020 at 3:41 PM Stefan Dirsch  wrote:
>
> Hi
>
> Starting from the upcoming GCC release 10, the default of -fcommon option will
> change to -fno-common. Due to this we're going to see a lot of build failures
> in Mesa drivers like
>
> multiple definition of `syncobj_handle'; 
> src/amd/vulkan/9198681@@vulkan_radeon@sha/meson-generated_.._radv_entrypoints.c.o
>  (symbol from plugin):(.text+0x0): first defined here
> [  213s]
> /usr/lib64/gcc/x86_64-suse-linux/9/../../../../x86_64-suse-linux/bin/ld:
> src/amd/vulkan/9198681@@vulkan_radeon@sha/radv_wsi_wayland.c.o (symbol from
> plugin): in function `radv_GetPhysicalDeviceWaylandPresentationSupportKHR':
>
> I'm wondering if there is already anybody working on fixing these issues or is
> it recommended to workaround it by setting -fcommon manually somehow? How can
> the latter be done when using meson/ninja as build tool?
>
> Thanks,
> Stefan
>
> Public Key available
> --
> Stefan Dirsch (Res. & Dev.)   SUSE Software Solutions Germany GmbH
> Tel: 0911-740 53 0Maxfeldstraße 5
> FAX: 0911-740 53 479  D-90409 Nürnberg
> http://www.suse.deGermany
> 
> (HRB 36809, AG Nürnberg) Geschäftsführer: Felix Imendörffer
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] Merge bot ("Marge") enabled

2019-12-17 Thread Bas Nieuwenhuizen

On Tue, Dec 17, 2019 at 11:09 PM Marek Olšák  wrote:
>
> Hi Eric,
>
> Does it mean people no longer need push access, because Marge can merge 
> anything?
>
> So any random person can create a merge request and immediately assign it to 
> Marge to merge it?

Per https://docs.gitlab.com/ee/user/permissions.html, only people with
Developer access (which is pretty much what we also use for push
access) can assign merge requests.
>
> Marek
>
> On Fri, Dec 13, 2019 at 4:35 PM Eric Anholt  wrote:
>>
>> I finally got back around to experimenting with the gitlab merge bot,
>> and it turns out that the day I spent a few weeks back I had actually
>> given up 5 minutes before the finish line.
>>
>> Marge is now enabled for mesa/mesa, piglit, and parallel-deqp-runner.
>> How you interact with marge:
>>
>> - Collect your reviews
>> - Put reviewed-by tags in your commits
>> - When you would have clicked "Merge when pipeline succeeds" (or,
>> worse, rebase and then merge when pipeline succeeds), instead edit the
>> assignee of the MR (top right panel of the UI) and assign to Marge Bot
>> - Marge will eventually take your MR, rebase it and let the pipeline run.
>> - If the pipeline passes, Marge will merge it
>> - If the pipeline fails, Marge will note it in the logs and unassign
>> herself (so your next push with a "fix" won't get auto-merged until
>> you decide to again).
>>
>> In the commit logs of the commits that Marge rebased (they'll always
>> be rebased), you'll get:
>>
>> Part-of: 
>> 
>>
>> In the final commit of that MR, you'll get:
>>
>> Tested-by: Marge Bot
>> 
>>
>> I feel like this is a major improvement to our workflow, in terms of
>> linking commits directly to their discussions without indirecting
>> through google.
>>
>> Note that one Marge instance will only process one MR at a time, so we
>> could end up backed up.  There's a mode that will form merge trains,
>> but I don't understand that mode enough yet to trust it. I think for
>> Mesa at this point this is going to be fine, as we should still be
>> able to push tens of MRs through per day.  As we scale up, we may find
>> that we need a separate Marge for piglit and other projects, which I
>> should be able to set up reasonably at this point.
>>
>> Once we settle in with Marge and learn to trust our robot overlords,
>> I'll update the contributor docs to direct people to Marge instead of
>> the "merge when pipeline succeeds" button.  I'm also hoping that once
>> our commit logs are full of links to discussions, we can drop the
>> mandatory squashing of r-b tags into commit messages and thus make our
>> process even easier for new contributors.
>> ___
>> mesa-dev mailing list
>> mesa-dev@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] Running the CI pipeline on personal Mesa branches

2019-12-06 Thread Bas Nieuwenhuizen

On Fri, Dec 6, 2019 at 10:49 AM Michel Dänzer  wrote:
>
>
> I just merged
> https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2794 , which
> affects people who want to run the CI pipeline on personal Mesa branches:
>
> Pushing changes to a personal branch now always creates a pipeline, but
> none of the jobs in it run by default. (There are no longer any special
> branch names affecting this, because creating MRs from such special
> branches resulted in duplicate CI job runs)
>
> The container stage jobs can be triggered manually from the GitLab UI
> (and maybe also via the GitLab API, for people who'd like to automate
> this? I haven't looked into that). The build/test stage jobs run
> automatically once all their dependencies have passed.
>
> As an example, in order to run one of the "piglit-*" test jobs, one has
> to manually trigger the "x86_build" and "x86_test" jobs.
>
> The pipelines created for merge requests still run all jobs by default
> as before.
>
>
> The main motivation for these changes is to avoid wasting CI runner
> resources. In that same spirit, please also cancel any unneeded
> build/test jobs. This can be done already before those jobs start
> running, e.g. while the container stage jobs run.

No complaint about not running the pipelines by default in personal
repositories, but expecting people to cancel automatically spawned CI
jobs as normal part of their workflow seems incredibly fiddly and
fragile to me. Are we *that* constrained?


>
>
> --
> Earthling Michel Dänzer   |   https://redhat.com
> Libre software enthusiast | Mesa and X developer
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] Declaration of _CmdBeginTransformFeedbackEXT

2019-10-03 Thread Bas Nieuwenhuizen

{$BUILD_DIR?}/src/amd/vulkan/radv_entrypoints.h
{$BUILD_DIR?}/src/intel/vulkan/anv_entrypoints.h

generated using the entrypoints.py scripts in the corresponding source
directories.

On Thu, Oct 3, 2019 at 5:46 PM Andreas Bergmeier  wrote:
>
> First off, thanks for git grep. Did not know that yet.
> Second - both hits are the definition. I am searching for the declaration.
>
> On Thu, 3 Oct 2019 at 16:38, Ilia Mirkin  wrote:
>>
>> $ git grep CmdBeginTransformFeedback
>> src/amd/vulkan/radv_cmd_buffer.c:void radv_CmdBeginTransformFeedbackEXT(
>> src/intel/vulkan/genX_cmd_buffer.c:void genX(CmdBeginTransformFeedbackEXT)(
>>
>> Not *that* hard to search for...
>>
>> On Thu, Oct 3, 2019 at 10:35 AM  wrote:
>> >
>> >
>> >
>> > Sorry for being a bit thick but it seems like I cannot find where 
>> > *_CmdBeginTransformFeedbackEXT functions are getting declared. I would 
>> > assume, that it is somewhere in a XML to header stage, but could not yet 
>> > figure out, which header.
>> > Probably using some macro magic, that makes the code non-searchable :(
>> > Could anyone please point me in the right direction?
>> > ___
>> > mesa-dev mailing list
>> > mesa-dev@lists.freedesktop.org
>> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] gitlab issue migration, labels & triage

2019-09-18 Thread Bas Nieuwenhuizen

Hi all,

One things I realized during the migration is that end users generally
cannot edit labels on an issue[1] and there is no component selection
anymore.

So we end up with a bunch of changes:

1) Bugs come in without labels
2) People are not consistently fixing up labels for issues
3) Labels are not sent in email updates, prompting some IRC talk of
adding the component in the titles already to make email filters work.

While having email filtering work completely would be awesome I think
my biggest issue here "ownership". When a bug came into the radv
bugzilla component I took some ownership of managing it. e.g. bugs
with a wrong component get moved, taking an initial stab at a fix etc.

My fear would be that we don't consistently triage new bugs and that a
number of them don't end up on anyone's radar. Contrary to MRs, where
I expected most of them to be made by long-time developers, bugs tend
to be filed by people who are not really affiliated with mesa and I
feel the effect is probably stronger.

Has anybody put any though in how to best manage things here? e.g.
some process, or do we want some form of automatic labeling, or is my
concern overblown?

- Bas

[1] https://docs.gitlab.com/ee/user/permissions.html
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2] ac: fix exclusive scans on GFX8-GFX9

2019-08-21 Thread Bas Nieuwenhuizen

Reviewed-by: Bas Nieuwenhuizen 

On Wed, Aug 21, 2019 at 4:26 PM Samuel Pitoiset
 wrote:
>
> This fixes a regression introduced with scan operations
> on GFX10. Note that some subgroups CTS still fail on GFX10 but
> I assume it's a different issue.
>
> This fixes dEQP-VK.subgroups.arithmetic.*.subgroupexclusive*.
>
> v2: - move the logic back to ac_build_scan()
>
> Fixes: 227c29a80de "amd/common/gfx10: implement scan & reduce operations"
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/amd/common/ac_llvm_build.c | 7 +++
>  1 file changed, 3 insertions(+), 4 deletions(-)
>
> diff --git a/src/amd/common/ac_llvm_build.c b/src/amd/common/ac_llvm_build.c
> index 05871f5ea98..5abae00d8f6 100644
> --- a/src/amd/common/ac_llvm_build.c
> +++ b/src/amd/common/ac_llvm_build.c
> @@ -4221,10 +4221,9 @@ ac_build_scan(struct ac_llvm_context *ctx, nir_op op, 
> LLVMValueRef src, LLVMValu
> if (ctx->chip_class >= GFX10) {
> result = inclusive ? src : identity;
> } else {
> -   if (inclusive)
> -   result = src;
> -   else
> -   result = ac_build_dpp(ctx, identity, src, dpp_wf_sr1, 
> 0xf, 0xf, false);
> +   if (!inclusive)
> +   src = ac_build_dpp(ctx, identity, src, dpp_wf_sr1, 
> 0xf, 0xf, false);
> +   result = src;
> }
> if (maxprefix <= 1)
> return result;
> --
> 2.22.1
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] ac: fix exclusive scans on GFX8-GFX9

2019-08-21 Thread Bas Nieuwenhuizen

On Wed, Aug 21, 2019 at 3:45 PM Samuel Pitoiset
 wrote:
>
> This fixes a regression introduced with scan operations
> on GFX10. Note that some subgroups CTS still fail on GFX10 but
> I assume it's a different issue.
>
> This fixes dEQP-VK.subgroups.arithmetic.*.subgroupexclusive*.
>
> Fixes: 227c29a80de "amd/common/gfx10: implement scan & reduce operations"
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/amd/common/ac_llvm_build.c | 7 +++
>  1 file changed, 3 insertions(+), 4 deletions(-)
>
> diff --git a/src/amd/common/ac_llvm_build.c b/src/amd/common/ac_llvm_build.c
> index 05871f5ea98..d72eaa2db46 100644
> --- a/src/amd/common/ac_llvm_build.c
> +++ b/src/amd/common/ac_llvm_build.c
> @@ -4221,10 +4221,7 @@ ac_build_scan(struct ac_llvm_context *ctx, nir_op op, 
> LLVMValueRef src, LLVMValu
> if (ctx->chip_class >= GFX10) {
> result = inclusive ? src : identity;
> } else {
> -   if (inclusive)
> -   result = src;
> -   else
> -   result = ac_build_dpp(ctx, identity, src, dpp_wf_sr1, 
> 0xf, 0xf, false);
> +   result = src;
> }
> if (maxprefix <= 1)
> return result;
> @@ -4333,6 +4330,8 @@ ac_build_exclusive_scan(struct ac_llvm_context *ctx, 
> LLVMValueRef src, nir_op op
> get_reduction_identity(ctx, op, 
> ac_get_type_size(LLVMTypeOf(src)));
> result = LLVMBuildBitCast(ctx->builder, ac_build_set_inactive(ctx, 
> src, identity),
>   LLVMTypeOf(identity), "");
> +   if (ctx->chip_class <= GFX9)
> +   result = ac_build_dpp(ctx, identity, result, dpp_wf_sr1, 0xf, 
> 0xf, false);

Kinda annoying that we still do the inclusive/exclusive logic for
gfx10 inside ac_build_scan. Can we keep this inside the function by
using a intermediate src?

> result = ac_build_scan(ctx, op, result, identity, ctx->wave_size, 
> false);
>
> return ac_build_wwm(ctx, result);
> --
> 2.22.1
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] radv: implement VK_AMD_shader_core_properties2

2019-08-21 Thread Bas Nieuwenhuizen

r-b

On Wed, Aug 21, 2019 at 9:01 AM Samuel Pitoiset
 wrote:
>
> Trivial extension that matches PAL.
>
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/amd/vulkan/radv_device.c  | 9 +
>  src/amd/vulkan/radv_extensions.py | 1 +
>  2 files changed, 10 insertions(+)
>
> diff --git a/src/amd/vulkan/radv_device.c b/src/amd/vulkan/radv_device.c
> index cc45ac95c08..5fde4577e4e 100644
> --- a/src/amd/vulkan/radv_device.c
> +++ b/src/amd/vulkan/radv_device.c
> @@ -1300,6 +1300,15 @@ void radv_GetPhysicalDeviceProperties2(
> properties->vgprAllocationGranularity = 4;
> break;
> }
> +   case 
> VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_SHADER_CORE_PROPERTIES_2_AMD: {
> +   VkPhysicalDeviceShaderCoreProperties2AMD *properties =
> +   (VkPhysicalDeviceShaderCoreProperties2AMD 
> *)ext;
> +
> +   properties->shaderCoreFeatures = 0;
> +   properties->activeComputeUnitCount =
> +   pdevice->rad_info.num_good_compute_units;
> +   break;
> +   }
> case 
> VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_VERTEX_ATTRIBUTE_DIVISOR_PROPERTIES_EXT: {
> VkPhysicalDeviceVertexAttributeDivisorPropertiesEXT 
> *properties =
> 
> (VkPhysicalDeviceVertexAttributeDivisorPropertiesEXT *)ext;
> diff --git a/src/amd/vulkan/radv_extensions.py 
> b/src/amd/vulkan/radv_extensions.py
> index 3624970dd37..b28d74f5746 100644
> --- a/src/amd/vulkan/radv_extensions.py
> +++ b/src/amd/vulkan/radv_extensions.py
> @@ -143,6 +143,7 @@ EXTENSIONS = [
>  Extension('VK_AMD_rasterization_order',   1, 
> 'device->has_out_of_order_rast'),
>  Extension('VK_AMD_shader_ballot', 1, 
> 'device->use_shader_ballot'),
>  Extension('VK_AMD_shader_core_properties',1, True),
> +Extension('VK_AMD_shader_core_properties2',   1, True),
>  Extension('VK_AMD_shader_info',   1, True),
>  Extension('VK_AMD_shader_trinary_minmax', 1, True),
>  Extension('VK_GOOGLE_decorate_string',1, True),
> --
> 2.22.1
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] radv: allow to enable VK_AMD_shader_ballot only on GFX8+

2019-08-21 Thread Bas Nieuwenhuizen

r-b

On Wed, Aug 21, 2019 at 8:34 AM Samuel Pitoiset
 wrote:
>
> Scans aren't implemented on SI/CIK.
>
> Cc: 19.2 
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/amd/vulkan/radv_device.c | 3 ++-
>  src/amd/vulkan/radv_shader.c | 2 +-
>  2 files changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/src/amd/vulkan/radv_device.c b/src/amd/vulkan/radv_device.c
> index cc45ac95c08..4aafe6e78aa 100644
> --- a/src/amd/vulkan/radv_device.c
> +++ b/src/amd/vulkan/radv_device.c
> @@ -383,7 +383,8 @@ radv_physical_device_init(struct radv_physical_device 
> *device,
>   device->rad_info.family == 
> CHIP_RENOIR ||
>   device->rad_info.chip_class >= 
> GFX10;
>
> -   device->use_shader_ballot = device->instance->perftest_flags & 
> RADV_PERFTEST_SHADER_BALLOT;
> +   device->use_shader_ballot = device->rad_info.chip_class >= GFX8 &&
> +   device->instance->perftest_flags & 
> RADV_PERFTEST_SHADER_BALLOT;
>
> /* Determine the number of threads per wave for all stages. */
> device->cs_wave_size = 64;
> diff --git a/src/amd/vulkan/radv_shader.c b/src/amd/vulkan/radv_shader.c
> index 1e6a9a950d8..f2a8ac8abe3 100644
> --- a/src/amd/vulkan/radv_shader.c
> +++ b/src/amd/vulkan/radv_shader.c
> @@ -297,7 +297,7 @@ radv_shader_compile_to_nir(struct radv_device *device,
> .lower_ubo_ssbo_access_to_offsets = true,
> .caps = {
> .amd_gcn_shader = true,
> -   .amd_shader_ballot = 
> device->instance->perftest_flags & RADV_PERFTEST_SHADER_BALLOT,
> +   .amd_shader_ballot = 
> device->physical_device->use_shader_ballot,
> .amd_trinary_minmax = true,
> .derivative_group = true,
> .descriptor_array_dynamic_indexing = true,
> --
> 2.22.1
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/2] radv/gfx10: do not use NGG with NAVI14

2019-08-21 Thread Bas Nieuwenhuizen

r-b for both.

On Wed, Aug 21, 2019 at 10:51 AM Samuel Pitoiset
 wrote:
>
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/amd/vulkan/radv_pipeline.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/src/amd/vulkan/radv_pipeline.c b/src/amd/vulkan/radv_pipeline.c
> index 64bd0d64401..c049a2844b8 100644
> --- a/src/amd/vulkan/radv_pipeline.c
> +++ b/src/amd/vulkan/radv_pipeline.c
> @@ -2320,6 +2320,7 @@ radv_fill_shader_keys(struct radv_device *device,
> }
>
> if (device->physical_device->rad_info.chip_class >= GFX10 &&
> +   device->physical_device->rad_info.family != CHIP_NAVI14 &&
> !(device->instance->debug_flags & RADV_DEBUG_NO_NGG)) {
> if (nir[MESA_SHADER_TESS_CTRL]) {
> keys[MESA_SHADER_TESS_EVAL].vs_common_out.as_ngg = 
> true;
> --
> 2.22.1
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/2] radv/gfx10: hardcode some depth+stencil formats in the format table

2019-08-20 Thread Bas Nieuwenhuizen

r-b for both.

On Tue, Aug 20, 2019 at 3:19 PM Samuel Pitoiset
 wrote:
>
> The script doesn't handle them correctly and D16_UNORM_S8_UINT
> isn't supported by the hardware, mark it as invalid.
>
> This fixes warning when generating gfx10_format_table.h.
>
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111393
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/amd/vulkan/gfx10_format_table.py | 5 +
>  1 file changed, 5 insertions(+)
>
> diff --git a/src/amd/vulkan/gfx10_format_table.py 
> b/src/amd/vulkan/gfx10_format_table.py
> index 81b0bed92aa..f55b302bf82 100644
> --- a/src/amd/vulkan/gfx10_format_table.py
> +++ b/src/amd/vulkan/gfx10_format_table.py
> @@ -66,6 +66,11 @@ HARDCODED = {
>  'VK_FORMAT_BC6H_SFLOAT_BLOCK': hardcoded_format('BC6_SFLOAT'),
>  'VK_FORMAT_BC7_UNORM_BLOCK': hardcoded_format('BC7_UNORM'),
>  'VK_FORMAT_BC7_SRGB_BLOCK': hardcoded_format('BC7_SRGB'),
> +
> +# DS
> +'VK_FORMAT_D16_UNORM_S8_UINT': hardcoded_format('INVALID'),
> +'VK_FORMAT_D24_UNORM_S8_UINT': hardcoded_format('8_24_UNORM'),
> +'VK_FORMAT_D32_SFLOAT_S8_UINT': hardcoded_format('X24_8_32_FLOAT'),
>  }
>
>
> --
> 2.22.1
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/2] radv: force enable VK_AMD_shader_ballot for Wolfenstein Youngblood

2019-08-20 Thread Bas Nieuwenhuizen

want to cc to 19.2?

r-b for both


On Tue, Aug 20, 2019 at 4:47 PM Samuel Pitoiset
 wrote:
>
> This gives a nice boost, +20% at this time on my Vega 56. Shader
> ballot should be enabled by default at some point but it reduces
> performance a bit (-6%) with Wolfeinstein II. Enable it only for
> Youngblood at the moment, like what we did for Talos in the past.
>
> As a bonus point, it gets rid of some minor artifacts that only
> happens when ballot is disabled for some reasons.
>
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/amd/vulkan/radv_device.c | 8 
>  1 file changed, 8 insertions(+)
>
> diff --git a/src/amd/vulkan/radv_device.c b/src/amd/vulkan/radv_device.c
> index 49518d43218..c04f6a27e82 100644
> --- a/src/amd/vulkan/radv_device.c
> +++ b/src/amd/vulkan/radv_device.c
> @@ -554,6 +554,14 @@ radv_handle_per_app_options(struct radv_instance 
> *instance,
>  */
> if (HAVE_LLVM < 0x900)
> instance->debug_flags |= RADV_DEBUG_NO_LOAD_STORE_OPT;
> +   } else if (!strcmp(name, "Wolfenstein: Youngblood")) {
> +   if (!(instance->debug_flags & RADV_DEBUG_NO_SHADER_BALLOT)) {
> +   /* Force enable VK_AMD_shader_ballot because it looks
> +* safe and it gives a nice boost (+20% on Vega 56 at
> +* this time).
> +*/
> +   instance->perftest_flags |= 
> RADV_PERFTEST_SHADER_BALLOT;
> +   }
> }
>  }
>
> --
> 2.22.1
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/2] radv/gfx10: do not emit PA_SC_TILE_STEERING_OVERRIDE twice

2019-08-20 Thread Bas Nieuwenhuizen

r-b for both

On Mon, Aug 19, 2019 at 2:57 PM Samuel Pitoiset
 wrote:
>
> CLEAR_STATE emits it for us.
>
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/amd/vulkan/si_cmd_buffer.c | 2 --
>  1 file changed, 2 deletions(-)
>
> diff --git a/src/amd/vulkan/si_cmd_buffer.c b/src/amd/vulkan/si_cmd_buffer.c
> index a5057fe25a2..68ec925f2b5 100644
> --- a/src/amd/vulkan/si_cmd_buffer.c
> +++ b/src/amd/vulkan/si_cmd_buffer.c
> @@ -366,8 +366,6 @@ si_emit_graphics(struct radv_physical_device 
> *physical_device,
> radeon_set_context_reg(cs, R_028C50_PA_SC_NGG_MODE_CNTL,
>S_028C50_MAX_DEALLOCS_IN_WAVE(512));
> radeon_set_context_reg(cs, 
> R_028C58_VGT_VERTEX_REUSE_BLOCK_CNTL, 14);
> -   radeon_set_context_reg(cs, 
> R_02835C_PA_SC_TILE_STEERING_OVERRIDE,
> -  
> physical_device->rad_info.pa_sc_tile_steering_override);
> radeon_set_context_reg(cs, R_02807C_DB_RMI_L2_CACHE_CONTROL,
>
> S_02807C_Z_WR_POLICY(V_02807C_CACHE_STREAM_WR) |
>
> S_02807C_S_WR_POLICY(V_02807C_CACHE_STREAM_WR) |
> --
> 2.22.1
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/2] radv: only account for tile_swizzle for color surfaces with DCC

2019-08-02 Thread Bas Nieuwenhuizen

r-b for both

On Thu, Aug 1, 2019 at 3:41 PM Samuel Pitoiset
 wrote:
>
> It's 0 for depth surfaces with TC compat HTILE enabled.
>
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/amd/vulkan/radv_image.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/src/amd/vulkan/radv_image.c b/src/amd/vulkan/radv_image.c
> index f3237dd5985..221b554e73e 100644
> --- a/src/amd/vulkan/radv_image.c
> +++ b/src/amd/vulkan/radv_image.c
> @@ -483,6 +483,8 @@ si_set_mutable_tex_desc_fields(struct radv_device *device,
> meta_va = gpu_address + image->dcc_offset;
> if (chip_class <= GFX8)
> meta_va += base_level_info->dcc_offset;
> +
> +   meta_va |= (uint32_t)plane->surface.tile_swizzle << 8;
> } else if (!is_storage_image &&
>radv_image_is_tc_compat_htile(image)) {
> meta_va = gpu_address + image->htile_offset;
> @@ -490,10 +492,8 @@ si_set_mutable_tex_desc_fields(struct radv_device 
> *device,
>
> if (meta_va) {
> state[6] |= S_008F28_COMPRESSION_EN(1);
> -   if (chip_class <= GFX9) {
> +   if (chip_class <= GFX9)
> state[7] = meta_va >> 8;
> -   state[7] |= plane->surface.tile_swizzle;
> -   }
> }
> }
>
> --
> 2.22.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/2] radv: remove radv_get_image_cmask_info()

2019-08-02 Thread Bas Nieuwenhuizen

r-b for both

On Thu, Aug 1, 2019 at 5:56 PM Samuel Pitoiset
 wrote:
>
> It's unnecessary to duplicate fields in another struct.
>
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/amd/vulkan/radv_device.c |  4 ++--
>  src/amd/vulkan/radv_image.c  | 38 +---
>  src/amd/vulkan/radv_meta_clear.c | 11 +
>  src/amd/vulkan/radv_private.h| 13 ++-
>  4 files changed, 21 insertions(+), 45 deletions(-)
>
> diff --git a/src/amd/vulkan/radv_device.c b/src/amd/vulkan/radv_device.c
> index 29be192443a..9aa731a252c 100644
> --- a/src/amd/vulkan/radv_device.c
> +++ b/src/amd/vulkan/radv_device.c
> @@ -4400,7 +4400,7 @@ radv_initialise_color_surface(struct radv_device 
> *device,
>
> cb->cb_color_pitch = S_028C64_TILE_MAX(pitch_tile_max);
> cb->cb_color_slice = S_028C68_TILE_MAX(slice_tile_max);
> -   cb->cb_color_cmask_slice = iview->image->cmask.slice_tile_max;
> +   cb->cb_color_cmask_slice = 
> surf->u.legacy.cmask_slice_tile_max;
>
> cb->cb_color_attrib |= 
> S_028C74_TILE_MODE_INDEX(tile_mode_index);
>
> @@ -4420,7 +4420,7 @@ radv_initialise_color_surface(struct radv_device 
> *device,
>
> /* CMASK variables */
> va = radv_buffer_get_va(iview->bo) + iview->image->offset;
> -   va += iview->image->cmask.offset;
> +   va += iview->image->cmask_offset;
> cb->cb_color_cmask = va >> 8;
>
> va = radv_buffer_get_va(iview->bo) + iview->image->offset;
> diff --git a/src/amd/vulkan/radv_image.c b/src/amd/vulkan/radv_image.c
> index 8ff93e4344c..aaaf15ec8dc 100644
> --- a/src/amd/vulkan/radv_image.c
> +++ b/src/amd/vulkan/radv_image.c
> @@ -939,7 +939,7 @@ si_make_texture_descriptor(struct radv_device *device,
>   
> S_008F24_META_RB_ALIGNED(image->planes[0].surface.u.gfx9.cmask.rb_aligned);
>
> if (radv_image_is_tc_compat_cmask(image)) {
> -   va = gpu_address + image->offset + 
> image->cmask.offset;
> +   va = gpu_address + image->offset + 
> image->cmask_offset;
>
> fmask_state[5] |= 
> S_008F24_META_DATA_ADDRESS(va >> 40);
> fmask_state[6] |= S_008F28_COMPRESSION_EN(1);
> @@ -952,7 +952,7 @@ si_make_texture_descriptor(struct radv_device *device,
> fmask_state[5] |= S_008F24_LAST_ARRAY(last_layer);
>
> if (radv_image_is_tc_compat_cmask(image)) {
> -   va = gpu_address + image->offset + 
> image->cmask.offset;
> +   va = gpu_address + image->offset + 
> image->cmask_offset;
>
> fmask_state[6] |= S_008F28_COMPRESSION_EN(1);
> fmask_state[7] |= va >> 8;
> @@ -1138,45 +1138,27 @@ radv_image_alloc_fmask(struct radv_device *device,
> image->alignment = MAX2(image->alignment, image->fmask.alignment);
>  }
>
> -static void
> -radv_image_get_cmask_info(struct radv_device *device,
> - struct radv_image *image,
> - struct radv_cmask_info *out)
> -{
> -   assert(image->plane_count == 1);
> -
> -   if (device->physical_device->rad_info.chip_class >= GFX9) {
> -   out->alignment = image->planes[0].surface.cmask_alignment;
> -   out->size = image->planes[0].surface.cmask_size;
> -   return;
> -   }
> -
> -   out->slice_tile_max = 
> image->planes[0].surface.u.legacy.cmask_slice_tile_max;
> -   out->alignment = image->planes[0].surface.cmask_alignment;
> -   out->slice_size = image->planes[0].surface.cmask_slice_size;
> -   out->size = image->planes[0].surface.cmask_size;
> -}
> -
>  static void
>  radv_image_alloc_cmask(struct radv_device *device,
>struct radv_image *image)
>  {
> +   unsigned cmask_alignment = image->planes[0].surface.cmask_alignment;
> +   unsigned cmask_size = image->planes[0].surface.cmask_size;
> uint32_t clear_value_size = 0;
> -   radv_image_get_cmask_info(device, image, >cmask);
>
> -   if (!image->cmask.size)
> +   if (!cmask_size)
> return;
>
> -   assert(image->cmask.alignment);
> +   assert(cmask_alignment);
>
> -   image->cmask.offset = align64(image->size, image->cmask.alignment);
> +   image->cmask_offset = align64(image->size, cmask_alignment);
> /* + 8 for storing the clear values */
> if (!image->clear_value_offset) {
> -   image->clear_value_offset = image->cmask.offset + 
> image->cmask.size;
> +   image->clear_value_offset = image->cmask_offset + cmask_size;
> clear_value_size = 8;
> }
> -   image->size = image->cmask.offset + image->cmask.size + 
> clear_value_size;
> -   image->alignment = MAX2(image->alignment,

Re: [Mesa-dev] [PATCH 4/4] radv/gfx10: use the correct target machine for Wave32

2019-08-01 Thread Bas Nieuwenhuizen

r-b for patch 1,2,4

On Thu, Aug 1, 2019 at 10:40 AM Samuel Pitoiset
 wrote:
>
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/amd/vulkan/radv_llvm_helper.cpp | 30 +
>  src/amd/vulkan/radv_shader.c|  3 ++-
>  src/amd/vulkan/radv_shader_helper.h |  3 ++-
>  3 files changed, 26 insertions(+), 10 deletions(-)
>
> diff --git a/src/amd/vulkan/radv_llvm_helper.cpp 
> b/src/amd/vulkan/radv_llvm_helper.cpp
> index 2b14ddcf184..612548e4219 100644
> --- a/src/amd/vulkan/radv_llvm_helper.cpp
> +++ b/src/amd/vulkan/radv_llvm_helper.cpp
> @@ -28,8 +28,10 @@
>  class radv_llvm_per_thread_info {
>  public:
> radv_llvm_per_thread_info(enum radeon_family arg_family,
> -   enum ac_target_machine_options arg_tm_options)
> -   : family(arg_family), tm_options(arg_tm_options), 
> passes(NULL) {}
> +   enum ac_target_machine_options arg_tm_options,
> +   unsigned arg_wave_size)
> +   : family(arg_family), tm_options(arg_tm_options),
> + wave_size(arg_wave_size), passes(NULL), passes_wave32(NULL) 
> {}
>
> ~radv_llvm_per_thread_info()
> {
> @@ -47,19 +49,28 @@ public:
> if (!passes)
> return false;
>
> +   if (llvm_info.tm_wave32) {
> +   passes_wave32 = 
> ac_create_llvm_passes(llvm_info.tm_wave32);
> +   if (!passes_wave32)
> +   return false;
> +   }
> +
> return true;
> }
>
> bool compile_to_memory_buffer(LLVMModuleRef module,
>   char **pelf_buffer, size_t *pelf_size)
> {
> -   return ac_compile_module_to_elf(passes, module, pelf_buffer, 
> pelf_size);
> +   struct ac_compiler_passes *p = wave_size == 32 ? 
> passes_wave32 : passes;
> +   return ac_compile_module_to_elf(p, module, pelf_buffer, 
> pelf_size);
> }
>
> bool is_same(enum radeon_family arg_family,
> -enum ac_target_machine_options arg_tm_options) {
> +enum ac_target_machine_options arg_tm_options,
> +unsigned arg_wave_size) {
> if (arg_family == family &&
> -   arg_tm_options == tm_options)
> +   arg_tm_options == tm_options &&
> +   arg_wave_size == wave_size)
> return true;
> return false;
> }
> @@ -67,7 +78,9 @@ public:
>  private:
> enum radeon_family family;
> enum ac_target_machine_options tm_options;
> +   unsigned wave_size;
> struct ac_compiler_passes *passes;
> +   struct ac_compiler_passes *passes_wave32;
>  };
>
>  /* we have to store a linked list per thread due to the possiblity of 
> multiple gpus being required */
> @@ -99,17 +112,18 @@ bool radv_compile_to_elf(struct ac_llvm_compiler *info,
>  bool radv_init_llvm_compiler(struct ac_llvm_compiler *info,
>  bool thread_compiler,
>  enum radeon_family family,
> -enum ac_target_machine_options tm_options)
> +enum ac_target_machine_options tm_options,
> +unsigned wave_size)
>  {
> if (thread_compiler) {
> for (auto  : radv_llvm_per_thread_list) {
> -   if (I.is_same(family, tm_options)) {
> +   if (I.is_same(family, tm_options, wave_size)) {
> *info = I.llvm_info;
> return true;
> }
> }
>
> -   radv_llvm_per_thread_list.emplace_back(family, tm_options);
> +   radv_llvm_per_thread_list.emplace_back(family, tm_options, 
> wave_size);
> radv_llvm_per_thread_info  = 
> radv_llvm_per_thread_list.back();
>
> if (!tinfo.init()) {
> diff --git a/src/amd/vulkan/radv_shader.c b/src/amd/vulkan/radv_shader.c
> index f0ab2d5e467..5e3b1378a14 100644
> --- a/src/amd/vulkan/radv_shader.c
> +++ b/src/amd/vulkan/radv_shader.c
> @@ -1163,7 +1163,8 @@ shader_variant_compile(struct radv_device *device,
> radv_init_llvm_once();
> radv_init_llvm_compiler(_llvm,
> thread_compiler,
> -   chip_family, tm_options);
> +   chip_family, tm_options,
> +   
> radv_get_shader_wave_size(device->physical_device, stage));
> if (gs_copy_shader) {
> assert(shader_count == 1);
> radv_compile_gs_copy_shader(_llvm, *shaders, ,
> diff --git a/src/amd/vulkan/radv_shader_helper.h 
> b/src/amd/vulkan/radv_shader_helper.h
> index d9dace0b495..c64d2df676b 100644
> ---

Re: [Mesa-dev] [PATCH 3/4] radv/gfx10: determine correct wave size when lowering subgroups

2019-08-01 Thread Bas Nieuwenhuizen

So I'm not sure we can actually do this.

AFAIU even though we use a 32-bit wave internally we still have to
expose 64-bit externally, because with
VkPhysicalDeviceSubgroupProperties we say the subgroup size is 64
bits.

So we have to ~emulate a 64-lane wave that always has the upper bits empty.

On Thu, Aug 1, 2019 at 10:40 AM Samuel Pitoiset
 wrote:
>
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/amd/vulkan/radv_shader.c | 30 +-
>  1 file changed, 17 insertions(+), 13 deletions(-)
>
> diff --git a/src/amd/vulkan/radv_shader.c b/src/amd/vulkan/radv_shader.c
> index 97fa80b348c..f0ab2d5e467 100644
> --- a/src/amd/vulkan/radv_shader.c
> +++ b/src/amd/vulkan/radv_shader.c
> @@ -124,6 +124,17 @@ unsigned shader_io_get_unique_index(gl_varying_slot slot)
> unreachable("illegal slot in get unique index\n");
>  }
>
> +static uint8_t
> +radv_get_shader_wave_size(const struct radv_physical_device *pdevice,
> + gl_shader_stage stage)
> +{
> +   if (stage == MESA_SHADER_COMPUTE)
> +   return pdevice->cs_wave_size;
> +   else if (stage == MESA_SHADER_FRAGMENT)
> +   return pdevice->ps_wave_size;
> +   return pdevice->ge_wave_size;
> +}
> +
>  VkResult radv_CreateShaderModule(
> VkDevice_device,
> const VkShaderModuleCreateInfo* pCreateInfo,
> @@ -422,9 +433,13 @@ radv_shader_compile_to_nir(struct radv_device *device,
>
> nir_lower_global_vars_to_local(nir);
> nir_remove_dead_variables(nir, nir_var_function_temp);
> +
> +   uint8_t wave_size = radv_get_shader_wave_size(device->physical_device,
> + nir->info.stage);
> +
> nir_lower_subgroups(nir, &(struct nir_lower_subgroups_options) {
> -   .subgroup_size = 64,
> -   .ballot_bit_size = 64,
> +   .subgroup_size = wave_size,
> +   .ballot_bit_size = wave_size,
> .lower_to_scalar = 1,
> .lower_subgroup_masks = 1,
> .lower_shuffle = 1,
> @@ -667,17 +682,6 @@ radv_get_shader_binary_size(size_t code_size)
> return code_size + DEBUGGER_NUM_MARKERS * 4;
>  }
>
> -static uint8_t
> -radv_get_shader_wave_size(const struct radv_physical_device *pdevice,
> - gl_shader_stage stage)
> -{
> -   if (stage == MESA_SHADER_COMPUTE)
> -   return pdevice->cs_wave_size;
> -   else if (stage == MESA_SHADER_FRAGMENT)
> -   return pdevice->ps_wave_size;
> -   return pdevice->ge_wave_size;
> -}
> -
>  static void radv_postprocess_config(const struct radv_physical_device 
> *pdevice,
> const struct ac_shader_config *config_in,
> const struct radv_shader_variant_info 
> *info,
> --
> 2.22.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 6/6] radv/gfx10: implement a GE bug workaround

2019-07-31 Thread Bas Nieuwenhuizen

r-b for the series

On Wed, Jul 31, 2019 at 9:36 AM Samuel Pitoiset
 wrote:
>
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/amd/vulkan/radv_pipeline.c | 27 +++
>  1 file changed, 23 insertions(+), 4 deletions(-)
>
> diff --git a/src/amd/vulkan/radv_pipeline.c b/src/amd/vulkan/radv_pipeline.c
> index b3952846f43..d62066cbee4 100644
> --- a/src/amd/vulkan/radv_pipeline.c
> +++ b/src/amd/vulkan/radv_pipeline.c
> @@ -3592,6 +3592,7 @@ radv_pipeline_generate_hw_ngg(struct radeon_cmdbuf 
> *ctx_cs,
> bool es_enable_prim_id = outinfo->export_prim_id ||
>  (es && es->info.info.uses_prim_id);
> bool break_wave_at_eoi = false;
> +   unsigned ge_cntl;
> unsigned nparams;
>
> if (es_type == MESA_SHADER_TESS_EVAL) {
> @@ -3674,10 +3675,28 @@ radv_pipeline_generate_hw_ngg(struct radeon_cmdbuf 
> *ctx_cs,
>
> S_028838_INDEX_BUF_EDGE_FLAG_ENA(!radv_pipeline_has_tess(pipeline) &&
> 
> !radv_pipeline_has_gs(pipeline)));
>
> -   radeon_set_uconfig_reg(ctx_cs, R_03096C_GE_CNTL,
> -  S_03096C_PRIM_GRP_SIZE(ngg_state->max_gsprims) 
> |
> -  
> S_03096C_VERT_GRP_SIZE(ngg_state->hw_max_esverts) |
> -  S_03096C_BREAK_WAVE_AT_EOI(break_wave_at_eoi));
> +   ge_cntl = S_03096C_PRIM_GRP_SIZE(ngg_state->max_gsprims) |
> + S_03096C_VERT_GRP_SIZE(ngg_state->hw_max_esverts) |
> + S_03096C_BREAK_WAVE_AT_EOI(break_wave_at_eoi);
> +
> +   /* Bug workaround for a possible hang with non-tessellation cases.
> +* Tessellation always sets GE_CNTL.VERT_GRP_SIZE = 0
> +*
> +* Requirement: GE_CNTL.VERT_GRP_SIZE = 
> VGT_GS_ONCHIP_CNTL.ES_VERTS_PER_SUBGRP - 5
> +*/
> +   if ((pipeline->device->physical_device->rad_info.family == 
> CHIP_NAVI10 ||
> +pipeline->device->physical_device->rad_info.family == 
> CHIP_NAVI12 ||
> +pipeline->device->physical_device->rad_info.family == 
> CHIP_NAVI14) &&
> +   !radv_pipeline_has_tess(pipeline) &&
> +   ngg_state->hw_max_esverts != 256) {
> +   ge_cntl &= C_03096C_VERT_GRP_SIZE;
> +
> +   if (ngg_state->hw_max_esverts > 5) {
> +   ge_cntl |= 
> S_03096C_VERT_GRP_SIZE(ngg_state->hw_max_esverts - 5);
> +   }
> +   }
> +
> +   radeon_set_uconfig_reg(ctx_cs, R_03096C_GE_CNTL, ge_cntl);
>  }
>
>  static void
> --
> 2.22.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] radv/gfx10: add Wave32 support for compute shaders

2019-07-30 Thread Bas Nieuwenhuizen

r-b

On Tue, Jul 30, 2019 at 6:29 PM Samuel Pitoiset
 wrote:
>
> It can be enabled with RADV_PERFTEST=cswave32.
>
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/amd/vulkan/radv_debug.h   |  1 +
>  src/amd/vulkan/radv_device.c  | 12 +++-
>  src/amd/vulkan/radv_nir_to_llvm.c | 14 +-
>  src/amd/vulkan/radv_pipeline.c|  3 ++-
>  src/amd/vulkan/radv_private.h |  3 +++
>  src/amd/vulkan/radv_shader.c  | 25 ++---
>  src/amd/vulkan/radv_shader.h  |  1 +
>  7 files changed, 53 insertions(+), 6 deletions(-)
>
> diff --git a/src/amd/vulkan/radv_debug.h b/src/amd/vulkan/radv_debug.h
> index 723fabda57f..6414e882676 100644
> --- a/src/amd/vulkan/radv_debug.h
> +++ b/src/amd/vulkan/radv_debug.h
> @@ -64,6 +64,7 @@ enum {
> RADV_PERFTEST_BO_LIST=  0x20,
> RADV_PERFTEST_SHADER_BALLOT  =  0x40,
> RADV_PERFTEST_TC_COMPAT_CMASK = 0x80,
> +   RADV_PERFTEST_CS_WAVE_32 = 0x100,
>  };
>
>  bool
> diff --git a/src/amd/vulkan/radv_device.c b/src/amd/vulkan/radv_device.c
> index 65e3ccf91ad..29be192443a 100644
> --- a/src/amd/vulkan/radv_device.c
> +++ b/src/amd/vulkan/radv_device.c
> @@ -383,6 +383,14 @@ radv_physical_device_init(struct radv_physical_device 
> *device,
>
> device->use_shader_ballot = device->instance->perftest_flags & 
> RADV_PERFTEST_SHADER_BALLOT;
>
> +   /* Determine the number of threads per wave for all stages. */
> +   device->cs_wave_size = 64;
> +
> +   if (device->rad_info.chip_class >= GFX10) {
> +   if (device->instance->perftest_flags & 
> RADV_PERFTEST_CS_WAVE_32)
> +   device->cs_wave_size = 32;
> +   }
> +
> radv_physical_device_init_mem_types(device);
> radv_fill_device_extension_table(device, 
> >supported_extensions);
>
> @@ -494,6 +502,7 @@ static const struct debug_control radv_perftest_options[] 
> = {
> {"bolist", RADV_PERFTEST_BO_LIST},
> {"shader_ballot", RADV_PERFTEST_SHADER_BALLOT},
> {"tccompatcmask", RADV_PERFTEST_TC_COMPAT_CMASK},
> +   {"cswave32", RADV_PERFTEST_CS_WAVE_32},
> {NULL, 0}
>  };
>
> @@ -1930,7 +1939,8 @@ VkResult radv_CreateDevice(
> device->scratch_waves = MAX2(32 * 
> physical_device->rad_info.num_good_compute_units,
>  max_threads_per_block / 64);
>
> -   device->dispatch_initiator = S_00B800_COMPUTE_SHADER_EN(1);
> +   device->dispatch_initiator = S_00B800_COMPUTE_SHADER_EN(1) |
> +
> S_00B800_CS_W32_EN(device->physical_device->cs_wave_size == 32);
>
> if (device->physical_device->rad_info.chip_class >= GFX7) {
> /* If the KMD allows it (there is a KMD hw register for it),
> diff --git a/src/amd/vulkan/radv_nir_to_llvm.c 
> b/src/amd/vulkan/radv_nir_to_llvm.c
> index 020c6d17771..feaab8f6370 100644
> --- a/src/amd/vulkan/radv_nir_to_llvm.c
> +++ b/src/amd/vulkan/radv_nir_to_llvm.c
> @@ -4317,6 +4317,15 @@ static void declare_esgs_ring(struct 
> radv_shader_context *ctx)
> LLVMSetAlignment(ctx->esgs_ring, 64 * 1024);
>  }
>
> +static uint8_t
> +radv_nir_shader_wave_size(struct nir_shader *const *shaders, int 
> shader_count,
> + const struct radv_nir_compiler_options *options)
> +{
> +   if (shaders[0]->info.stage == MESA_SHADER_COMPUTE)
> +   return options->cs_wave_size;
> +   return 64;
> +}
> +
>  static
>  LLVMModuleRef ac_translate_nir_to_llvm(struct ac_llvm_compiler *ac_llvm,
> struct nir_shader *const *shaders,
> @@ -4333,8 +4342,11 @@ LLVMModuleRef ac_translate_nir_to_llvm(struct 
> ac_llvm_compiler *ac_llvm,
> options->unsafe_math ? AC_FLOAT_MODE_UNSAFE_FP_MATH :
>AC_FLOAT_MODE_DEFAULT;
>
> +   uint8_t wave_size = radv_nir_shader_wave_size(shaders,
> + shader_count, options);
> +
> ac_llvm_context_init(, ac_llvm, options->chip_class,
> -options->family, float_mode, 64);
> +options->family, float_mode, wave_size);
> ctx.context = ctx.ac.context;
>
> radv_nir_shader_info_init(_info->info);
> diff --git a/src/amd/vulkan/radv_pipeline.c b/src/amd/vulkan/radv_pipeline.c
> index 583b600dfdd..6b8b7bbe25a 100644
> --- a/src/amd/vulkan/radv_pipeline.c
> +++ b/src/amd/vulkan/radv_pipeline.c
> @@ -4648,7 +4648,8 @@ radv_compute_generate_pm4(struct radv_pipeline 
> *pipeline)
> threads_per_threadgroup = compute_shader->info.cs.block_size[0] *
>   compute_shader->info.cs.block_size[1] *
>   compute_shader->info.cs.block_size[2];
> -   waves_per_threadgroup = DIV_ROUND_UP(threads_per_threadgroup, 64);
> +   waves_per_threadgroup = DIV_ROUND_UP(threads_per_threadgroup,
> +

Re: [Mesa-dev] [PATCH] radv/gfx10: only compile the GS copy shader on-demand

2019-07-30 Thread Bas Nieuwenhuizen

r-b

On Tue, Jul 30, 2019 at 3:11 PM Samuel Pitoiset
 wrote:
>
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/amd/vulkan/radv_pipeline.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/src/amd/vulkan/radv_pipeline.c b/src/amd/vulkan/radv_pipeline.c
> index 583b600dfdd..e11196bd82e 100644
> --- a/src/amd/vulkan/radv_pipeline.c
> +++ b/src/amd/vulkan/radv_pipeline.c
> @@ -2626,7 +2626,8 @@ void radv_create_shaders(struct radv_pipeline *pipeline,
>
> if(modules[MESA_SHADER_GEOMETRY]) {
> struct radv_shader_binary *gs_copy_binary = NULL;
> -   if (!pipeline->gs_copy_shader) {
> +   if (!pipeline->gs_copy_shader &&
> +   !radv_pipeline_has_ngg(pipeline)) {
> pipeline->gs_copy_shader = radv_create_gs_copy_shader(
> device, nir[MESA_SHADER_GEOMETRY], 
> _copy_binary,
> 
> keys[MESA_SHADER_GEOMETRY].has_multiview_view_index);
> --
> 2.22.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] radv/gfx10: do not use the fast depth or stencil clear bytes path

2019-07-29 Thread Bas Nieuwenhuizen

r-b

On Mon, Jul 29, 2019 at 2:35 PM Samuel Pitoiset
 wrote:
>
>
> On 7/29/19 2:30 PM, Bas Nieuwenhuizen wrote:
> > On Mon, Jul 29, 2019 at 2:20 PM Samuel Pitoiset
> >  wrote:
> >>
> >> On 7/29/19 2:15 PM, Bas Nieuwenhuizen wrote:
> >>> On Mon, Jul 29, 2019 at 2:11 PM Samuel Pitoiset
> >>>  wrote:
> >>>> The HTILE masks seem to be different and so we need to rework that
> >>>> path. Just disabled for now and implement later.
> >>> The HTILE masks are not different per amdvlk?
> >>>
> >>> Can you at least rework the commit message to reflect that?
> >> "It needs to be reworked on GFX10, so just disable it for now." ?
> > How about just "It causes issues on GFX10"? We don't know it needs to
> > be reworked either?
> Looks like it needs but whatever, I'm fine with that, so Rb?
> >
> >
> >>>> This fixes rendering issues with vkmark and Wreckfest at least.
> >>>>
> >>>> Signed-off-by: Samuel Pitoiset 
> >>>> ---
> >>>>src/amd/vulkan/radv_meta_clear.c | 5 +++--
> >>>>1 file changed, 3 insertions(+), 2 deletions(-)
> >>>>
> >>>> diff --git a/src/amd/vulkan/radv_meta_clear.c 
> >>>> b/src/amd/vulkan/radv_meta_clear.c
> >>>> index b93ba3e0b29..8ddc2e38cd4 100644
> >>>> --- a/src/amd/vulkan/radv_meta_clear.c
> >>>> +++ b/src/amd/vulkan/radv_meta_clear.c
> >>>> @@ -1005,7 +1005,7 @@ radv_can_fast_clear_depth(struct radv_cmd_buffer 
> >>>> *cmd_buffer,
> >>>>   if (!view_mask && clear_rect->layerCount != 
> >>>> iview->image->info.array_size)
> >>>>   return false;
> >>>>
> >>>> -   if (cmd_buffer->device->physical_device->rad_info.chip_class < 
> >>>> GFX9 &&
> >>>> +   if (cmd_buffer->device->physical_device->rad_info.chip_class != 
> >>>> GFX9 &&
> >>>>   (!(aspects & VK_IMAGE_ASPECT_DEPTH_BIT) ||
> >>>>   ((vk_format_aspects(iview->image->vk_format) & 
> >>>> VK_IMAGE_ASPECT_STENCIL_BIT) &&
> >>>>!(aspects & VK_IMAGE_ASPECT_STENCIL_BIT
> >>>> @@ -1048,7 +1048,8 @@ radv_fast_clear_depth(struct radv_cmd_buffer 
> >>>> *cmd_buffer,
> >>>> 
> >>>> iview->image->planes[0].surface.htile_size, clear_word);
> >>>>   } else {
> >>>>   /* Only clear depth or stencil bytes in the HTILE 
> >>>> buffer. */
> >>>> -   
> >>>> assert(cmd_buffer->device->physical_device->rad_info.chip_class >= GFX9);
> >>>> +   /* TODO: Implement that path for GFX10. */
> >>>> +   
> >>>> assert(cmd_buffer->device->physical_device->rad_info.chip_class == GFX9);
> >>>>   flush_bits = clear_htile_mask(cmd_buffer, 
> >>>> iview->image->bo,
> >>>> iview->image->offset + 
> >>>> iview->image->htile_offset,
> >>>> 
> >>>> iview->image->planes[0].surface.htile_size, clear_word,
> >>>> --
> >>>> 2.22.0
> >>>>
> >>>> ___
> >>>> mesa-dev mailing list
> >>>> mesa-dev@lists.freedesktop.org
> >>>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] radv/gfx10: do not use the fast depth or stencil clear bytes path

2019-07-29 Thread Bas Nieuwenhuizen

On Mon, Jul 29, 2019 at 2:20 PM Samuel Pitoiset
 wrote:
>
>
> On 7/29/19 2:15 PM, Bas Nieuwenhuizen wrote:
> > On Mon, Jul 29, 2019 at 2:11 PM Samuel Pitoiset
> >  wrote:
> >> The HTILE masks seem to be different and so we need to rework that
> >> path. Just disabled for now and implement later.
> > The HTILE masks are not different per amdvlk?
> >
> > Can you at least rework the commit message to reflect that?
> "It needs to be reworked on GFX10, so just disable it for now." ?

How about just "It causes issues on GFX10"? We don't know it needs to
be reworked either?


> >> This fixes rendering issues with vkmark and Wreckfest at least.
> >>
> >> Signed-off-by: Samuel Pitoiset 
> >> ---
> >>   src/amd/vulkan/radv_meta_clear.c | 5 +++--
> >>   1 file changed, 3 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/src/amd/vulkan/radv_meta_clear.c 
> >> b/src/amd/vulkan/radv_meta_clear.c
> >> index b93ba3e0b29..8ddc2e38cd4 100644
> >> --- a/src/amd/vulkan/radv_meta_clear.c
> >> +++ b/src/amd/vulkan/radv_meta_clear.c
> >> @@ -1005,7 +1005,7 @@ radv_can_fast_clear_depth(struct radv_cmd_buffer 
> >> *cmd_buffer,
> >>  if (!view_mask && clear_rect->layerCount != 
> >> iview->image->info.array_size)
> >>  return false;
> >>
> >> -   if (cmd_buffer->device->physical_device->rad_info.chip_class < 
> >> GFX9 &&
> >> +   if (cmd_buffer->device->physical_device->rad_info.chip_class != 
> >> GFX9 &&
> >>  (!(aspects & VK_IMAGE_ASPECT_DEPTH_BIT) ||
> >>  ((vk_format_aspects(iview->image->vk_format) & 
> >> VK_IMAGE_ASPECT_STENCIL_BIT) &&
> >>   !(aspects & VK_IMAGE_ASPECT_STENCIL_BIT
> >> @@ -1048,7 +1048,8 @@ radv_fast_clear_depth(struct radv_cmd_buffer 
> >> *cmd_buffer,
> >>
> >> iview->image->planes[0].surface.htile_size, clear_word);
> >>  } else {
> >>  /* Only clear depth or stencil bytes in the HTILE buffer. 
> >> */
> >> -   
> >> assert(cmd_buffer->device->physical_device->rad_info.chip_class >= GFX9);
> >> +   /* TODO: Implement that path for GFX10. */
> >> +   
> >> assert(cmd_buffer->device->physical_device->rad_info.chip_class == GFX9);
> >>  flush_bits = clear_htile_mask(cmd_buffer, 
> >> iview->image->bo,
> >>iview->image->offset + 
> >> iview->image->htile_offset,
> >>
> >> iview->image->planes[0].surface.htile_size, clear_word,
> >> --
> >> 2.22.0
> >>
> >> ___
> >> mesa-dev mailing list
> >> mesa-dev@lists.freedesktop.org
> >> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] radv/gfx10: do not use the fast depth or stencil clear bytes path

2019-07-29 Thread Bas Nieuwenhuizen

On Mon, Jul 29, 2019 at 2:11 PM Samuel Pitoiset
 wrote:
>
> The HTILE masks seem to be different and so we need to rework that
> path. Just disabled for now and implement later.

The HTILE masks are not different per amdvlk?

Can you at least rework the commit message to reflect that?
>
> This fixes rendering issues with vkmark and Wreckfest at least.
>
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/amd/vulkan/radv_meta_clear.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/src/amd/vulkan/radv_meta_clear.c 
> b/src/amd/vulkan/radv_meta_clear.c
> index b93ba3e0b29..8ddc2e38cd4 100644
> --- a/src/amd/vulkan/radv_meta_clear.c
> +++ b/src/amd/vulkan/radv_meta_clear.c
> @@ -1005,7 +1005,7 @@ radv_can_fast_clear_depth(struct radv_cmd_buffer 
> *cmd_buffer,
> if (!view_mask && clear_rect->layerCount != 
> iview->image->info.array_size)
> return false;
>
> -   if (cmd_buffer->device->physical_device->rad_info.chip_class < GFX9 &&
> +   if (cmd_buffer->device->physical_device->rad_info.chip_class != GFX9 
> &&
> (!(aspects & VK_IMAGE_ASPECT_DEPTH_BIT) ||
> ((vk_format_aspects(iview->image->vk_format) & 
> VK_IMAGE_ASPECT_STENCIL_BIT) &&
>  !(aspects & VK_IMAGE_ASPECT_STENCIL_BIT
> @@ -1048,7 +1048,8 @@ radv_fast_clear_depth(struct radv_cmd_buffer 
> *cmd_buffer,
>   
> iview->image->planes[0].surface.htile_size, clear_word);
> } else {
> /* Only clear depth or stencil bytes in the HTILE buffer. */
> -   
> assert(cmd_buffer->device->physical_device->rad_info.chip_class >= GFX9);
> +   /* TODO: Implement that path for GFX10. */
> +   
> assert(cmd_buffer->device->physical_device->rad_info.chip_class == GFX9);
> flush_bits = clear_htile_mask(cmd_buffer, iview->image->bo,
>   iview->image->offset + 
> iview->image->htile_offset,
>   
> iview->image->planes[0].surface.htile_size, clear_word,
> --
> 2.22.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] radv: implement VK_EXT_index_type_uint8

2019-07-29 Thread Bas Nieuwenhuizen

r-b

On Mon, Jul 29, 2019 at 10:47 AM Samuel Pitoiset
 wrote:
>
> Natively supported on VI+.
>
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/amd/vulkan/radv_cmd_buffer.c  | 60 +++
>  src/amd/vulkan/radv_device.c  |  6 
>  src/amd/vulkan/radv_extensions.py |  1 +
>  3 files changed, 61 insertions(+), 6 deletions(-)
>
> diff --git a/src/amd/vulkan/radv_cmd_buffer.c 
> b/src/amd/vulkan/radv_cmd_buffer.c
> index d9783e6ca8a..e0ea47b5745 100644
> --- a/src/amd/vulkan/radv_cmd_buffer.c
> +++ b/src/amd/vulkan/radv_cmd_buffer.c
> @@ -2541,6 +2541,21 @@ struct radv_draw_info {
> uint64_t strmout_buffer_offset;
>  };
>
> +static uint32_t
> +radv_get_primitive_reset_index(struct radv_cmd_buffer *cmd_buffer)
> +{
> +   switch (cmd_buffer->state.index_type) {
> +   case V_028A7C_VGT_INDEX_8:
> +   return 0xffu;
> +   case V_028A7C_VGT_INDEX_16:
> +   return 0xu;
> +   case V_028A7C_VGT_INDEX_32:
> +   return 0xu;
> +   default:
> +   unreachable("invalid index type");
> +   }
> +}
> +
>  static void
>  si_emit_ia_multi_vgt_param(struct radv_cmd_buffer *cmd_buffer,
>bool instanced_draw, bool indirect_draw,
> @@ -2612,7 +2627,7 @@ radv_emit_draw_registers(struct radv_cmd_buffer 
> *cmd_buffer,
>
> if (primitive_reset_en) {
> uint32_t primitive_reset_index =
> -   state->index_type ? 0xu : 0xu;
> +   radv_get_primitive_reset_index(cmd_buffer);
>
> if (primitive_reset_index != 
> state->last_primitive_reset_index) {
> radeon_set_context_reg(cs,
> @@ -3233,6 +3248,36 @@ void radv_CmdBindVertexBuffers(
> cmd_buffer->state.dirty |= RADV_CMD_DIRTY_VERTEX_BUFFER;
>  }
>
> +static uint32_t
> +vk_to_index_type(VkIndexType type)
> +{
> +   switch (type) {
> +   case VK_INDEX_TYPE_UINT8_EXT:
> +   return V_028A7C_VGT_INDEX_8;
> +   case VK_INDEX_TYPE_UINT16:
> +   return V_028A7C_VGT_INDEX_16;
> +   case VK_INDEX_TYPE_UINT32:
> +   return V_028A7C_VGT_INDEX_32;
> +   default:
> +   unreachable("invalid index type");
> +   }
> +}
> +
> +static uint32_t
> +radv_get_vgt_index_size(uint32_t type)
> +{
> +   switch (type) {
> +   case V_028A7C_VGT_INDEX_8:
> +   return 1;
> +   case V_028A7C_VGT_INDEX_16:
> +   return 2;
> +   case V_028A7C_VGT_INDEX_32:
> +   return 4;
> +   default:
> +   unreachable("invalid index type");
> +   }
> +}
> +
>  void radv_CmdBindIndexBuffer(
> VkCommandBuffer commandBuffer,
> VkBuffer buffer,
> @@ -3251,12 +3296,12 @@ void radv_CmdBindIndexBuffer(
>
> cmd_buffer->state.index_buffer = index_buffer;
> cmd_buffer->state.index_offset = offset;
> -   cmd_buffer->state.index_type = indexType; /* vk matches hw */
> +   cmd_buffer->state.index_type = vk_to_index_type(indexType);
> cmd_buffer->state.index_va = radv_buffer_get_va(index_buffer->bo);
> cmd_buffer->state.index_va += index_buffer->offset + offset;
>
> -   int index_size_shift = cmd_buffer->state.index_type ? 2 : 1;
> -   cmd_buffer->state.max_index_count = (index_buffer->size - offset) >> 
> index_size_shift;
> +   int index_size = radv_get_vgt_index_size(indexType);
> +   cmd_buffer->state.max_index_count = (index_buffer->size - offset) / 
> index_size;
> cmd_buffer->state.dirty |= RADV_CMD_DIRTY_INDEX_BUFFER;
> radv_cs_add_buffer(cmd_buffer->device->ws, cmd_buffer->cs, 
> index_buffer->bo);
>  }
> @@ -4275,7 +4320,7 @@ radv_emit_draw_packets(struct radv_cmd_buffer 
> *cmd_buffer,
> }
>
> if (info->indexed) {
> -   int index_size = state->index_type ? 4 : 2;
> +   int index_size = 
> radv_get_vgt_index_size(state->index_type);
> uint64_t index_va;
>
> index_va = state->index_va;
> @@ -4354,8 +4399,11 @@ static bool radv_need_late_scissor_emission(struct 
> radv_cmd_buffer *cmd_buffer,
> if (cmd_buffer->state.dirty & used_states)
> return true;
>
> +   uint32_t primitive_reset_index =
> +   radv_get_primitive_reset_index(cmd_buffer);
> +
> if (info->indexed && state->pipeline->graphics.prim_restart_enable &&
> -   (state->index_type ? 0xu : 0xu) != 
> state->last_primitive_reset_index)
> +   primitive_reset_index != state->last_primitive_reset_index)
> return true;
>
> return false;
> diff --git a/src/amd/vulkan/radv_device.c b/src/amd/vulkan/radv_device.c
> index 9ba100df6e8..65e3ccf91ad 100644
> --- a/src/amd/vulkan/radv_device.c
> +++ b/src/amd/vulkan/radv_device.c
> @@ -987,6 +987,12 @@ void

Re: [Mesa-dev] [PATCH] ac: do not crash when the buffer data format is invalid

2019-07-29 Thread Bas Nieuwenhuizen

r-b

On Mon, Jul 29, 2019 at 12:00 PM Samuel Pitoiset
 wrote:
>
> This might happen when a pipeline doesn't define the vertex input
> state, so the buffer data format is 0 (aka INVALID).
>
> This fixes crashes when compiling some shaders on GFX10.
>
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/amd/common/ac_llvm_build.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/src/amd/common/ac_llvm_build.c b/src/amd/common/ac_llvm_build.c
> index 250bfc5229e..278f8893432 100644
> --- a/src/amd/common/ac_llvm_build.c
> +++ b/src/amd/common/ac_llvm_build.c
> @@ -1508,6 +1508,7 @@ ac_get_tbuffer_format(struct ac_llvm_context *ctx,
> unsigned format;
> switch (dfmt) {
> default: unreachable("bad dfmt");
> +   case V_008F0C_BUF_DATA_FORMAT_INVALID: format = 
> V_008F0C_IMG_FORMAT_INVALID; break;
> case V_008F0C_BUF_DATA_FORMAT_8: format = 
> V_008F0C_IMG_FORMAT_8_UINT; break;
> case V_008F0C_BUF_DATA_FORMAT_8_8: format = 
> V_008F0C_IMG_FORMAT_8_8_UINT; break;
> case V_008F0C_BUF_DATA_FORMAT_8_8_8_8: format = 
> V_008F0C_IMG_FORMAT_8_8_8_8_UINT; break;
> --
> 2.22.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] radv: Set correct metadata size for GFX9+.

2019-07-25 Thread Bas Nieuwenhuizen

Without correct size, radeonsi assumes the metadata is incorrect,
which can and will cause issues.

Since the metadata is really incorrect without the size, let us
fix that.

Fixes: e43cc3e3afc "radv/gfx9: handle GFX9 opaque metadata"
---
 src/amd/vulkan/radv_image.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/amd/vulkan/radv_image.c b/src/amd/vulkan/radv_image.c
index 0941cbb..541ff4086f4 100644
--- a/src/amd/vulkan/radv_image.c
+++ b/src/amd/vulkan/radv_image.c
@@ -1034,7 +1034,8 @@ radv_query_opaque_metadata(struct radv_device *device,
for (i = 0; i <= image->info.levels - 1; i++)
md->metadata[10+i] = 
image->planes[0].surface.u.legacy.level[i].offset >> 8;
md->size_metadata = (11 + image->info.levels - 1) * 4;
-   }
+   } else
+   md->size_metadata = 10 * 4;
 }
 
 void
-- 
2.21.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] radv/gfx10: Disable DCC with scanout.

2019-07-25 Thread Bas Nieuwenhuizen

bleh, you're right 

So we should not be using DCC ...

On Thu, Jul 25, 2019 at 4:37 PM Samuel Pitoiset
 wrote:
>
> It's already disabled later in this function?
>
> On 7/25/19 4:34 PM, Bas Nieuwenhuizen wrote:
> > (a) radv does not set the DCC fields required yet.
> > (b) radeonsi just broke their DCC metadata.
> >
> > Fixes: f8b6c5a1a63 "radeonsi: rewrite si_get_opaque_metadata, also for 
> > gfx10 support"
> > ---
> >   src/amd/vulkan/radv_image.c | 3 +++
> >   1 file changed, 3 insertions(+)
> >
> > diff --git a/src/amd/vulkan/radv_image.c b/src/amd/vulkan/radv_image.c
> > index 0941cbb..4bcdb70214a 100644
> > --- a/src/amd/vulkan/radv_image.c
> > +++ b/src/amd/vulkan/radv_image.c
> > @@ -161,6 +161,9 @@ radv_use_dcc_for_image(struct radv_device *device,
> >   if (image->shareable)
> >   return false;
> >
> > + if (radv_surface_has_scanout(device, create_info))
> > + return false;
> > +
> >   /* TODO: Enable DCC for storage images. */
> >   if ((pCreateInfo->usage & VK_IMAGE_USAGE_STORAGE_BIT) ||
> >   (pCreateInfo->flags & VK_IMAGE_CREATE_EXTENDED_USAGE_BIT))
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] radv/gfx10: Disable DCC with scanout.

2019-07-25 Thread Bas Nieuwenhuizen

(a) radv does not set the DCC fields required yet.
(b) radeonsi just broke their DCC metadata.

Fixes: f8b6c5a1a63 "radeonsi: rewrite si_get_opaque_metadata, also for gfx10 
support"
---
 src/amd/vulkan/radv_image.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/src/amd/vulkan/radv_image.c b/src/amd/vulkan/radv_image.c
index 0941cbb..4bcdb70214a 100644
--- a/src/amd/vulkan/radv_image.c
+++ b/src/amd/vulkan/radv_image.c
@@ -161,6 +161,9 @@ radv_use_dcc_for_image(struct radv_device *device,
if (image->shareable)
return false;
 
+   if (radv_surface_has_scanout(device, create_info))
+   return false;
+
/* TODO: Enable DCC for storage images. */
if ((pCreateInfo->usage & VK_IMAGE_USAGE_STORAGE_BIT) ||
(pCreateInfo->flags & VK_IMAGE_CREATE_EXTENDED_USAGE_BIT))
-- 
2.21.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] radv/gfx10: use L2 for DMA copy/fill operations

2019-07-25 Thread Bas Nieuwenhuizen

r-b

though it sounds like some of our cache flushes might be not ideal.

On Thu, Jul 25, 2019 at 3:35 PM Samuel Pitoiset
 wrote:
>
> It's coherent and faster. GFX7-GFX9 should also support this but
> for now only uses L2 for GFX10 because it's untested on previous gens.
>
> This fixes dEQP-VK.memory.pipeline_barrier.transfer_*
>
> This also fixes some missing geometry in Dawn Of War III because
> VBOs weren't updated correctly.
>
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/amd/vulkan/si_cmd_buffer.c | 16 
>  1 file changed, 16 insertions(+)
>
> diff --git a/src/amd/vulkan/si_cmd_buffer.c b/src/amd/vulkan/si_cmd_buffer.c
> index 21a90cb2514..94f759139ee 100644
> --- a/src/amd/vulkan/si_cmd_buffer.c
> +++ b/src/amd/vulkan/si_cmd_buffer.c
> @@ -1501,6 +1501,14 @@ void si_cp_dma_buffer_copy(struct radv_cmd_buffer 
> *cmd_buffer,
> unsigned dma_flags = 0;
> unsigned byte_count = MIN2(size, 
> cp_dma_max_byte_count(cmd_buffer));
>
> +   if (cmd_buffer->device->physical_device->rad_info.chip_class 
> >= GFX10) {
> +   /* DMA operations via L2 are coherent and faster.
> +* TODO: GFX7-GFX9 should also support this but it
> +* requires tests/benchmarks.
> +*/
> +   dma_flags |= CP_DMA_USE_L2;
> +   }
> +
> si_cp_dma_prepare(cmd_buffer, byte_count,
>   size + skipped_size + realign_size,
>   _flags);
> @@ -1545,6 +1553,14 @@ void si_cp_dma_clear_buffer(struct radv_cmd_buffer 
> *cmd_buffer, uint64_t va,
> unsigned byte_count = MIN2(size, 
> cp_dma_max_byte_count(cmd_buffer));
> unsigned dma_flags = CP_DMA_CLEAR;
>
> +   if (cmd_buffer->device->physical_device->rad_info.chip_class 
> >= GFX10) {
> +   /* DMA operations via L2 are coherent and faster.
> +* TODO: GFX7-GFX9 should also support this but it
> +* requires tests/benchmarks.
> +*/
> +   dma_flags |= CP_DMA_USE_L2;
> +   }
> +
> si_cp_dma_prepare(cmd_buffer, byte_count, size, _flags);
>
> /* Emit the clear packet. */
> --
> 2.22.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2] radv/gfx10: fix intensity formats by setting ALPHA_IS_ON_MSB

2019-07-25 Thread Bas Nieuwenhuizen

On Wed, Jul 24, 2019 at 4:47 PM Samuel Pitoiset
 wrote:
>
> This fixes
> dEQP-VK.rasterization.primitive_size.points.point_size_*
>
> This also fixes some black squares with the Sascha SSAO demo.
>
> v2: - do not set for multiple channels
> - call vi_alpha_is_on_msb() for pre-GFX10
> - remove unused 'swap'
>
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/amd/vulkan/radv_image.c | 17 +++--
>  1 file changed, 11 insertions(+), 6 deletions(-)
>
> diff --git a/src/amd/vulkan/radv_image.c b/src/amd/vulkan/radv_image.c
> index 0941cbb..d46946269e6 100644
> --- a/src/amd/vulkan/radv_image.c
> +++ b/src/amd/vulkan/radv_image.c
> @@ -617,6 +617,15 @@ static unsigned gfx9_border_color_swizzle(const enum 
> vk_swizzle swizzle[4])
> return bc_swizzle;
>  }
>
> +static bool vi_alpha_is_on_msb(struct radv_device *device, VkFormat format)
> +{
> +   const struct vk_format_description *desc = 
> vk_format_description(format);
> +
> +   if (device->physical_device->rad_info.chip_class >= GFX10 && 
> desc->nr_channels == 1)
> +   return desc->swizzle[3] == VK_SWIZZLE_X;

In vulkan we never have an alpha-only format, so this will always be false.

r-b anyway.

> +
> +   return radv_translate_colorswap(format, false) <= 1;
> +}
>  /**
>   * Build the sampler view descriptor for a texture (GFX10).
>   */
> @@ -691,11 +700,9 @@ gfx10_make_texture_descriptor(struct radv_device *device,
> state[7] = 0;
>
> if (radv_dcc_enabled(image, first_level)) {
> -   unsigned swap = radv_translate_colorswap(vk_format, FALSE);
> -
> state[6] |= 
> S_00A018_MAX_UNCOMPRESSED_BLOCK_SIZE(V_028C78_MAX_BLOCK_SIZE_256B) |
> 
> S_00A018_MAX_COMPRESSED_BLOCK_SIZE(V_028C78_MAX_BLOCK_SIZE_128B) |
> -   S_00A018_ALPHA_IS_ON_MSB(swap <= 1);
> +   
> S_00A018_ALPHA_IS_ON_MSB(vi_alpha_is_on_msb(device, vk_format));
> }
>
> /* Initialize the sampler view for FMASK. */
> @@ -849,9 +856,7 @@ si_make_texture_descriptor(struct radv_device *device,
> state[5] |= S_008F24_LAST_ARRAY(last_layer);
> }
> if (image->dcc_offset) {
> -   unsigned swap = radv_translate_colorswap(vk_format, FALSE);
> -
> -   state[6] = S_008F28_ALPHA_IS_ON_MSB(swap <= 1);
> +   state[6] = 
> S_008F28_ALPHA_IS_ON_MSB(vi_alpha_is_on_msb(device, vk_format));
> } else {
> /* The last dword is unused by hw. The shader uses it to clear
>  * bits in the first dword of sampler state.
> --
> 2.22.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] radv/gfx10: fix intensity formats by setting ALPHA_IS_ON_MSB

2019-07-24 Thread Bas Nieuwenhuizen

On Wed, Jul 24, 2019 at 3:00 PM Samuel Pitoiset
 wrote:
>
> This fixes
> dEQP-VK.rasterization.primitive_size.points.point_size_*
>
> This also fixes some black squares with the Sascha SSAO demo.
>
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/amd/vulkan/radv_image.c | 15 ++-
>  1 file changed, 14 insertions(+), 1 deletion(-)
>
> diff --git a/src/amd/vulkan/radv_image.c b/src/amd/vulkan/radv_image.c
> index 0941cbb..59d6d0ced78 100644
> --- a/src/amd/vulkan/radv_image.c
> +++ b/src/amd/vulkan/radv_image.c
> @@ -617,6 +617,19 @@ static unsigned gfx9_border_color_swizzle(const enum 
> vk_swizzle swizzle[4])
> return bc_swizzle;
>  }
>
> +static bool vi_alpha_is_on_msb(struct radv_device *device, VkFormat format)
> +{
> +   const struct vk_format_description *desc = 
> vk_format_description(format);
> +
> +   /* Formats with 3 channels can't have alpha. */
> +   if (desc->nr_channels == 3)
> +   return true; /* same as xxxA; is any value OK here? */

I don't think this is correct. For formats with multiple channels,
this bit is not about "does this format have alpha", but "is the alpha
channel on MSB or LSB". IIRC even for RG the "alpha" is just the G
component, no explicit alpha needed.


> +
> +   if (device->physical_device->rad_info.chip_class >= GFX10 && 
> desc->nr_channels == 1)
> +   return desc->swizzle[3] == VK_SWIZZLE_X;
> +
> +   return radv_translate_colorswap(format, false) <= 1;
> +}
>  /**
>   * Build the sampler view descriptor for a texture (GFX10).
>   */
> @@ -695,7 +708,7 @@ gfx10_make_texture_descriptor(struct radv_device *device,
>
> state[6] |= 
> S_00A018_MAX_UNCOMPRESSED_BLOCK_SIZE(V_028C78_MAX_BLOCK_SIZE_256B) |
> 
> S_00A018_MAX_COMPRESSED_BLOCK_SIZE(V_028C78_MAX_BLOCK_SIZE_128B) |
> -   S_00A018_ALPHA_IS_ON_MSB(swap <= 1);
> +   
> S_00A018_ALPHA_IS_ON_MSB(vi_alpha_is_on_msb(device, vk_format));
> }
>
> /* Initialize the sampler view for FMASK. */
> --
> 2.22.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 5/5] radv/gfx10: enable VK_EXT_transform_feedback

2019-07-23 Thread Bas Nieuwenhuizen

r-b for the series if you resolve my comment on patch  4.

On Tue, Jul 23, 2019 at 3:21 PM Samuel Pitoiset
 wrote:
>
> When a pipeline uses transform feedback, the driver fallbacks to
> the legacy path because NGG support for streamout is a non-trivial
> amount of work.
>
> AMDVLK also uses the legacy path for streamout, while RadeonSI
> uses the new NGG path.
>
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/amd/vulkan/radv_extensions.py | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/src/amd/vulkan/radv_extensions.py 
> b/src/amd/vulkan/radv_extensions.py
> index e9addad0035..8e1d61dfaaf 100644
> --- a/src/amd/vulkan/radv_extensions.py
> +++ b/src/amd/vulkan/radv_extensions.py
> @@ -129,7 +129,7 @@ EXTENSIONS = [
>  Extension('VK_EXT_shader_stencil_export', 1, True),
>  Extension('VK_EXT_shader_subgroup_ballot',1, True),
>  Extension('VK_EXT_shader_subgroup_vote',  1, True),
> -Extension('VK_EXT_transform_feedback',1, 
> 'device->rad_info.chip_class < GFX10'),
> +Extension('VK_EXT_transform_feedback',1, True),
>  Extension('VK_EXT_vertex_attribute_divisor',  3, True),
>  Extension('VK_EXT_ycbcr_image_arrays',1, True),
>  Extension('VK_AMD_buffer_marker', 1, True),
> --
> 2.22.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 4/5] radv/gfx10: do not enable NGG if a pipeline uses XFB

2019-07-23 Thread Bas Nieuwenhuizen

On Tue, Jul 23, 2019 at 3:21 PM Samuel Pitoiset
 wrote:
>
> NGG GS for streamout requires a bunch of work, so enable it with
> the legacy path only for now.
>
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/amd/vulkan/radv_pipeline.c | 28 
>  1 file changed, 28 insertions(+)
>
> diff --git a/src/amd/vulkan/radv_pipeline.c b/src/amd/vulkan/radv_pipeline.c
> index a7ff0e2d139..0903e5abf37 100644
> --- a/src/amd/vulkan/radv_pipeline.c
> +++ b/src/amd/vulkan/radv_pipeline.c
> @@ -33,6 +33,7 @@
>  #include "radv_shader.h"
>  #include "nir/nir.h"
>  #include "nir/nir_builder.h"
> +#include "nir/nir_xfb_info.h"
>  #include "spirv/nir_spirv.h"
>  #include "vk_util.h"
>
> @@ -2269,6 +2270,16 @@ radv_generate_graphics_pipeline_key(struct 
> radv_pipeline *pipeline,
> return key;
>  }
>
> +static bool
> +radv_nir_stage_uses_xfb(const nir_shader *nir)
> +{
> +   nir_xfb_info *xfb = nir_gather_xfb_info(nir, NULL);
> +   bool uses_xfb = !!xfb;
> +
> +   ralloc_free(xfb);
> +   return uses_xfb;
> +}
> +
>  static void
>  radv_fill_shader_keys(struct radv_device *device,
>   struct radv_shader_variant_key *keys,
> @@ -2321,6 +2332,23 @@ radv_fill_shader_keys(struct radv_device *device,
>  */
> keys[MESA_SHADER_TESS_EVAL].vs_common_out.as_ngg = 
> false;
> }
> +
> +   /* TODO: Implement streamout support for NGG. */
> +   bool uses_xfb = false;
> +   if ((nir[MESA_SHADER_VERTEX] &&
> +radv_nir_stage_uses_xfb(nir[MESA_SHADER_VERTEX])) ||
> +   (nir[MESA_SHADER_TESS_EVAL] &&
> +radv_nir_stage_uses_xfb(nir[MESA_SHADER_TESS_EVAL])) ||
> +   (nir[MESA_SHADER_GEOMETRY] &&
> +radv_nir_stage_uses_xfb(nir[MESA_SHADER_GEOMETRY])))
> +   uses_xfb = true;

transform feedback can only happen on the last stage before PS right?
Can we first determine what the last shader is and only then check for
xfb? That way we don't have to scan 3 shaders.
> +
> +   if (uses_xfb) {
> +   if (nir[MESA_SHADER_TESS_CTRL])
> +   
> keys[MESA_SHADER_TESS_EVAL].vs_common_out.as_ngg = false;
> +   else
> +   keys[MESA_SHADER_VERTEX].vs_common_out.as_ngg 
> = false;
> +   }
> }
>
> for(int i = 0; i < MESA_SHADER_STAGES; ++i)
> --
> 2.22.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v3] radv/gfx10: fix VS input VGPRs with the legacy path

2019-07-23 Thread Bas Nieuwenhuizen

r-b

On Tue, Jul 23, 2019 at 2:44 PM Samuel Pitoiset
 wrote:
>
> For some reasons, InstanceID is VGPR3 although StepRate0 is set to 1.
>
> v3: fix instanceID input VGPR for geometry
> v2: fix instanceID
>
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/amd/vulkan/radv_nir_to_llvm.c | 12 +---
>  src/amd/vulkan/radv_shader.c  |  8 ++--
>  2 files changed, 15 insertions(+), 5 deletions(-)
>
> diff --git a/src/amd/vulkan/radv_nir_to_llvm.c 
> b/src/amd/vulkan/radv_nir_to_llvm.c
> index 336bae28614..cf73cdc692b 100644
> --- a/src/amd/vulkan/radv_nir_to_llvm.c
> +++ b/src/amd/vulkan/radv_nir_to_llvm.c
> @@ -852,9 +852,15 @@ declare_vs_input_vgprs(struct radv_shader_context *ctx, 
> struct arg_info *args)
> }
> } else {
> if (ctx->ac.chip_class >= GFX10) {
> -   add_arg(args, ARG_VGPR, ctx->ac.i32, NULL); 
> /* user vgpr */
> -   add_arg(args, ARG_VGPR, ctx->ac.i32, NULL); 
> /* user vgpr */
> -   add_arg(args, ARG_VGPR, ctx->ac.i32, 
> >abi.instance_id);
> +   if (ctx->options->key.vs_common_out.as_ngg) {
> +   add_arg(args, ARG_VGPR, ctx->ac.i32, 
> NULL); /* user vgpr */
> +   add_arg(args, ARG_VGPR, ctx->ac.i32, 
> NULL); /* user vgpr */
> +   add_arg(args, ARG_VGPR, ctx->ac.i32, 
> >abi.instance_id);
> +   } else {
> +   add_arg(args, ARG_VGPR, ctx->ac.i32, 
> NULL); /* unused */
> +   add_arg(args, ARG_VGPR, ctx->ac.i32, 
> >vs_prim_id);
> +   add_arg(args, ARG_VGPR, ctx->ac.i32, 
> >abi.instance_id);
> +   }
> } else {
> add_arg(args, ARG_VGPR, ctx->ac.i32, 
> >abi.instance_id);
> add_arg(args, ARG_VGPR, ctx->ac.i32, 
> >vs_prim_id);
> diff --git a/src/amd/vulkan/radv_shader.c b/src/amd/vulkan/radv_shader.c
> index 3adaf52e152..06122664a13 100644
> --- a/src/amd/vulkan/radv_shader.c
> +++ b/src/amd/vulkan/radv_shader.c
> @@ -765,7 +765,7 @@ static void radv_postprocess_config(const struct 
> radv_physical_device *pdevice,
> if (info->vs.export_prim_id) {
> vgpr_comp_cnt = 2;
> } else if (info->info.vs.needs_instance_id) {
> -   vgpr_comp_cnt = 1;
> +   vgpr_comp_cnt = pdevice->rad_info.chip_class 
> >= GFX10 ? 3 : 1;
> } else {
> vgpr_comp_cnt = 0;
> }
> @@ -837,7 +837,11 @@ static void radv_postprocess_config(const struct 
> radv_physical_device *pdevice,
>
> if (es_type == MESA_SHADER_VERTEX) {
> /* VGPR0-3: (VertexID, InstanceID / StepRate0, ...) */
> -   es_vgpr_comp_cnt = info->info.vs.needs_instance_id ? 
> 1 : 0;
> +   if (info->info.vs.needs_instance_id) {
> +   es_vgpr_comp_cnt = 
> pdevice->rad_info.chip_class >= GFX10 ? 3 : 1;
> +   } else {
> +   es_vgpr_comp_cnt = 0;
> +   }
> } else if (es_type == MESA_SHADER_TESS_EVAL) {
> es_vgpr_comp_cnt = info->info.uses_prim_id ? 3 : 2;
> } else {
> --
> 2.22.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2] radv/gfx10: fix VS input VGPRs with the legacy path

2019-07-23 Thread Bas Nieuwenhuizen

r-b

On Tue, Jul 23, 2019 at 2:10 PM Samuel Pitoiset
 wrote:
>
> For some reasons, InstanceID is VGPR3 although StepRate0 is set to 1.
>
> v2: fix instanceID
>
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/amd/vulkan/radv_nir_to_llvm.c | 12 +---
>  src/amd/vulkan/radv_shader.c  |  2 +-
>  2 files changed, 10 insertions(+), 4 deletions(-)
>
> diff --git a/src/amd/vulkan/radv_nir_to_llvm.c 
> b/src/amd/vulkan/radv_nir_to_llvm.c
> index 336bae28614..cf73cdc692b 100644
> --- a/src/amd/vulkan/radv_nir_to_llvm.c
> +++ b/src/amd/vulkan/radv_nir_to_llvm.c
> @@ -852,9 +852,15 @@ declare_vs_input_vgprs(struct radv_shader_context *ctx, 
> struct arg_info *args)
> }
> } else {
> if (ctx->ac.chip_class >= GFX10) {
> -   add_arg(args, ARG_VGPR, ctx->ac.i32, NULL); 
> /* user vgpr */
> -   add_arg(args, ARG_VGPR, ctx->ac.i32, NULL); 
> /* user vgpr */
> -   add_arg(args, ARG_VGPR, ctx->ac.i32, 
> >abi.instance_id);
> +   if (ctx->options->key.vs_common_out.as_ngg) {
> +   add_arg(args, ARG_VGPR, ctx->ac.i32, 
> NULL); /* user vgpr */
> +   add_arg(args, ARG_VGPR, ctx->ac.i32, 
> NULL); /* user vgpr */
> +   add_arg(args, ARG_VGPR, ctx->ac.i32, 
> >abi.instance_id);
> +   } else {
> +   add_arg(args, ARG_VGPR, ctx->ac.i32, 
> NULL); /* unused */
> +   add_arg(args, ARG_VGPR, ctx->ac.i32, 
> >vs_prim_id);
> +   add_arg(args, ARG_VGPR, ctx->ac.i32, 
> >abi.instance_id);
> +   }
> } else {
> add_arg(args, ARG_VGPR, ctx->ac.i32, 
> >abi.instance_id);
> add_arg(args, ARG_VGPR, ctx->ac.i32, 
> >vs_prim_id);
> diff --git a/src/amd/vulkan/radv_shader.c b/src/amd/vulkan/radv_shader.c
> index 3adaf52e152..3d1b56e7f60 100644
> --- a/src/amd/vulkan/radv_shader.c
> +++ b/src/amd/vulkan/radv_shader.c
> @@ -765,7 +765,7 @@ static void radv_postprocess_config(const struct 
> radv_physical_device *pdevice,
> if (info->vs.export_prim_id) {
> vgpr_comp_cnt = 2;
> } else if (info->info.vs.needs_instance_id) {
> -   vgpr_comp_cnt = 1;
> +   vgpr_comp_cnt = pdevice->rad_info.chip_class 
> >= GFX10 ? 3 : 1;
> } else {
> vgpr_comp_cnt = 0;
> }
> --
> 2.22.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] radv/gfx10: fix VS input VGPRs with the legacy path

2019-07-23 Thread Bas Nieuwenhuizen

So does this work with tests that use multiple instances?

If so, r-b.

On Tue, Jul 23, 2019 at 1:29 PM Samuel Pitoiset
 wrote:
>
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/amd/vulkan/radv_nir_to_llvm.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/src/amd/vulkan/radv_nir_to_llvm.c 
> b/src/amd/vulkan/radv_nir_to_llvm.c
> index 336bae28614..9cea92e8a69 100644
> --- a/src/amd/vulkan/radv_nir_to_llvm.c
> +++ b/src/amd/vulkan/radv_nir_to_llvm.c
> @@ -851,7 +851,8 @@ declare_vs_input_vgprs(struct radv_shader_context *ctx, 
> struct arg_info *args)
> add_arg(args, ARG_VGPR, ctx->ac.i32, NULL); 
> /* unused */
> }
> } else {
> -   if (ctx->ac.chip_class >= GFX10) {
> +   if (ctx->ac.chip_class >= GFX10 &&
> +   ctx->options->key.vs_common_out.as_ngg) {
> add_arg(args, ARG_VGPR, ctx->ac.i32, NULL); 
> /* user vgpr */
> add_arg(args, ARG_VGPR, ctx->ac.i32, NULL); 
> /* user vgpr */
> add_arg(args, ARG_VGPR, ctx->ac.i32, 
> >abi.instance_id);
> --
> 2.22.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] radv/gfx10: enable CLEAR_state

2019-07-23 Thread Bas Nieuwenhuizen

r-b

On Tue, Jul 23, 2019 at 8:37 AM Samuel Pitoiset
 wrote:
>
> It actually works.
>
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/amd/vulkan/radv_device.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
>
> diff --git a/src/amd/vulkan/radv_device.c b/src/amd/vulkan/radv_device.c
> index 992e12840f7..93b03afda22 100644
> --- a/src/amd/vulkan/radv_device.c
> +++ b/src/amd/vulkan/radv_device.c
> @@ -354,8 +354,7 @@ radv_physical_device_init(struct radv_physical_device 
> *device,
> /* The mere presence of CLEAR_STATE in the IB causes random GPU hangs
>  * on GFX6.
>  */
> -   device->has_clear_state = device->rad_info.chip_class >= GFX7 &&
> - device->rad_info.chip_class <= GFX9;
> +   device->has_clear_state = device->rad_info.chip_class >= GFX7;
>
> device->cpdma_prefetch_writes_memory = device->rad_info.chip_class <= 
> GFX8;
>
> --
> 2.22.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] radv: fix dumping disassembly with RADV_DEBUG=shaders

2019-07-23 Thread Bas Nieuwenhuizen

r-b

On Tue, Jul 23, 2019 at 9:51 AM Samuel Pitoiset
 wrote:
>
> Fixes: a20a9d0c5e7 ("radv: dont store disasm string unless keep_shader_info 
> flag set")
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/amd/vulkan/radv_shader.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/src/amd/vulkan/radv_shader.c b/src/amd/vulkan/radv_shader.c
> index 3adaf52e152..736388c555c 100644
> --- a/src/amd/vulkan/radv_shader.c
> +++ b/src/amd/vulkan/radv_shader.c
> @@ -1013,7 +1013,8 @@ radv_shader_variant_create(struct radv_device *device,
> return NULL;
> }
>
> -   if (device->keep_shader_info) {
> +   if (device->keep_shader_info ||
> +   (device->instance->debug_flags & 
> RADV_DEBUG_DUMP_SHADERS)) {
> const char *disasm_data;
> size_t disasm_size;
> if (!ac_rtld_get_section_by_name(_binary, 
> ".AMDGPU.disasm", _data, _size)) {
> --
> 2.22.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/2] radv/gfx10: correctly determine the number of vertices per primitive

2019-07-22 Thread Bas Nieuwenhuizen

On Mon, Jul 22, 2019 at 6:01 PM Ilia Mirkin  wrote:
>
> On Mon, Jul 22, 2019 at 11:49 AM Samuel Pitoiset
>  wrote:
> >
> > For TES as NGG.
> >
> > Signed-off-by: Samuel Pitoiset 
> > ---
> >  src/amd/vulkan/radv_nir_to_llvm.c | 17 -
> >  1 file changed, 16 insertions(+), 1 deletion(-)
> >
> > diff --git a/src/amd/vulkan/radv_nir_to_llvm.c 
> > b/src/amd/vulkan/radv_nir_to_llvm.c
> > index 336bae28614..6e5a283f923 100644
> > --- a/src/amd/vulkan/radv_nir_to_llvm.c
> > +++ b/src/amd/vulkan/radv_nir_to_llvm.c
> > @@ -112,6 +112,7 @@ struct radv_shader_context {
> > unsigned gs_max_out_vertices;
> > unsigned gs_output_prim;
> >
> > +   unsigned tes_point_mode;
> > unsigned tes_primitive_mode;
> >
> > uint32_t tcs_patch_outputs_read;
> > @@ -3304,7 +3305,6 @@ handle_ngg_outputs_post(struct radv_shader_context 
> > *ctx)
> >  {
> > LLVMBuilderRef builder = ctx->ac.builder;
> > struct ac_build_if_state if_state;
> > -   unsigned num_vertices = 3;
> > LLVMValueRef tmp;
> >
> > assert((ctx->stage == MESA_SHADER_VERTEX ||
> > @@ -3322,6 +3322,20 @@ handle_ngg_outputs_post(struct radv_shader_context 
> > *ctx)
> > ac_unpack_param(>ac, ctx->gs_vtx_offset[2], 0, 16),
> > };
> >
> > +   /* Determine the number of vertices per primitive. */
> > +   unsigned num_vertices;
> > +
> > +   if (ctx->stage == MESA_SHADER_VERTEX) {
> > +   num_vertices = 3; /* TODO: optimize for points & lines */
> > +   } else {
> > +   if (ctx->tes_point_mode)
> > +   num_vertices = 1;
> > +   else if (ctx->tes_primitive_mode == GL_LINES)
> > +   num_vertices = 2;
> > +   else
> > +   num_vertices = 3;
> > +   }
> > +
> > /* TODO: streamout */
> >
> > /* Copy Primitive IDs from GS threads to the LDS address 
> > corresponding
> > @@ -4435,6 +4449,7 @@ LLVMModuleRef ac_translate_nir_to_llvm(struct 
> > ac_llvm_compiler *ac_llvm,
> > ctx.tcs_num_inputs = 
> > util_last_bit64(shader_info->info.vs.ls_outputs_written);
> > ctx.tcs_num_patches = get_tcs_num_patches();
> > } else if (shaders[i]->info.stage == MESA_SHADER_TESS_EVAL) 
> > {
> > +   ctx.tes_point_mode = 
> > shaders[i]->info.tess.point_mode;
>
> Drive-by-comment without reading the full context...
>
> What if there's e.g. a GS which produces not-points? This bool will be
> set, and the logic above will say num_vertices = 1, which presumably
> is bad.

The invariant you're probably missing here is that
handle_ngg_outputs_post only gets called if there is no GS.
 (And the gs epilogue does not care about these tessellation variables).

>
>   -ilia
>
> > ctx.tes_primitive_mode = 
> > shaders[i]->info.tess.primitive_mode;
> > ctx.abi.load_tess_varyings = load_tes_input;
> > ctx.abi.load_tess_coord = load_tess_coord;
> > --
> > 2.22.0
> >
> > ___
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] radv: fix crash in vkCmdClearAttachments with unused attachment

2019-07-22 Thread Bas Nieuwenhuizen

r-b

On Mon, Jul 22, 2019 at 10:09 AM Samuel Pitoiset
 wrote:
>
> depth_stencil_attachment and/or ds_resolve attachment can be NULL.
>
> This fixes crashes with
> dEQP-VK.renderpass.suballocation.unused_clear_attachments.*
>
> Cc: 19.1 
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/amd/vulkan/radv_meta_clear.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/src/amd/vulkan/radv_meta_clear.c 
> b/src/amd/vulkan/radv_meta_clear.c
> index dd2ba402f40..b93ba3e0b29 100644
> --- a/src/amd/vulkan/radv_meta_clear.c
> +++ b/src/amd/vulkan/radv_meta_clear.c
> @@ -1688,7 +1688,7 @@ emit_clear(struct radv_cmd_buffer *cmd_buffer,
> if (ds_resolve_clear)
> ds_att = subpass->ds_resolve_attachment;
>
> -   if (ds_att->attachment == VK_ATTACHMENT_UNUSED)
> +   if (!ds_att || ds_att->attachment == VK_ATTACHMENT_UNUSED)
> return;
>
> VkImageLayout image_layout = ds_att->layout;
> --
> 2.22.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] ac/nir: fix txf_ms with an offset

2019-07-21 Thread Bas Nieuwenhuizen

r-b

On Fri, Jul 19, 2019 at 9:19 PM Rhys Perry  wrote:
>
> Seems to fix some hair artifacts in Max Payne 3:
> https://github.com/daniel-schuermann/mesa/issues/76
>
> Signed-off-by: Rhys Perry 
> Fixes: f4e499ec791 ('radv: add initial non-conformant radv vulkan driver')
> ---
>  src/amd/common/ac_nir_to_llvm.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c
> index 96bf89a8bf9..549a26ea243 100644
> --- a/src/amd/common/ac_nir_to_llvm.c
> +++ b/src/amd/common/ac_nir_to_llvm.c
> @@ -3784,7 +3784,7 @@ static void visit_tex(struct ac_nir_context *ctx, 
> nir_tex_instr *instr)
> goto write_result;
> }
>
> -   if (args.offset && instr->op != nir_texop_txf) {
> +   if (args.offset && instr->op != nir_texop_txf && instr->op != 
> nir_texop_txf_ms) {
> LLVMValueRef offset[3], pack;
> for (unsigned chan = 0; chan < 3; ++chan)
> offset[chan] = ctx->ac.i32_0;
> @@ -3919,7 +3919,7 @@ static void visit_tex(struct ac_nir_context *ctx, 
> nir_tex_instr *instr)
> args.coords[sample_chan], fmask_ptr);
> }
>
> -   if (args.offset && instr->op == nir_texop_txf) {
> +   if (args.offset && (instr->op == nir_texop_txf || instr->op == 
> nir_texop_txf_ms)) {
> int num_offsets = 
> instr->src[offset_src].src.ssa->num_components;
> num_offsets = MIN2(num_offsets, instr->coord_components);
> for (unsigned i = 0; i < num_offsets; ++i) {
> --
> 2.21.0
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 7/7] radv/gfx10: update descriptors for inline uniform blocks

2019-07-21 Thread Bas Nieuwenhuizen

On Thu, Jul 18, 2019 at 3:51 PM Samuel Pitoiset
 wrote:
>
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/amd/vulkan/radv_nir_to_llvm.c | 13 ++---
>  1 file changed, 10 insertions(+), 3 deletions(-)
>
> diff --git a/src/amd/vulkan/radv_nir_to_llvm.c 
> b/src/amd/vulkan/radv_nir_to_llvm.c
> index 6feb55e3916..19dcae3a476 100644
> --- a/src/amd/vulkan/radv_nir_to_llvm.c
> +++ b/src/amd/vulkan/radv_nir_to_llvm.c
> @@ -1373,9 +1373,16 @@ radv_load_resource(struct ac_shader_abi *abi, 
> LLVMValueRef index,
> uint32_t desc_type = S_008F0C_DST_SEL_X(V_008F0C_SQ_SEL_X) |
> S_008F0C_DST_SEL_Y(V_008F0C_SQ_SEL_Y) |
> S_008F0C_DST_SEL_Z(V_008F0C_SQ_SEL_Z) |
> -   S_008F0C_DST_SEL_W(V_008F0C_SQ_SEL_W) |
> -   S_008F0C_NUM_FORMAT(V_008F0C_BUF_NUM_FORMAT_FLOAT) |
> -   S_008F0C_DATA_FORMAT(V_008F0C_BUF_DATA_FORMAT_32);
> +   S_008F0C_DST_SEL_W(V_008F0C_SQ_SEL_W);
> +
> +   if (ctx->ac.chip_class >= GFX10) {
> +   desc_type |= 
> S_008F0C_FORMAT(V_008F0C_IMG_FORMAT_32_FLOAT) |
> +S_008F0C_OOB_SELECT(3) |

We really should get some enum/define values for OOB_SELECT.

Anyway, not a blocker

Reviewed-by: Bas Nieuwenhuizen 

for the series
> +S_008F0C_RESOURCE_LEVEL(1);
> +   } else {
> +   desc_type |= 
> S_008F0C_NUM_FORMAT(V_008F0C_BUF_NUM_FORMAT_FLOAT) |
> +
> S_008F0C_DATA_FORMAT(V_008F0C_BUF_DATA_FORMAT_32);
> +   }
>
> LLVMValueRef desc_components[4] = {
> LLVMBuildPtrToInt(ctx->ac.builder, desc_ptr, 
> ctx->ac.intptr, ""),
> --
> 2.22.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2 2/3] radv/gfx10: do not emit VGT_GS_MODE

2019-07-18 Thread Bas Nieuwenhuizen

We might want to merge this into patch 1, as we now emit the
R_028A84_VGT_PRIMITIVEID_EN twice after only patch 1.

Either way r-b for the series

On Thu, Jul 18, 2019 at 10:14 AM Samuel Pitoiset
 wrote:
>
> Unnecessary.
>
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/amd/vulkan/radv_pipeline.c | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/src/amd/vulkan/radv_pipeline.c b/src/amd/vulkan/radv_pipeline.c
> index bcb7ccc803d..b11d79f4811 100644
> --- a/src/amd/vulkan/radv_pipeline.c
> +++ b/src/amd/vulkan/radv_pipeline.c
> @@ -3274,6 +3274,9 @@ radv_pipeline_generate_vgt_gs_mode(struct radeon_cmdbuf 
> *ctx_cs,
> unsigned vgt_primitiveid_en = 0;
> uint32_t vgt_gs_mode = 0;
>
> +   if (radv_pipeline_has_ngg(pipeline))
> +   return;
> +
> if (radv_pipeline_has_gs(pipeline)) {
> const struct radv_shader_variant *gs =
> pipeline->shaders[MESA_SHADER_GEOMETRY];
> --
> 2.22.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 4/4] radv/gfx10: do not always execute a barrier before the second shader

2019-07-18 Thread Bas Nieuwenhuizen

r-b

On Thu, Jul 18, 2019 at 10:04 AM Samuel Pitoiset
 wrote:
>
>
> On 7/18/19 2:29 AM, Bas Nieuwenhuizen wrote:
> > On Wed, Jul 17, 2019 at 3:44 PM Samuel Pitoiset
> >  wrote:
> >> With NGG, empty waves may still be required to export data.
> >>
> >> This fixes dEQP-VK.ycbcr.format.*_unorm.geometry_*.
> >>
> >> Signed-off-by: Samuel Pitoiset 
> >> ---
> >>   src/amd/vulkan/radv_nir_to_llvm.c | 31 ++-
> >>   1 file changed, 30 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/src/amd/vulkan/radv_nir_to_llvm.c 
> >> b/src/amd/vulkan/radv_nir_to_llvm.c
> >> index 3e18303879e..7e623414adc 100644
> >> --- a/src/amd/vulkan/radv_nir_to_llvm.c
> >> +++ b/src/amd/vulkan/radv_nir_to_llvm.c
> >> @@ -4448,8 +4448,37 @@ LLVMModuleRef ac_translate_nir_to_llvm(struct 
> >> ac_llvm_compiler *ac_llvm,
> >>  declare_esgs_ring();
> >>  }
> >>
> >> -   if (i)
> >> +   bool nested_barrier = false;
> >> +
> >> +   if (i) {
> >> +   if (shaders[i]->info.stage == MESA_SHADER_GEOMETRY 
> >> &&
> >> +   ctx.options->key.vs_common_out.as_ngg) {
> >> +   nested_barrier = false;
> >> +   } else {
> >> +   nested_barrier = true;
> >> +   }
> >> +   }
> > We can simplify this to
> >
> > nested_barrier = i && (shaders[i]->info.stage != MESA_SHADER_GEOMETRY
> > || !ctx.options->key.vs_common_out.as_ngg);
> >
> > Otherwise r-b, I'm just surprised an s_barrier is okay.
> I'm going to move the NGG GS prologue into that inner if, so I would
> prefer to keep this way.
> >> +
> >> +   if (nested_barrier) {
> >> +   /* Execute a barrier before the second shader in
> >> +* a merged shader.
> >> +*
> >> +* Execute the barrier inside the conditional 
> >> block,
> >> +* so that empty waves can jump directly to 
> >> s_endpgm,
> >> +* which will also signal the barrier.
> >> +*
> >> +* This is possible in gfx9, because an empty wave
> >> +* for the second shader does not participate in
> >> +* the epilogue. With NGG, empty waves may still
> >> +* be required to export data (e.g. GS output 
> >> vertices),
> >> +* so we cannot let them exit early.
> >> +*
> >> +* If the shader is TCS and the TCS epilog is 
> >> present
> >> +* and contains a barrier, it will wait there and 
> >> then
> >> +* reach s_endpgm.
> >> +   */
> >>  ac_emit_barrier(, ctx.stage);
> >> +   }
> >>
> >>  nir_foreach_variable(variable, [i]->outputs)
> >>  scan_shader_output_decl(, variable, 
> >> shaders[i], shaders[i]->info.stage);
> >> --
> >> 2.22.0
> >>
> >> ___
> >> mesa-dev mailing list
> >> mesa-dev@lists.freedesktop.org
> >> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 4/4] radv/gfx10: do not always execute a barrier before the second shader

2019-07-17 Thread Bas Nieuwenhuizen

On Wed, Jul 17, 2019 at 3:44 PM Samuel Pitoiset
 wrote:
>
> With NGG, empty waves may still be required to export data.
>
> This fixes dEQP-VK.ycbcr.format.*_unorm.geometry_*.
>
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/amd/vulkan/radv_nir_to_llvm.c | 31 ++-
>  1 file changed, 30 insertions(+), 1 deletion(-)
>
> diff --git a/src/amd/vulkan/radv_nir_to_llvm.c 
> b/src/amd/vulkan/radv_nir_to_llvm.c
> index 3e18303879e..7e623414adc 100644
> --- a/src/amd/vulkan/radv_nir_to_llvm.c
> +++ b/src/amd/vulkan/radv_nir_to_llvm.c
> @@ -4448,8 +4448,37 @@ LLVMModuleRef ac_translate_nir_to_llvm(struct 
> ac_llvm_compiler *ac_llvm,
> declare_esgs_ring();
> }
>
> -   if (i)
> +   bool nested_barrier = false;
> +
> +   if (i) {
> +   if (shaders[i]->info.stage == MESA_SHADER_GEOMETRY &&
> +   ctx.options->key.vs_common_out.as_ngg) {
> +   nested_barrier = false;
> +   } else {
> +   nested_barrier = true;
> +   }
> +   }

We can simplify this to

nested_barrier = i && (shaders[i]->info.stage != MESA_SHADER_GEOMETRY
|| !ctx.options->key.vs_common_out.as_ngg);

Otherwise r-b, I'm just surprised an s_barrier is okay.
> +
> +   if (nested_barrier) {
> +   /* Execute a barrier before the second shader in
> +* a merged shader.
> +*
> +* Execute the barrier inside the conditional block,
> +* so that empty waves can jump directly to s_endpgm,
> +* which will also signal the barrier.
> +*
> +* This is possible in gfx9, because an empty wave
> +* for the second shader does not participate in
> +* the epilogue. With NGG, empty waves may still
> +* be required to export data (e.g. GS output 
> vertices),
> +* so we cannot let them exit early.
> +*
> +* If the shader is TCS and the TCS epilog is present
> +* and contains a barrier, it will wait there and then
> +* reach s_endpgm.
> +   */
> ac_emit_barrier(, ctx.stage);
> +   }
>
> nir_foreach_variable(variable, [i]->outputs)
> scan_shader_output_decl(, variable, shaders[i], 
> shaders[i]->info.stage);
> --
> 2.22.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 3/4] radv/gfx10: set BREAK_WAVE_AT_EOI if TES or GS enable the primitive ID

2019-07-17 Thread Bas Nieuwenhuizen

On Wed, Jul 17, 2019 at 3:44 PM Samuel Pitoiset
 wrote:
>
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/amd/vulkan/radv_pipeline.c | 8 
>  1 file changed, 8 insertions(+)
>
> diff --git a/src/amd/vulkan/radv_pipeline.c b/src/amd/vulkan/radv_pipeline.c
> index de933937f03..8b6e62a75f5 100644
> --- a/src/amd/vulkan/radv_pipeline.c
> +++ b/src/amd/vulkan/radv_pipeline.c
> @@ -3452,6 +3452,14 @@ radv_pipeline_generate_hw_ngg(struct radeon_cmdbuf 
> *ctx_cs,
> bool break_wave_at_eoi = false;
> unsigned nparams;
>
> +   if (es_type == MESA_SHADER_TESS_EVAL) {
> +   struct radv_shader_variant *gs =
> +   pipeline->shaders[MESA_SHADER_GEOMETRY];
> +
> +   if (es_enable_prim_id || (gs && gs->info.info.uses_prim_id))
> +   break_wave_at_eoi = true;
> +   }
> +

r-b
> nparams = MAX2(outinfo->param_exports, 1);
> radeon_set_context_reg(ctx_cs, R_0286C4_SPI_VS_OUT_CONFIG,
>S_0286C4_VS_EXPORT_COUNT(nparams - 1) |
> --
> 2.22.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/4] radv: move emitting VGT_GS_MODE into the HW VS path

2019-07-17 Thread Bas Nieuwenhuizen

On Thu, Jul 18, 2019 at 2:05 AM Bas Nieuwenhuizen
 wrote:
>
> On Wed, Jul 17, 2019 at 3:44 PM Samuel Pitoiset
>  wrote:
> >
> > It's useless for NGG anyways.
> >
> > Signed-off-by: Samuel Pitoiset 
> > ---
> >  src/amd/vulkan/radv_pipeline.c | 43 ++
> >  1 file changed, 33 insertions(+), 10 deletions(-)
> >
> > diff --git a/src/amd/vulkan/radv_pipeline.c b/src/amd/vulkan/radv_pipeline.c
> > index fdeb31c453e..686fd371f0f 100644
> > --- a/src/amd/vulkan/radv_pipeline.c
> > +++ b/src/amd/vulkan/radv_pipeline.c
> > @@ -3272,27 +3272,18 @@ radv_pipeline_generate_vgt_gs_mode(struct 
> > radeon_cmdbuf *ctx_cs,
>
> Can you rename the function?

Actually now that I see your later patches, how about we keep this
function, return immediately if ngg, and then move the primitive id
stuff for ngg to ngg?


>
>
> > pipeline->shaders[MESA_SHADER_TESS_EVAL] :
> > pipeline->shaders[MESA_SHADER_VERTEX];
> > unsigned vgt_primitiveid_en = 0;
> > -   uint32_t vgt_gs_mode = 0;
> >
> > -   if (radv_pipeline_has_gs(pipeline)) {
> > -   const struct radv_shader_variant *gs =
> > -   pipeline->shaders[MESA_SHADER_GEOMETRY];
> > -
> > -   vgt_gs_mode = ac_vgt_gs_mode(gs->info.gs.vertices_out,
> > -
> > pipeline->device->physical_device->rad_info.chip_class);
> > -   } else if (radv_pipeline_has_ngg(pipeline)) {
> > +   if (radv_pipeline_has_ngg(pipeline)) {
> > bool enable_prim_id =
> > outinfo->export_prim_id || 
> > vs->info.info.uses_prim_id;
> >
> > vgt_primitiveid_en |= 
> > S_028A84_PRIMITIVEID_EN(enable_prim_id) |
> >   
> > S_028A84_NGG_DISABLE_PROVOK_REUSE(enable_prim_id);
> > } else if (outinfo->export_prim_id || vs->info.info.uses_prim_id) {
> > -   vgt_gs_mode = S_028A40_MODE(V_028A40_GS_SCENARIO_A);
> > vgt_primitiveid_en |= S_028A84_PRIMITIVEID_EN(1);
> > }
> >
> > radeon_set_context_reg(ctx_cs, R_028A84_VGT_PRIMITIVEID_EN, 
> > vgt_primitiveid_en);
> > -   radeon_set_context_reg(ctx_cs, R_028A40_VGT_GS_MODE, vgt_gs_mode);
> >  }
> >
> >  static void
> > @@ -3370,6 +3361,38 @@ radv_pipeline_generate_hw_vs(struct radeon_cmdbuf 
> > *ctx_cs,
> >cull_dist_mask << 8 |
> >clip_dist_mask);
> >
> > +   /* We always write VGT_GS_MODE in the VS state, because every switch
> > +* between different shader pipelines involving a different GS or 
> > no GS
> > +* at all involves a switch of the VS (different GS use different 
> > copy
> > +* shaders). On the other hand, when the API switches from a GS to 
> > no
> > +* GS and then back to the same GS used originally, the GS state is 
> > not
> > +* sent again.
> > +*/
> > +   unsigned vgt_gs_mode;
> > +   if (!radv_pipeline_has_gs(pipeline)) {
> > +   const struct radv_vs_output_info *outinfo =
> > +   get_vs_output_info(pipeline);
> > +   const struct radv_shader_variant *vs =
> > +   pipeline->shaders[MESA_SHADER_TESS_EVAL] ?
> > +   pipeline->shaders[MESA_SHADER_TESS_EVAL] :
> > +   pipeline->shaders[MESA_SHADER_VERTEX];
> > +   unsigned mode = V_028A40_GS_OFF;
> > +
> > +   /* PrimID needs GS scenario A. */
> > +   if (outinfo->export_prim_id || vs->info.info.uses_prim_id)
> > +   mode = V_028A40_GS_SCENARIO_A;
> > +
> > +   vgt_gs_mode = S_028A40_MODE(mode);
> > +   } else {
> > +   const struct radv_shader_variant *gs =
> > +   pipeline->shaders[MESA_SHADER_GEOMETRY];
> > +
> > +   vgt_gs_mode = ac_vgt_gs_mode(gs->info.gs.vertices_out,
> > +
> > pipeline->device->physical_device->rad_info.chip_class);
> > +   }
> > +
> > +   radeon_set_context_reg(ctx_cs, R_028A40_VGT_GS_MODE, vgt_gs_mode);
> > +
>
> Can you keep this in a separate function (possibly with the name
> radv_pipeline_generate_vgt_gs_mode)?
> > if (pipeline->device->physical_device->rad_info.chip_class <= GFX8)
> > radeon_set_context_reg(ctx_cs, R_028AB4_VGT_REUSE_OFF,
> >outinfo->writes_viewport_index);
> > --
> > 2.22.0
> >
> > ___
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/4] radv: move emitting VGT_GS_MODE into the HW VS path

2019-07-17 Thread Bas Nieuwenhuizen

On Wed, Jul 17, 2019 at 3:44 PM Samuel Pitoiset
 wrote:
>
> It's useless for NGG anyways.
>
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/amd/vulkan/radv_pipeline.c | 43 ++
>  1 file changed, 33 insertions(+), 10 deletions(-)
>
> diff --git a/src/amd/vulkan/radv_pipeline.c b/src/amd/vulkan/radv_pipeline.c
> index fdeb31c453e..686fd371f0f 100644
> --- a/src/amd/vulkan/radv_pipeline.c
> +++ b/src/amd/vulkan/radv_pipeline.c
> @@ -3272,27 +3272,18 @@ radv_pipeline_generate_vgt_gs_mode(struct 
> radeon_cmdbuf *ctx_cs,

Can you rename the function?


> pipeline->shaders[MESA_SHADER_TESS_EVAL] :
> pipeline->shaders[MESA_SHADER_VERTEX];
> unsigned vgt_primitiveid_en = 0;
> -   uint32_t vgt_gs_mode = 0;
>
> -   if (radv_pipeline_has_gs(pipeline)) {
> -   const struct radv_shader_variant *gs =
> -   pipeline->shaders[MESA_SHADER_GEOMETRY];
> -
> -   vgt_gs_mode = ac_vgt_gs_mode(gs->info.gs.vertices_out,
> -
> pipeline->device->physical_device->rad_info.chip_class);
> -   } else if (radv_pipeline_has_ngg(pipeline)) {
> +   if (radv_pipeline_has_ngg(pipeline)) {
> bool enable_prim_id =
> outinfo->export_prim_id || vs->info.info.uses_prim_id;
>
> vgt_primitiveid_en |= S_028A84_PRIMITIVEID_EN(enable_prim_id) 
> |
>   
> S_028A84_NGG_DISABLE_PROVOK_REUSE(enable_prim_id);
> } else if (outinfo->export_prim_id || vs->info.info.uses_prim_id) {
> -   vgt_gs_mode = S_028A40_MODE(V_028A40_GS_SCENARIO_A);
> vgt_primitiveid_en |= S_028A84_PRIMITIVEID_EN(1);
> }
>
> radeon_set_context_reg(ctx_cs, R_028A84_VGT_PRIMITIVEID_EN, 
> vgt_primitiveid_en);
> -   radeon_set_context_reg(ctx_cs, R_028A40_VGT_GS_MODE, vgt_gs_mode);
>  }
>
>  static void
> @@ -3370,6 +3361,38 @@ radv_pipeline_generate_hw_vs(struct radeon_cmdbuf 
> *ctx_cs,
>cull_dist_mask << 8 |
>clip_dist_mask);
>
> +   /* We always write VGT_GS_MODE in the VS state, because every switch
> +* between different shader pipelines involving a different GS or no 
> GS
> +* at all involves a switch of the VS (different GS use different copy
> +* shaders). On the other hand, when the API switches from a GS to no
> +* GS and then back to the same GS used originally, the GS state is 
> not
> +* sent again.
> +*/
> +   unsigned vgt_gs_mode;
> +   if (!radv_pipeline_has_gs(pipeline)) {
> +   const struct radv_vs_output_info *outinfo =
> +   get_vs_output_info(pipeline);
> +   const struct radv_shader_variant *vs =
> +   pipeline->shaders[MESA_SHADER_TESS_EVAL] ?
> +   pipeline->shaders[MESA_SHADER_TESS_EVAL] :
> +   pipeline->shaders[MESA_SHADER_VERTEX];
> +   unsigned mode = V_028A40_GS_OFF;
> +
> +   /* PrimID needs GS scenario A. */
> +   if (outinfo->export_prim_id || vs->info.info.uses_prim_id)
> +   mode = V_028A40_GS_SCENARIO_A;
> +
> +   vgt_gs_mode = S_028A40_MODE(mode);
> +   } else {
> +   const struct radv_shader_variant *gs =
> +   pipeline->shaders[MESA_SHADER_GEOMETRY];
> +
> +   vgt_gs_mode = ac_vgt_gs_mode(gs->info.gs.vertices_out,
> +
> pipeline->device->physical_device->rad_info.chip_class);
> +   }
> +
> +   radeon_set_context_reg(ctx_cs, R_028A40_VGT_GS_MODE, vgt_gs_mode);
> +

Can you keep this in a separate function (possibly with the name
radv_pipeline_generate_vgt_gs_mode)?
> if (pipeline->device->physical_device->rad_info.chip_class <= GFX8)
> radeon_set_context_reg(ctx_cs, R_028AB4_VGT_REUSE_OFF,
>outinfo->writes_viewport_index);
> --
> 2.22.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] radv: fix VGT_GS_MODE if VS uses the primitive ID

2019-07-17 Thread Bas Nieuwenhuizen

r-b

On Wed, Jul 17, 2019 at 10:54 AM Samuel Pitoiset
 wrote:
>
> Found by inspection.
>
> Cc: 
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/amd/vulkan/radv_pipeline.c | 10 +-
>  1 file changed, 5 insertions(+), 5 deletions(-)
>
> diff --git a/src/amd/vulkan/radv_pipeline.c b/src/amd/vulkan/radv_pipeline.c
> index a3323ae8135..f6cb3611c9d 100644
> --- a/src/amd/vulkan/radv_pipeline.c
> +++ b/src/amd/vulkan/radv_pipeline.c
> @@ -3264,6 +3264,10 @@ radv_pipeline_generate_vgt_gs_mode(struct 
> radeon_cmdbuf *ctx_cs,
> struct radv_pipeline *pipeline)
>  {
> const struct radv_vs_output_info *outinfo = 
> get_vs_output_info(pipeline);
> +   const struct radv_shader_variant *vs =
> +   pipeline->shaders[MESA_SHADER_TESS_EVAL] ?
> +   pipeline->shaders[MESA_SHADER_TESS_EVAL] :
> +   pipeline->shaders[MESA_SHADER_VERTEX];
> unsigned vgt_primitiveid_en = 0;
> uint32_t vgt_gs_mode = 0;
>
> @@ -3274,16 +3278,12 @@ radv_pipeline_generate_vgt_gs_mode(struct 
> radeon_cmdbuf *ctx_cs,
> vgt_gs_mode = ac_vgt_gs_mode(gs->info.gs.vertices_out,
>  
> pipeline->device->physical_device->rad_info.chip_class);
> } else if (radv_pipeline_has_ngg(pipeline)) {
> -   const struct radv_shader_variant *vs =
> -   pipeline->shaders[MESA_SHADER_TESS_EVAL] ?
> -   pipeline->shaders[MESA_SHADER_TESS_EVAL] :
> -   pipeline->shaders[MESA_SHADER_VERTEX];
> bool enable_prim_id =
> outinfo->export_prim_id || vs->info.info.uses_prim_id;
>
> vgt_primitiveid_en |= S_028A84_PRIMITIVEID_EN(enable_prim_id) 
> |
>   
> S_028A84_NGG_DISABLE_PROVOK_REUSE(enable_prim_id);
> -   } else if (outinfo->export_prim_id) {
> +   } else if (outinfo->export_prim_id || vs->info.info.uses_prim_id) {
> vgt_gs_mode = S_028A40_MODE(V_028A40_GS_SCENARIO_A);
> vgt_primitiveid_en |= S_028A84_PRIMITIVEID_EN(1);
> }
> --
> 2.22.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] radv/gfx10: set the pgm rsrc3/4 regs using index sh reg set

2019-07-17 Thread Bas Nieuwenhuizen

r-b

On Tue, Jul 16, 2019 at 7:28 AM Dave Airlie  wrote:
>
> From: Dave Airlie 
>
> This is ported from AMDVLK, it's probably not requires unless
> we want to use "real time queues", but it might be nice to just have
> in place.
> ---
>  src/amd/common/sid.h   |  1 +
>  src/amd/vulkan/radv_cs.h   | 18 +++
>  src/amd/vulkan/si_cmd_buffer.c | 42 +++---
>  3 files changed, 42 insertions(+), 19 deletions(-)
>
> diff --git a/src/amd/common/sid.h b/src/amd/common/sid.h
> index d464b6a110e..0b996e54884 100644
> --- a/src/amd/common/sid.h
> +++ b/src/amd/common/sid.h
> @@ -196,6 +196,7 @@
>  #define PKT3_INCREMENT_CE_COUNTER  0x84
>  #define PKT3_INCREMENT_DE_COUNTER  0x85
>  #define PKT3_WAIT_ON_CE_COUNTER0x86
> +#define PKT3_SET_SH_REG_INDEX  0x9B
>  #define PKT3_LOAD_CONTEXT_REG  0x9F /* new for VI */
>
>  #define PKT_TYPE_S(x)   (((unsigned)(x) & 0x3) << 30)
> diff --git a/src/amd/vulkan/radv_cs.h b/src/amd/vulkan/radv_cs.h
> index eb1aedb0327..d21acba7e8e 100644
> --- a/src/amd/vulkan/radv_cs.h
> +++ b/src/amd/vulkan/radv_cs.h
> @@ -97,6 +97,24 @@ static inline void radeon_set_sh_reg(struct radeon_cmdbuf 
> *cs, unsigned reg, uns
> radeon_emit(cs, value);
>  }
>
> +static inline void radeon_set_sh_reg_idx(const struct radv_physical_device 
> *pdevice,
> +struct radeon_cmdbuf *cs,
> +unsigned reg, unsigned idx,
> +unsigned value)
> +{
> +   assert(reg >= SI_SH_REG_OFFSET && reg < SI_SH_REG_END);
> +   assert(cs->cdw + 3 <= cs->max_dw);
> +   assert(idx);
> +
> +   unsigned opcode = PKT3_SET_SH_REG_INDEX;
> +   if (pdevice->rad_info.chip_class < GFX10)
> +   opcode = PKT3_SET_SH_REG;
> +
> +   radeon_emit(cs, PKT3(opcode, 1, 0));
> +   radeon_emit(cs, (reg - SI_SH_REG_OFFSET) >> 2 | (idx << 28));
> +   radeon_emit(cs, value);
> +}
> +
>  static inline void radeon_set_uconfig_reg_seq(struct radeon_cmdbuf *cs, 
> unsigned reg, unsigned num)
>  {
> assert(reg >= CIK_UCONFIG_REG_OFFSET && reg < CIK_UCONFIG_REG_END);
> diff --git a/src/amd/vulkan/si_cmd_buffer.c b/src/amd/vulkan/si_cmd_buffer.c
> index a832dbd89eb..f789cdd1ce6 100644
> --- a/src/amd/vulkan/si_cmd_buffer.c
> +++ b/src/amd/vulkan/si_cmd_buffer.c
> @@ -262,20 +262,24 @@ si_emit_graphics(struct radv_physical_device 
> *physical_device,
> if (physical_device->rad_info.chip_class >= GFX7) {
> if (physical_device->rad_info.chip_class >= GFX10) {
> /* Logical CUs 16 - 31 */
> -   radeon_set_sh_reg(cs, 
> R_00B404_SPI_SHADER_PGM_RSRC4_HS,
> - S_00B404_CU_EN(0x));
> -   radeon_set_sh_reg(cs, 
> R_00B204_SPI_SHADER_PGM_RSRC4_GS,
> - S_00B204_CU_EN(0x) |
> - 
> S_00B204_SPI_SHADER_LATE_ALLOC_GS_GFX10(0));
> -   radeon_set_sh_reg(cs, 
> R_00B104_SPI_SHADER_PGM_RSRC4_VS,
> - S_00B104_CU_EN(0x));
> -   radeon_set_sh_reg(cs, 
> R_00B004_SPI_SHADER_PGM_RSRC4_PS,
> - S_00B004_CU_EN(0x));
> +   radeon_set_sh_reg_idx(physical_device,
> + cs, 
> R_00B404_SPI_SHADER_PGM_RSRC4_HS,
> + 3, S_00B404_CU_EN(0x));
> +   radeon_set_sh_reg_idx(physical_device,
> + cs, 
> R_00B204_SPI_SHADER_PGM_RSRC4_GS,
> + 3, S_00B204_CU_EN(0x) |
> + 
> S_00B204_SPI_SHADER_LATE_ALLOC_GS_GFX10(0));
> +   radeon_set_sh_reg_idx(physical_device,
> + cs, 
> R_00B104_SPI_SHADER_PGM_RSRC4_VS,
> + 3, S_00B104_CU_EN(0x));
> +   radeon_set_sh_reg_idx(physical_device,
> + cs, 
> R_00B004_SPI_SHADER_PGM_RSRC4_PS,
> + 3, S_00B004_CU_EN(0x));
> }
>
> if (physical_device->rad_info.chip_class >= GFX9) {
> -   radeon_set_sh_reg(cs, 
> R_00B41C_SPI_SHADER_PGM_RSRC3_HS,
> - S_00B41C_CU_EN(0x) | 
> S_00B41C_WAVE_LIMIT(0x3F));
> +   radeon_set_sh_reg_idx(physical_device, cs, 
> R_00B41C_SPI_SHADER_PGM_RSRC3_HS,
> + 3, S_00B41C_CU_EN(0x) | 
> S_00B41C_WAVE_LIMIT(0x3F));
> } else {
>

[Mesa-dev] [PATCH] radv: Only save the descriptor set if we have one.

2019-07-16 Thread Bas Nieuwenhuizen

After reset, if valid does not contain the relevant bit the descriptor
can be != NULL but still not be valid.

CC: 
---
 src/amd/vulkan/radv_meta.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/amd/vulkan/radv_meta.c b/src/amd/vulkan/radv_meta.c
index 5e619c2f181..448a6168bd2 100644
--- a/src/amd/vulkan/radv_meta.c
+++ b/src/amd/vulkan/radv_meta.c
@@ -86,7 +86,7 @@ radv_meta_save(struct radv_meta_saved_state *state,
 
if (state->flags & RADV_META_SAVE_DESCRIPTORS) {
state->old_descriptor_set0 = descriptors_state->sets[0];
-   if (!state->old_descriptor_set0)
+   if (!(descriptors_state->valid & 1) || 
!state->old_descriptor_set0)
state->flags &= ~RADV_META_SAVE_DESCRIPTORS;
}
 
-- 
2.21.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] radv/gfx10: implement VK_EXT_post_depth_coverage

2019-07-16 Thread Bas Nieuwenhuizen

r-b

On Tue, Jul 16, 2019 at 5:11 PM Samuel Pitoiset
 wrote:
>
> I did implement this extension a while ago but it didn't work
> on pre GFX10 for some reasons. Now all CTS pass.
>
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/amd/vulkan/radv_extensions.py | 1 +
>  src/amd/vulkan/radv_nir_to_llvm.c | 1 +
>  src/amd/vulkan/radv_pipeline.c| 1 +
>  src/amd/vulkan/radv_shader.c  | 1 +
>  src/amd/vulkan/radv_shader.h  | 1 +
>  5 files changed, 5 insertions(+)
>
> diff --git a/src/amd/vulkan/radv_extensions.py 
> b/src/amd/vulkan/radv_extensions.py
> index 8b6ba6a4df0..e9addad0035 100644
> --- a/src/amd/vulkan/radv_extensions.py
> +++ b/src/amd/vulkan/radv_extensions.py
> @@ -120,6 +120,7 @@ EXTENSIONS = [
>  Extension('VK_EXT_memory_priority',   1, True),
>  Extension('VK_EXT_pci_bus_info',  2, True),
>  Extension('VK_EXT_pipeline_creation_feedback',1, True),
> +Extension('VK_EXT_post_depth_coverage',   1, 
> 'device->rad_info.chip_class >= GFX10'),
>  Extension('VK_EXT_queue_family_foreign',  1, True),
>  Extension('VK_EXT_sample_locations',  1, True),
>  Extension('VK_EXT_sampler_filter_minmax', 1, 
> 'device->rad_info.chip_class >= GFX7'),
> diff --git a/src/amd/vulkan/radv_nir_to_llvm.c 
> b/src/amd/vulkan/radv_nir_to_llvm.c
> index a689003d473..3e18303879e 100644
> --- a/src/amd/vulkan/radv_nir_to_llvm.c
> +++ b/src/amd/vulkan/radv_nir_to_llvm.c
> @@ -4637,6 +4637,7 @@ ac_fill_shader_info(struct radv_shader_variant_info 
> *shader_info, struct nir_sha
>  break;
>  case MESA_SHADER_FRAGMENT:
>  shader_info->fs.early_fragment_test = 
> nir->info.fs.early_fragment_tests;
> +shader_info->fs.post_depth_coverage = 
> nir->info.fs.post_depth_coverage;
>  break;
>  case MESA_SHADER_GEOMETRY:
>  shader_info->gs.vertices_in = nir->info.gs.vertices_in;
> diff --git a/src/amd/vulkan/radv_pipeline.c b/src/amd/vulkan/radv_pipeline.c
> index 31495ec078d..7056ac8ca60 100644
> --- a/src/amd/vulkan/radv_pipeline.c
> +++ b/src/amd/vulkan/radv_pipeline.c
> @@ -3822,6 +3822,7 @@ radv_compute_db_shader_control(const struct radv_device 
> *device,
> S_02880C_MASK_EXPORT_ENABLE(mask_export_enable) |
> S_02880C_Z_ORDER(z_order) |
> S_02880C_DEPTH_BEFORE_SHADER(ps->info.fs.early_fragment_test) 
> |
> +   
> S_02880C_PRE_SHADER_DEPTH_COVERAGE_ENABLE(ps->info.fs.post_depth_coverage) |
> S_02880C_EXEC_ON_HIER_FAIL(ps->info.info.ps.writes_memory) |
> S_02880C_EXEC_ON_NOOP(ps->info.info.ps.writes_memory) |
> S_02880C_DUAL_QUAD_DISABLE(disable_rbplus);
> diff --git a/src/amd/vulkan/radv_shader.c b/src/amd/vulkan/radv_shader.c
> index 1e9399de193..75f1ce3e869 100644
> --- a/src/amd/vulkan/radv_shader.c
> +++ b/src/amd/vulkan/radv_shader.c
> @@ -270,6 +270,7 @@ radv_shader_compile_to_nir(struct radv_device *device,
> .int64_atomics = true,
> .multiview = true,
> .physical_storage_buffer_address = true,
> +   .post_depth_coverage = true,
> .runtime_descriptor_array = true,
> .shader_viewport_index_layer = true,
> .stencil_export = true,
> diff --git a/src/amd/vulkan/radv_shader.h b/src/amd/vulkan/radv_shader.h
> index 360591349a8..fea0d1c8df1 100644
> --- a/src/amd/vulkan/radv_shader.h
> +++ b/src/amd/vulkan/radv_shader.h
> @@ -283,6 +283,7 @@ struct radv_shader_variant_info {
> uint32_t float16_shaded_mask;
> bool can_discard;
> bool early_fragment_test;
> +   bool post_depth_coverage;
> } fs;
> struct {
> unsigned block_size[3];
> --
> 2.22.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/2] radv/gfx10: fallback to the legacy path if tess and extreme geometry

2019-07-16 Thread Bas Nieuwenhuizen

r-b for the series

On Tue, Jul 16, 2019 at 4:39 PM Samuel Pitoiset
 wrote:
>
> This is unsupported and hangs.
>
> This fixes GPU hangs with
> dEQP-VK.tessellation.geometry_interaction.limits.output_required_*.
>
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/amd/vulkan/radv_pipeline.c | 12 
>  src/amd/vulkan/radv_shader.c   |  2 +-
>  2 files changed, 13 insertions(+), 1 deletion(-)
>
> diff --git a/src/amd/vulkan/radv_pipeline.c b/src/amd/vulkan/radv_pipeline.c
> index d1eede172dc..a22e605ca1c 100644
> --- a/src/amd/vulkan/radv_pipeline.c
> +++ b/src/amd/vulkan/radv_pipeline.c
> @@ -2306,6 +2306,18 @@ radv_fill_shader_keys(struct radv_device *device,
> } else {
> keys[MESA_SHADER_VERTEX].vs_common_out.as_ngg = true;
> }
> +
> +   if (nir[MESA_SHADER_TESS_CTRL] &&
> +   nir[MESA_SHADER_GEOMETRY] &&
> +   nir[MESA_SHADER_GEOMETRY]->info.gs.invocations *
> +   nir[MESA_SHADER_GEOMETRY]->info.gs.vertices_out > 256) {
> +   /* Fallback to the legacy path if tessellation is
> +* enabled with extreme geometry because
> +* EN_MAX_VERT_OUT_PER_GS_INSTANCE doesn't work and it
> +* might hang.
> +*/
> +   keys[MESA_SHADER_TESS_EVAL].vs_common_out.as_ngg = 
> false;
> +   }
> }
>
> for(int i = 0; i < MESA_SHADER_STAGES; ++i)
> diff --git a/src/amd/vulkan/radv_shader.c b/src/amd/vulkan/radv_shader.c
> index 1e9399de193..6bafcb2f869 100644
> --- a/src/amd/vulkan/radv_shader.c
> +++ b/src/amd/vulkan/radv_shader.c
> @@ -796,7 +796,7 @@ static void radv_postprocess_config(const struct 
> radv_physical_device *pdevice,
> break;
> }
>
> -   if (pdevice->rad_info.chip_class >= GFX10 &&
> +   if (pdevice->rad_info.chip_class >= GFX10 && info->is_ngg &&
> (stage == MESA_SHADER_VERTEX || stage == MESA_SHADER_TESS_EVAL || 
> stage == MESA_SHADER_GEOMETRY)) {
> unsigned gs_vgpr_comp_cnt, es_vgpr_comp_cnt;
> gl_shader_stage es_stage = stage;
> --
> 2.22.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] radv/gfx10: disable the TC compat zrange workaround

2019-07-16 Thread Bas Nieuwenhuizen

r-b

On Tue, Jul 16, 2019 at 5:35 PM Samuel Pitoiset
 wrote:
>
> Unnecessary.
>
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/amd/vulkan/radv_cmd_buffer.c | 7 ++-
>  src/amd/vulkan/radv_device.c | 2 ++
>  src/amd/vulkan/radv_image.c  | 7 ---
>  src/amd/vulkan/radv_private.h| 1 +
>  4 files changed, 13 insertions(+), 4 deletions(-)
>
> diff --git a/src/amd/vulkan/radv_cmd_buffer.c 
> b/src/amd/vulkan/radv_cmd_buffer.c
> index a6d4e0d0e21..b4301c0da15 100644
> --- a/src/amd/vulkan/radv_cmd_buffer.c
> +++ b/src/amd/vulkan/radv_cmd_buffer.c
> @@ -1356,7 +1356,8 @@ radv_update_zrange_precision(struct radv_cmd_buffer 
> *cmd_buffer,
> uint32_t db_z_info = ds->db_z_info;
> uint32_t db_z_info_reg;
>
> -   if (!radv_image_is_tc_compat_htile(image))
> +   if (!cmd_buffer->device->physical_device->has_tc_compat_zrange_bug ||
> +   !radv_image_is_tc_compat_htile(image))
> return;
>
> if (!radv_layout_has_htile(image, layout,
> @@ -1566,6 +1567,10 @@ radv_set_tc_compat_zrange_metadata(struct 
> radv_cmd_buffer *cmd_buffer,
>  {
> struct radeon_cmdbuf *cs = cmd_buffer->cs;
> uint64_t va = radv_buffer_get_va(image->bo);
> +
> +   if (!cmd_buffer->device->physical_device->has_tc_compat_zrange_bug)
> +   return;
> +
> va += image->offset + image->tc_compat_zrange_offset;
>
> radeon_emit(cs, PKT3(PKT3_WRITE_DATA, 3, 
> cmd_buffer->state.predicating));
> diff --git a/src/amd/vulkan/radv_device.c b/src/amd/vulkan/radv_device.c
> index 9d75305fc2b..b397a9a8aa0 100644
> --- a/src/amd/vulkan/radv_device.c
> +++ b/src/amd/vulkan/radv_device.c
> @@ -363,6 +363,8 @@ radv_physical_device_init(struct radv_physical_device 
> *device,
> device->has_scissor_bug = device->rad_info.family == CHIP_VEGA10 ||
>   device->rad_info.family == CHIP_RAVEN;
>
> +   device->has_tc_compat_zrange_bug = device->rad_info.chip_class < 
> GFX10;
> +
> /* Out-of-order primitive rasterization. */
> device->has_out_of_order_rast = device->rad_info.chip_class >= GFX8 &&
> device->rad_info.max_se >= 2;
> diff --git a/src/amd/vulkan/radv_image.c b/src/amd/vulkan/radv_image.c
> index ccbec36849e..4d3ed71c23c 100644
> --- a/src/amd/vulkan/radv_image.c
> +++ b/src/amd/vulkan/radv_image.c
> @@ -1186,14 +1186,15 @@ radv_image_alloc_dcc(struct radv_image *image)
>  }
>
>  static void
> -radv_image_alloc_htile(struct radv_image *image)
> +radv_image_alloc_htile(struct radv_device *device, struct radv_image *image)
>  {
> image->htile_offset = align64(image->size, 
> image->planes[0].surface.htile_alignment);
>
> /* + 8 for storing the clear values */
> image->clear_value_offset = image->htile_offset + 
> image->planes[0].surface.htile_size;
> image->size = image->clear_value_offset + 8;
> -   if (radv_image_is_tc_compat_htile(image)) {
> +   if (radv_image_is_tc_compat_htile(image) &&
> +   device->physical_device->has_tc_compat_zrange_bug) {
> /* Metadata for the TC-compatible HTILE hardware bug which
>  * have to be fixed by updating ZRANGE_PRECISION when doing
>  * fast depth clears to 0.0f.
> @@ -1402,7 +1403,7 @@ radv_image_create(VkDevice _device,
> if (radv_image_can_enable_htile(image) &&
> !(device->instance->debug_flags & 
> RADV_DEBUG_NO_HIZ)) {
> image->tc_compatible_htile = 
> image->planes[0].surface.flags & RADEON_SURF_TC_COMPATIBLE_HTILE;
> -   radv_image_alloc_htile(image);
> +   radv_image_alloc_htile(device, image);
> } else {
> radv_image_disable_htile(image);
> }
> diff --git a/src/amd/vulkan/radv_private.h b/src/amd/vulkan/radv_private.h
> index e1b5b456ef3..931d4039397 100644
> --- a/src/amd/vulkan/radv_private.h
> +++ b/src/amd/vulkan/radv_private.h
> @@ -317,6 +317,7 @@ struct radv_physical_device {
> bool has_clear_state;
> bool cpdma_prefetch_writes_memory;
> bool has_scissor_bug;
> +   bool has_tc_compat_zrange_bug;
>
> bool has_out_of_order_rast;
> bool out_of_order_rast_allowed;
> --
> 2.22.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] radv/gfx10: don't set array pitch field on images

2019-07-15 Thread Bas Nieuwenhuizen

r-b

On Tue, Jul 16, 2019 at 1:25 AM Dave Airlie  wrote:
>
> From: Dave Airlie 
>
> Setting this seems to be broken, amdvlk only sets it for quilted
> textures which I'm not sure what those are.
>
> Fixes: dEQP-VK.glsl.texture_functions.query.texturesize*3d*
> ---
>  src/amd/vulkan/radv_image.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/src/amd/vulkan/radv_image.c b/src/amd/vulkan/radv_image.c
> index ccbec36849e..66a948fde4a 100644
> --- a/src/amd/vulkan/radv_image.c
> +++ b/src/amd/vulkan/radv_image.c
> @@ -682,7 +682,7 @@ gfx10_make_texture_descriptor(struct radv_device *device,
>  */
> state[4] = S_00A010_DEPTH(type == V_008F1C_SQ_RSRC_IMG_3D ? depth - 1 
> : last_layer) |
>S_00A010_BASE_ARRAY(first_layer);
> -   state[5] = S_00A014_ARRAY_PITCH(!!(type == V_008F1C_SQ_RSRC_IMG_3D)) |
> +   state[5] = S_00A014_ARRAY_PITCH(0) |
>S_00A014_MAX_MIP(image->info.samples > 1 ?
> util_logbase2(image->info.samples) :
> image->info.levels - 1) |
> --
> 2.17.1
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] radv: remove unused code in radv_export_param()

2019-07-15 Thread Bas Nieuwenhuizen

R-b

On Mon, Jul 15, 2019, 8:49 AM Samuel Pitoiset 
wrote:

> It was hack for geometry shaders.
>
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/amd/vulkan/radv_nir_to_llvm.c | 16 +---
>  1 file changed, 1 insertion(+), 15 deletions(-)
>
> diff --git a/src/amd/vulkan/radv_nir_to_llvm.c
> b/src/amd/vulkan/radv_nir_to_llvm.c
> index 00c7df8574b..a5eb8404108 100644
> --- a/src/amd/vulkan/radv_nir_to_llvm.c
> +++ b/src/amd/vulkan/radv_nir_to_llvm.c
> @@ -2593,21 +2593,7 @@ radv_export_param(struct radv_shader_context *ctx,
> unsigned index,
>  static LLVMValueRef
>  radv_load_output(struct radv_shader_context *ctx, unsigned index,
> unsigned chan)
>  {
> -   LLVMValueRef output;
> -
> -   if (ctx->vertexptr) {
> -   LLVMValueRef gep_idx[3] = {
> -   ctx->ac.i32_0, /* implicit C-style array */
> -   ctx->ac.i32_0, /* second value of struct */
> -   ctx->ac.i32_1, /* stream 1: source data index */
> -   };
> -
> -   gep_idx[2] = LLVMConstInt(ctx->ac.i32,
> ac_llvm_reg_index_soa(index, chan), false);
> -   output = LLVMBuildGEP(ctx->ac.builder, ctx->vertexptr,
> gep_idx, 3, "");
> -   } else {
> -   output = ctx->abi.outputs[ac_llvm_reg_index_soa(index,
> chan)];
> -   }
> -
> +   LLVMValueRef output =
> ctx->abi.outputs[ac_llvm_reg_index_soa(index, chan)];
> return LLVMBuildLoad(ctx->ac.builder, output, "");
>  }
>
> --
> 2.22.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] radv/gfx10: add missing conversions for 16-bit exports

2019-07-15 Thread Bas Nieuwenhuizen

And R-b after suggestion

On Mon, Jul 15, 2019, 11:22 PM Bas Nieuwenhuizen 
wrote:

>
>
> On Mon, Jul 15, 2019, 10:45 AM Samuel Pitoiset 
> wrote:
>
>> This fixes
>> dEQP-VK.spirv_assembly.instruction.graphics.16bit_storage.input_output_*
>>
>> Found with RADV_DEBUG=checkir
>>
>> Signed-off-by: Samuel Pitoiset 
>> ---
>>  src/amd/vulkan/radv_nir_to_llvm.c | 9 +
>>  1 file changed, 9 insertions(+)
>>
>> diff --git a/src/amd/vulkan/radv_nir_to_llvm.c
>> b/src/amd/vulkan/radv_nir_to_llvm.c
>> index 339c9d93423..fa26a450a91 100644
>> --- a/src/amd/vulkan/radv_nir_to_llvm.c
>> +++ b/src/amd/vulkan/radv_nir_to_llvm.c
>> @@ -3691,6 +3691,13 @@ static void gfx10_ngg_gs_emit_epilogue_2(struct
>> radv_shader_context *ctx)
>> gep_idx[2] = LLVMConstInt(ctx->ac.i32,
>> out_idx, false);
>> tmp = LLVMBuildGEP(builder, vertexptr,
>> gep_idx, 3, "");
>> tmp = LLVMBuildLoad(builder, tmp, "");
>> +
>> +   LLVMTypeRef type =
>> LLVMGetAllocatedType(ctx->abi.outputs[ac_llvm_reg_index_soa(i, j)]);
>> +   if (ac_get_type_size(type) == 2) {
>> +   tmp =
>> LLVMBuildBitCast(ctx->ac.builder, tmp, ctx->ac.i32, "");
>>
>
> Can we use ac_to_integer here? That way we don't need to care about
> floatness.
>
> +   tmp =
>> LLVMBuildTrunc(ctx->ac.builder, tmp, ctx->ac.i16, "");
>> +   }
>> +
>> outputs[noutput].values[j] =
>> ac_to_float(>ac, tmp);
>> }
>>
>> @@ -3771,6 +3778,8 @@ static void gfx10_ngg_gs_emit_vertex(struct
>> radv_shader_context *ctx,
>> LLVMValueRef ptr = LLVMBuildGEP(builder,
>> vertexptr, gep_idx, 3, "");
>>
>> out_val = ac_to_integer(>ac, out_val);
>> +   out_val = LLVMBuildZExtOrBitCast(ctx->ac.builder,
>> out_val, ctx->ac.i32, "");
>> +
>> LLVMBuildStore(builder, out_val, ptr);
>> }
>> }
>> --
>> 2.22.0
>>
>> ___
>> mesa-dev mailing list
>> mesa-dev@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] radv/gfx10: add missing conversions for 16-bit exports

2019-07-15 Thread Bas Nieuwenhuizen

On Mon, Jul 15, 2019, 10:45 AM Samuel Pitoiset 
wrote:

> This fixes
> dEQP-VK.spirv_assembly.instruction.graphics.16bit_storage.input_output_*
>
> Found with RADV_DEBUG=checkir
>
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/amd/vulkan/radv_nir_to_llvm.c | 9 +
>  1 file changed, 9 insertions(+)
>
> diff --git a/src/amd/vulkan/radv_nir_to_llvm.c
> b/src/amd/vulkan/radv_nir_to_llvm.c
> index 339c9d93423..fa26a450a91 100644
> --- a/src/amd/vulkan/radv_nir_to_llvm.c
> +++ b/src/amd/vulkan/radv_nir_to_llvm.c
> @@ -3691,6 +3691,13 @@ static void gfx10_ngg_gs_emit_epilogue_2(struct
> radv_shader_context *ctx)
> gep_idx[2] = LLVMConstInt(ctx->ac.i32,
> out_idx, false);
> tmp = LLVMBuildGEP(builder, vertexptr,
> gep_idx, 3, "");
> tmp = LLVMBuildLoad(builder, tmp, "");
> +
> +   LLVMTypeRef type =
> LLVMGetAllocatedType(ctx->abi.outputs[ac_llvm_reg_index_soa(i, j)]);
> +   if (ac_get_type_size(type) == 2) {
> +   tmp =
> LLVMBuildBitCast(ctx->ac.builder, tmp, ctx->ac.i32, "");
>

Can we use ac_to_integer here? That way we don't need to care about
floatness.

+   tmp =
> LLVMBuildTrunc(ctx->ac.builder, tmp, ctx->ac.i16, "");
> +   }
> +
> outputs[noutput].values[j] =
> ac_to_float(>ac, tmp);
> }
>
> @@ -3771,6 +3778,8 @@ static void gfx10_ngg_gs_emit_vertex(struct
> radv_shader_context *ctx,
> LLVMValueRef ptr = LLVMBuildGEP(builder,
> vertexptr, gep_idx, 3, "");
>
> out_val = ac_to_integer(>ac, out_val);
> +   out_val = LLVMBuildZExtOrBitCast(ctx->ac.builder,
> out_val, ctx->ac.i32, "");
> +
> LLVMBuildStore(builder, out_val, ptr);
> }
> }
> --
> 2.22.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] radv/gfx10: enable OC_LDS_EN for NGG GS if the ES stage is TES

2019-07-15 Thread Bas Nieuwenhuizen

r-b

On Mon, Jul 15, 2019 at 6:46 PM Samuel Pitoiset
 wrote:
>
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/amd/vulkan/radv_shader.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/src/amd/vulkan/radv_shader.c b/src/amd/vulkan/radv_shader.c
> index f6b0297d4a3..1e9399de193 100644
> --- a/src/amd/vulkan/radv_shader.c
> +++ b/src/amd/vulkan/radv_shader.c
> @@ -826,7 +826,8 @@ static void radv_postprocess_config(const struct 
> radv_physical_device *pdevice,
> config_out->rsrc1 |= 
> S_00B228_GS_VGPR_COMP_CNT(gs_vgpr_comp_cnt) |
>  S_00B228_WGP_MODE(1);
> config_out->rsrc2 |= 
> S_00B22C_ES_VGPR_COMP_CNT(es_vgpr_comp_cnt) |
> -S_00B22C_LDS_SIZE(config_in->lds_size);
> +S_00B22C_LDS_SIZE(config_in->lds_size) |
> +S_00B22C_OC_LDS_EN(es_stage == 
> MESA_SHADER_TESS_EVAL);
> } else if (pdevice->rad_info.chip_class >= GFX9 &&
>stage == MESA_SHADER_GEOMETRY) {
> unsigned es_type = info->gs.es_type;
> --
> 2.22.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2] radv/gfx10: enable 1D textures

2019-07-12 Thread Bas Nieuwenhuizen

R-b

On Fri, Jul 12, 2019, 8:17 AM Samuel Pitoiset 
wrote:

> Mirror RadeonSI. This also fixes crashes in addrlib.
>
> v2: - fix ac_nir_to_llvm
>
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/amd/common/ac_nir_to_llvm.c| 14 +++---
>  src/amd/vulkan/radv_image.c|  4 ++--
>  src/amd/vulkan/winsys/amdgpu/radv_amdgpu_surface.c |  6 --
>  3 files changed, 13 insertions(+), 11 deletions(-)
>
> diff --git a/src/amd/common/ac_nir_to_llvm.c
> b/src/amd/common/ac_nir_to_llvm.c
> index 1fbbe507eae..96bf89a8bf9 100644
> --- a/src/amd/common/ac_nir_to_llvm.c
> +++ b/src/amd/common/ac_nir_to_llvm.c
> @@ -84,7 +84,7 @@ get_ac_sampler_dim(const struct ac_llvm_context *ctx,
> enum glsl_sampler_dim dim,
>  {
> switch (dim) {
> case GLSL_SAMPLER_DIM_1D:
> -   if (ctx->chip_class >= GFX9)
> +   if (ctx->chip_class == GFX9)
> return is_array ? ac_image_2darray : ac_image_2d;
> return is_array ? ac_image_1darray : ac_image_1d;
> case GLSL_SAMPLER_DIM_2D:
> @@ -1360,7 +1360,7 @@ static LLVMValueRef build_tex_intrinsic(struct
> ac_nir_context *ctx,
> }
>
> /* Fixup for GFX9 which allocates 1D textures as 2D. */
> -   if (instr->op == nir_texop_lod && ctx->ac.chip_class >= GFX9) {
> +   if (instr->op == nir_texop_lod && ctx->ac.chip_class == GFX9) {
> if ((args->dim == ac_image_2darray ||
>  args->dim == ac_image_2d) && !args->coords[1]) {
> args->coords[1] = ctx->ac.i32_0;
> @@ -2334,7 +2334,7 @@ static void get_image_coords(struct ac_nir_context
> *ctx,
>   dim ==
> GLSL_SAMPLER_DIM_SUBPASS_MS);
> bool is_ms = (dim == GLSL_SAMPLER_DIM_MS ||
>   dim == GLSL_SAMPLER_DIM_SUBPASS_MS);
> -   bool gfx9_1d = ctx->ac.chip_class >= GFX9 && dim ==
> GLSL_SAMPLER_DIM_1D;
> +   bool gfx9_1d = ctx->ac.chip_class == GFX9 && dim ==
> GLSL_SAMPLER_DIM_1D;
> assert(!add_frag_pos && "Input attachments should be lowered by
> this point.");
> count = image_type_to_components_count(dim, is_array);
>
> @@ -2706,7 +2706,7 @@ static LLVMValueRef visit_image_size(struct
> ac_nir_context *ctx,
> z = LLVMBuildSDiv(ctx->ac.builder, z, six, "");
> res = LLVMBuildInsertElement(ctx->ac.builder, res, z, two,
> "");
> }
> -   if (ctx->ac.chip_class >= GFX9 && dim == GLSL_SAMPLER_DIM_1D &&
> is_array) {
> +   if (ctx->ac.chip_class == GFX9 && dim == GLSL_SAMPLER_DIM_1D &&
> is_array) {
> LLVMValueRef layers =
> LLVMBuildExtractElement(ctx->ac.builder, res, two, "");
> res = LLVMBuildInsertElement(ctx->ac.builder, res, layers,
> ctx->ac.i32_1, "");
> @@ -3829,7 +3829,7 @@ static void visit_tex(struct ac_nir_context *ctx,
> nir_tex_instr *instr)
> break;
> case GLSL_SAMPLER_DIM_1D:
> num_src_deriv_channels = 1;
> -   if (ctx->ac.chip_class >= GFX9) {
> +   if (ctx->ac.chip_class == GFX9) {
> num_dest_deriv_channels = 2;
> } else {
> num_dest_deriv_channels = 1;
> @@ -3877,7 +3877,7 @@ static void visit_tex(struct ac_nir_context *ctx,
> nir_tex_instr *instr)
> args.coords[2] = apply_round_slice(>ac,
> args.coords[2]);
> }
>
> -   if (ctx->ac.chip_class >= GFX9 &&
> +   if (ctx->ac.chip_class == GFX9 &&
> instr->sampler_dim == GLSL_SAMPLER_DIM_1D &&
> instr->op != nir_texop_lod) {
> LLVMValueRef filler;
> @@ -3963,7 +3963,7 @@ static void visit_tex(struct ac_nir_context *ctx,
> nir_tex_instr *instr)
> LLVMValueRef z = LLVMBuildExtractElement(ctx->ac.builder,
> result, two, "");
> z = LLVMBuildSDiv(ctx->ac.builder, z, six, "");
> result = LLVMBuildInsertElement(ctx->ac.builder, result,
> z, two, "");
> -   } else if (ctx->ac.chip_class >= GFX9 &&
> +   } else if (ctx->ac.chip_class == GFX9 &&
>instr->op == nir_texop_txs &&
>instr->sampler_dim == GLSL_SAMPLER_DIM_1D &&
>instr->is_array) {
> diff --git a/src/amd/vulkan/radv_image.c b/src/amd/vulkan/radv_image.c
> index 368bd5d839d..ccbec36849e 100644
> --- a/src/amd/vulkan/radv_image.c
> +++ b/src/amd/vulkan/radv_image.c
> @@ -649,7 +649,7 @@ gfx10_make_texture_descriptor(struct radv_device
> *device,
> }
>
> type = radv_tex_dim(image->type, view_type,
> image->info.array_size, image->info.samples,
> -   is_storage_image,
> device->physical_device->rad_info.chip_class >= GFX9);
> +   is_storage_image,
>

Re: [Mesa-dev] [PATCH 8/8] radv/gfx10: emit DISABLE_CONSERVATIVE_ZPASS_COUNTS

2019-07-12 Thread Bas Nieuwenhuizen

r-b for the series

On Fri, Jul 12, 2019 at 12:17 PM Samuel Pitoiset
 wrote:
>
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/amd/vulkan/radv_cmd_buffer.c | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/src/amd/vulkan/radv_cmd_buffer.c 
> b/src/amd/vulkan/radv_cmd_buffer.c
> index dacd8c8d803..86b5c812405 100644
> --- a/src/amd/vulkan/radv_cmd_buffer.c
> +++ b/src/amd/vulkan/radv_cmd_buffer.c
> @@ -1993,10 +1993,12 @@ void radv_set_db_count_control(struct radv_cmd_buffer 
> *cmd_buffer)
> } else {
> const struct radv_subpass *subpass = 
> cmd_buffer->state.subpass;
> uint32_t sample_rate = subpass ? 
> util_logbase2(subpass->max_sample_count) : 0;
> +   bool gfx10_perfect = 
> cmd_buffer->device->physical_device->rad_info.chip_class >= GFX10 && 
> has_perfect_queries;
>
> if (cmd_buffer->device->physical_device->rad_info.chip_class 
> >= GFX7) {
> db_count_control =
> 
> S_028004_PERFECT_ZPASS_COUNTS(has_perfect_queries) |
> +   
> S_028004_DISABLE_CONSERVATIVE_ZPASS_COUNTS(gfx10_perfect) |
> S_028004_SAMPLE_RATE(sample_rate) |
> S_028004_ZPASS_ENABLE(1) |
> S_028004_SLICE_EVEN_ENABLE(1) |
> --
> 2.22.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/2] radv: store a pointer to rad_info in the pipeline

2019-07-12 Thread Bas Nieuwenhuizen

Please don't introduce multiple ways to do the same thing. (for both
patches in the series).

On Fri, Jul 12, 2019 at 10:39 AM Samuel Pitoiset
 wrote:
>
> Cleanup.
>
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/amd/vulkan/radv_pipeline.c | 173 -
>  src/amd/vulkan/radv_private.h  |   2 +
>  2 files changed, 87 insertions(+), 88 deletions(-)
>
> diff --git a/src/amd/vulkan/radv_pipeline.c b/src/amd/vulkan/radv_pipeline.c
> index 9b68650fd36..e81afdd426c 100644
> --- a/src/amd/vulkan/radv_pipeline.c
> +++ b/src/amd/vulkan/radv_pipeline.c
> @@ -178,7 +178,7 @@ radv_pipeline_scratch_init(struct radv_device *device,
>   
> pipeline->shaders[i]->config.scratch_bytes_per_wave);
>
> max_stage_waves = MIN2(max_stage_waves,
> - 4 * 
> device->physical_device->rad_info.num_good_compute_units *
> + 4 * pipeline->info->num_good_compute_units *
>   (256 / 
> pipeline->shaders[i]->config.num_vgprs));
> max_waves = MAX2(max_waves, max_stage_waves);
> }
> @@ -1092,7 +1092,7 @@ radv_pipeline_init_multisample_state(struct 
> radv_pipeline *pipeline,
>  {
> const VkPipelineMultisampleStateCreateInfo *vkms = 
> pCreateInfo->pMultisampleState;
> struct radv_multisample_state *ms = >graphics.ms;
> -   unsigned num_tile_pipes = 
> pipeline->device->physical_device->rad_info.num_tile_pipes;
> +   unsigned num_tile_pipes = pipeline->info->num_tile_pipes;
> bool out_of_order_rast = false;
> int ps_iter_samples = 1;
> uint32_t mask = 0x;
> @@ -1141,7 +1141,7 @@ radv_pipeline_init_multisample_state(struct 
> radv_pipeline *pipeline,
> S_028A4C_MULTI_SHADER_ENGINE_PRIM_DISCARD_ENABLE(1) |
> S_028A4C_FORCE_EOV_CNTDWN_ENABLE(1) |
> S_028A4C_FORCE_EOV_REZ_ENABLE(1);
> -   ms->pa_sc_mode_cntl_0 = 
> S_028A48_ALTERNATE_RBS_PER_TILE(pipeline->device->physical_device->rad_info.chip_class
>  >= GFX9) |
> +   ms->pa_sc_mode_cntl_0 = 
> S_028A48_ALTERNATE_RBS_PER_TILE(pipeline->info->chip_class >= GFX9) |
> S_028A48_VPORT_SCISSOR_ENABLE(1);
>
> if (ms->num_samples > 1) {
> @@ -1492,7 +1492,7 @@ calculate_gs_info(const VkGraphicsPipelineCreateInfo 
> *pCreateInfo,
> struct radv_gs_state gs = {0};
> struct radv_shader_variant_info *gs_info = 
> >shaders[MESA_SHADER_GEOMETRY]->info;
> struct radv_es_output_info *es_info;
> -   if (pipeline->device->physical_device->rad_info.chip_class >= GFX9)
> +   if (pipeline->info->chip_class >= GFX9)
> es_info = radv_pipeline_has_tess(pipeline) ? 
> _info->tes.es_info : _info->vs.es_info;
> else
> es_info = radv_pipeline_has_tess(pipeline) ?
> @@ -1835,15 +1835,14 @@ calculate_ngg_info(const VkGraphicsPipelineCreateInfo 
> *pCreateInfo,
>  static void
>  calculate_gs_ring_sizes(struct radv_pipeline *pipeline, const struct 
> radv_gs_state *gs)
>  {
> -   struct radv_device *device = pipeline->device;
> -   unsigned num_se = device->physical_device->rad_info.max_se;
> +   unsigned num_se = pipeline->info->max_se;
> unsigned wave_size = 64;
> unsigned max_gs_waves = 32 * num_se; /* max 32 per SE on GCN */
> /* On GFX6-GFX7, the value comes from VGT_GS_VERTEX_REUSE = 16.
>  * On GFX8+, the value comes from VGT_VERTEX_REUSE_BLOCK_CNTL = 30 
> (+2).
>  */
> unsigned gs_vertex_reuse =
> -   (device->physical_device->rad_info.chip_class >= GFX8 ? 32 : 
> 16) * num_se;
> +   (pipeline->info->chip_class >= GFX8 ? 32 : 16) * num_se;
> unsigned alignment = 256 * num_se;
> /* The maximum size is 63.999 MB per SE. */
> unsigned max_size = ((unsigned)(63.999 * 1024 * 1024) & ~255) * 
> num_se;
> @@ -1862,13 +1861,13 @@ calculate_gs_ring_sizes(struct radv_pipeline 
> *pipeline, const struct radv_gs_sta
> esgs_ring_size = align(esgs_ring_size, alignment);
> gsvs_ring_size = align(gsvs_ring_size, alignment);
>
> -   if (pipeline->device->physical_device->rad_info.chip_class <= GFX8)
> +   if (pipeline->info->chip_class <= GFX8)
> pipeline->graphics.esgs_ring_size = CLAMP(esgs_ring_size, 
> min_esgs_ring_size, max_size);
>
> pipeline->graphics.gsvs_ring_size = MIN2(gsvs_ring_size, max_size);
>  }
>
> -static void si_multiwave_lds_size_workaround(struct radv_device *device,
> +static void si_multiwave_lds_size_workaround(struct radv_pipeline *pipeline,
>  unsigned *lds_size)
>  {
> /* If tessellation is all offchip and on-chip GS isn't used, this
> @@ -1880,8 +1879,8 @@ static void si_multiwave_lds_size_workaround(struct 
> radv_device *device,
>  *   Make

Re: [Mesa-dev] [PATCH 1/2] radv: add more assertions to make sure packets are correctly emitted

2019-07-12 Thread Bas Nieuwenhuizen

Okay, r-b for the series then.

On Fri, Jul 12, 2019 at 12:00 PM Samuel Pitoiset
 wrote:
>
>
> On 7/12/19 11:54 AM, Bas Nieuwenhuizen wrote:
> > On Fri, Jul 12, 2019 at 11:13 AM Samuel Pitoiset
> >  wrote:
> >> Signed-off-by: Samuel Pitoiset 
> >> ---
> >>   src/amd/vulkan/radv_cs.h | 6 +++---
> >>   1 file changed, 3 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/src/amd/vulkan/radv_cs.h b/src/amd/vulkan/radv_cs.h
> >> index 5f8b59c34cb..2ba7da1fb44 100644
> >> --- a/src/amd/vulkan/radv_cs.h
> >> +++ b/src/amd/vulkan/radv_cs.h
> >> @@ -42,7 +42,7 @@ static inline unsigned radeon_check_space(struct 
> >> radeon_winsys *ws,
> >>
> >>   static inline void radeon_set_config_reg_seq(struct radeon_cmdbuf *cs, 
> >> unsigned reg, unsigned num)
> >>   {
> >> -assert(reg < SI_CONTEXT_REG_OFFSET);
> >> +assert(reg < SI_CONTEXT_REG_OFFSET && reg < SI_CONFIG_REG_END);
> > Shouldn't the first condition be "reg >= SI_CONFIG_REG_OFFSET" ?
> Right, will fix before pushing.
> >
> >
> >>   assert(cs->cdw + 2 + num <= cs->max_dw);
> >>   assert(num);
> >>   radeon_emit(cs, PKT3(PKT3_SET_CONFIG_REG, num, 0));
> >> @@ -57,7 +57,7 @@ static inline void radeon_set_config_reg(struct 
> >> radeon_cmdbuf *cs, unsigned reg,
> >>
> >>   static inline void radeon_set_context_reg_seq(struct radeon_cmdbuf *cs, 
> >> unsigned reg, unsigned num)
> >>   {
> >> -assert(reg >= SI_CONTEXT_REG_OFFSET);
> >> +assert(reg >= SI_CONTEXT_REG_OFFSET && reg < SI_CONTEXT_REG_END);
> >>   assert(cs->cdw + 2 + num <= cs->max_dw);
> >>   assert(num);
> >>   radeon_emit(cs, PKT3(PKT3_SET_CONTEXT_REG, num, 0));
> >> @@ -75,7 +75,7 @@ static inline void radeon_set_context_reg_idx(struct 
> >> radeon_cmdbuf *cs,
> >>unsigned reg, unsigned idx,
> >>unsigned value)
> >>   {
> >> -   assert(reg >= SI_CONTEXT_REG_OFFSET);
> >> +   assert(reg >= SI_CONTEXT_REG_OFFSET && reg < SI_CONTEXT_REG_END);
> >>  assert(cs->cdw + 3 <= cs->max_dw);
> >>  radeon_emit(cs, PKT3(PKT3_SET_CONTEXT_REG, 1, 0));
> >>  radeon_emit(cs, (reg - SI_CONTEXT_REG_OFFSET) >> 2 | (idx << 28));
> >> --
> >> 2.22.0
> >>
> >> ___
> >> mesa-dev mailing list
> >> mesa-dev@lists.freedesktop.org
> >> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/2] radv: add more assertions to make sure packets are correctly emitted

2019-07-12 Thread Bas Nieuwenhuizen

On Fri, Jul 12, 2019 at 11:13 AM Samuel Pitoiset
 wrote:
>
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/amd/vulkan/radv_cs.h | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/src/amd/vulkan/radv_cs.h b/src/amd/vulkan/radv_cs.h
> index 5f8b59c34cb..2ba7da1fb44 100644
> --- a/src/amd/vulkan/radv_cs.h
> +++ b/src/amd/vulkan/radv_cs.h
> @@ -42,7 +42,7 @@ static inline unsigned radeon_check_space(struct 
> radeon_winsys *ws,
>
>  static inline void radeon_set_config_reg_seq(struct radeon_cmdbuf *cs, 
> unsigned reg, unsigned num)
>  {
> -assert(reg < SI_CONTEXT_REG_OFFSET);
> +assert(reg < SI_CONTEXT_REG_OFFSET && reg < SI_CONFIG_REG_END);

Shouldn't the first condition be "reg >= SI_CONFIG_REG_OFFSET" ?


>  assert(cs->cdw + 2 + num <= cs->max_dw);
>  assert(num);
>  radeon_emit(cs, PKT3(PKT3_SET_CONFIG_REG, num, 0));
> @@ -57,7 +57,7 @@ static inline void radeon_set_config_reg(struct 
> radeon_cmdbuf *cs, unsigned reg,
>
>  static inline void radeon_set_context_reg_seq(struct radeon_cmdbuf *cs, 
> unsigned reg, unsigned num)
>  {
> -assert(reg >= SI_CONTEXT_REG_OFFSET);
> +assert(reg >= SI_CONTEXT_REG_OFFSET && reg < SI_CONTEXT_REG_END);
>  assert(cs->cdw + 2 + num <= cs->max_dw);
>  assert(num);
>  radeon_emit(cs, PKT3(PKT3_SET_CONTEXT_REG, num, 0));
> @@ -75,7 +75,7 @@ static inline void radeon_set_context_reg_idx(struct 
> radeon_cmdbuf *cs,
>   unsigned reg, unsigned idx,
>   unsigned value)
>  {
> -   assert(reg >= SI_CONTEXT_REG_OFFSET);
> +   assert(reg >= SI_CONTEXT_REG_OFFSET && reg < SI_CONTEXT_REG_END);
> assert(cs->cdw + 3 <= cs->max_dw);
> radeon_emit(cs, PKT3(PKT3_SET_CONTEXT_REG, 1, 0));
> radeon_emit(cs, (reg - SI_CONTEXT_REG_OFFSET) >> 2 | (idx << 28));
> --
> 2.22.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] radv/gfx10: do not set alignment on the ngg_emit pointer

2019-07-11 Thread Bas Nieuwenhuizen

R-b

On Thu, Jul 11, 2019, 6:33 PM Samuel Pitoiset 
wrote:

> This is invalid and this fixes a crash in LLVM.
>
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/amd/vulkan/radv_nir_to_llvm.c | 1 -
>  1 file changed, 1 deletion(-)
>
> diff --git a/src/amd/vulkan/radv_nir_to_llvm.c
> b/src/amd/vulkan/radv_nir_to_llvm.c
> index bf712b7fe45..32548857b57 100644
> --- a/src/amd/vulkan/radv_nir_to_llvm.c
> +++ b/src/amd/vulkan/radv_nir_to_llvm.c
> @@ -4326,7 +4326,6 @@ LLVMModuleRef ac_translate_nir_to_llvm(struct
> ac_llvm_compiler *ac_llvm,
> ctx.gs_ngg_emit = 
> LLVMBuildIntToPtr(ctx.ac.builder,
> ctx.ac.i32_0,
>
> LLVMPointerType(LLVMArrayType(ctx.ac.i32, 0), AC_ADDR_SPACE_LDS),
> "ngg_emit");
> -   LLVMSetAlignment(ctx.gs_ngg_emit, 4);
> }
>
> ctx.gs_max_out_vertices =
> shaders[i]->info.gs.vertices_out;
> --
> 2.22.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/2] radv: report shader stage name when dumping LLVM IR

2019-07-11 Thread Bas Nieuwenhuizen

R-b for the series

On Thu, Jul 11, 2019, 6:04 PM Samuel Pitoiset 
wrote:

> For debugging purposes.
>
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/amd/vulkan/radv_nir_to_llvm.c | 21 +
>  1 file changed, 17 insertions(+), 4 deletions(-)
>
> diff --git a/src/amd/vulkan/radv_nir_to_llvm.c
> b/src/amd/vulkan/radv_nir_to_llvm.c
> index 32548857b57..e4ab5847729 100644
> --- a/src/amd/vulkan/radv_nir_to_llvm.c
> +++ b/src/amd/vulkan/radv_nir_to_llvm.c
> @@ -4434,8 +4434,13 @@ LLVMModuleRef ac_translate_nir_to_llvm(struct
> ac_llvm_compiler *ac_llvm,
>
> LLVMBuildRetVoid(ctx.ac.builder);
>
> -   if (options->dump_preoptir)
> +   if (options->dump_preoptir) {
> +   fprintf(stderr, "%s LLVM IR:\n\n",
> +   radv_get_shader_name(shader_info,
> +shaders[shader_count -
> 1]->info.stage));
> ac_dump_module(ctx.ac.module);
> +   fprintf(stderr, "\n");
> +   }
>
> ac_llvm_finalize_module(, ac_llvm->passmgr, options);
>
> @@ -4489,13 +4494,18 @@ static void ac_compile_llvm_module(struct
> ac_llvm_compiler *ac_llvm,
>struct radv_shader_binary **rbinary,
>struct radv_shader_variant_info
> *shader_info,
>gl_shader_stage stage,
> +  const char *name,
>const struct radv_nir_compiler_options
> *options)
>  {
> char *elf_buffer = NULL;
> size_t elf_size = 0;
> char *llvm_ir_string = NULL;
> -   if (options->dump_shader)
> +
> +   if (options->dump_shader) {
> +   fprintf(stderr, "%s LLVM IR:\n\n", name);
> ac_dump_module(llvm_module);
> +   fprintf(stderr, "\n");
> +   }
>
> if (options->record_llvm_ir) {
> char *llvm_ir = LLVMPrintModuleToString(llvm_module);
> @@ -4585,7 +4595,10 @@ radv_compile_nir_shader(struct ac_llvm_compiler
> *ac_llvm,
>options);
>
> ac_compile_llvm_module(ac_llvm, llvm_module, rbinary, shader_info,
> -  nir[nir_count - 1]->info.stage, options);
> +  nir[nir_count - 1]->info.stage,
> +  radv_get_shader_name(shader_info,
> +   nir[nir_count -
> 1]->info.stage),
> +  options);
>
> for (int i = 0; i < nir_count; ++i)
> ac_fill_shader_info(shader_info, nir[i], options);
> @@ -4737,7 +4750,7 @@ radv_compile_gs_copy_shader(struct ac_llvm_compiler
> *ac_llvm,
> ac_llvm_finalize_module(, ac_llvm->passmgr, options);
>
> ac_compile_llvm_module(ac_llvm, ctx.ac.module, rbinary,
> shader_info,
> -  MESA_SHADER_VERTEX, options);
> +  MESA_SHADER_VERTEX, "GS Copy Shader",
> options);
> (*rbinary)->is_gs_copy_shader = true;
>
>  }
> --
> 2.22.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/2] radv/gfx10: fix exporting clip/cull distances for GS

2019-07-11 Thread Bas Nieuwenhuizen

R-b

On Thu, Jul 11, 2019, 5:02 PM Samuel Pitoiset 
wrote:

> This fixes dEQP-VK.clipping.user_defined.clip_distance.*geom*.
>
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/amd/vulkan/radv_nir_to_llvm.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/src/amd/vulkan/radv_nir_to_llvm.c
> b/src/amd/vulkan/radv_nir_to_llvm.c
> index 7da061f7f33..bf712b7fe45 100644
> --- a/src/amd/vulkan/radv_nir_to_llvm.c
> +++ b/src/amd/vulkan/radv_nir_to_llvm.c
> @@ -3656,7 +3656,8 @@ static void gfx10_ngg_gs_emit_epilogue_2(struct
> radv_shader_context *ctx)
> noutput++;
> }
>
> -   radv_llvm_export_vs(ctx, outputs, noutput, outinfo, false);
> +   radv_llvm_export_vs(ctx, outputs, noutput, outinfo,
> +
>  ctx->options->key.vs_common_out.export_clip_dists);
> FREE(outputs);
> }
> ac_build_endif(>ac, 5145);
> --
> 2.22.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/2] radv/gfx10: fix exporting the subpass view index for GS

2019-07-11 Thread Bas Nieuwenhuizen

R-b

On Thu, Jul 11, 2019, 5:02 PM Samuel Pitoiset 
wrote:

> This fixes dEQP-VK.multiview.*geometry*.
>
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/amd/vulkan/radv_nir_to_llvm.c | 16 +++-
>  1 file changed, 15 insertions(+), 1 deletion(-)
>
> diff --git a/src/amd/vulkan/radv_nir_to_llvm.c
> b/src/amd/vulkan/radv_nir_to_llvm.c
> index 11498bc27aa..7da061f7f33 100644
> --- a/src/amd/vulkan/radv_nir_to_llvm.c
> +++ b/src/amd/vulkan/radv_nir_to_llvm.c
> @@ -3583,11 +3583,12 @@ static void gfx10_ngg_gs_emit_epilogue_2(struct
> radv_shader_context *ctx)
> ac_build_ifcc(>ac, tmp, 5145);
> {
> struct radv_vs_output_info *outinfo =
> >shader_info->vs.outinfo;
> +   bool export_view_index =
> ctx->options->key.has_multiview_view_index;
> struct radv_shader_output_values *outputs;
> unsigned noutput = 0;
>
> /* Allocate a temporary array for the output values. */
> -   unsigned num_outputs = util_bitcount64(ctx->output_mask);
> +   unsigned num_outputs = util_bitcount64(ctx->output_mask) +
> export_view_index;
> outputs = calloc(num_outputs, sizeof(outputs[0]));
>
> memset(outinfo->vs_output_param_offset,
> AC_EXP_PARAM_UNDEFINED,
> @@ -3642,6 +3643,19 @@ static void gfx10_ngg_gs_emit_epilogue_2(struct
> radv_shader_context *ctx)
> noutput++;
> }
>
> +   /* Export ViewIndex. */
> +   if (export_view_index) {
> +   outinfo->writes_layer = true;
> +
> +   outputs[noutput].slot_name = VARYING_SLOT_LAYER;
> +   outputs[noutput].slot_index = 0;
> +   outputs[noutput].usage_mask = 0x1;
> +   outputs[noutput].values[0] = ac_to_float(>ac,
> ctx->abi.view_index);
> +   for (unsigned j = 1; j < 4; j++)
> +   outputs[noutput].values[j] = ctx->ac.f32_0;
> +   noutput++;
> +   }
> +
> radv_llvm_export_vs(ctx, outputs, noutput, outinfo, false);
> FREE(outputs);
> }
> --
> 2.22.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] radv/gfx10: enable 1D textures

2019-07-11 Thread Bas Nieuwenhuizen

r-b

On Thu, Jul 11, 2019 at 5:22 PM Samuel Pitoiset
 wrote:
>
> Mirror RadeonSI. This also fixes crashes in addrlib.
>
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/amd/vulkan/radv_image.c| 4 ++--
>  src/amd/vulkan/winsys/amdgpu/radv_amdgpu_surface.c | 6 --
>  2 files changed, 6 insertions(+), 4 deletions(-)
>
> diff --git a/src/amd/vulkan/radv_image.c b/src/amd/vulkan/radv_image.c
> index 368bd5d839d..ccbec36849e 100644
> --- a/src/amd/vulkan/radv_image.c
> +++ b/src/amd/vulkan/radv_image.c
> @@ -649,7 +649,7 @@ gfx10_make_texture_descriptor(struct radv_device *device,
> }
>
> type = radv_tex_dim(image->type, view_type, image->info.array_size, 
> image->info.samples,
> -   is_storage_image, 
> device->physical_device->rad_info.chip_class >= GFX9);
> +   is_storage_image, 
> device->physical_device->rad_info.chip_class == GFX9);
> if (type == V_008F1C_SQ_RSRC_IMG_1D_ARRAY) {
> height = 1;
> depth = image->info.array_size;
> @@ -796,7 +796,7 @@ si_make_texture_descriptor(struct radv_device *device,
> data_format = V_008F14_IMG_DATA_FORMAT_S8_16;
> }
> type = radv_tex_dim(image->type, view_type, image->info.array_size, 
> image->info.samples,
> -   is_storage_image, 
> device->physical_device->rad_info.chip_class >= GFX9);
> +   is_storage_image, 
> device->physical_device->rad_info.chip_class == GFX9);
> if (type == V_008F1C_SQ_RSRC_IMG_1D_ARRAY) {
> height = 1;
> depth = image->info.array_size;
> diff --git a/src/amd/vulkan/winsys/amdgpu/radv_amdgpu_surface.c 
> b/src/amd/vulkan/winsys/amdgpu/radv_amdgpu_surface.c
> index 3f4cad861c2..598baa2addc 100644
> --- a/src/amd/vulkan/winsys/amdgpu/radv_amdgpu_surface.c
> +++ b/src/amd/vulkan/winsys/amdgpu/radv_amdgpu_surface.c
> @@ -90,8 +90,10 @@ static int radv_amdgpu_winsys_surface_init(struct 
> radeon_winsys *_ws,
> struct ac_surf_config config;
>
> memcpy(, surf_info, sizeof(config.info));
> -   config.is_3d = !!(type == RADEON_SURF_TYPE_3D);
> -   config.is_cube = !!(type == RADEON_SURF_TYPE_CUBEMAP);
> +   config.is_1d = type == RADEON_SURF_TYPE_1D ||
> +  type == RADEON_SURF_TYPE_1D_ARRAY;
> +   config.is_3d = type == RADEON_SURF_TYPE_3D;
> +   config.is_cube = type == RADEON_SURF_TYPE_CUBEMAP;
>
> return ac_compute_surface(ws->addrlib, >info, , mode, 
> surf);
>  }
> --
> 2.22.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] radv/gfx10: fix maximum number of mip levels for 3D images

2019-07-11 Thread Bas Nieuwenhuizen

r-b

On Thu, Jul 11, 2019 at 2:24 PM Samuel Pitoiset
 wrote:
>
> The dimensions also have to be adjusted if the number of supported
> mip levels is changed.
>
> This fixes dEQP-VK.api.info.image_format_properties.3d.*.
>
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/amd/vulkan/radv_formats.c | 14 ++
>  1 file changed, 10 insertions(+), 4 deletions(-)
>
> diff --git a/src/amd/vulkan/radv_formats.c b/src/amd/vulkan/radv_formats.c
> index 26fc4b9ba18..98c84edbdc1 100644
> --- a/src/amd/vulkan/radv_formats.c
> +++ b/src/amd/vulkan/radv_formats.c
> @@ -1150,10 +1150,16 @@ static VkResult 
> radv_get_image_format_properties(struct radv_physical_device *ph
> maxArraySize = chip_class >= GFX10 ? 8192 : 2048;
> break;
> case VK_IMAGE_TYPE_3D:
> -   maxExtent.width = 2048;
> -   maxExtent.height = 2048;
> -   maxExtent.depth = 2048;
> -   maxMipLevels = chip_class >= GFX10 ? 14 : 12; /* 
> log2(maxWidth) + 1 */
> +   if (chip_class >= GFX10) {
> +   maxExtent.width = 8192;
> +   maxExtent.height = 8192;
> +   maxExtent.depth = 8192;
> +   } else {
> +   maxExtent.width = 2048;
> +   maxExtent.height = 2048;
> +   maxExtent.depth = 2048;
> +   }
> +   maxMipLevels = util_logbase2(maxExtent.width) + 1;
> maxArraySize = 1;
> break;
> }
> --
> 2.22.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] radv/gfx10: disable TC-compat HTILE for multisampled D32_SFLOAT format

2019-07-11 Thread Bas Nieuwenhuizen

r-b

On Thu, Jul 11, 2019 at 11:54 AM Samuel Pitoiset
 wrote:
>
> For some reasons D32_SFLOAT is also affected on GFX10, it works
> fine with previous generations.
>
> This fixes some dEQP-VK.renderpass2.depth_stencil_resolve.*.
>
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/amd/vulkan/radv_image.c | 7 +--
>  1 file changed, 5 insertions(+), 2 deletions(-)
>
> diff --git a/src/amd/vulkan/radv_image.c b/src/amd/vulkan/radv_image.c
> index 6245873a4ed..368bd5d839d 100644
> --- a/src/amd/vulkan/radv_image.c
> +++ b/src/amd/vulkan/radv_image.c
> @@ -83,9 +83,12 @@ radv_use_tc_compat_htile_for_image(struct radv_device 
> *device,
> return false;
>
> /* FIXME: for some reason TC compat with 2/4/8 samples breaks some cts
> -* tests - disable for now */
> +* tests - disable for now. On GFX10 D32_SFLOAT is affected as well.
> +*/
> if (pCreateInfo->samples >= 2 &&
> -   pCreateInfo->format == VK_FORMAT_D32_SFLOAT_S8_UINT)
> +   (pCreateInfo->format == VK_FORMAT_D32_SFLOAT_S8_UINT ||
> +(pCreateInfo->format == VK_FORMAT_D32_SFLOAT &&
> + device->physical_device->rad_info.chip_class == GFX10)))
> return false;
>
> /* GFX9 supports both 32-bit and 16-bit depth surfaces, while GFX8 
> only
> --
> 2.22.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 5/9] radv/gfx10: implement support for GS as NGG

2019-07-11 Thread Bas Nieuwenhuizen

Reviewed-by: Bas Nieuwenhuizen 

for the series.

On Thu, Jul 11, 2019 at 8:44 AM Samuel Pitoiset
 wrote:
>
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/amd/vulkan/radv_nir_to_llvm.c | 540 +-
>  src/amd/vulkan/radv_pipeline.c|   5 +-
>  src/amd/vulkan/radv_private.h |  24 ++
>  src/amd/vulkan/radv_shader.c  |   5 +
>  4 files changed, 568 insertions(+), 6 deletions(-)
>
> diff --git a/src/amd/vulkan/radv_nir_to_llvm.c 
> b/src/amd/vulkan/radv_nir_to_llvm.c
> index 176e95537c1..dc37c937155 100644
> --- a/src/amd/vulkan/radv_nir_to_llvm.c
> +++ b/src/amd/vulkan/radv_nir_to_llvm.c
> @@ -105,7 +105,12 @@ struct radv_shader_context {
>
> bool is_gs_copy_shader;
> LLVMValueRef gs_next_vertex[4];
> +   LLVMValueRef gs_curprim_verts[4];
> +   LLVMValueRef gs_generated_prims[4];
> +   LLVMValueRef gs_ngg_emit;
> +   LLVMValueRef gs_ngg_scratch;
> unsigned gs_max_out_vertices;
> +   unsigned gs_output_prim;
>
> unsigned tes_primitive_mode;
>
> @@ -116,6 +121,8 @@ struct radv_shader_context {
> uint32_t tcs_num_patches;
> uint32_t max_gsvs_emit_size;
> uint32_t gsvs_vertex_size;
> +
> +   LLVMValueRef vertexptr; /* GFX10 only */
>  };
>
>  enum radeon_llvm_calling_convention {
> @@ -1846,6 +1853,10 @@ static LLVMValueRef load_sample_mask_in(struct 
> ac_shader_abi *abi)
>  }
>
>
> +static void gfx10_ngg_gs_emit_vertex(struct radv_shader_context *ctx,
> +unsigned stream,
> +LLVMValueRef *addrs);
> +
>  static void
>  visit_emit_vertex(struct ac_shader_abi *abi, unsigned stream, LLVMValueRef 
> *addrs)
>  {
> @@ -1854,6 +1865,11 @@ visit_emit_vertex(struct ac_shader_abi *abi, unsigned 
> stream, LLVMValueRef *addr
> unsigned offset = 0;
> struct radv_shader_context *ctx = radv_shader_context_from_abi(abi);
>
> +   if (ctx->options->key.vs_common_out.as_ngg) {
> +   gfx10_ngg_gs_emit_vertex(ctx, stream, addrs);
> +   return;
> +   }
> +
> /* Write vertex attribute values to GSVS ring */
> gs_next_vertex = LLVMBuildLoad(ctx->ac.builder,
>ctx->gs_next_vertex[stream],
> @@ -1919,6 +1935,12 @@ static void
>  visit_end_primitive(struct ac_shader_abi *abi, unsigned stream)
>  {
> struct radv_shader_context *ctx = radv_shader_context_from_abi(abi);
> +
> +   if (ctx->options->key.vs_common_out.as_ngg) {
> +   LLVMBuildStore(ctx->ac.builder, ctx->ac.i32_0, 
> ctx->gs_curprim_verts[stream]);
> +   return;
> +   }
> +
> ac_build_sendmsg(>ac, AC_SENDMSG_GS_OP_CUT | AC_SENDMSG_GS | 
> (stream << 8), ctx->gs_wave_id);
>  }
>
> @@ -2571,8 +2593,20 @@ radv_export_param(struct radv_shader_context *ctx, 
> unsigned index,
>  static LLVMValueRef
>  radv_load_output(struct radv_shader_context *ctx, unsigned index, unsigned 
> chan)
>  {
> -   LLVMValueRef output =
> -   ctx->abi.outputs[ac_llvm_reg_index_soa(index, chan)];
> +   LLVMValueRef output;
> +
> +   if (ctx->vertexptr) {
> +   LLVMValueRef gep_idx[3] = {
> +   ctx->ac.i32_0, /* implicit C-style array */
> +   ctx->ac.i32_0, /* second value of struct */
> +   ctx->ac.i32_1, /* stream 1: source data index */
> +   };
> +
> +   gep_idx[2] = LLVMConstInt(ctx->ac.i32, 
> ac_llvm_reg_index_soa(index, chan), false);
> +   output = LLVMBuildGEP(ctx->ac.builder, ctx->vertexptr, 
> gep_idx, 3, "");
> +   } else {
> +   output = ctx->abi.outputs[ac_llvm_reg_index_soa(index, chan)];
> +   }
>
> return LLVMBuildLoad(ctx->ac.builder, output, "");
>  }
> @@ -2940,7 +2974,7 @@ handle_vs_outputs_post(struct radv_shader_context *ctx,
> outputs[noutput].usage_mask =
> 
> ctx->shader_info->info.tes.output_usage_mask[i];
> } else {
> -   assert(ctx->is_gs_copy_shader);
> +   assert(ctx->is_gs_copy_shader || 
> ctx->options->key.vs_common_out.as_ngg);
> outputs[noutput].usage_mask =
> 
> ctx->shader_info->info.gs.output_usage_mask[i];
> }
> @@ -3090,6 +3124,20 @@ static LLVMValueRef get_wave_id_in_tg(struct 
> radv_shader_context *ctx)
> return ac_unpa

Re: [Mesa-dev] [PATCH 6/6] radv: switch to the new VS exports path

2019-07-10 Thread Bas Nieuwenhuizen

Reviewed-by: Bas Nieuwenhuizen 

for the series.

On Wed, Jul 10, 2019 at 3:15 PM Samuel Pitoiset
 wrote:
>
> It will help for GS as NGG on GFX10.
>
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/amd/vulkan/radv_nir_to_llvm.c | 118 +-
>  1 file changed, 2 insertions(+), 116 deletions(-)
>
> diff --git a/src/amd/vulkan/radv_nir_to_llvm.c 
> b/src/amd/vulkan/radv_nir_to_llvm.c
> index 597d006284a..176e95537c1 100644
> --- a/src/amd/vulkan/radv_nir_to_llvm.c
> +++ b/src/amd/vulkan/radv_nir_to_llvm.c
> @@ -2884,12 +2884,8 @@ handle_vs_outputs_post(struct radv_shader_context *ctx,
>bool export_clip_dists,
>struct radv_vs_output_info *outinfo)
>  {
> -   unsigned pos_idx, num_pos_exports = 0;
> -   struct ac_export_args pos_args[4] = {};
> -   LLVMValueRef psize_value = NULL, layer_value = NULL, 
> viewport_index_value = NULL;
> struct radv_shader_output_values *outputs;
> unsigned noutput = 0;
> -   int i;
>
> if (ctx->options->key.has_multiview_view_index) {
> LLVMValueRef* tmp_out = 
> >abi.outputs[ac_llvm_reg_index_soa(VARYING_SLOT_LAYER, 0)];
> @@ -2905,61 +2901,18 @@ handle_vs_outputs_post(struct radv_shader_context 
> *ctx,
>
> memset(outinfo->vs_output_param_offset, AC_EXP_PARAM_UNDEFINED,
>sizeof(outinfo->vs_output_param_offset));
> -
> -   for(unsigned location = VARYING_SLOT_CLIP_DIST0; location <= 
> VARYING_SLOT_CLIP_DIST1; ++location) {
> -   if (ctx->output_mask & (1ull << location)) {
> -   unsigned output_usage_mask, length;
> -   LLVMValueRef slots[4];
> -   unsigned j;
> -
> -   if (ctx->stage == MESA_SHADER_VERTEX &&
> -   !ctx->is_gs_copy_shader) {
> -   output_usage_mask =
> -   
> ctx->shader_info->info.vs.output_usage_mask[location];
> -   } else if (ctx->stage == MESA_SHADER_TESS_EVAL) {
> -   output_usage_mask =
> -   
> ctx->shader_info->info.tes.output_usage_mask[location];
> -   } else {
> -   assert(ctx->is_gs_copy_shader);
> -   output_usage_mask =
> -   
> ctx->shader_info->info.gs.output_usage_mask[location];
> -   }
> -
> -   length = util_last_bit(output_usage_mask);
> -
> -   for (j = 0; j < length; j++)
> -   slots[j] = ac_to_float(>ac, 
> radv_load_output(ctx, location, j));
> -
> -   for (i = length; i < 4; i++)
> -   slots[i] = LLVMGetUndef(ctx->ac.f32);
> -
> -   unsigned index = 2 + (location - 
> VARYING_SLOT_CLIP_DIST0);
> -   si_llvm_init_export_args(ctx, [0], 0xf,
> - V_008DFC_SQ_EXP_POS + index,
> - _args[index]);
> -   }
> -   }
> -
> -   LLVMValueRef pos_values[4] = {ctx->ac.f32_0, ctx->ac.f32_0, 
> ctx->ac.f32_0, ctx->ac.f32_1};
> -   if (ctx->output_mask & (1ull << VARYING_SLOT_POS)) {
> -   for (unsigned j = 0; j < 4; j++)
> -   pos_values[j] = radv_load_output(ctx, 
> VARYING_SLOT_POS, j);
> -   }
> -   si_llvm_init_export_args(ctx, pos_values, 0xf, V_008DFC_SQ_EXP_POS, 
> _args[0]);
> +   outinfo->pos_exports = 0;
>
> if (ctx->output_mask & (1ull << VARYING_SLOT_PSIZ)) {
> outinfo->writes_pointsize = true;
> -   psize_value = radv_load_output(ctx, VARYING_SLOT_PSIZ, 0);
> }
>
> if (ctx->output_mask & (1ull << VARYING_SLOT_LAYER)) {
> outinfo->writes_layer = true;
> -   layer_value = radv_load_output(ctx, VARYING_SLOT_LAYER, 0);
> }
>
> if (ctx->output_mask & (1ull << VARYING_SLOT_VIEWPORT)) {
> outinfo->writes_viewport_index = true;
> -   viewport_index_value = radv_load_output(ctx, 
> VARYING_SLOT_VIEWPORT, 0);
> }
>
> if (ctx->shader_info->info.so.num_outputs &&
> @@ -2968,72 +2921,6 @@ handle_vs_outputs_post(struct radv_shader_context *ctx,
> radv_emit_streamout(ctx,

Re: [Mesa-dev] [PATCH 2/2] radv: remove extra code for exporting LayerID to the next stage

2019-07-10 Thread Bas Nieuwenhuizen

r-b for both

On Wed, Jul 10, 2019 at 1:00 PM Samuel Pitoiset
 wrote:
>
> Now that the output usage mask is set to 0x1 the LayerID is
> correctly exported in the loop above.
>
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/amd/vulkan/radv_nir_to_llvm.c | 19 ++-
>  1 file changed, 2 insertions(+), 17 deletions(-)
>
> diff --git a/src/amd/vulkan/radv_nir_to_llvm.c 
> b/src/amd/vulkan/radv_nir_to_llvm.c
> index e54e58c58f6..bd14f9fff1b 100644
> --- a/src/amd/vulkan/radv_nir_to_llvm.c
> +++ b/src/amd/vulkan/radv_nir_to_llvm.c
> @@ -2712,7 +2712,7 @@ radv_emit_streamout(struct radv_shader_context *ctx, 
> unsigned stream)
>
>  static void
>  handle_vs_outputs_post(struct radv_shader_context *ctx,
> -  bool export_prim_id, bool export_layer_id,
> +  bool export_prim_id,
>bool export_clip_dists,
>struct radv_vs_output_info *outinfo)
>  {
> @@ -2916,18 +2916,6 @@ handle_vs_outputs_post(struct radv_shader_context *ctx,
> outinfo->export_prim_id = true;
> }
>
> -   if (export_layer_id && layer_value) {
> -   LLVMValueRef values[4];
> -
> -   values[0] = layer_value;
> -   for (unsigned j = 1; j < 4; j++)
> -   values[j] = ctx->ac.f32_0;
> -
> -   radv_export_param(ctx, param_count, values, 0x1);
> -
> -   outinfo->vs_output_param_offset[VARYING_SLOT_LAYER] = 
> param_count++;
> -   }
> -
> outinfo->pos_exports = num_pos_exports;
> outinfo->param_exports = param_count;
>  }
> @@ -3202,7 +3190,6 @@ handle_ngg_outputs_post(struct radv_shader_context *ctx)
> ac_nir_build_if(_state, ctx, is_es_thread);
> {
> handle_vs_outputs_post(ctx, 
> ctx->options->key.vs_common_out.export_prim_id,
> -  
> ctx->options->key.vs_common_out.export_layer_id,
>
> ctx->options->key.vs_common_out.export_clip_dists,
>ctx->stage == MESA_SHADER_TESS_EVAL ? 
> >shader_info->tes.outinfo : >shader_info->vs.outinfo);
> }
> @@ -3471,7 +3458,6 @@ handle_shader_outputs_post(struct ac_shader_abi *abi, 
> unsigned max_outputs,
> handle_es_outputs_post(ctx, 
> >shader_info->vs.es_info);
> else
> handle_vs_outputs_post(ctx, 
> ctx->options->key.vs_common_out.export_prim_id,
> -  
> ctx->options->key.vs_common_out.export_layer_id,
>
> ctx->options->key.vs_common_out.export_clip_dists,
>>shader_info->vs.outinfo);
> break;
> @@ -3491,7 +3477,6 @@ handle_shader_outputs_post(struct ac_shader_abi *abi, 
> unsigned max_outputs,
> handle_es_outputs_post(ctx, 
> >shader_info->tes.es_info);
> else
> handle_vs_outputs_post(ctx, 
> ctx->options->key.vs_common_out.export_prim_id,
> -  
> ctx->options->key.vs_common_out.export_layer_id,
>
> ctx->options->key.vs_common_out.export_clip_dists,
>
> >shader_info->tes.outinfo);
> break;
> @@ -4109,7 +4094,7 @@ ac_gs_copy_shader_emit(struct radv_shader_context *ctx)
> radv_emit_streamout(ctx, stream);
>
> if (stream == 0) {
> -   handle_vs_outputs_post(ctx, false, false, true,
> +   handle_vs_outputs_post(ctx, false, true,
>>shader_info->vs.outinfo);
> }
>
> --
> 2.22.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] radv: compute correct number of input vertices for NGG

2019-07-09 Thread Bas Nieuwenhuizen

On Tue, Jul 9, 2019 at 9:19 AM Samuel Pitoiset
 wrote:
>
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/amd/vulkan/radv_pipeline.c | 25 -
>  1 file changed, 24 insertions(+), 1 deletion(-)
>
> diff --git a/src/amd/vulkan/radv_pipeline.c b/src/amd/vulkan/radv_pipeline.c
> index 5942e20dafe..96b20c1c730 100644
> --- a/src/amd/vulkan/radv_pipeline.c
> +++ b/src/amd/vulkan/radv_pipeline.c
> @@ -1616,6 +1616,29 @@ static void clamp_gsprims_to_esverts(unsigned 
> *max_gsprims, unsigned max_esverts
> *max_gsprims = MIN2(*max_gsprims, 1 + max_reuse);
>  }
>
> +static unsigned
> +radv_get_num_input_vertices(struct radv_pipeline *pipeline)
> +{
> +   if (radv_pipeline_has_gs(pipeline)) {
> +   struct radv_shader_variant *gs =
> +   radv_get_shader(pipeline, MESA_SHADER_GEOMETRY);
> +
> +   return gs->info.gs.vertices_in;
> +   }
> +
> +   if (radv_pipeline_has_tess(pipeline)) {
> +   struct radv_shader_variant *tes = radv_get_shader(pipeline, 
> MESA_SHADER_TESS_EVAL);
> +
> +   if (tes->info.tes.point_mode)
> +   return 1;
> +   if (tes->info.tes.primitive_mode == GL_ISOLINES)
> +   return 2;
> +   return 3;
> +   }
> +
> +   return 3;

I think this should be based on
pCreateInfo->pInputAssemblyState->topology, instead of assuming 3.

However for consistency with radeonsi for now, this is r-b


> +}
> +
>  static struct radv_ngg_state
>  calculate_ngg_info(const VkGraphicsPipelineCreateInfo *pCreateInfo,
>struct radv_pipeline *pipeline)
> @@ -1625,7 +1648,7 @@ calculate_ngg_info(const VkGraphicsPipelineCreateInfo 
> *pCreateInfo,
> struct radv_es_output_info *es_info =
> radv_pipeline_has_tess(pipeline) ? _info->tes.es_info : 
> _info->vs.es_info;
> unsigned gs_type = radv_pipeline_has_gs(pipeline) ? 
> MESA_SHADER_GEOMETRY : MESA_SHADER_VERTEX;
> -   unsigned max_verts_per_prim = 3; // triangles
> +   unsigned max_verts_per_prim = radv_get_num_input_vertices(pipeline);
> unsigned min_verts_per_prim =
> gs_type == MESA_SHADER_GEOMETRY ? max_verts_per_prim : 1;
> unsigned gs_num_invocations = gs_info ? MAX2(gs_info->gs.invocations, 
> 1) : 1;
> --
> 2.22.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] radv: do not emit VGT_FLUSH on GFX10

2019-07-08 Thread Bas Nieuwenhuizen

r-b

On Mon, Jul 8, 2019 at 1:45 PM Samuel Pitoiset
 wrote:
>
> We don't need it.
>
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/amd/vulkan/radv_device.c | 7 +--
>  1 file changed, 5 insertions(+), 2 deletions(-)
>
> diff --git a/src/amd/vulkan/radv_device.c b/src/amd/vulkan/radv_device.c
> index 5a92e5276d9..09614067a4a 100644
> --- a/src/amd/vulkan/radv_device.c
> +++ b/src/amd/vulkan/radv_device.c
> @@ -2747,8 +2747,11 @@ radv_get_preamble_cs(struct radv_queue *queue,
> if (esgs_ring_bo || gsvs_ring_bo || tess_rings_bo)  {
> radeon_emit(cs, PKT3(PKT3_EVENT_WRITE, 0, 0));
> radeon_emit(cs, EVENT_TYPE(V_028A90_VS_PARTIAL_FLUSH) 
> | EVENT_INDEX(4));
> -   radeon_emit(cs, PKT3(PKT3_EVENT_WRITE, 0, 0));
> -   radeon_emit(cs, EVENT_TYPE(V_028A90_VGT_FLUSH) | 
> EVENT_INDEX(0));
> +
> +   if 
> (queue->device->physical_device->rad_info.chip_class < GFX10) {
> +   radeon_emit(cs, PKT3(PKT3_EVENT_WRITE, 0, 0));
> +   radeon_emit(cs, 
> EVENT_TYPE(V_028A90_VGT_FLUSH) | EVENT_INDEX(0));
> +   }
> }
>
> radv_emit_gs_ring_sizes(queue, cs, esgs_ring_bo, 
> esgs_ring_size,
> --
> 2.22.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2] radv: add an option for disabling NGG on GFX10

2019-07-07 Thread Bas Nieuwenhuizen

r-b

On Sun, Jul 7, 2019 at 7:50 PM Samuel Pitoiset
 wrote:
>
> Will be useful for testing the legacy path.
>
> v2: add to get_hash_flags() too
>
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/amd/vulkan/radv_debug.h| 1 +
>  src/amd/vulkan/radv_device.c   | 1 +
>  src/amd/vulkan/radv_pipeline.c | 5 -
>  src/amd/vulkan/radv_private.h  | 2 ++
>  4 files changed, 8 insertions(+), 1 deletion(-)
>
> diff --git a/src/amd/vulkan/radv_debug.h b/src/amd/vulkan/radv_debug.h
> index 75e28000e14..723fabda57f 100644
> --- a/src/amd/vulkan/radv_debug.h
> +++ b/src/amd/vulkan/radv_debug.h
> @@ -52,6 +52,7 @@ enum {
> RADV_DEBUG_NOTHREADLLVM  = 0x40,
> RADV_DEBUG_NOBINNING = 0x80,
> RADV_DEBUG_NO_LOAD_STORE_OPT = 0x100,
> +   RADV_DEBUG_NO_NGG= 0x200,
>  };
>
>  enum {
> diff --git a/src/amd/vulkan/radv_device.c b/src/amd/vulkan/radv_device.c
> index 4a1078a1b52..5a92e5276d9 100644
> --- a/src/amd/vulkan/radv_device.c
> +++ b/src/amd/vulkan/radv_device.c
> @@ -474,6 +474,7 @@ static const struct debug_control radv_debug_options[] = {
> {"nothreadllvm", RADV_DEBUG_NOTHREADLLVM},
> {"nobinning", RADV_DEBUG_NOBINNING},
> {"noloadstoreopt", RADV_DEBUG_NO_LOAD_STORE_OPT},
> +   {"nongg", RADV_DEBUG_NO_NGG},
> {NULL, 0}
>  };
>
> diff --git a/src/amd/vulkan/radv_pipeline.c b/src/amd/vulkan/radv_pipeline.c
> index 69acfdaec7d..ff39c140572 100644
> --- a/src/amd/vulkan/radv_pipeline.c
> +++ b/src/amd/vulkan/radv_pipeline.c
> @@ -157,6 +157,8 @@ static uint32_t get_hash_flags(struct radv_device *device)
>
> if (device->instance->debug_flags & RADV_DEBUG_UNSAFE_MATH)
> hash_flags |= RADV_HASH_SHADER_UNSAFE_MATH;
> +   if (device->instance->debug_flags & RADV_DEBUG_NO_NGG)
> +   hash_flags |= RADV_HASH_SHADER_NO_NGG;
> if (device->instance->perftest_flags & RADV_PERFTEST_SISCHED)
> hash_flags |= RADV_HASH_SHADER_SISCHED;
> return hash_flags;
> @@ -2253,7 +2255,8 @@ radv_fill_shader_keys(struct radv_device *device,
> keys[MESA_SHADER_VERTEX].vs.out.as_es = true;
> }
>
> -   if (device->physical_device->rad_info.chip_class >= GFX10) {
> +   if (!(device->instance->debug_flags & RADV_DEBUG_NO_NGG) &&
> +   device->physical_device->rad_info.chip_class >= GFX10) {
> keys[MESA_SHADER_VERTEX].vs.out.as_ngg = true;
> }
>
> diff --git a/src/amd/vulkan/radv_private.h b/src/amd/vulkan/radv_private.h
> index fd1f8972adc..21db7fbbbc9 100644
> --- a/src/amd/vulkan/radv_private.h
> +++ b/src/amd/vulkan/radv_private.h
> @@ -1390,6 +1390,8 @@ struct radv_shader_module;
>  #define RADV_HASH_SHADER_IS_GEOM_COPY_SHADER (1 << 0)
>  #define RADV_HASH_SHADER_SISCHED (1 << 1)
>  #define RADV_HASH_SHADER_UNSAFE_MATH (1 << 2)
> +#define RADV_HASH_SHADER_NO_NGG  (1 << 3)
> +
>  void
>  radv_hash_shaders(unsigned char *hash,
>   const VkPipelineShaderStageCreateInfo **stages,
> --
> 2.22.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] radv: add an option for disabling NGG on GFX10

2019-07-07 Thread Bas Nieuwenhuizen

Please add the option to get_hash_flags in radv_pipeline.c too, so it
does not poison the cache.

On Sun, Jul 7, 2019 at 7:35 PM Samuel Pitoiset
 wrote:
>
> Will be useful for testing the legacy path.
>
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/amd/vulkan/radv_debug.h| 1 +
>  src/amd/vulkan/radv_device.c   | 1 +
>  src/amd/vulkan/radv_pipeline.c | 3 ++-
>  3 files changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/src/amd/vulkan/radv_debug.h b/src/amd/vulkan/radv_debug.h
> index 75e28000e14..723fabda57f 100644
> --- a/src/amd/vulkan/radv_debug.h
> +++ b/src/amd/vulkan/radv_debug.h
> @@ -52,6 +52,7 @@ enum {
> RADV_DEBUG_NOTHREADLLVM  = 0x40,
> RADV_DEBUG_NOBINNING = 0x80,
> RADV_DEBUG_NO_LOAD_STORE_OPT = 0x100,
> +   RADV_DEBUG_NO_NGG= 0x200,
>  };
>
>  enum {
> diff --git a/src/amd/vulkan/radv_device.c b/src/amd/vulkan/radv_device.c
> index 4a1078a1b52..5a92e5276d9 100644
> --- a/src/amd/vulkan/radv_device.c
> +++ b/src/amd/vulkan/radv_device.c
> @@ -474,6 +474,7 @@ static const struct debug_control radv_debug_options[] = {
> {"nothreadllvm", RADV_DEBUG_NOTHREADLLVM},
> {"nobinning", RADV_DEBUG_NOBINNING},
> {"noloadstoreopt", RADV_DEBUG_NO_LOAD_STORE_OPT},
> +   {"nongg", RADV_DEBUG_NO_NGG},
> {NULL, 0}
>  };
>
> diff --git a/src/amd/vulkan/radv_pipeline.c b/src/amd/vulkan/radv_pipeline.c
> index 69acfdaec7d..db0fb50bbe7 100644
> --- a/src/amd/vulkan/radv_pipeline.c
> +++ b/src/amd/vulkan/radv_pipeline.c
> @@ -2253,7 +2253,8 @@ radv_fill_shader_keys(struct radv_device *device,
> keys[MESA_SHADER_VERTEX].vs.out.as_es = true;
> }
>
> -   if (device->physical_device->rad_info.chip_class >= GFX10) {
> +   if (!(device->instance->debug_flags & RADV_DEBUG_NO_NGG) &&
> +   device->physical_device->rad_info.chip_class >= GFX10) {
> keys[MESA_SHADER_VERTEX].vs.out.as_ngg = true;
> }
>
> --
> 2.22.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] ac: select the GFX ring when halting waves with UMR on GFX10

2019-07-07 Thread Bas Nieuwenhuizen

r-b

On Sun, Jul 7, 2019 at 7:32 PM Samuel Pitoiset
 wrote:
>
> GFX10 has two rings, so UMR want to know which one to halt.
> Select the first one by default.
>
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/amd/common/ac_debug.c   | 9 ++---
>  src/amd/common/ac_debug.h   | 3 ++-
>  src/amd/vulkan/radv_debug.c | 3 ++-
>  src/gallium/drivers/radeonsi/si_debug.c | 2 +-
>  4 files changed, 11 insertions(+), 6 deletions(-)
>
> diff --git a/src/amd/common/ac_debug.c b/src/amd/common/ac_debug.c
> index e4cb6a13a3a..1632106fdb9 100644
> --- a/src/amd/common/ac_debug.c
> +++ b/src/amd/common/ac_debug.c
> @@ -769,12 +769,15 @@ static int compare_wave(const void *p1, const void *p2)
>  }
>
>  /* Return wave information. "waves" should be a large enough array. */
> -unsigned ac_get_wave_info(struct ac_wave_info waves[AC_MAX_WAVES_PER_CHIP])
> +unsigned ac_get_wave_info(enum chip_class chip_class,
> + struct ac_wave_info waves[AC_MAX_WAVES_PER_CHIP])
>  {
> -   char line[2000];
> +   char line[2000], cmd[128];
> unsigned num_waves = 0;
>
> -   FILE *p = popen("umr -O halt_waves -wa", "r");
> +   sprintf(cmd, "umr -O halt_waves -wa %s", chip_class >= GFX10 ? 
> "gfx_0.0.0" : "gfx");
> +
> +   FILE *p = popen(cmd, "r");
> if (!p)
> return 0;
>
> diff --git a/src/amd/common/ac_debug.h b/src/amd/common/ac_debug.h
> index 23343fe1304..0d5c1dd9eac 100644
> --- a/src/amd/common/ac_debug.h
> +++ b/src/amd/common/ac_debug.h
> @@ -64,6 +64,7 @@ void ac_parse_ib(FILE *f, uint32_t *ib, int num_dw, const 
> int *trace_ids,
>  bool ac_vm_fault_occured(enum chip_class chip_class,
>  uint64_t *old_dmesg_timestamp, uint64_t *out_addr);
>
> -unsigned ac_get_wave_info(struct ac_wave_info waves[AC_MAX_WAVES_PER_CHIP]);
> +unsigned ac_get_wave_info(enum chip_class chip_class,
> + struct ac_wave_info waves[AC_MAX_WAVES_PER_CHIP]);
>
>  #endif
> diff --git a/src/amd/vulkan/radv_debug.c b/src/amd/vulkan/radv_debug.c
> index 2f661c0208f..42296745543 100644
> --- a/src/amd/vulkan/radv_debug.c
> +++ b/src/amd/vulkan/radv_debug.c
> @@ -445,7 +445,8 @@ radv_dump_annotated_shaders(struct radv_pipeline 
> *pipeline,
> VkShaderStageFlagBits active_stages, FILE *f)
>  {
> struct ac_wave_info waves[AC_MAX_WAVES_PER_CHIP];
> -   unsigned num_waves = ac_get_wave_info(waves);
> +   enum chip_class chip_class = 
> pipeline->device->physical_device->rad_info.chip_class;
> +   unsigned num_waves = ac_get_wave_info(chip_class, waves);
>
> fprintf(f, COLOR_CYAN "The number of active waves = %u" COLOR_RESET
> "\n\n", num_waves);
> diff --git a/src/gallium/drivers/radeonsi/si_debug.c 
> b/src/gallium/drivers/radeonsi/si_debug.c
> index c9c78733099..8265159c0d0 100644
> --- a/src/gallium/drivers/radeonsi/si_debug.c
> +++ b/src/gallium/drivers/radeonsi/si_debug.c
> @@ -1080,7 +1080,7 @@ static void si_print_annotated_shader(struct si_shader 
> *shader,
>  static void si_dump_annotated_shaders(struct si_context *sctx, FILE *f)
>  {
> struct ac_wave_info waves[AC_MAX_WAVES_PER_CHIP];
> -   unsigned num_waves = ac_get_wave_info(waves);
> +   unsigned num_waves = ac_get_wave_info(sctx->chip_class, waves);
>
> fprintf(f, COLOR_CYAN "The number of active waves = %u" COLOR_RESET
> "\n\n", num_waves);
> --
> 2.22.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/2] radv: do not crash when generating binning state for unknown chips

2019-07-04 Thread Bas Nieuwenhuizen

r-b

On Thu, Jul 4, 2019 at 8:51 AM Samuel Pitoiset
 wrote:
>
> These values are only useful if binning is disabled.
>
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/amd/vulkan/radv_pipeline.c | 44 +-
>  1 file changed, 22 insertions(+), 22 deletions(-)
>
> diff --git a/src/amd/vulkan/radv_pipeline.c b/src/amd/vulkan/radv_pipeline.c
> index 71d3be240b2..49687405705 100644
> --- a/src/amd/vulkan/radv_pipeline.c
> +++ b/src/amd/vulkan/radv_pipeline.c
> @@ -2691,29 +2691,29 @@ radv_pipeline_generate_binning_state(struct 
> radeon_cmdbuf *ctx_cs,
>
> VkExtent2D bin_size = radv_compute_bin_size(pipeline, pCreateInfo);
>
> -   unsigned context_states_per_bin; /* allowed range: [1, 6] */
> -   unsigned persistent_states_per_bin; /* allowed range: [1, 32] */
> -   unsigned fpovs_per_batch; /* allowed range: [0, 255], 0 = unlimited */
> -
> -   switch (pipeline->device->physical_device->rad_info.family) {
> -   case CHIP_VEGA10:
> -   case CHIP_VEGA12:
> -   case CHIP_VEGA20:
> -   context_states_per_bin = 1;
> -   persistent_states_per_bin = 1;
> -   fpovs_per_batch = 63;
> -   break;
> -   case CHIP_RAVEN:
> -   case CHIP_RAVEN2:
> -   context_states_per_bin = 6;
> -   persistent_states_per_bin = 32;
> -   fpovs_per_batch = 63;
> -   break;
> -   default:
> -   unreachable("unhandled family while determining binning 
> state.");
> -   }
> -
> if (pipeline->device->pbb_allowed && bin_size.width && 
> bin_size.height) {
> +   unsigned context_states_per_bin; /* allowed range: [1, 6] */
> +   unsigned persistent_states_per_bin; /* allowed range: [1, 32] 
> */
> +   unsigned fpovs_per_batch; /* allowed range: [0, 255], 0 = 
> unlimited */
> +
> +   switch (pipeline->device->physical_device->rad_info.family) {
> +   case CHIP_VEGA10:
> +   case CHIP_VEGA12:
> +   case CHIP_VEGA20:
> +   context_states_per_bin = 1;
> +   persistent_states_per_bin = 1;
> +   fpovs_per_batch = 63;
> +   break;
> +   case CHIP_RAVEN:
> +   case CHIP_RAVEN2:
> +   context_states_per_bin = 6;
> +   persistent_states_per_bin = 32;
> +   fpovs_per_batch = 63;
> +   break;
> +   default:
> +   unreachable("unhandled family while determining 
> binning state.");
> +   }
> +
> pa_sc_binner_cntl_0 =
> S_028C44_BINNING_MODE(V_028C44_BINNING_ALLOWED) |
> S_028C44_BIN_SIZE_X(bin_size.width == 16) |
> --
> 2.22.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/2] radv: fix potential crash in the compute resolve path

2019-07-04 Thread Bas Nieuwenhuizen

r-b

On Thu, Jul 4, 2019 at 8:51 AM Samuel Pitoiset
 wrote:
>
> If the destination attachment is UNUSED.
>
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/amd/vulkan/radv_meta_resolve_cs.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/src/amd/vulkan/radv_meta_resolve_cs.c 
> b/src/amd/vulkan/radv_meta_resolve_cs.c
> index 7d3cc166e0d..13c61509b21 100644
> --- a/src/amd/vulkan/radv_meta_resolve_cs.c
> +++ b/src/amd/vulkan/radv_meta_resolve_cs.c
> @@ -917,12 +917,13 @@ radv_cmd_buffer_resolve_subpass_cs(struct 
> radv_cmd_buffer *cmd_buffer)
> for (uint32_t i = 0; i < subpass->color_count; ++i) {
> struct radv_subpass_attachment src_att = 
> subpass->color_attachments[i];
> struct radv_subpass_attachment dst_att = 
> subpass->resolve_attachments[i];
> -   struct radv_image_view *src_iview = 
> fb->attachments[src_att.attachment].attachment;
> -   struct radv_image_view *dst_iview = 
> fb->attachments[dst_att.attachment].attachment;
>
> if (dst_att.attachment == VK_ATTACHMENT_UNUSED)
> continue;
>
> +   struct radv_image_view *src_iview = 
> fb->attachments[src_att.attachment].attachment;
> +   struct radv_image_view *dst_iview = 
> fb->attachments[dst_att.attachment].attachment;
> +
> VkImageResolve region = {
> .extent = (VkExtent3D){ fb->width, fb->height, 0 },
> .srcSubresource = (VkImageSubresourceLayers) {
> --
> 2.22.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 5/5] radv: only allocate a 32-bit value for the TC-compat range metadata

2019-07-02 Thread Bas Nieuwenhuizen

r-b

On Tue, Jul 2, 2019 at 2:47 PM Samuel Pitoiset
 wrote:
>
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/amd/vulkan/radv_image.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/src/amd/vulkan/radv_image.c b/src/amd/vulkan/radv_image.c
> index eeccce0d82f..dc598d9eecf 100644
> --- a/src/amd/vulkan/radv_image.c
> +++ b/src/amd/vulkan/radv_image.c
> @@ -990,8 +990,8 @@ radv_image_alloc_htile(struct radv_image *image)
>  * have to be fixed by updating ZRANGE_PRECISION when doing
>  * fast depth clears to 0.0f.
>  */
> -   image->tc_compat_zrange_offset = image->clear_value_offset + 
> 8;
> -   image->size = image->clear_value_offset + 16;
> +   image->tc_compat_zrange_offset = image->size;
> +   image->size = image->tc_compat_zrange_offset + 4;
> }
> image->alignment = align64(image->alignment, 
> image->planes[0].surface.htile_alignment);
>  }
> --
> 2.22.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 3/5] radv: remove set but unused aspect mask during depth layout transitions

2019-07-02 Thread Bas Nieuwenhuizen

I really like just always filling the struct completely. Provides a
better abstraction and less surprises.

On Tue, Jul 2, 2019 at 2:47 PM Samuel Pitoiset
 wrote:
>
> The decompress/resummarize pass always use the depth aspect.
>
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/amd/vulkan/radv_cmd_buffer.c | 1 -
>  1 file changed, 1 deletion(-)
>
> diff --git a/src/amd/vulkan/radv_cmd_buffer.c 
> b/src/amd/vulkan/radv_cmd_buffer.c
> index fc8184200fc..322e705621f 100644
> --- a/src/amd/vulkan/radv_cmd_buffer.c
> +++ b/src/amd/vulkan/radv_cmd_buffer.c
> @@ -4895,7 +4895,6 @@ static void radv_handle_depth_image_transition(struct 
> radv_cmd_buffer *cmd_buffe
> } else if (radv_layout_is_htile_compressed(image, src_layout, 
> src_queue_mask) &&
>!radv_layout_is_htile_compressed(image, dst_layout, 
> dst_queue_mask)) {
> VkImageSubresourceRange local_range = *range;
> -   local_range.aspectMask = VK_IMAGE_ASPECT_DEPTH_BIT;
> local_range.baseMipLevel = 0;
> local_range.levelCount = 1;
>
> --
> 2.22.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 4/5] radv: remove unused code in radv_update_tc_compat_zrange_metadata()

2019-07-02 Thread Bas Nieuwenhuizen

r-b

On Tue, Jul 2, 2019 at 2:47 PM Samuel Pitoiset
 wrote:
>
> ---
>  src/amd/vulkan/radv_cmd_buffer.c | 2 --
>  1 file changed, 2 deletions(-)
>
> diff --git a/src/amd/vulkan/radv_cmd_buffer.c 
> b/src/amd/vulkan/radv_cmd_buffer.c
> index 322e705621f..a89d804aa65 100644
> --- a/src/amd/vulkan/radv_cmd_buffer.c
> +++ b/src/amd/vulkan/radv_cmd_buffer.c
> @@ -1534,8 +1534,6 @@ radv_update_tc_compat_zrange_metadata(struct 
> radv_cmd_buffer *cmd_buffer,
>   struct radv_image *image,
>   VkClearDepthStencilValue ds_clear_value)
>  {
> -   uint64_t va = radv_buffer_get_va(image->bo);
> -   va += image->offset + image->tc_compat_zrange_offset;
> uint32_t cond_val;
>
> /* Conditionally set DB_Z_INFO.ZRANGE_PRECISION to 0 when the last
> --
> 2.22.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/5] radv: add radv_get_depth_pipeline() helper

2019-07-02 Thread Bas Nieuwenhuizen

r-b

On Tue, Jul 2, 2019 at 2:47 PM Samuel Pitoiset
 wrote:
>
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/amd/vulkan/radv_meta_decompress.c | 66 +--
>  1 file changed, 41 insertions(+), 25 deletions(-)
>
> diff --git a/src/amd/vulkan/radv_meta_decompress.c 
> b/src/amd/vulkan/radv_meta_decompress.c
> index 578a287d07b..fa5de24314a 100644
> --- a/src/amd/vulkan/radv_meta_decompress.c
> +++ b/src/amd/vulkan/radv_meta_decompress.c
> @@ -320,6 +320,43 @@ enum radv_depth_op {
> DEPTH_RESUMMARIZE,
>  };
>
> +static VkPipeline *
> +radv_get_depth_pipeline(struct radv_cmd_buffer *cmd_buffer,
> +   struct radv_image *image, enum radv_depth_op op)
> +{
> +   struct radv_meta_state *state = _buffer->device->meta_state;
> +   uint32_t samples = image->info.samples;
> +   uint32_t samples_log2 = ffs(samples) - 1;
> +   VkPipeline *pipeline;
> +
> +   if (!state->depth_decomp[samples_log2].decompress_pipeline) {
> +   VkResult ret;
> +
> +   ret = create_pipeline(cmd_buffer->device, VK_NULL_HANDLE, 
> samples,
> + state->depth_decomp[samples_log2].pass,
> + 
> state->depth_decomp[samples_log2].p_layout,
> + 
> >depth_decomp[samples_log2].decompress_pipeline,
> + 
> >depth_decomp[samples_log2].resummarize_pipeline);
> +   if (ret != VK_SUCCESS) {
> +   cmd_buffer->record_result = ret;
> +   return NULL;
> +   }
> +   }
> +
> +   switch (op) {
> +   case DEPTH_DECOMPRESS:
> +   pipeline = 
> >depth_decomp[samples_log2].decompress_pipeline;
> +   break;
> +   case DEPTH_RESUMMARIZE:
> +   pipeline = 
> >depth_decomp[samples_log2].resummarize_pipeline;
> +   break;
> +   default:
> +   unreachable("unknown operation");
> +   }
> +
> +   return pipeline;
> +}
> +
>  static void radv_process_depth_image_inplace(struct radv_cmd_buffer 
> *cmd_buffer,
>  struct radv_image *image,
>  VkImageSubresourceRange 
> *subresourceRange,
> @@ -336,41 +373,20 @@ static void radv_process_depth_image_inplace(struct 
> radv_cmd_buffer *cmd_buffer,
> uint32_t samples = image->info.samples;
> uint32_t samples_log2 = ffs(samples) - 1;
> struct radv_meta_state *meta_state = _buffer->device->meta_state;
> -   VkPipeline pipeline_h;
> +   VkPipeline *pipeline;
>
> if (!radv_image_has_htile(image))
> return;
>
> -   if (!meta_state->depth_decomp[samples_log2].decompress_pipeline) {
> -   VkResult ret = create_pipeline(cmd_buffer->device, 
> VK_NULL_HANDLE, samples,
> -  
> meta_state->depth_decomp[samples_log2].pass,
> -  
> meta_state->depth_decomp[samples_log2].p_layout,
> -  
> _state->depth_decomp[samples_log2].decompress_pipeline,
> -  
> _state->depth_decomp[samples_log2].resummarize_pipeline);
> -   if (ret != VK_SUCCESS) {
> -   cmd_buffer->record_result = ret;
> -   return;
> -   }
> -   }
> -
> radv_meta_save(_state, cmd_buffer,
>RADV_META_SAVE_GRAPHICS_PIPELINE |
>RADV_META_SAVE_SAMPLE_LOCATIONS |
>RADV_META_SAVE_PASS);
>
> -   switch (op) {
> -   case DEPTH_DECOMPRESS:
> -   pipeline_h = 
> meta_state->depth_decomp[samples_log2].decompress_pipeline;
> -   break;
> -   case DEPTH_RESUMMARIZE:
> -   pipeline_h = 
> meta_state->depth_decomp[samples_log2].resummarize_pipeline;
> -   break;
> -   default:
> -   unreachable("unknown operation");
> -   }
> +   pipeline = radv_get_depth_pipeline(cmd_buffer, image, op);
>
> -   radv_CmdBindPipeline(cmd_buffer_h, VK_PIPELINE_BIND_POINT_GRAPHICS,
> -pipeline_h);
> +   radv_CmdBindPipeline(radv_cmd_buffer_to_handle(cmd_buffer),
> +VK_PIPELINE_BIND_POINT_GRAPHICS, *pipeline);
>
> radv_CmdSetViewport(cmd_buffer_h, 0, 1, &(VkViewport) {
> .x = 0,
> --
> 2.22.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 2350 matches

Mail list logo