from:"Ilia Mirkin"

Re: [Mesa-dev] piglit default main branch results - Re: Rename "master" branch to "main"?

2021-03-29 Thread Ilia Mirkin

Hi Jordan,

https://cgit.freedesktop.org/piglit

This still displays the "master" branch by default. I think you're
supposed to change ... something ... in the repo, which indicates
which is the default branch.

Cheers,

  -ilia

On Mon, Mar 29, 2021 at 8:31 PM Jordan Justen  wrote:
>
> On 2021-03-24 23:47:46, Jordan Justen wrote:
> > On 2021-03-24 21:14:57, Jason Ekstrand wrote:
> > >
> > > Jordan, is there any way you can make the script sort by last updated 
> > > before it
> > > re-targets the MRs so they at least stay in the same order? Updating them 
> > > in MR
> > > # order wouldn't be bad either, I guess.
> > >
> >
> > It already does process them sorting by the oldest "update time"
> > first, so roughly speaking the order sorted by update time should be
> > the same.
> >
> > I don't know what might happen if 2 MRs were updated within the same
> > second. But, the updates are actually a bit slow (maybe ~1 update per
> > second), so there doesn't seem to be much risk, as far as I can see,
> > of, for instance, 10 updates happening within the same second.
> >
>
> Crucible and piglit are now changed over to using main.
>
> Piglit has had at least 2 MR's merge to main now:
>  * https://gitlab.freedesktop.org/mesa/piglit/-/merge_requests/463
>  * https://gitlab.freedesktop.org/mesa/piglit/-/merge_requests/497
>
> It appears that the "last updated sort order" stayed 100% the same
> before and after the script ran to update the open MR branch targets.
>
> It took ~48 seconds for the script to update 58 piglit merge-requests.
>
> I have only seen one issue so far for the piglit default branch
> change. It appears that the piglit patchwork scripting isn't
> configured for the main branch. I think it's long past time to retire
> the piglit patchwork, so I made a request to the admins to do that. I
> think Mesa already retired patchwork.
>
> -Jordan
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [RFC] Concrete proposal to split classic

2021-03-29 Thread Ilia Mirkin

Probably nv30 would do well to "move on" as well. But it also presents
an interesting question -- the nv30 driver has lots of problems. I
have no plans to fix them, nor am I aware of anyone else with such
plans. However if such a developer were to turn up, would it be
reasonable to assume that their work would ultimately land in this
"lts" branch/tree/whatever? Some of the "fixes" will require large-ish
changes to the driver...

Cheers,

  -ilia

On Mon, Mar 29, 2021 at 6:48 PM Marek Olšák  wrote:
>
> Alright that's r300 and swr that are going to find a new home in the lts 
> branch. Do any other gallium drivers want to join them?
>
> Marek
>
> On Mon., Mar. 29, 2021, 13:51 Zielinski, Jan,  wrote:
>>
>> On Thursday, March 25, 2021 8:47 Marek Olšák wrote:
>> > Same thinking could be applied to other gallium drivers for old hardware 
>> > that don't receive new development and are becoming more and more 
>> > irrelevant every year due to their age.
>>
>> Can we also keep Gallium for OpenSWR driver on -lts branch? We currently are 
>> focusing effort on other OSS projects, and want to maintain OpenSWR at its 
>> current feature level, but we are often seeing Mesa core changes causing 
>> problems in OpenSWR, that we can’t always address right away. So, we would 
>> like to point our users to a stable branch, that limits the amount of effort 
>> required for OpenSWR to support its existing users.
>>
>> Jan
>>
>> On Wed, Mar 24, 2021, at 09:15, Jason Ekstrand wrote:
>> > On Wed, Mar 24, 2021 at 10:28 AM Rob Clark  
>> > wrote:
>> > >
>> > > On Mon, Mar 22, 2021 at 3:15 PM Dylan Baker  
>> > > wrote:
>> > > >
>> > > > Hi list,
>> > > >
>> > > > We've talked about it a number of times, but I think it's time time to
>> > > > discuss splitting the classic drivers off of the main development 
>> > > > branch
>> > > > again, although this time I have a concrete plan for how this would
>> > > > work.
>> > > >
>> > > > First, why? Basically, all of the classic drivers are in maintanence
>> > > > mode (even i965). Second, many of them rely on code that no one works
>> > > > on, and very few people still understand. There is no CI for most of
>> > > > them, and the Intel CI is not integrated with gitlab, so it's easy to
>> > > > unintentionally break them, and this breakage usually isn't noticed
>> > > > until just before or just after a release. 21.0 was held up (in small
>> > > > part, also me just getting behind) because of such breakages.
>> > > >
>> > > > I konw there is some interest in getting i915g in good enough shape 
>> > > > that
>> > > > it could replace i915c, at least for the common case. I also am aware
>> > > > that Dave, Ilia, and Eric (with some pointers from Ken) have been
>> > > > working on a gallium driver to replace i965. Neither of those things 
>> > > > are
>> > > > ready yet, but I've taken them into account.
>> > > >
>> > > > Here's the plan:
>> > > >
>> > > > 1) 21.1 release happens
>> > > > 2) we remove classic from master
>> > > > 3) 21.1 reaches EOL because of 21.2
>> > > > 4) we fork the 21.1 branch into a "classic-lts"¹ branch
>> > > > 5) we disable all vulkan and gallium drivers in said branch, at least 
>> > > > at
>> > > >the Meson level
>> > >
>> > > I'm +1 for the -lts branch.. the layering between mesa "classic" and
>> > > gallium is already starting to get poked thru in the name of
>> > > performance, and we've already discovered cases of classic drivers
>> > > being broken for multiple months with no one noticing.  I think a
>> > > slower moving -lts branch is the best approach to keeping things
>> > > working for folks with older hw.
>> > >
>> > > But possibly there is some value in not completely disabling gallium
>> > > completely in the -lts branch.  We do have some older gallium drivers
>> > > which do not have CI coverage and I think are not used frequently by
>> > > developers who are tracking the latest main/master branch.  I'm not
>> > > suggesting that we remove them from the main (non-lts) branch but it
>> > > might be useful to be able to recommend users of those drivers stick
>> > > with the -lts version for better stability?
>> >
>> > I agree with this.  Generally, I don't think we should delete anything
>> > from the -lts branch.  Doing so only risks more breakage.  We probably
>> > want to change some meson build defaults to not build anything but old
>> > drivers but that's it.
>> >
>> > --Jason
>> >
>> > > BR,
>> > > -R
>> > >
>> > > > 6) We change the name and precidence of the glvnd loader file
>> > > > 7) apply any build fixups (turn of intel generators for versions >= 
>> > > > 7.5,
>> > > >for example
>> > > > 8) maintain that branch with build and critical bug fixes only
>> > > >
>> > > > This gives ditros and end users two options.
>> > > > 1) then can build *only* the legacy branch in the a normal Mesa 
>> > > > provides
>> > > >libGL interfaces fashion
>> > > > 2) They can use glvnd

Re: [Mesa-dev] Mesa 21.0 release

2021-03-10 Thread Ilia Mirkin

On Wed, Mar 10, 2021 at 3:39 PM Dylan Baker  wrote:
> 0464117ad9 ci: remove nouveau from shader-db runs

Hey Dylan,

You're going to have a bad time if you don't backport this, as CI will
hang instead of completing. The "fix" for the hanging introduced
practical issues at runtime, and I figured it was better to have
applications work than to have CI.

Should be pretty obvious what to do - just remove all the nouveau
lines from that script.

Cheers,

  -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] st/pbo: fix pbo uploads without PIPE_CAP_TGSI_VS_LAYER_VIEWPORT

2020-12-14 Thread Ilia Mirkin

Hm, I thought reading gl_Layer in a fragment shader is well-defined
(to be 0) when there's no GS. Reading back over
ARB_fragment_layer_viewport, it doesn't seem to address this point at
all. It just says that if GS doesn't write gl_Layer, then it will be
read as 0 by the fragment shader. I'd be curious if there are any
interpretations that allow this case to be undefined.

It's definitely the case on NVIDIA hw that you'll get a 0 (since it
reflects the layer being rendered). The pbo logic has worked well on
NVIDIA hw for many years, so this is just a regression there.

Cheers,

  -ilia

On Mon, Dec 14, 2020 at 10:55 AM Mike Blumenkrantz
 wrote:
>
> Hi Ilia,
>
> I'm not entirely sure what you're asking here.
>
> This patch doesn't change anything related to gl_Layer reads, it just forces 
> the geometry shader codepath unconditionally when VS_LAYER_VIEWPORT isn't 
> enabled in order to successfully write the gl_Layer output. If anything, this 
> should be beneficial to those nvidia chipsets based on your description since 
> previously the fragment shader would've had nothing to read.
>
>
> Mike
>
> On Mon, Dec 14, 2020 at 10:48 AM Ilia Mirkin  wrote:
>>
>> Hey Mike,
>>
>> This is in reference to your change
>> https://cgit.freedesktop.org/mesa/mesa/commit/?id=614c2ac2f48955537efcfefaf0609d6c03e5
>> .
>>
>> A fragment shader should still be able to read gl_Layer even without
>> PIPE_CAP_TGSI_VS_LAYER_VIEWPORT. A frag shader can read gl_Layer any
>> time ARB_fragment_layer_viewport is supported -- I forget the precise
>> conditions for it, but basically any DX10 hardware supports that.
>>
>> However the VS_LAYER_VIEWPORT feature isn't supported on any NVIDIA
>> hardware until fairly late models (GM20x+). So this will regress the
>> majority case (single-layer) on all NVIDIA hw.
>>
>> Can you instead adjust the conditions to allow this for the drivers
>> that implement layer/viewport sysvals in frag shaders? Or just
>> implement ARB_fragment_layer_viewport in zink?
>>
>> Thanks,
>>
>>   -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] st/pbo: fix pbo uploads without PIPE_CAP_TGSI_VS_LAYER_VIEWPORT

2020-12-14 Thread Ilia Mirkin

Hey Mike,

This is in reference to your change
https://cgit.freedesktop.org/mesa/mesa/commit/?id=614c2ac2f48955537efcfefaf0609d6c03e5
.

A fragment shader should still be able to read gl_Layer even without
PIPE_CAP_TGSI_VS_LAYER_VIEWPORT. A frag shader can read gl_Layer any
time ARB_fragment_layer_viewport is supported -- I forget the precise
conditions for it, but basically any DX10 hardware supports that.

However the VS_LAYER_VIEWPORT feature isn't supported on any NVIDIA
hardware until fairly late models (GM20x+). So this will regress the
majority case (single-layer) on all NVIDIA hw.

Can you instead adjust the conditions to allow this for the drivers
that implement layer/viewport sysvals in frag shaders? Or just
implement ARB_fragment_layer_viewport in zink?

Thanks,

  -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] frontends/va/postproc: Use the actual image height when blitting

2020-11-30 Thread Ilia Mirkin

Many (most?) codecs decode stuff in 16px increments. So even though
the video is 1080 in height, the decoded surface is 1088. However the
video decoder should know that only 1080 pixels are valid. The src/dst
regions should account for this.

I suspect that there's an issue in the vl_compositor_cs logic which
might not account for ... something? It's trying to scale texture
coordinates, so using the full texture width/height is clearly right
for scaling. However perhaps the thing being scaled is wrong if it's
in the coordinate system of the destination surface rather than the
video texture. It's a bit hard to follow the shaders without
pseudocode like there is in vl_compositor_gfx. Perhaps writing such
pseudocode will make the problem more apparent?

Cheers,

  -ilia

On Mon, Nov 30, 2020 at 12:35 PM Thong Thai  wrote:
>
> Hi Ilia,
>
> This was the issue I was trying to fix:
> https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6736#note_626808
>
> It's doing a blit from 1088px (the actual video is only 1080px) to
> 1440px. The blit code scales and uses both the 1080 box height and 1088
> resource height, but I'm guessing the small difference is the problem
> I'm dealing with:
>
>u_bitcast_f2u(x_scale / src->width0),
>u_bitcast_f2u(y_scale / src->height0),
>
> p y_scale / src->height0
> $4 = 0.000689338252
>
> p y_scale / 1080
> $6 = 0.00069461
>
> I'm not sure how to properly fix it though... any ideas would be
> appreciated.
>
> Thanks,
>
> Thong Thai
>
> On 2020-11-30 10:52 a.m., Liu, Leo wrote:
> > +Thong.
> >
> > -Original Message-
> > From: mesa-dev  On Behalf Of Ilia 
> > Mirkin
> > Sent: November 25, 2020 5:48 PM
> > To: ML Mesa-dev ; Thai, Thong 
> > 
> > Subject: Re: [Mesa-dev] frontends/va/postproc: Use the actual image height 
> > when blitting
> >
> > Hi Thong,
> >
> > In this change:
> >
> > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcgit.freedesktop.org%2Fmesa%2Fmesa%2Fcommit%2F%3Fid%3D49465babdb35d88ed8a283e925d6cd346255d50cdata=04%7C01%7Cleo.liu%40amd.com%7C92b8ad99a46447208c3008d891943b09%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637419413135677095%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=g1pJUxsSM68kEjNZ6VDJGufaOF1Wwcezl1ehezq8mqo%3Dreserved=0
> >
> > You appear to change the height of the underlying resource while doing a 
> > blit. This is never legal though -- the height of the resource is 
> > effectively immutable (and subject to various constraints). I'm not sure 
> > what issue you're having, but this is not the way to fix it.
> >
> > Cheers,
> >
> >-ilia
> > ___
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Fmesa-devdata=04%7C01%7Cleo.liu%40amd.com%7C92b8ad99a46447208c3008d891943b09%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637419413135687091%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=ug6N9kvs1JzjDhHPpB%2F9ttrCaahVgm7NJ6Bk0zCxc4A%3Dreserved=0
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] frontends/va/postproc: Use the actual image height when blitting

2020-11-25 Thread Ilia Mirkin

Hi Thong,

In this change:

https://cgit.freedesktop.org/mesa/mesa/commit/?id=49465babdb35d88ed8a283e925d6cd346255d50c

You appear to change the height of the underlying resource while doing
a blit. This is never legal though -- the height of the resource is
effectively immutable (and subject to various constraints). I'm not
sure what issue you're having, but this is not the way to fix it.

Cheers,

  -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [ANNOUNCE] mesa 20.0.5

2020-04-22 Thread Ilia Mirkin

On Thu, Apr 23, 2020 at 12:39 AM Dylan Baker  wrote:
>
> Quoting Ilia Mirkin (2020-04-22 15:47:59)
> > On Wed, Apr 22, 2020 at 6:39 PM Danylo Piliaiev
> >  wrote:
> > > I'm sorry for this trouble. However looking at it I think: maybe 
> > > something could be
> > > changed about applying patches to stable to safeguard against such issues.
> >
> > We used to get pre-announcements a few days prior to a release which
> > would allow developers to look things over, which would allow us to
> > notice things like that. Not sure when this got dropped.
> >
> > Cheers,
> >
> >   -ilia
>
> That was dropped in favor of a live staging branch that is updated daily (at
> least on week days).

The live staging branch is nice, but IMO not a replacement.

Cheers,

  -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [ANNOUNCE] mesa 20.0.5

2020-04-22 Thread Ilia Mirkin

On Wed, Apr 22, 2020 at 6:39 PM Danylo Piliaiev
 wrote:
> I'm sorry for this trouble. However looking at it I think: maybe something 
> could be
> changed about applying patches to stable to safeguard against such issues.

We used to get pre-announcements a few days prior to a release which
would allow developers to look things over, which would allow us to
notice things like that. Not sure when this got dropped.

Cheers,

  -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] nir: find_msb vs clz

2020-04-01 Thread Ilia Mirkin

On Wed, Apr 1, 2020 at 2:39 PM Erik Faye-Lund
 wrote:
>
> While working on the NIR to DXIL conversion code for D3D12, I've
> noticed that we're not exactly doing the best we could here.
>
> First some background:
>
> NIR currently has a few instructions that does kinda the same:
>
> 1. nir_op_ufind_msb: Finds the index of the most significant bit,
> counting from the least significant bit. It returns -1 on zero-input.
>
> 2. nir_op_ifind_msb: A signed version of ufind_msb; looks for the first
> non sign-bit. It's not terribly interesting in this context, as it can
> be trivially lowered if missing, and it doesn't seem like any hardware
> supports this natively. I'm just mentioning it for completeness.

While I can't speak to the current state of the nouveau NIR backend,
the hardware definitely has both of these. (And a cursory look
indicates that both are properly supported without any unnecessary
lowering.) It's known as "FLO" in the NVIDIA intrinsic names, and it's
the "BFIND" instruction in nv50_ir-speak. It's present on all Fermi+
GPUs.

https://cgit.freedesktop.org/mesa/mesa/tree/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp#n2346
https://cgit.freedesktop.org/mesa/mesa/tree/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp#n946

Note that it has a separate bit for whether it's the signed variant or
not. There's also a "SAMT" variant of it, but I honestly don't
remember what that does exactly. We use it when finding the LSB after
reversing the bits. I think makes the op return 32-x or something.

Cheers,

  -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [RFC PATCH] mesa: fix _mesa_draw_nonzero_divisor_bits to return nonzero divisors

2020-02-29 Thread Ilia Mirkin

The bitmask is _EffEnabledNonZeroDivisor, so no need to invert it before
returning.

Fixes: fd6636ebc06d (st/mesa: simplify determination whether a draw needs 
min/max index)
Signed-off-by: Ilia Mirkin 
---

I haven't done any extensive testing on this, but it does seem to fix a
regression on nv50 where the limits were not being supplied. (And
there's an assert to make sure that they are.)

 src/mesa/main/arrayobj.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/mesa/main/arrayobj.h b/src/mesa/main/arrayobj.h
index 19ab65b3242..3efcd577ae5 100644
--- a/src/mesa/main/arrayobj.h
+++ b/src/mesa/main/arrayobj.h
@@ -221,7 +221,7 @@ _mesa_draw_nonzero_divisor_bits(const struct gl_context 
*ctx)
 {
const struct gl_vertex_array_object *const vao = ctx->Array._DrawVAO;
assert(vao->NewArrays == 0);
-   return ~vao->_EffEnabledNonZeroDivisor & ctx->Array._DrawVAOEnabledAttribs;
+   return vao->_EffEnabledNonZeroDivisor & ctx->Array._DrawVAOEnabledAttribs;
 }
 
 
-- 
2.24.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [RFC PATCH] Add GL_MESA_ieee_fp_alu_mode specification draft

2020-02-24 Thread Ilia Mirkin

On Mon, Feb 24, 2020 at 6:56 PM Ilia Mirkin  wrote:
>
> On Sun, Feb 23, 2020 at 8:57 PM Ilia Mirkin  wrote:
> >
> > ---
> >
> > We talked about something like this a while back, but the end result
> > was inconclusive. I added a TGSI MUL_ZERO_WINS shader property for nine.
> > But it'd be nice for wine to be able to control this too.
> >
> > I couldn't actually find any evidence of the discussion from 2017 or so,
> > so ... let's have another one.
>
> Ian, Matteo, thanks for your feedback.
>
> Based on IRC discussion today, it looks like radeonsi is not
> interested in gaining such a feature, and Ian's comments all point
> away from exposing the actual hardware bits, which I think means this
> ext is DOA. I don't think support in just nouveau + r600 justifies the
> effort of getting this going -- too small of a user base between them,
> and they can use nine if they really want better perf. I guess that's
> why this ext died back in 2017 too, but at least now there will
> hopefully be an easier-to-find record of it.
>
> Cheers,
>
>   -ilia

Oh, and for posterity, strfllw in #winehackers tracked down the
original thread from 2017:

https://lists.freedesktop.org/archives/mesa-dev/2017-January/140613.html

Not sure how I missed it, I even looked in Jan 2017, but there we are.
Apparently part of the difficulty for radeonsi is LLVM, but perhaps
ACO could soften some of that blow.

  -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [RFC PATCH] Add GL_MESA_ieee_fp_alu_mode specification draft

2020-02-24 Thread Ilia Mirkin

On Sun, Feb 23, 2020 at 8:57 PM Ilia Mirkin  wrote:
>
> ---
>
> We talked about something like this a while back, but the end result
> was inconclusive. I added a TGSI MUL_ZERO_WINS shader property for nine.
> But it'd be nice for wine to be able to control this too.
>
> I couldn't actually find any evidence of the discussion from 2017 or so,
> so ... let's have another one.

Ian, Matteo, thanks for your feedback.

Based on IRC discussion today, it looks like radeonsi is not
interested in gaining such a feature, and Ian's comments all point
away from exposing the actual hardware bits, which I think means this
ext is DOA. I don't think support in just nouveau + r600 justifies the
effort of getting this going -- too small of a user base between them,
and they can use nine if they really want better perf. I guess that's
why this ext died back in 2017 too, but at least now there will
hopefully be an easier-to-find record of it.

Cheers,

  -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [RFC PATCH] Add GL_MESA_ieee_fp_alu_mode specification draft

2020-02-24 Thread Ilia Mirkin

On Mon, Feb 24, 2020 at 1:10 PM Ian Romanick  wrote:
>
> On 2/23/20 5:57 PM, Ilia Mirkin wrote:
> > ---
> >
> > We talked about something like this a while back, but the end result
> > was inconclusive. I added a TGSI MUL_ZERO_WINS shader property for nine.
> > But it'd be nice for wine to be able to control this too.
> >
> > I couldn't actually find any evidence of the discussion from 2017 or so,
> > so ... let's have another one.
> >
> >  docs/specs/MESA_ieee_fp_alu_mode.spec | 136 ++
> >  1 file changed, 136 insertions(+)
> >  create mode 100644 docs/specs/MESA_ieee_fp_alu_mode.spec
> >
> > diff --git a/docs/specs/MESA_ieee_fp_alu_mode.spec 
> > b/docs/specs/MESA_ieee_fp_alu_mode.spec
> > new file mode 100644
> > index 000..cb274f06571
> > --- /dev/null
> > +++ b/docs/specs/MESA_ieee_fp_alu_mode.spec
> > @@ -0,0 +1,136 @@
> > +Name
> > +
> > +MESA_ieee_fp_alu_mode
> > +
> > +Name Strings
> > +
> > +GL_MESA_ieee_fp_alu_mode
> > +
> > +Contact
> > +
> > +Ilia Mirkin, ilia 'at' x.org
> > +
> > +IP Status
> > +
> > +No known IP issues.
> > +
> > +Status
> > +
> > +Proposed
> > +
> > +Version
> > +
> > +Number
> > +
> > +TBD
> > +
> > +Dependencies
> > +
> > +OpenGL 3.0 or OpenGL ES 3.0 is required.
> > +
> > +The extension is written against the OpenGL GL 3.0 and OpenGL ES 3.0
> > +specifications.
> > +
> > +Overview
> > +
> > +Pre-GL3 hardware did not generally have full IEEE floating point 
> > operation
> > +support. Among other things, 0 * Infinity would work out to 0, and 
> > NaN's
> > +might not be generated, or otherwise be treated improperly. GL3-class 
> > and
> > +later hardware introduced full IEEE FP support, including NaN, 
> > Infinity,
> > +and the proper generation of these.
> > +
> > +Some software targeted at older hardware makes assumptions about how 
> > the
> > +shader ALU works. And to accomodate these, GL3-class hardware has a 
> > way to
> > +change how the shader ALU behaves. There are no standards around this, 
> > and
> > +different hardware has different ways of dealing with it. However these
> > +modes were designed specifically with such older software in mind.
> > +
> > +This extension introduces a way to configure a context to be in 
> > non-IEEE
> > +ALU mode. This extension does not specify precisely what this means, as
> > +each vendor has something different. Generally it means non-IEEE 
> > compliant
> > +handling of multiplication, as well as any other unspecified changes.
>
> I think many of the other things are specified.  They're the non-IEEE
> behaviors of GL_ARB_vertex_program and GL_ARB_fragment_program, and
> those mimic the required behavior of early DX shader models.  There are
> a bunch of cases that specify that zero is generated when IEEE would
> require NaN.
>
> If there's just a small handful of things like this, we'd probably be
> better adding a couple new built-in functions to do the job.  The
> problem on Intel hardware is... we really, really don't want to switch
> to non-IEEE mode because it changes how a bunch of things work, and we
> haven't tested any of that in many years.  I'd much rather put in some
> kind of work-arounds for things that don't want multiplication or pow()
> to generate NaN.

So basically anything that ever involves multiplication needs to have
these variants. Things like dot, the various crazy ops of days past
whose names escape me but involve complex calculations, etc. Things
like pow are questionable (depends on if they get decomposed or not),
and things like rcp/rsq unquestionably produce NaN's (or Infinity,
sorry not 100% sure but easily checked) on NVIDIA irrespective of that
mode being enabled.

Also on Intel hardware, as you mention, the "non-ieee" mode is ...
interesting, so to allow for that, I didn't want to say anything other
than the positive cases. If you have no interest in exposing this, I
could rewrite this in a NVIDIA/AMD-friendly manner.

>
> As for the mechanism, I'm very strongly in favor of something that would
> be locked-in when the shader is compiled.  I really want to avoid any
> potential that an external glEnable could trigger a a recompile.

Stefan Dösinger suggested a context flag on IRC. I'd be fine with that
too, even if I have to go create 2 exts due to GLX/EGL.

>
> The more I think about it... having an extension that adds a

[Mesa-dev] [RFC PATCH] Add GL_MESA_ieee_fp_alu_mode specification draft

2020-02-23 Thread Ilia Mirkin

---

We talked about something like this a while back, but the end result
was inconclusive. I added a TGSI MUL_ZERO_WINS shader property for nine.
But it'd be nice for wine to be able to control this too.

I couldn't actually find any evidence of the discussion from 2017 or so,
so ... let's have another one.

 docs/specs/MESA_ieee_fp_alu_mode.spec | 136 ++
 1 file changed, 136 insertions(+)
 create mode 100644 docs/specs/MESA_ieee_fp_alu_mode.spec

diff --git a/docs/specs/MESA_ieee_fp_alu_mode.spec 
b/docs/specs/MESA_ieee_fp_alu_mode.spec
new file mode 100644
index 000..cb274f06571
--- /dev/null
+++ b/docs/specs/MESA_ieee_fp_alu_mode.spec
@@ -0,0 +1,136 @@
+Name
+
+MESA_ieee_fp_alu_mode
+
+Name Strings
+
+GL_MESA_ieee_fp_alu_mode
+
+Contact
+
+Ilia Mirkin, ilia 'at' x.org
+
+IP Status
+
+No known IP issues.
+
+Status
+
+Proposed
+
+Version
+
+Number
+
+TBD
+
+Dependencies
+
+OpenGL 3.0 or OpenGL ES 3.0 is required.
+
+The extension is written against the OpenGL GL 3.0 and OpenGL ES 3.0
+specifications.
+
+Overview
+
+Pre-GL3 hardware did not generally have full IEEE floating point operation
+support. Among other things, 0 * Infinity would work out to 0, and NaN's
+might not be generated, or otherwise be treated improperly. GL3-class and
+later hardware introduced full IEEE FP support, including NaN, Infinity,
+and the proper generation of these.
+
+Some software targeted at older hardware makes assumptions about how the
+shader ALU works. And to accomodate these, GL3-class hardware has a way to
+change how the shader ALU behaves. There are no standards around this, and
+different hardware has different ways of dealing with it. However these
+modes were designed specifically with such older software in mind.
+
+This extension introduces a way to configure a context to be in non-IEEE
+ALU mode. This extension does not specify precisely what this means, as
+each vendor has something different. Generally it means non-IEEE compliant
+handling of multiplication, as well as any other unspecified changes.
+
+New Tokens
+
+Accepted by the  parameter of Enable, Disable, and IsEnabled, by
+the  parameter of GetBooleanv, GetIntegerv, GetFloatv, and
+GetDoublev:
+
+IEEE_FP_ALU_MODE_MESA  0x
+
+
+Changes to GLSL Section 4.1.4 Floats:
+
+Add the following paragraph:
+
+In case that the shader is being executed in a context with
+IEEE_FP_ALU_MODE_MESA disabled, multiplication shall produce the following
+(non-IEEE-complaint) result:
+
+   float a = 0;
+   float b = Infinity;
+   float c = a * b; // c == 0
+
+There may be other implications from this mode being enabled, including
+clamping of non-finite values, or anything else the hardware mode happens
+to enable to achieve compatibility.
+
+New State
+
+(add to table 6.52, Miscellaneous, p.392)
+
+   Initial
+Get Value  Type   Get Command   Value Description   
Sec.   Attribute
+-  -  ---  --- --  
--  -
+IEEE_FP_ALU_MODE_MESAB IsEnabledTRUE   Whether shader ALU  
 enable
+   is in IEEE FP mode
+
+
+Issues
+
+(1) This specification does not precisely specify what non-IEEE FP mode is.
+
+RESOLVED. Shipping hardware has different ways of dealing with it. For
+example, Intel clamps all values. NVIDIA Tesla series has a
+context-wide mode for controlling whether zero wins in multiplication
+or follows IEEE rules. NVIDIA Fermi+ series as well as ATI/AMD Radeon
+R600+ has separate opcodes which control this (but again, a different
+set of operations are covered).
+
+A single extension which is going to be easy to use for emulation
+software is thus much harder to write if it's to precisely specify
+this.
+
+The applications that want these have already been written and tested
+against these approaches, so we know they all work with whatever the
+hardware has to offer.
+
+(2) Why use an Enable instead of a shader layout token?
+
+RESOLVED. Because some hardware implementations don't allow
+controlling this on a per-stage level. While one could come up with
+rules requiring linked program stages to have the same setting, this
+is going to be extra validation for the implementations to
+implement. Furthermore, one would want these rules to also apply to
+fixed-function-generated shaders equally. Instead a simple mode should
+be able to flip this on and off.
+
+(3) What about FP denorms?
+
+RESOLVED. The same hardware tends to also have a way to control
+whether denorm FP values

[Mesa-dev] [PATCH] gm107/ir: avoid combining geometry shader stores at 0x60

2020-01-06 Thread Ilia Mirkin

This corresponds to gl_PrimitiveID and gl_Layer. When both of these are
stored in a single AST.64 or AST.128 operation, then it appears as
though the whole store fails. Fixes the recently extended
glsl-1.50-transform-feedback-builtins piglit, and also
gtf30.GL3Tests.transform_feedback.transform_feedback_builtins.

The issue was reproduced on GM107 and GP108.

Signed-off-by: Ilia Mirkin 
---
 .../drivers/nouveau/codegen/nv50_ir_peephole.cpp   | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
index 673fcd187ca..2f46b0e886a 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
@@ -2804,6 +2804,16 @@ MemoryOpt::combineSt(Record *rec, Instruction *st)
if (prog->getType() == Program::TYPE_COMPUTE && rec->rel[0])
   return false;
 
+   // There's really no great place to put this in a generic manner. Seemingly
+   // wide stores at 0x60 don't work in GS shaders on SM50+. Don't combine
+   // those.
+   if (prog->getTarget()->getChipset() >= NVISA_GM107_CHIPSET &&
+   prog->getType() == Program::TYPE_GEOMETRY &&
+   st->getSrc(0)->reg.file == FILE_SHADER_OUTPUT &&
+   rec->rel[0] == NULL &&
+   MIN2(offRc, offSt) == 0x60)
+  return false;
+
// remove any existing load/store records for the store being merged into
// the existing record.
purgeRecords(st, DATA_FILE_COUNT);
-- 
2.24.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] nv50,nvc0: fix destination coordinates of blit

2019-12-29 Thread Ilia Mirkin

The fix was found by Karol Herbst a long time ago, but it was unclear
why it helped or if it would create additional problems. This change
adds a comment that explains what's going on, and in the process also
normalizes the nv50 implementation to match.

The coordinates which are fed to gl_Position map directly to pixel
coordinates, since the viewport transform is disabled. If the
framebuffer is MSAA, then that doesn't affect the pixel coordinates at
all, it's just that each pixel has multiple samples.

Note that this makes it really clear that this approach is inappropriate
for EXT_framebuffer_multisample_blit_scaled, and also the 3d path will
fail terribly for direct copies. Thankfully the 2d path normally takes
care of this.

Fixes KHR-GL43.packed_depth_stencil.blit.depth32f_stencil8 as well as
scaling issues in a number of EXT_framebuffer_multisample-related piglit
tests (although they continue to fail due to inaccuracies).

Signed-off-by: Ilia Mirkin 
---
 src/gallium/drivers/nouveau/nv50/nv50_surface.c | 12 
 src/gallium/drivers/nouveau/nvc0/nvc0_surface.c | 16 ++--
 2 files changed, 18 insertions(+), 10 deletions(-)

diff --git a/src/gallium/drivers/nouveau/nv50/nv50_surface.c 
b/src/gallium/drivers/nouveau/nv50/nv50_surface.c
index c64be0a348f..7a2402d72ed 100644
--- a/src/gallium/drivers/nouveau/nv50/nv50_surface.c
+++ b/src/gallium/drivers/nouveau/nv50/nv50_surface.c
@@ -1356,7 +1356,6 @@ nv50_blit_3d(struct nv50_context *nv50, const struct 
pipe_blit_info *info)
float x0, x1, y0, y1, z;
float dz;
float x_range, y_range;
-   float tri_x, tri_y;
 
blit->mode = nv50_blit_select_mode(info);
blit->color_mask = nv50_blit_derive_color_mask(info);
@@ -1377,14 +1376,11 @@ nv50_blit_3d(struct nv50_context *nv50, const struct 
pipe_blit_info *info)
x_range = (float)info->src.box.width / (float)info->dst.box.width;
y_range = (float)info->src.box.height / (float)info->dst.box.height;
 
-   tri_x = 16384 << nv50_miptree(dst)->ms_x;
-   tri_y = 16384 << nv50_miptree(dst)->ms_y;
-
x0 = (float)info->src.box.x - x_range * (float)info->dst.box.x;
y0 = (float)info->src.box.y - y_range * (float)info->dst.box.y;
 
-   x1 = x0 + tri_x * x_range;
-   y1 = y0 + tri_y * y_range;
+   x1 = x0 + 16384.0f * x_range;
+   y1 = y0 + 16384.0f * y_range;
 
x0 *= (float)(1 << nv50_miptree(src)->ms_x);
x1 *= (float)(1 << nv50_miptree(src)->ms_x);
@@ -1457,7 +1453,7 @@ nv50_blit_3d(struct nv50_context *nv50, const struct 
pipe_blit_info *info)
   PUSH_DATAf(push, y0);
   PUSH_DATAf(push, z);
   BEGIN_NV04(push, NV50_3D(VTX_ATTR_2F_X(0)), 2);
-  PUSH_DATAf(push, tri_x);
+  PUSH_DATAf(push, 16384.0f);
   PUSH_DATAf(push, 0.0f);
   BEGIN_NV04(push, NV50_3D(VTX_ATTR_3F_X(1)), 3);
   PUSH_DATAf(push, x0);
@@ -1465,7 +1461,7 @@ nv50_blit_3d(struct nv50_context *nv50, const struct 
pipe_blit_info *info)
   PUSH_DATAf(push, z);
   BEGIN_NV04(push, NV50_3D(VTX_ATTR_2F_X(0)), 2);
   PUSH_DATAf(push, 0.0f);
-  PUSH_DATAf(push, tri_y);
+  PUSH_DATAf(push, 16384.0f);
   BEGIN_NV04(push, NV50_3D(VERTEX_END_GL), 1);
   PUSH_DATA (push, 0);
}
diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_surface.c 
b/src/gallium/drivers/nouveau/nvc0/nvc0_surface.c
index 5f630149836..2b38fe6e7f5 100644
--- a/src/gallium/drivers/nouveau/nvc0/nvc0_surface.c
+++ b/src/gallium/drivers/nouveau/nvc0/nvc0_surface.c
@@ -1276,6 +1276,18 @@ nvc0_blit_3d(struct nvc0_context *nvc0, const struct 
pipe_blit_info *info)
 * render target, with scissors defining the destination region.
 * The vertex is supplied with non-normalized texture coordinates
 * arranged in a way to yield the desired offset and scale.
+*
+* Note that while the source texture is presented to the sampler as
+* non-MSAA (even if it is), the destination texture is treated as MSAA for
+* rendering. This means that
+*  - destination coordinates shouldn't be scaled
+*  - without per-sample rendering, the target will be a solid-fill for all
+*of the samples
+*
+* The last point implies that this process is very bad for 1:1 blits, as
+* well as scaled blits between MSAA surfaces. This works fine for
+* upscaling and downscaling though. The 1:1 blits should ideally be
+* handled by the 2d engine, which can do it perfectly.
 */
 
minx = info->dst.box.x;
@@ -1364,14 +1376,14 @@ nvc0_blit_3d(struct nvc0_context *nvc0, const struct 
pipe_blit_info *info)
   *(vbuf++) = fui(y0);
   *(vbuf++) = fui(z);
 
-  *(vbuf++) = fui(32768 << nv50_miptree(dst)->ms_x);
+  *(vbuf++) = fui(32768.0f);
   *(vbuf++) = fui(0.0f);
   *(vbuf++) = fui(x1);
   *(vbuf++) = fui(y0);
   *(vbuf++) = fui(z);
 
   *(vbuf++) = fui(0.0f);
-  *(vbuf++) = fui(32768 << nv50_miptree(dst)->ms_y);
+  *(vbuf++) = fui(3276

Re: [Mesa-dev] How MESA display image in case of use full software rendering?

2019-12-25 Thread Ilia Mirkin

Your best bet is to recreate your driver as a KMS driver. This should
be relatively straightforward -- lots of helpers exist to cover your
use-case. Perhaps the "tiny" drivers are appropriate. Also a quick
search reveals this:

https://xilinx-wiki.atlassian.net/wiki/spaces/A/pages/18842520/Xilinx+DRM+KMS+driver

which suggests that a driver already exists, although I don't see it upstream.

  -ilia

On Wed, Dec 25, 2019 at 4:04 AM Oleg Gavrilchenko
 wrote:
>
> Thank you for answer! I wathed kmscube example. I want to clarify some
> things. I have system on ARM(Xilinx ZYNQ). It is not have GPU, there is
> only DDR framebuffer, which display through HDMI via FPGA logic.
> I wrote for it small fbdev driver. Programs may read/write this
> framebuffer through /dev/fb0. That is all. I not write DRM and KMS
> support for this device.
> I try use OSMesa with this framebuffer, and it work fine.
> Should I add support DRM or KMS support on my driver for  OpenGL work right?
>
> On 12/25/19 6:10 AM, Ilia Mirkin wrote:
> > Not specific to swrast, but there's a "drm" platform via gbm (or a gbm
> > platform with drm output? I'll never remember). You should be able to
> > use this to output directly to a kms output. An example of such an
> > application is kmscube -- it's more geared to hardware accel, but I
> > don't see any reason why it wouldn't work with swrast. See
> > https://cgit.freedesktop.org/mesa/kmscube . I don't think there's any
> > integration with fbdev, but I also don't see why such an integration
> > would be precluded from existing -- however you're better off with
> > kms, as that's the supported thing.
> >
> > On Mon, Dec 23, 2019 at 6:01 AM Oleg Gavrilchenko
> >  wrote:
> >> Hi! Sorry for my English, it is not my first language(Russian).
> >>
> >> I can ask. How MESA display image in case of use full software
> >> rendering? What does interface it use?
> >> It use /dev/fbX, drm or otherwise method?
> >>
> >> ___
> >> mesa-dev mailing list
> >> mesa-dev@lists.freedesktop.org
> >> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> > ___
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> >
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] How MESA display image in case of use full software rendering?

2019-12-24 Thread Ilia Mirkin

Not specific to swrast, but there's a "drm" platform via gbm (or a gbm
platform with drm output? I'll never remember). You should be able to
use this to output directly to a kms output. An example of such an
application is kmscube -- it's more geared to hardware accel, but I
don't see any reason why it wouldn't work with swrast. See
https://cgit.freedesktop.org/mesa/kmscube . I don't think there's any
integration with fbdev, but I also don't see why such an integration
would be precluded from existing -- however you're better off with
kms, as that's the supported thing.

On Mon, Dec 23, 2019 at 6:01 AM Oleg Gavrilchenko
 wrote:
>
> Hi! Sorry for my English, it is not my first language(Russian).
>
> I can ask. How MESA display image in case of use full software
> rendering? What does interface it use?
> It use /dev/fbX, drm or otherwise method?
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] util/atomic: Add p_atomic_add_return for the unlocked path

2019-12-05 Thread Ilia Mirkin

Reviewed-by: Ilia Mirkin 

On Thu, Dec 5, 2019 at 12:51 PM Jason Ekstrand  wrote:
>
> Fixes: 385d13f26d2 "util/atomic: Add a _return variant of p_atomic_add"
> ---
>  src/util/u_atomic.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/src/util/u_atomic.h b/src/util/u_atomic.h
> index 45e8e2e0188..9cbc6dd1eaa 100644
> --- a/src/util/u_atomic.h
> +++ b/src/util/u_atomic.h
> @@ -92,7 +92,7 @@
>  #define p_atomic_add(_v, _i) ((void) p_atomic_add_return((_v), (_i))
>  #define p_atomic_inc_return(_v) (++(*(_v)))
>  #define p_atomic_dec_return(_v) (--(*(_v)))
> -#define p_atomic_add(_v, _i) (*(_v) = *(_v) + (_i))
> +#define p_atomic_add_return(_v, _i) (*(_v) = *(_v) + (_i))
>  #define p_atomic_cmpxchg(_v, _old, _new) (*(_v) == (_old) ? (*(_v) = (_new), 
> (_old)) : *(_v))
>
>  #endif
> --
> 2.23.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] i965: remove second prototype of brw_batch_reloc

2019-12-05 Thread Ilia Mirkin

I am. Reading code too late at night, apparently. Sorry!

On Thu, Dec 5, 2019 at 4:15 AM Lionel Landwerlin
 wrote:
>
> Hmm... I don't see it.
>
> Are you not confused by brw_batch_reloc/brw_state_reloc?
>
> -Lionel
>
> On 05/12/2019 06:56, Ilia Mirkin wrote:
> > Signed-off-by: Ilia Mirkin 
> > ---
> >   src/mesa/drivers/dri/i965/intel_batchbuffer.h | 5 -
> >   1 file changed, 5 deletions(-)
> >
> > diff --git a/src/mesa/drivers/dri/i965/intel_batchbuffer.h 
> > b/src/mesa/drivers/dri/i965/intel_batchbuffer.h
> > index 91720dad5b4..137158f168a 100644
> > --- a/src/mesa/drivers/dri/i965/intel_batchbuffer.h
> > +++ b/src/mesa/drivers/dri/i965/intel_batchbuffer.h
> > @@ -67,11 +67,6 @@ uint64_t brw_batch_reloc(struct intel_batchbuffer *batch,
> >struct brw_bo *target,
> >uint32_t target_offset,
> >unsigned flags);
> > -uint64_t brw_state_reloc(struct intel_batchbuffer *batch,
> > - uint32_t batch_offset,
> > - struct brw_bo *target,
> > - uint32_t target_offset,
> > - unsigned flags);
> >
> >   #define USED_BATCH(_batch) \
> >  ((uintptr_t)((_batch).map_next - (_batch).batch.map))
>
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] i965: remove second prototype of brw_batch_reloc

2019-12-04 Thread Ilia Mirkin

Signed-off-by: Ilia Mirkin 
---
 src/mesa/drivers/dri/i965/intel_batchbuffer.h | 5 -
 1 file changed, 5 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/intel_batchbuffer.h 
b/src/mesa/drivers/dri/i965/intel_batchbuffer.h
index 91720dad5b4..137158f168a 100644
--- a/src/mesa/drivers/dri/i965/intel_batchbuffer.h
+++ b/src/mesa/drivers/dri/i965/intel_batchbuffer.h
@@ -67,11 +67,6 @@ uint64_t brw_batch_reloc(struct intel_batchbuffer *batch,
  struct brw_bo *target,
  uint32_t target_offset,
  unsigned flags);
-uint64_t brw_state_reloc(struct intel_batchbuffer *batch,
- uint32_t batch_offset,
- struct brw_bo *target,
- uint32_t target_offset,
- unsigned flags);
 
 #define USED_BATCH(_batch) \
((uintptr_t)((_batch).map_next - (_batch).batch.map))
-- 
2.23.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] nv50/ir: fix crash in isUniform for undefined values

2019-11-02 Thread Ilia Mirkin

Reviewed-by: Ilia Mirkin 

On Sat, Nov 2, 2019 at 7:57 PM Karol Herbst  wrote:
>
> Signed-off-by: Karol Herbst 
> ---
>  src/gallium/drivers/nouveau/codegen/nv50_ir.cpp | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir.cpp 
> b/src/gallium/drivers/nouveau/codegen/nv50_ir.cpp
> index a181a13a3b1..ae07d967221 100644
> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir.cpp
> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir.cpp
> @@ -274,6 +274,8 @@ LValue::isUniform() const
> if (defs.size() > 1)
>return false;
> Instruction *insn = getInsn();
> +   if (!insn)
> +  return false;
> // let's not try too hard here for now ...
> return !insn->srcExists(1) && insn->getSrc(0)->isUniform();
>  }
> --
> 2.23.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [2/2] state_tracker: Handle texture view min level in st_generate_mipmap()

2019-11-01 Thread Ilia Mirkin

It looks like the _mesa_generate_mipmap fallback has a similar problem
(which would happen for glGenerateMipmap with e.g. a s3tc format).

I think this could all use some piglit tests that iterate through all
or at least many different formats, including both renderable and
non-renderable ones.

Cheers,

  -ilia

On Fri, Nov 1, 2019 at 3:08 PM Paul Gofman  wrote:
>
> Signed-off-by: Paul Gofman 
> ---
>  src/mesa/state_tracker/st_gen_mipmap.c | 5 -
>  1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/src/mesa/state_tracker/st_gen_mipmap.c 
> b/src/mesa/state_tracker/st_gen_mipmap.c
> index b71a8ee57bb..c5cf7063836 100644
> --- a/src/mesa/state_tracker/st_gen_mipmap.c
> +++ b/src/mesa/state_tracker/st_gen_mipmap.c
> @@ -56,13 +56,16 @@ st_generate_mipmap(struct gl_context *ctx, GLenum target,
> struct st_context *st = st_context(ctx);
> struct st_texture_object *stObj = st_texture_object(texObj);
> struct pipe_resource *pt = st_get_texobj_resource(texObj);
> -   const uint baseLevel = texObj->BaseLevel;
> +   uint baseLevel = texObj->BaseLevel;
> enum pipe_format format;
> uint lastLevel, first_layer, last_layer;
>
> if (!pt)
>return;
>
> +   if (texObj->Immutable)
> +  baseLevel += texObj->MinLevel;
> +
> /* not sure if this ultimately actually should work,
>but we're not supporting multisampled textures yet. */
> assert(pt->nr_samples < 2);
> --
> 2.23.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] mesa/st: support lowering user-clip-planes automatically

2019-10-18 Thread Ilia Mirkin

On Fri, Oct 18, 2019 at 9:07 AM Ilia Mirkin  wrote:
>
> On Fri, Oct 18, 2019 at 9:04 AM Erik Faye-Lund
>  wrote:
> >
> > On Fri, 2019-10-18 at 08:57 -0400, Ilia Mirkin wrote:
> > > On Fri, Oct 18, 2019 at 8:51 AM Erik Faye-Lund
> > >  wrote:
> > > > On Thu, 2019-10-17 at 22:24 -0400, Ilia Mirkin wrote:
> > > > > In the meanwhile (unless you plan on taking up Jason's
> > > > > suggestion),
> > > > > might I recommend some assert's for the unhandled cases so that
> > > > > there
> > > > > are no surprises?
> > > >
> > > > Good idea. I sent a MR for it here:
> > > >
> > > > https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2380
> > >
> > > Not sure that approach works, esp with SSO.
> >
> > If so, wouldn't that already be a problem with the existing
> > lower_depth_clamp-stuff, then? I mean, I just lifted the logic from
> > there...
>
> Yeah, that looks bogus. I'm moderately sure that checking
> "st->vp/gp/tep" at LinkShader time is wrong. Marek can probably
> confirm.

Actually it might be crazy enough to work -- you'll generate the
(potentially) wrong shader at LinkShader time, but then at draw time,
the key will be regenerated (to look up the appropriate shader, at
which time st->gp/etc *are* reliable), you'll discover that you're
missing the right shader, and recompile. Not exactly ideal, but also
not broken.

  -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] mesa/st: support lowering user-clip-planes automatically

2019-10-18 Thread Ilia Mirkin

On Fri, Oct 18, 2019 at 9:04 AM Erik Faye-Lund
 wrote:
>
> On Fri, 2019-10-18 at 08:57 -0400, Ilia Mirkin wrote:
> > On Fri, Oct 18, 2019 at 8:51 AM Erik Faye-Lund
> >  wrote:
> > > On Thu, 2019-10-17 at 22:24 -0400, Ilia Mirkin wrote:
> > > > In the meanwhile (unless you plan on taking up Jason's
> > > > suggestion),
> > > > might I recommend some assert's for the unhandled cases so that
> > > > there
> > > > are no surprises?
> > >
> > > Good idea. I sent a MR for it here:
> > >
> > > https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2380
> >
> > Not sure that approach works, esp with SSO.
>
> If so, wouldn't that already be a problem with the existing
> lower_depth_clamp-stuff, then? I mean, I just lifted the logic from
> there...

Yeah, that looks bogus. I'm moderately sure that checking
"st->vp/gp/tep" at LinkShader time is wrong. Marek can probably
confirm.

  -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] mesa/st: support lowering user-clip-planes automatically

2019-10-18 Thread Ilia Mirkin

On Fri, Oct 18, 2019 at 8:51 AM Erik Faye-Lund
 wrote:
>
> On Thu, 2019-10-17 at 22:24 -0400, Ilia Mirkin wrote:
> > In the meanwhile (unless you plan on taking up Jason's suggestion),
> > might I recommend some assert's for the unhandled cases so that there
> > are no surprises?
>
> Good idea. I sent a MR for it here:
>
> https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2380

Not sure that approach works, esp with SSO. I was thinking just like
somewhere in st_extensions where it's checking these things -- if
GS/tess are supported, make sure caps a/b/c aren't enabled.

  -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] mesa/st: support lowering user-clip-planes automatically

2019-10-17 Thread Ilia Mirkin

In the meanwhile (unless you plan on taking up Jason's suggestion),
might I recommend some assert's for the unhandled cases so that there
are no surprises?

Cheers,

  -ilia

On Thu, Oct 17, 2019 at 11:39 AM Erik Faye-Lund
 wrote:
>
> This is discussed in the merge request thread. Zink currently only support 
> vertex and fragment shaders, so it's the only place this can occur. If 
> someone wants to enable this for drivers that supports geometry or 
> tesselation shaders, they would need to extend this code first. Unless I beat 
> them to it, of course. I don't want to implement this blindly, which is why I 
> left this out for now.
>
> On October 17, 2019 5:09:36 PM GMT+02:00, Ilia Mirkin  
> wrote:
>>
>> Hey Erik,
>>
>> Just saw your change
>> https://cgit.freedesktop.org/mesa/mesa/commit/?id=3298aedd6e9f12cefd5aa7414eeebf69ce7538d1
>> . It looks like you assume that the UCPs will apply in the vertex
>> stage, but given that we support GL compat profiles for GL 4.0+ in
>> st/mesa, the UCPs actually apply to the same stage that clip distances
>> are processed in -- might be TES or GS, depending on the pipeline.
>>
>> Perhaps there's some reason why this works anyways that I'm not seeing...
>>
>> Cheers,
>>
>>   -ilia
>
>
> --
> Sent from my Android device with K-9 Mail. Please excuse my brevity.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] mesa/st: support lowering user-clip-planes automatically

2019-10-17 Thread Ilia Mirkin

Hey Erik,

Just saw your change
https://cgit.freedesktop.org/mesa/mesa/commit/?id=3298aedd6e9f12cefd5aa7414eeebf69ce7538d1
. It looks like you assume that the UCPs will apply in the vertex
stage, but given that we support GL compat profiles for GL 4.0+ in
st/mesa, the UCPs actually apply to the same stage that clip distances
are processed in -- might be TES or GS, depending on the pipeline.

Perhaps there's some reason why this works anyways that I'm not seeing...

Cheers,

  -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] nv50/ir: remove DUMMY edge type

2019-10-16 Thread Ilia Mirkin

Reviewed-by: Ilia Mirkin 

Based on the dot-output thing ("dotted" edge), presumably it was to
mark BB's that were somehow related? I have no idea what the design
was. Christoph hasn't been seen in ages, perhaps Curro knows, but I
think it's also fine to delete.

Cheers,

  -ilia

On Mon, Oct 14, 2019 at 5:10 PM Karol Herbst  wrote:
>
> it was never used
>
> Signed-off-by: Karol Herbst 
> ---
>  src/gallium/drivers/nouveau/codegen/nv50_ir_bb.cpp| 3 ---
>  src/gallium/drivers/nouveau/codegen/nv50_ir_graph.cpp | 8 +---
>  src/gallium/drivers/nouveau/codegen/nv50_ir_graph.h   | 1 -
>  src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp| 2 --
>  4 files changed, 1 insertion(+), 13 deletions(-)
>
> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_bb.cpp 
> b/src/gallium/drivers/nouveau/codegen/nv50_ir_bb.cpp
> index 9f0e0733326..76fee8c791e 100644
> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_bb.cpp
> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_bb.cpp
> @@ -536,9 +536,6 @@ Function::printCFGraph(const char *filePath)
>   case Graph::Edge::BACK:
>  fprintf(out, "\t%i -> %i;\n", idA, idB);
>  break;
> - case Graph::Edge::DUMMY:
> -fprintf(out, "\t%i -> %i [style=dotted];\n", idA, idB);
> -break;
>   default:
>  assert(0);
>  break;
> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_graph.cpp 
> b/src/gallium/drivers/nouveau/codegen/nv50_ir_graph.cpp
> index b1076cf4129..e9a9981746a 100644
> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_graph.cpp
> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_graph.cpp
> @@ -77,7 +77,6 @@ const char *Graph::Edge::typeStr() const
> case FORWARD: return "forward";
> case BACK:return "back";
> case CROSS:   return "cross";
> -   case DUMMY:   return "dummy";
> case UNKNOWN:
> default:
>return "unk";
> @@ -184,7 +183,7 @@ Graph::Node::reachableBy(const Node *node, const Node 
> *term) const
>   continue;
>
>for (EdgeIterator ei = pos->outgoing(); !ei.end(); ei.next()) {
> - if (ei.getType() == Edge::BACK || ei.getType() == Edge::DUMMY)
> + if (ei.getType() == Edge::BACK)
>  continue;
>   if (ei.getNode()->visit(seq))
>  stack.push(ei.getNode());
> @@ -301,7 +300,6 @@ private:
>  switch (ei.getType()) {
>  case Graph::Edge::TREE:
>  case Graph::Edge::FORWARD:
> -case Graph::Edge::DUMMY:
> if (++(ei.getNode()->tag) == ei.getNode()->incidentCountFwd())
>bb.push(ei.getNode());
> break;
> @@ -371,8 +369,6 @@ void Graph::classifyDFS(Node *curr, int& seq)
>
> for (edge = curr->out; edge; edge = edge->next[0]) {
>node = edge->target;
> -  if (edge->type == Edge::DUMMY)
> - continue;
>
>if (node->getSequence() == 0) {
>   edge->type = Edge::TREE;
> @@ -387,8 +383,6 @@ void Graph::classifyDFS(Node *curr, int& seq)
>
> for (edge = curr->in; edge; edge = edge->next[1]) {
>node = edge->origin;
> -  if (edge->type == Edge::DUMMY)
> - continue;
>
>if (node->getSequence() == 0) {
>   edge->type = Edge::TREE;
> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_graph.h 
> b/src/gallium/drivers/nouveau/codegen/nv50_ir_graph.h
> index 115f20e5e99..fc85e78a50c 100644
> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_graph.h
> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_graph.h
> @@ -47,7 +47,6 @@ public:
>   FORWARD,
>   BACK,
>   CROSS, // e.g. loop break
> - DUMMY
>};
>
>Edge(Node *dst, Node *src, Type kind);
> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp 
> b/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp
> index f25bce00884..6df2664da22 100644
> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp
> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp
> @@ -624,8 +624,6 @@ 
> RegAlloc::BuildIntervalsPass::collectLiveValues(BasicBlock *bb)
>// trickery to save a loop of OR'ing liveSets
>// aliasing works fine with BitSet::setOr
>for (Graph::EdgeIterator ei = bb->cfg.outgoing(); !ei.end(); 
> ei.next()) {
> - if (ei.getType() == Graph::Edge::DUMMY)
> -continue;
>   if (bbA) {
>  bb->liveSet.setOr(>liveSet, >liveSet);
>  bbA = bb;
> --
> 2.21.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] nv50/ir: remove DUMMY edge type

2019-10-14 Thread Ilia Mirkin

Oh. Probably. Let me have another look then.

On Mon, Oct 14, 2019 at 5:27 PM Karol Herbst  wrote:
>
> isn't that what "UNKNOWN" is for?
>
> On Mon, Oct 14, 2019 at 11:16 PM Ilia Mirkin  wrote:
> >
> > The idea was that this type would be used when you're not sure, and
> > then run the classifier afterwards. Otherwise the classifier doesn't
> > know which edges to classify...
> >
> > On Mon, Oct 14, 2019 at 5:10 PM Karol Herbst  wrote:
> > >
> > > it was never used
> > >
> > > Signed-off-by: Karol Herbst 
> > > ---
> > >  src/gallium/drivers/nouveau/codegen/nv50_ir_bb.cpp| 3 ---
> > >  src/gallium/drivers/nouveau/codegen/nv50_ir_graph.cpp | 8 +---
> > >  src/gallium/drivers/nouveau/codegen/nv50_ir_graph.h   | 1 -
> > >  src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp| 2 --
> > >  4 files changed, 1 insertion(+), 13 deletions(-)
> > >
> > > diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_bb.cpp 
> > > b/src/gallium/drivers/nouveau/codegen/nv50_ir_bb.cpp
> > > index 9f0e0733326..76fee8c791e 100644
> > > --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_bb.cpp
> > > +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_bb.cpp
> > > @@ -536,9 +536,6 @@ Function::printCFGraph(const char *filePath)
> > >   case Graph::Edge::BACK:
> > >  fprintf(out, "\t%i -> %i;\n", idA, idB);
> > >  break;
> > > - case Graph::Edge::DUMMY:
> > > -fprintf(out, "\t%i -> %i [style=dotted];\n", idA, idB);
> > > -break;
> > >   default:
> > >  assert(0);
> > >  break;
> > > diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_graph.cpp 
> > > b/src/gallium/drivers/nouveau/codegen/nv50_ir_graph.cpp
> > > index b1076cf4129..e9a9981746a 100644
> > > --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_graph.cpp
> > > +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_graph.cpp
> > > @@ -77,7 +77,6 @@ const char *Graph::Edge::typeStr() const
> > > case FORWARD: return "forward";
> > > case BACK:return "back";
> > > case CROSS:   return "cross";
> > > -   case DUMMY:   return "dummy";
> > > case UNKNOWN:
> > > default:
> > >return "unk";
> > > @@ -184,7 +183,7 @@ Graph::Node::reachableBy(const Node *node, const Node 
> > > *term) const
> > >   continue;
> > >
> > >for (EdgeIterator ei = pos->outgoing(); !ei.end(); ei.next()) {
> > > - if (ei.getType() == Edge::BACK || ei.getType() == Edge::DUMMY)
> > > + if (ei.getType() == Edge::BACK)
> > >  continue;
> > >   if (ei.getNode()->visit(seq))
> > >  stack.push(ei.getNode());
> > > @@ -301,7 +300,6 @@ private:
> > >  switch (ei.getType()) {
> > >  case Graph::Edge::TREE:
> > >  case Graph::Edge::FORWARD:
> > > -case Graph::Edge::DUMMY:
> > > if (++(ei.getNode()->tag) == 
> > > ei.getNode()->incidentCountFwd())
> > >bb.push(ei.getNode());
> > > break;
> > > @@ -371,8 +369,6 @@ void Graph::classifyDFS(Node *curr, int& seq)
> > >
> > > for (edge = curr->out; edge; edge = edge->next[0]) {
> > >node = edge->target;
> > > -  if (edge->type == Edge::DUMMY)
> > > - continue;
> > >
> > >if (node->getSequence() == 0) {
> > >   edge->type = Edge::TREE;
> > > @@ -387,8 +383,6 @@ void Graph::classifyDFS(Node *curr, int& seq)
> > >
> > > for (edge = curr->in; edge; edge = edge->next[1]) {
> > >node = edge->origin;
> > > -  if (edge->type == Edge::DUMMY)
> > > - continue;
> > >
> > >if (node->getSequence() == 0) {
> > >   edge->type = Edge::TREE;
> > > diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_graph.h 
> > > b/src/gallium/drivers/nouveau/codegen/nv50_ir_graph.h
> > > index 115f20e5e99..fc85e78a50c 100644
> > > --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_graph.h
> > > +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_graph.h
> > > @@ -47,7 +47,6 @@ public:
> > >   FORW

Re: [Mesa-dev] [PATCH] nv50/ir: remove DUMMY edge type

2019-10-14 Thread Ilia Mirkin

The idea was that this type would be used when you're not sure, and
then run the classifier afterwards. Otherwise the classifier doesn't
know which edges to classify...

On Mon, Oct 14, 2019 at 5:10 PM Karol Herbst  wrote:
>
> it was never used
>
> Signed-off-by: Karol Herbst 
> ---
>  src/gallium/drivers/nouveau/codegen/nv50_ir_bb.cpp| 3 ---
>  src/gallium/drivers/nouveau/codegen/nv50_ir_graph.cpp | 8 +---
>  src/gallium/drivers/nouveau/codegen/nv50_ir_graph.h   | 1 -
>  src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp| 2 --
>  4 files changed, 1 insertion(+), 13 deletions(-)
>
> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_bb.cpp 
> b/src/gallium/drivers/nouveau/codegen/nv50_ir_bb.cpp
> index 9f0e0733326..76fee8c791e 100644
> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_bb.cpp
> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_bb.cpp
> @@ -536,9 +536,6 @@ Function::printCFGraph(const char *filePath)
>   case Graph::Edge::BACK:
>  fprintf(out, "\t%i -> %i;\n", idA, idB);
>  break;
> - case Graph::Edge::DUMMY:
> -fprintf(out, "\t%i -> %i [style=dotted];\n", idA, idB);
> -break;
>   default:
>  assert(0);
>  break;
> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_graph.cpp 
> b/src/gallium/drivers/nouveau/codegen/nv50_ir_graph.cpp
> index b1076cf4129..e9a9981746a 100644
> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_graph.cpp
> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_graph.cpp
> @@ -77,7 +77,6 @@ const char *Graph::Edge::typeStr() const
> case FORWARD: return "forward";
> case BACK:return "back";
> case CROSS:   return "cross";
> -   case DUMMY:   return "dummy";
> case UNKNOWN:
> default:
>return "unk";
> @@ -184,7 +183,7 @@ Graph::Node::reachableBy(const Node *node, const Node 
> *term) const
>   continue;
>
>for (EdgeIterator ei = pos->outgoing(); !ei.end(); ei.next()) {
> - if (ei.getType() == Edge::BACK || ei.getType() == Edge::DUMMY)
> + if (ei.getType() == Edge::BACK)
>  continue;
>   if (ei.getNode()->visit(seq))
>  stack.push(ei.getNode());
> @@ -301,7 +300,6 @@ private:
>  switch (ei.getType()) {
>  case Graph::Edge::TREE:
>  case Graph::Edge::FORWARD:
> -case Graph::Edge::DUMMY:
> if (++(ei.getNode()->tag) == ei.getNode()->incidentCountFwd())
>bb.push(ei.getNode());
> break;
> @@ -371,8 +369,6 @@ void Graph::classifyDFS(Node *curr, int& seq)
>
> for (edge = curr->out; edge; edge = edge->next[0]) {
>node = edge->target;
> -  if (edge->type == Edge::DUMMY)
> - continue;
>
>if (node->getSequence() == 0) {
>   edge->type = Edge::TREE;
> @@ -387,8 +383,6 @@ void Graph::classifyDFS(Node *curr, int& seq)
>
> for (edge = curr->in; edge; edge = edge->next[1]) {
>node = edge->origin;
> -  if (edge->type == Edge::DUMMY)
> - continue;
>
>if (node->getSequence() == 0) {
>   edge->type = Edge::TREE;
> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_graph.h 
> b/src/gallium/drivers/nouveau/codegen/nv50_ir_graph.h
> index 115f20e5e99..fc85e78a50c 100644
> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_graph.h
> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_graph.h
> @@ -47,7 +47,6 @@ public:
>   FORWARD,
>   BACK,
>   CROSS, // e.g. loop break
> - DUMMY
>};
>
>Edge(Node *dst, Node *src, Type kind);
> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp 
> b/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp
> index f25bce00884..6df2664da22 100644
> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp
> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp
> @@ -624,8 +624,6 @@ 
> RegAlloc::BuildIntervalsPass::collectLiveValues(BasicBlock *bb)
>// trickery to save a loop of OR'ing liveSets
>// aliasing works fine with BitSet::setOr
>for (Graph::EdgeIterator ei = bb->cfg.outgoing(); !ei.end(); 
> ei.next()) {
> - if (ei.getType() == Graph::Edge::DUMMY)
> -continue;
>   if (bbA) {
>  bb->liveSet.setOr(>liveSet, >liveSet);
>  bbA = bb;
> --
> 2.21.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] nv50/ir: mark STORE destination inputs as used

2019-10-14 Thread Ilia Mirkin

Observed an issue when looking at the code generatedy by the
image-vertex-attrib-input-output piglit test. Even though the test
itself worked fine (due to TIC 0 being used for the image), this needs
to be fixed.

Signed-off-by: Ilia Mirkin 
---
 src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
index d62d36008e6..8c429026452 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
@@ -1591,6 +1591,12 @@ bool Source::scanInstruction(const struct 
tgsi_full_instruction *inst)
   if (insn.getOpcode() == TGSI_OPCODE_STORE &&
   dst.getFile() != TGSI_FILE_MEMORY) {
  info->io.globalAccess |= 0x2;
+
+ if (dst.getFile() == TGSI_FILE_INPUT) {
+// TODO: Handle indirect somehow?
+const int i = dst.getIndex(0);
+info->in[i].mask |= 1;
+ }
   }
 
   if (dst.getFile() == TGSI_FILE_OUTPUT) {
-- 
2.21.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] gm107/ir: fix loading z offset for layered 3d image bindings

2019-10-13 Thread Ilia Mirkin

Unfortuantely we don't know if a particular load is a real 2d image (as
would be a cube face or 2d array element), or a layer of a 3d image.
Since we pass in the TIC reference, the instruction's type has to match
what's in the TIC (experimentally). In order to properly support
bindless images, this also can't be done by looking at the current
bindings and generating appropriate code.

As a result all plain 2d loads are converted into a pair of 2d/3d loads,
with appropriate predicates to ensure only one of those actually
executes, and the values are all merged in.

This goes somewhat against the current flow, so for GM107 we do the OOB
handling directly in the surface processing logic. Perhaps the other
gens should do something similar, but that is left to another change.

This fixes dEQP tests like image_load_store.3d.*_single_layer and GL-CTS
tests like shader_image_load_store.non-layered_binding without breaking
anything else.

Signed-off-by: Ilia Mirkin 
---

OK, first of all -- to whoever thought that binding single layers of a 3d
image and telling the shader they were regular 2d images was a good idea --
I disagree.

This change feels super super dirty, but I honestly don't see a materially
cleaner way of handling it. Instead of being able to reuse the OOB
handling, it's put in with the coord processing (!), and the surface
conversion function is seriously hacked up.

But splitting it up is harder, since a lot of information has to flow
from stage to stage, like when to do what kind of access, and cloning
the surface op is much easier in the coord processing stage.

 .../nouveau/codegen/nv50_ir_emit_gm107.cpp|  34 ++-
 .../nouveau/codegen/nv50_ir_lowering_nvc0.cpp | 206 +-
 .../nouveau/codegen/nv50_ir_lowering_nvc0.h   |   4 +-
 src/gallium/drivers/nouveau/nvc0/nvc0_tex.c   |  10 +-
 4 files changed, 201 insertions(+), 53 deletions(-)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
index 6eefe8f0025..e244bd0d610 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
@@ -122,6 +122,8 @@ private:
void emitSAM();
void emitRAM();
 
+   void emitPSETP();
+
void emitMOV();
void emitS2R();
void emitCS2R();
@@ -690,6 +692,31 @@ CodeEmitterGM107::emitRAM()
  * predicate/cc
  
**/
 
+void
+CodeEmitterGM107::emitPSETP()
+{
+
+   emitInsn(0x5090);
+
+   switch (insn->op) {
+   case OP_AND: emitField(0x18, 3, 0); break;
+   case OP_OR:  emitField(0x18, 3, 1); break;
+   case OP_XOR: emitField(0x18, 3, 2); break;
+   default:
+  assert(!"unexpected operation");
+  break;
+   }
+
+   // emitINV (0x2a);
+   emitPRED(0x27); // TODO: support 3-arg
+   emitINV (0x20, insn->src(1));
+   emitPRED(0x1d, insn->src(1));
+   emitINV (0x0f, insn->src(0));
+   emitPRED(0x0c, insn->src(0));
+   emitPRED(0x03, insn->def(0));
+   emitPRED(0x00);
+}
+
 
/***
  * movement / conversion
  
**/
@@ -3557,7 +3584,12 @@ CodeEmitterGM107::emitInstruction(Instruction *i)
case OP_AND:
case OP_OR:
case OP_XOR:
-  emitLOP();
+  switch (insn->def(0).getFile()) {
+  case FILE_GPR: emitLOP(); break;
+  case FILE_PREDICATE: emitPSETP(); break;
+  default:
+ assert(!"invalid bool op");
+  }
   break;
case OP_NOT:
   emitNOT();
diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp
index 1f702a987d8..0f68a9a229f 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp
@@ -1802,6 +1802,9 @@ NVC0LoweringPass::loadSuInfo32(Value *ptr, int slot, 
uint32_t off, bool bindless
 {
uint32_t base = slot * NVC0_SU_INFO__STRIDE;
 
+   // We don't upload surface info for bindless for GM107+
+   assert(!bindless || targ->getChipset() < NVISA_GM107_CHIPSET);
+
if (ptr) {
   ptr = bld.mkOp2v(OP_ADD, TYPE_U32, bld.getSSA(), ptr, bld.mkImm(slot));
   if (bindless)
@@ -2204,7 +2207,7 @@ getDestType(const ImgType type) {
 }
 
 void
-NVC0LoweringPass::convertSurfaceFormat(TexInstruction *su)
+NVC0LoweringPass::convertSurfaceFormat(TexInstruction *su, Instruction 
**loaded)
 {
const TexInstruction::ImgFormatDesc *format = su->tex.format;
int width = format->bits[0] + format->bits[1] +
@@ -2223,21 +2226,38 @@ NVC0LoweringPass::convertSurfaceFormat(TexInstruction 
*su)
if (width < 32)
   untypedDst[0] = bld.getSSA();
 
-   for (int i = 0; i < 4; i++) {
-  typedDst[i] = su->getDef(i);
+   if (loaded &&

Re: [Mesa-dev] Declaration of _CmdBeginTransformFeedbackEXT

2019-10-03 Thread Ilia Mirkin

$ git grep CmdBeginTransformFeedback
src/amd/vulkan/radv_cmd_buffer.c:void radv_CmdBeginTransformFeedbackEXT(
src/intel/vulkan/genX_cmd_buffer.c:void genX(CmdBeginTransformFeedbackEXT)(

Not *that* hard to search for...

On Thu, Oct 3, 2019 at 10:35 AM  wrote:
>
>
>
> Sorry for being a bit thick but it seems like I cannot find where 
> *_CmdBeginTransformFeedbackEXT functions are getting declared. I would 
> assume, that it is somewhere in a XML to header stage, but could not yet 
> figure out, which header.
> Probably using some macro magic, that makes the code non-searchable :(
> Could anyone please point me in the right direction?
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 1/2] gallium/tgsi: add support for DEMOTE and READ_HELPER opcodes

2019-10-02 Thread Ilia Mirkin

This mirrors the intrinsics in the GLSL IR. One could imagine an
alternate definition where reading the semantic would account for the
READ_HELPER functionality, but that feels potentially dodgy and could be
subject to CSE unpleasantness.

Signed-off-by: Ilia Mirkin 
---
 .../auxiliary/tgsi/tgsi_info_opcodes.h|  4 ++--
 src/gallium/docs/source/tgsi.rst  | 21 +++
 src/gallium/include/pipe/p_shader_tokens.h|  4 ++--
 src/mesa/state_tracker/st_glsl_to_tgsi.cpp|  7 +--
 4 files changed, 30 insertions(+), 6 deletions(-)

diff --git a/src/gallium/auxiliary/tgsi/tgsi_info_opcodes.h 
b/src/gallium/auxiliary/tgsi/tgsi_info_opcodes.h
index 0b9b264bc53..7aecda44b82 100644
--- a/src/gallium/auxiliary/tgsi/tgsi_info_opcodes.h
+++ b/src/gallium/auxiliary/tgsi/tgsi_info_opcodes.h
@@ -29,11 +29,11 @@ OPCODE(1, 1, COMP, ROUND)
 OPCODE(1, 1, REPL, EX2)
 OPCODE(1, 1, REPL, LG2)
 OPCODE(1, 2, REPL, POW)
-OPCODE_GAP(31) /* removed */
+OPCODE(0, 0, NONE, DEMOTE)
 OPCODE(1, 1, COMP, U2I64)
 OPCODE(1, 0, OTHR, CLOCK)
 OPCODE(1, 1, COMP, I2I64)
-OPCODE_GAP(35) /* removed */
+OPCODE(1, 0, COMP, READ_HELPER)
 OPCODE(1, 1, REPL, COS)
 OPCODE(1, 1, COMP, DDX)
 OPCODE(1, 1, COMP, DDY)
diff --git a/src/gallium/docs/source/tgsi.rst b/src/gallium/docs/source/tgsi.rst
index 287f4b72729..d58b23f024b 100644
--- a/src/gallium/docs/source/tgsi.rst
+++ b/src/gallium/docs/source/tgsi.rst
@@ -681,6 +681,27 @@ This instruction replicates its result.
   Unconditional discard.  Allowed in fragment shaders only.
 
 
+.. opcode:: DEMOTE - Demote Invocation to a Helper
+
+  This demotes the current invocation to a helper, but continues
+  execution (while KILL may or may not terminate the
+  invocation). After this runs, all the usual helper invocation rules
+  apply about discarding buffer and render target writes. This is
+  useful for having accurate derivatives in the other invocations
+  which have not been demoted.
+
+  Allowed in fragment shaders only.
+
+
+.. opcode:: READ_HELPER - Reads Invocation Helper Status
+
+  This is identical to ``TGSI_SEMANTIC_HELPER_INVOCATION``, except
+  this will read the current value, which might change as a result of
+  a ``DEMOTE`` instruction.
+
+  Allowed in fragment shaders only.
+
+
 .. opcode:: TXB - Texture Lookup With Bias
 
   for cube map array textures and shadow cube maps, the bias value
diff --git a/src/gallium/include/pipe/p_shader_tokens.h 
b/src/gallium/include/pipe/p_shader_tokens.h
index b30a257df2f..5770eba0837 100644
--- a/src/gallium/include/pipe/p_shader_tokens.h
+++ b/src/gallium/include/pipe/p_shader_tokens.h
@@ -376,11 +376,11 @@ enum tgsi_opcode {
TGSI_OPCODE_EX2= 28,
TGSI_OPCODE_LG2= 29,
TGSI_OPCODE_POW= 30,
-   /* gap */
+   TGSI_OPCODE_DEMOTE = 31,
TGSI_OPCODE_U2I64  = 32,
TGSI_OPCODE_CLOCK  = 33,
TGSI_OPCODE_I2I64  = 34,
-   /* gap */
+   TGSI_OPCODE_READ_HELPER= 35,
TGSI_OPCODE_COS= 36,
TGSI_OPCODE_DDX= 37,
TGSI_OPCODE_DDY= 38,
diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp 
b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
index 799c161cfaf..be582f5f01c 100644
--- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
@@ -4107,6 +4107,10 @@ glsl_to_tgsi_visitor::visit(ir_call *ir)
   visit_generic_intrinsic(ir, TGSI_OPCODE_READ_INVOC);
   return;
 
+   case ir_intrinsic_helper_invocation:
+  visit_generic_intrinsic(ir, TGSI_OPCODE_READ_HELPER);
+  return;
+
case ir_intrinsic_invalid:
case ir_intrinsic_generic_load:
case ir_intrinsic_generic_store:
@@ -4120,7 +4124,6 @@ glsl_to_tgsi_visitor::visit(ir_call *ir)
case ir_intrinsic_generic_atomic_comp_swap:
case ir_intrinsic_begin_invocation_interlock:
case ir_intrinsic_end_invocation_interlock:
-   case ir_intrinsic_helper_invocation:
   unreachable("Invalid intrinsic");
}
 }
@@ -4631,7 +4634,7 @@ glsl_to_tgsi_visitor::visit(ir_discard *ir)
 void
 glsl_to_tgsi_visitor::visit(ir_demote *ir)
 {
-   assert(!"demote statement unsupported");
+   emit_asm(ir, TGSI_OPCODE_DEMOTE);
 }
 
 void
-- 
2.21.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/2] nvc0: add support for GL_EXT_demote_to_helper_invocation

2019-10-02 Thread Ilia Mirkin

Signed-off-by: Ilia Mirkin 
---

This passes the available piglit tests (once they are fixed to not
require GL 4.5)

 .../drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp| 12 
 src/gallium/drivers/nouveau/nv50/nv50_screen.c   |  1 +
 src/gallium/drivers/nouveau/nvc0/nvc0_screen.c   |  1 +
 3 files changed, 14 insertions(+)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
index c3e3cf2dff5..d62d36008e6 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
@@ -821,6 +821,7 @@ static nv50_ir::operation translateOpcode(uint opcode)
NV50_IR_OPCODE_CASE(DDY, DFDY);
NV50_IR_OPCODE_CASE(DDY_FINE, DFDY);
NV50_IR_OPCODE_CASE(KILL, DISCARD);
+   NV50_IR_OPCODE_CASE(DEMOTE, DISCARD);
 
NV50_IR_OPCODE_CASE(SEQ, SET);
NV50_IR_OPCODE_CASE(SGT, SET);
@@ -1581,6 +1582,9 @@ bool Source::scanInstruction(const struct 
tgsi_full_instruction *inst)
if (insn.getOpcode() == TGSI_OPCODE_INTERP_SAMPLE)
   info->prop.fp.readsSampleLocations = true;
 
+   if (insn.getOpcode() == TGSI_OPCODE_DEMOTE)
+  info->prop.fp.usesDiscard = true;
+
if (insn.dstCount()) {
   Instruction::DstRegister dst = insn.getDst(0);
 
@@ -3463,6 +3467,11 @@ Converter::handleInstruction(const struct 
tgsi_full_instruction *insn)
   if (!tgsi.getDst(0).isMasked(1))
  mkOp1(OP_RDSV, TYPE_U32, dst0[1], mkSysVal(SV_CLOCK, 0))->fixed = 1;
   break;
+   case TGSI_OPCODE_READ_HELPER:
+  if (!tgsi.getDst(0).isMasked(0))
+ mkOp1(OP_RDSV, TYPE_U32, dst0[0], mkSysVal(SV_THREAD_KILL, 0))
+->fixed = 1;
+  break;
case TGSI_OPCODE_KILL_IF:
   val0 = new_LValue(func, FILE_PREDICATE);
   mask = 0;
@@ -3476,6 +3485,9 @@ Converter::handleInstruction(const struct 
tgsi_full_instruction *insn)
   }
   break;
case TGSI_OPCODE_KILL:
+   case TGSI_OPCODE_DEMOTE:
+  // TODO: Should we make KILL exit that invocation? Some old shaders
+  // don't like that.
   mkOp(OP_DISCARD, TYPE_NONE, NULL);
   break;
case TGSI_OPCODE_TEX:
diff --git a/src/gallium/drivers/nouveau/nv50/nv50_screen.c 
b/src/gallium/drivers/nouveau/nv50/nv50_screen.c
index 557139494d4..f0846867621 100644
--- a/src/gallium/drivers/nouveau/nv50/nv50_screen.c
+++ b/src/gallium/drivers/nouveau/nv50/nv50_screen.c
@@ -330,6 +330,7 @@ nv50_screen_get_param(struct pipe_screen *pscreen, enum 
pipe_cap param)
case PIPE_CAP_FBFETCH_COHERENT:
case PIPE_CAP_TGSI_SKIP_SHRINK_IO_ARRAYS:
case PIPE_CAP_TGSI_ATOMINC_WRAP:
+   case PIPE_CAP_DEMOTE_TO_HELPER_INVOCATION:
   return 0;
 
case PIPE_CAP_VENDOR_ID:
diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c 
b/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c
index 435c3058a89..06f1fb2b708 100644
--- a/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c
+++ b/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c
@@ -281,6 +281,7 @@ nvc0_screen_get_param(struct pipe_screen *pscreen, enum 
pipe_cap param)
case PIPE_CAP_DEST_SURFACE_SRGB_CONTROL:
case PIPE_CAP_TGSI_DIV:
case PIPE_CAP_TGSI_ATOMINC_WRAP:
+   case PIPE_CAP_DEMOTE_TO_HELPER_INVOCATION:
   return 1;
case PIPE_CAP_PREFER_BLIT_BASED_TEXTURE_TRANSFER:
   return nouveau_screen(pscreen)->vram_domain & NOUVEAU_BO_VRAM ? 1 : 0;
-- 
2.21.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] mesa: fix texStore for FORMAT_Z32_FLOAT_S8X24_UINT

2019-09-12 Thread Ilia Mirkin

Yep, it does work out ... you get lucky :) It does seem like it'd be
really easy to just switch to GLubyte like all the other functions do,
but whatever.

On Thu, Sep 12, 2019 at 10:22 PM Marek Olšák  wrote:
>
> Ilia, the patch is OK if GL_STENCIL_INDEX is not allowed, right?
>
> Marek
>
> On Tue, Aug 27, 2019 at 9:24 PM Ilia Mirkin  wrote:
>>
>> By the way, I took the liberty of composing a test which fails with
>> current mesa, and I'm pretty sure it continues to fail with this
>> patch. It does pass if I just make it a GLubyte like all the other
>> instances of the code already do.
>>
>> Piglit test: https://patchwork.freedesktop.org/patch/327460/
>>
>> On Tue, Aug 27, 2019 at 7:37 PM Ilia Mirkin  wrote:
>> >
>> > I don't think the original author was included -- CC was probably
>> > stripped by the ML. Adding here.
>> >
>> > On Tue, Aug 27, 2019 at 6:49 PM Marek Olšák  wrote:
>> > >
>> > > Yes, but if the original author doesn't reply, I'd like to push this.
>> > >
>> > > Marek
>> > >
>> > > On Thu, Aug 15, 2019 at 8:01 PM Ilia Mirkin  wrote:
>> > >>
>> > >> Subtle. The source format *can* be 64-bit, by the way, but if it's
>> > >> GL_DEPTH_COMPONENT it may well be 32-bit.
>> > >>
>> > >> But what if it's GL_STENCIL_INDEX -- could it not be 1-byte? IOW,
>> > >> should this just be a char *, and use byte addressing and be done with
>> > >> it?
>> > >>
>> > >> On Thu, Aug 15, 2019 at 7:56 PM Marek Olšák  wrote:
>> > >> >
>> > >> > From: Jiadong Zhu 
>> > >> >
>> > >> > _mesa_texstore_z32f_x24s8 calculates source rowStride at a
>> > >> > pace of 64-bit, this will make inaccuracy offset if the width
>> > >> > of src image is an odd number. Modify src pointer to int_32* as
>> > >> > source image format is gl_float which is 32-bit per pixel.
>> > >> >
>> > >> > Signed-off-by: Jiadong Zhu 
>> > >> > Signed-off-by: Marek Olšák 
>> > >> > ---
>> > >> >  src/mesa/main/texstore.c | 6 +++---
>> > >> >  1 file changed, 3 insertions(+), 3 deletions(-)
>> > >> >  mode change 100644 => 100755 src/mesa/main/texstore.c
>> > >> >
>> > >> > diff --git a/src/mesa/main/texstore.c b/src/mesa/main/texstore.c
>> > >> > old mode 100644
>> > >> > new mode 100755
>> > >> > index 2913d4bc067..207695041a7
>> > >> > --- a/src/mesa/main/texstore.c
>> > >> > +++ b/src/mesa/main/texstore.c
>> > >> > @@ -531,35 +531,35 @@ _mesa_texstore_s8(TEXSTORE_PARAMS)
>> > >> > return GL_TRUE;
>> > >> >  }
>> > >> >
>> > >> >
>> > >> >  static GLboolean
>> > >> >  _mesa_texstore_z32f_x24s8(TEXSTORE_PARAMS)
>> > >> >  {
>> > >> > GLint img, row;
>> > >> > const GLint srcRowStride
>> > >> >= _mesa_image_row_stride(srcPacking, srcWidth, srcFormat, 
>> > >> > srcType)
>> > >> > -  / sizeof(uint64_t);
>> > >> > + / sizeof(int32_t);
>> > >> >
>> > >> > assert(dstFormat == MESA_FORMAT_Z32_FLOAT_S8X24_UINT);
>> > >> > assert(srcFormat == GL_DEPTH_STENCIL ||
>> > >> >srcFormat == GL_DEPTH_COMPONENT ||
>> > >> >srcFormat == GL_STENCIL_INDEX);
>> > >> > assert(srcFormat != GL_DEPTH_STENCIL ||
>> > >> >srcType == GL_UNSIGNED_INT_24_8 ||
>> > >> >srcType == GL_FLOAT_32_UNSIGNED_INT_24_8_REV);
>> > >> >
>> > >> > /* In case we only upload depth we need to preserve the stencil */
>> > >> > for (img = 0; img < srcDepth; img++) {
>> > >> >uint64_t *dstRow = (uint64_t *) dstSlices[img];
>> > >> > -  const uint64_t *src
>> > >> > - = (const uint64_t *) _mesa_image_address(dims, srcPacking, 
>> > >> > srcAddr,
>> > >> > +  const int32_t *src
>> > >> > + = (const int32_t *) _mesa_image_address(dims, srcPacking, 
>> > >> > srcAddr,
>> > >> > srcWidth, srcHeight,
>> > >> > srcFormat, srcType,
>> > >> > img, 0, 0);
>> > >> >for (row = 0; row < srcHeight; row++) {
>> > >> >   /* The unpack functions with:
>> > >> >*dstType = GL_FLOAT_32_UNSIGNED_INT_24_8_REV
>> > >> >* only write their own dword, so the other dword (stencil
>> > >> >* or depth) is preserved. */
>> > >> >   if (srcFormat != GL_STENCIL_INDEX)
>> > >> >  _mesa_unpack_depth_span(ctx, srcWidth,
>> > >> > --
>> > >> > 2.17.1
>> > >> >
>> > >> > ___
>> > >> > mesa-dev mailing list
>> > >> > mesa-dev@lists.freedesktop.org
>> > >> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] teximage: ensure that TexSubImage checks format

2019-09-02 Thread Ilia Mirkin

We were previously not doing at least some of the checks. This uses the
same logic that is used in glTexImage*.

Signed-off-by: Ilia Mirkin 
---

This seems to leave the Intel CI happy -

https://mesa-ci.01.org/imirkin/builds/32/group/63a9f0ea7bb98050796b649e85481845

And fixes things for the teximage-errors updates I sent in for piglit.

 src/mesa/main/teximage.c | 9 +
 1 file changed, 9 insertions(+)

diff --git a/src/mesa/main/teximage.c b/src/mesa/main/teximage.c
index f90765f40fa..e55e435779f 100644
--- a/src/mesa/main/teximage.c
+++ b/src/mesa/main/teximage.c
@@ -2205,6 +2205,15 @@ texsubimage_error_check(struct gl_context *ctx, GLuint 
dimensions,
   return GL_TRUE;
}
 
+   if (!texture_formats_agree(texImage->InternalFormat, format)) {
+  _mesa_error(ctx, GL_INVALID_OPERATION,
+  "%s(incompatible internalFormat = %s, format = %s)",
+  callerName,
+  _mesa_enum_to_string(texImage->InternalFormat),
+  _mesa_enum_to_string(format));
+  return GL_TRUE;
+   }
+
GLenum internalFormat = _mesa_is_gles(ctx) ?
   oes_float_internal_format(ctx, texImage->InternalFormat, type) :
   texImage->InternalFormat;
-- 
2.21.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH kmscube] kmscube: make fb/gl config format settable on cmdline

2019-08-28 Thread Ilia Mirkin

Instead of changing the code each time, allow the fourcc to be passed
in as an argument.

Signed-off-by: Ilia Mirkin 
---
 common.c |  4 ++--
 common.h |  2 +-
 kmscube.c| 20 +++-
 texturator.c |  2 +-
 4 files changed, 23 insertions(+), 5 deletions(-)

diff --git a/common.c b/common.c
index b6f3e9b..b60b833 100644
--- a/common.c
+++ b/common.c
@@ -40,10 +40,10 @@ gbm_surface_create_with_modifiers(struct gbm_device *gbm,
   const uint64_t *modifiers,
   const unsigned int count);
 
-const struct gbm * init_gbm(int drm_fd, int w, int h, uint64_t modifier)
+const struct gbm * init_gbm(int drm_fd, int w, int h, uint32_t format, 
uint64_t modifier)
 {
gbm.dev = gbm_create_device(drm_fd);
-   gbm.format = GBM_FORMAT_XRGB;
+   gbm.format = format;
gbm.surface = NULL;
 
if (gbm_surface_create_with_modifiers) {
diff --git a/common.h b/common.h
index d461262..6540763 100644
--- a/common.h
+++ b/common.h
@@ -99,7 +99,7 @@ struct gbm {
int width, height;
 };
 
-const struct gbm * init_gbm(int drm_fd, int w, int h, uint64_t modifier);
+const struct gbm * init_gbm(int drm_fd, int w, int h, uint32_t format, 
uint64_t modifier);
 
 
 struct egl {
diff --git a/kmscube.c b/kmscube.c
index 90de638..6a1c2af 100644
--- a/kmscube.c
+++ b/kmscube.c
@@ -46,6 +46,7 @@ static const char *shortopts = "AD:M:m:V:v:";
 static const struct option longopts[] = {
{"atomic", no_argument,   0, 'A'},
{"device", required_argument, 0, 'D'},
+   {"format", required_argument, 0, 'f'},
{"mode",   required_argument, 0, 'M'},
{"modifier", required_argument, 0, 'm'},
{"samples",  required_argument, 0, 's'},
@@ -66,6 +67,7 @@ static void usage(const char *name)
"rgba  -  rgba textured cube\n"
"nv12-2img -  yuv textured (color conversion in 
shader)\n"
"nv12-1img -  yuv textured (single nv12 
texture)\n"
+   "-f, --format=FOURCC  framebuffer format\n"
"-m, --modifier=MODIFIER  hardcode the selected 
modifier\n"
"-s, --samples=N  use MSAA\n"
"-V, --video=FILE video textured cube\n"
@@ -81,6 +83,7 @@ int main(int argc, char *argv[])
char mode_str[DRM_DISPLAY_MODE_LEN] = "";
char *p;
enum mode mode = SMOOTH;
+   uint32_t format = DRM_FORMAT_XRGB;
uint64_t modifier = DRM_FORMAT_MOD_LINEAR;
int samples = 0;
int atomic = 0;
@@ -101,6 +104,21 @@ int main(int argc, char *argv[])
case 'D':
device = optarg;
break;
+   case 'f': {
+   char fourcc[4] = "";
+   int length = strlen(optarg);
+   if (length > 0)
+   fourcc[0] = optarg[0];
+   if (length > 1)
+   fourcc[1] = optarg[1];
+   if (length > 2)
+   fourcc[2] = optarg[2];
+   if (length > 3)
+   fourcc[3] = optarg[3];
+   format = fourcc_code(fourcc[0], fourcc[1],
+fourcc[2], fourcc[3]);
+   break;
+   }
case 'M':
if (strcmp(optarg, "smooth") == 0) {
mode = SMOOTH;
@@ -155,7 +173,7 @@ int main(int argc, char *argv[])
}
 
gbm = init_gbm(drm->fd, drm->mode->hdisplay, drm->mode->vdisplay,
-   modifier);
+   format, modifier);
if (!gbm) {
printf("failed to initialize GBM\n");
return -1;
diff --git a/texturator.c b/texturator.c
index ddd8014..555f81f 100644
--- a/texturator.c
+++ b/texturator.c
@@ -955,7 +955,7 @@ int main(int argc, char *argv[])
}
 
gbm = init_gbm(drm->fd, drm->mode->hdisplay, drm->mode->vdisplay,
-   DRM_FORMAT_MOD_LINEAR);
+   DRM_FORMAT_XRGB, DRM_FORMAT_MOD_LINEAR);
if (!gbm) {
printf("failed to initialize GBM\n");
return -1;
-- 
2.21.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] mesa: fix texStore for FORMAT_Z32_FLOAT_S8X24_UINT

2019-08-27 Thread Ilia Mirkin

By the way, I took the liberty of composing a test which fails with
current mesa, and I'm pretty sure it continues to fail with this
patch. It does pass if I just make it a GLubyte like all the other
instances of the code already do.

Piglit test: https://patchwork.freedesktop.org/patch/327460/

On Tue, Aug 27, 2019 at 7:37 PM Ilia Mirkin  wrote:
>
> I don't think the original author was included -- CC was probably
> stripped by the ML. Adding here.
>
> On Tue, Aug 27, 2019 at 6:49 PM Marek Olšák  wrote:
> >
> > Yes, but if the original author doesn't reply, I'd like to push this.
> >
> > Marek
> >
> > On Thu, Aug 15, 2019 at 8:01 PM Ilia Mirkin  wrote:
> >>
> >> Subtle. The source format *can* be 64-bit, by the way, but if it's
> >> GL_DEPTH_COMPONENT it may well be 32-bit.
> >>
> >> But what if it's GL_STENCIL_INDEX -- could it not be 1-byte? IOW,
> >> should this just be a char *, and use byte addressing and be done with
> >> it?
> >>
> >> On Thu, Aug 15, 2019 at 7:56 PM Marek Olšák  wrote:
> >> >
> >> > From: Jiadong Zhu 
> >> >
> >> > _mesa_texstore_z32f_x24s8 calculates source rowStride at a
> >> > pace of 64-bit, this will make inaccuracy offset if the width
> >> > of src image is an odd number. Modify src pointer to int_32* as
> >> > source image format is gl_float which is 32-bit per pixel.
> >> >
> >> > Signed-off-by: Jiadong Zhu 
> >> > Signed-off-by: Marek Olšák 
> >> > ---
> >> >  src/mesa/main/texstore.c | 6 +++---
> >> >  1 file changed, 3 insertions(+), 3 deletions(-)
> >> >  mode change 100644 => 100755 src/mesa/main/texstore.c
> >> >
> >> > diff --git a/src/mesa/main/texstore.c b/src/mesa/main/texstore.c
> >> > old mode 100644
> >> > new mode 100755
> >> > index 2913d4bc067..207695041a7
> >> > --- a/src/mesa/main/texstore.c
> >> > +++ b/src/mesa/main/texstore.c
> >> > @@ -531,35 +531,35 @@ _mesa_texstore_s8(TEXSTORE_PARAMS)
> >> > return GL_TRUE;
> >> >  }
> >> >
> >> >
> >> >  static GLboolean
> >> >  _mesa_texstore_z32f_x24s8(TEXSTORE_PARAMS)
> >> >  {
> >> > GLint img, row;
> >> > const GLint srcRowStride
> >> >= _mesa_image_row_stride(srcPacking, srcWidth, srcFormat, srcType)
> >> > -  / sizeof(uint64_t);
> >> > + / sizeof(int32_t);
> >> >
> >> > assert(dstFormat == MESA_FORMAT_Z32_FLOAT_S8X24_UINT);
> >> > assert(srcFormat == GL_DEPTH_STENCIL ||
> >> >srcFormat == GL_DEPTH_COMPONENT ||
> >> >srcFormat == GL_STENCIL_INDEX);
> >> > assert(srcFormat != GL_DEPTH_STENCIL ||
> >> >srcType == GL_UNSIGNED_INT_24_8 ||
> >> >srcType == GL_FLOAT_32_UNSIGNED_INT_24_8_REV);
> >> >
> >> > /* In case we only upload depth we need to preserve the stencil */
> >> > for (img = 0; img < srcDepth; img++) {
> >> >uint64_t *dstRow = (uint64_t *) dstSlices[img];
> >> > -  const uint64_t *src
> >> > - = (const uint64_t *) _mesa_image_address(dims, srcPacking, 
> >> > srcAddr,
> >> > +  const int32_t *src
> >> > + = (const int32_t *) _mesa_image_address(dims, srcPacking, 
> >> > srcAddr,
> >> > srcWidth, srcHeight,
> >> > srcFormat, srcType,
> >> > img, 0, 0);
> >> >for (row = 0; row < srcHeight; row++) {
> >> >   /* The unpack functions with:
> >> >*dstType = GL_FLOAT_32_UNSIGNED_INT_24_8_REV
> >> >* only write their own dword, so the other dword (stencil
> >> >* or depth) is preserved. */
> >> >   if (srcFormat != GL_STENCIL_INDEX)
> >> >  _mesa_unpack_depth_span(ctx, srcWidth,
> >> > --
> >> > 2.17.1
> >> >
> >> > ___
> >> > mesa-dev mailing list
> >> > mesa-dev@lists.freedesktop.org
> >> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] mesa: fix texStore for FORMAT_Z32_FLOAT_S8X24_UINT

2019-08-27 Thread Ilia Mirkin

I don't think the original author was included -- CC was probably
stripped by the ML. Adding here.

On Tue, Aug 27, 2019 at 6:49 PM Marek Olšák  wrote:
>
> Yes, but if the original author doesn't reply, I'd like to push this.
>
> Marek
>
> On Thu, Aug 15, 2019 at 8:01 PM Ilia Mirkin  wrote:
>>
>> Subtle. The source format *can* be 64-bit, by the way, but if it's
>> GL_DEPTH_COMPONENT it may well be 32-bit.
>>
>> But what if it's GL_STENCIL_INDEX -- could it not be 1-byte? IOW,
>> should this just be a char *, and use byte addressing and be done with
>> it?
>>
>> On Thu, Aug 15, 2019 at 7:56 PM Marek Olšák  wrote:
>> >
>> > From: Jiadong Zhu 
>> >
>> > _mesa_texstore_z32f_x24s8 calculates source rowStride at a
>> > pace of 64-bit, this will make inaccuracy offset if the width
>> > of src image is an odd number. Modify src pointer to int_32* as
>> > source image format is gl_float which is 32-bit per pixel.
>> >
>> > Signed-off-by: Jiadong Zhu 
>> > Signed-off-by: Marek Olšák 
>> > ---
>> >  src/mesa/main/texstore.c | 6 +++---
>> >  1 file changed, 3 insertions(+), 3 deletions(-)
>> >  mode change 100644 => 100755 src/mesa/main/texstore.c
>> >
>> > diff --git a/src/mesa/main/texstore.c b/src/mesa/main/texstore.c
>> > old mode 100644
>> > new mode 100755
>> > index 2913d4bc067..207695041a7
>> > --- a/src/mesa/main/texstore.c
>> > +++ b/src/mesa/main/texstore.c
>> > @@ -531,35 +531,35 @@ _mesa_texstore_s8(TEXSTORE_PARAMS)
>> > return GL_TRUE;
>> >  }
>> >
>> >
>> >  static GLboolean
>> >  _mesa_texstore_z32f_x24s8(TEXSTORE_PARAMS)
>> >  {
>> > GLint img, row;
>> > const GLint srcRowStride
>> >= _mesa_image_row_stride(srcPacking, srcWidth, srcFormat, srcType)
>> > -  / sizeof(uint64_t);
>> > + / sizeof(int32_t);
>> >
>> > assert(dstFormat == MESA_FORMAT_Z32_FLOAT_S8X24_UINT);
>> > assert(srcFormat == GL_DEPTH_STENCIL ||
>> >srcFormat == GL_DEPTH_COMPONENT ||
>> >srcFormat == GL_STENCIL_INDEX);
>> > assert(srcFormat != GL_DEPTH_STENCIL ||
>> >srcType == GL_UNSIGNED_INT_24_8 ||
>> >srcType == GL_FLOAT_32_UNSIGNED_INT_24_8_REV);
>> >
>> > /* In case we only upload depth we need to preserve the stencil */
>> > for (img = 0; img < srcDepth; img++) {
>> >uint64_t *dstRow = (uint64_t *) dstSlices[img];
>> > -  const uint64_t *src
>> > - = (const uint64_t *) _mesa_image_address(dims, srcPacking, 
>> > srcAddr,
>> > +  const int32_t *src
>> > + = (const int32_t *) _mesa_image_address(dims, srcPacking, 
>> > srcAddr,
>> > srcWidth, srcHeight,
>> > srcFormat, srcType,
>> > img, 0, 0);
>> >for (row = 0; row < srcHeight; row++) {
>> >   /* The unpack functions with:
>> >*dstType = GL_FLOAT_32_UNSIGNED_INT_24_8_REV
>> >* only write their own dword, so the other dword (stencil
>> >* or depth) is preserved. */
>> >   if (srcFormat != GL_STENCIL_INDEX)
>> >  _mesa_unpack_depth_span(ctx, srcWidth,
>> > --
>> > 2.17.1
>> >
>> > ___
>> > mesa-dev mailing list
>> > mesa-dev@lists.freedesktop.org
>> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] gallium/vl: use compute preference for all multimedia, not just blit

2019-08-26 Thread Ilia Mirkin

On Mon, Aug 26, 2019 at 6:41 AM Juan A. Suarez Romero
 wrote:
>
> On Sat, 2019-08-17 at 12:17 -0400, Ilia Mirkin wrote:
> > The compute paths in vl are a bit AMD-specific. For example, they (on
> > nouveau), try to use a BGRX8 image format, which is not supported.
> > Fixing all this is probably possible, but since the compute paths aren't
> > in any way better, it's difficult to care.
> >
> > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111213
> > Fixes: 9364d66cb7 (gallium/auxiliary/vl: Add video compositor compute 
> > shader render)
> >
> > Signed-off-by: Ilia Mirkin 
> > ---
>
>
> This patch didn't apply cleanly on 19.1 branch. I've resolved the conflict as
> https://gitlab.freedesktop.org/mesa/mesa/commit/ac0f71a4afa823f4a6d95bdf0d48cc71d3654c37

Looks right. Thanks for taking care of it!

  -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] gallium/vl: use compute preference for all multimedia, not just blit

2019-08-17 Thread Ilia Mirkin

The compute paths in vl are a bit AMD-specific. For example, they (on
nouveau), try to use a BGRX8 image format, which is not supported.
Fixing all this is probably possible, but since the compute paths aren't
in any way better, it's difficult to care.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111213
Fixes: 9364d66cb7 (gallium/auxiliary/vl: Add video compositor compute shader 
render)
Signed-off-by: Ilia Mirkin 
---
 src/gallium/auxiliary/util/u_screen.c| 2 +-
 src/gallium/auxiliary/vl/vl_compositor.c | 2 +-
 src/gallium/docs/source/screen.rst   | 4 ++--
 src/gallium/drivers/radeonsi/si_get.c| 2 +-
 src/gallium/include/pipe/p_defines.h | 2 +-
 src/gallium/state_trackers/va/postproc.c | 2 +-
 6 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/src/gallium/auxiliary/util/u_screen.c 
b/src/gallium/auxiliary/util/u_screen.c
index 0b4bb067d6d..88f4945e755 100644
--- a/src/gallium/auxiliary/util/u_screen.c
+++ b/src/gallium/auxiliary/util/u_screen.c
@@ -331,7 +331,7 @@ u_pipe_screen_get_param_defaults(struct pipe_screen 
*pscreen,
case PIPE_CAP_TGSI_ATOMFADD:
case PIPE_CAP_TGSI_SKIP_SHRINK_IO_ARRAYS:
case PIPE_CAP_IMAGE_LOAD_FORMATTED:
-   case PIPE_CAP_PREFER_COMPUTE_BLIT_FOR_MULTIMEDIA:
+   case PIPE_CAP_PREFER_COMPUTE_FOR_MULTIMEDIA:
case PIPE_CAP_FRAGMENT_SHADER_INTERLOCK:
case PIPE_CAP_CS_DERIVED_SYSTEM_VALUES_SUPPORTED:
case PIPE_CAP_ATOMIC_FLOAT_MINMAX:
diff --git a/src/gallium/auxiliary/vl/vl_compositor.c 
b/src/gallium/auxiliary/vl/vl_compositor.c
index 04808e80d84..a381af108b3 100644
--- a/src/gallium/auxiliary/vl/vl_compositor.c
+++ b/src/gallium/auxiliary/vl/vl_compositor.c
@@ -764,7 +764,7 @@ vl_compositor_init(struct vl_compositor *c, struct 
pipe_context *pipe)
 
memset(c, 0, sizeof(*c));
 
-   c->pipe_cs_composit_supported = pipe->screen->get_param(pipe->screen, 
PIPE_CAP_COMPUTE) &&
+   c->pipe_cs_composit_supported = pipe->screen->get_param(pipe->screen, 
PIPE_CAP_PREFER_COMPUTE_FOR_MULTIMEDIA) &&
 pipe->screen->get_param(pipe->screen, PIPE_CAP_TGSI_TEX_TXF_LZ) &&
 pipe->screen->get_param(pipe->screen, PIPE_CAP_TGSI_DIV);
 
diff --git a/src/gallium/docs/source/screen.rst 
b/src/gallium/docs/source/screen.rst
index c033321ec66..d149a2f4c9f 100644
--- a/src/gallium/docs/source/screen.rst
+++ b/src/gallium/docs/source/screen.rst
@@ -528,8 +528,8 @@ The integer capabilities:
   execution. 0 = throttling is disabled.
 * ``PIPE_CAP_DMABUF``: Whether Linux DMABUF handles are supported by
   resource_from_handle and resource_get_handle.
-* ``PIPE_CAP_PREFER_COMPUTE_BLIT_FOR_MULTIMEDIA``: Whether VDPAU, VAAPI, and
-  OpenMAX should use a compute-based blit instead of pipe_context::blit.
+* ``PIPE_CAP_PREFER_COMPUTE_FOR_MULTIMEDIA``: Whether VDPAU, VAAPI, and
+  OpenMAX should use a compute-based blit instead of pipe_context::blit and 
compute pipeline for compositing images.
 * ``PIPE_CAP_FRAGMENT_SHADER_INTERLOCK``: True if fragment shader interlock
   functionality is supported.
 * ``PIPE_CAP_CS_DERIVED_SYSTEM_VALUES_SUPPORTED``: True if driver handles
diff --git a/src/gallium/drivers/radeonsi/si_get.c 
b/src/gallium/drivers/radeonsi/si_get.c
index c9895edafb8..f85a53393aa 100644
--- a/src/gallium/drivers/radeonsi/si_get.c
+++ b/src/gallium/drivers/radeonsi/si_get.c
@@ -156,7 +156,7 @@ static int si_get_param(struct pipe_screen *pscreen, enum 
pipe_cap param)
case PIPE_CAP_FBFETCH:
case PIPE_CAP_COMPUTE_GRID_INFO_LAST_BLOCK:
case PIPE_CAP_IMAGE_LOAD_FORMATTED:
-   case PIPE_CAP_PREFER_COMPUTE_BLIT_FOR_MULTIMEDIA:
+   case PIPE_CAP_PREFER_COMPUTE_FOR_MULTIMEDIA:
case PIPE_CAP_TGSI_DIV:
return 1;
 
diff --git a/src/gallium/include/pipe/p_defines.h 
b/src/gallium/include/pipe/p_defines.h
index 0c79cac5cff..808c2b8cfaf 100644
--- a/src/gallium/include/pipe/p_defines.h
+++ b/src/gallium/include/pipe/p_defines.h
@@ -885,7 +885,7 @@ enum pipe_cap
PIPE_CAP_IMAGE_LOAD_FORMATTED,
PIPE_CAP_MAX_FRAMES_IN_FLIGHT,
PIPE_CAP_DMABUF,
-   PIPE_CAP_PREFER_COMPUTE_BLIT_FOR_MULTIMEDIA,
+   PIPE_CAP_PREFER_COMPUTE_FOR_MULTIMEDIA,
PIPE_CAP_FRAGMENT_SHADER_INTERLOCK,
PIPE_CAP_FBFETCH_COHERENT,
PIPE_CAP_CS_DERIVED_SYSTEM_VALUES_SUPPORTED,
diff --git a/src/gallium/state_trackers/va/postproc.c 
b/src/gallium/state_trackers/va/postproc.c
index fbc55b7714b..3431b1b48c7 100644
--- a/src/gallium/state_trackers/va/postproc.c
+++ b/src/gallium/state_trackers/va/postproc.c
@@ -222,7 +222,7 @@ static VAStatus vlVaPostProcBlit(vlVaDriver *drv, 
vlVaContext *context,
   blit.filter = PIPE_TEX_MIPFILTER_LINEAR;
 
   if (drv->pipe->screen->get_param(drv->pipe->screen,
-   
PIPE_CAP_PREFER_COMPUTE_BLIT_FOR_MULTIMEDIA))
+   PIPE_CAP_PREFER_COMPUTE_FOR_MULTIMEDIA))
  util_compute_blit(drv-&

Re: [Mesa-dev] [PATCH 15/20] pan/midgard: Use type-appropriate swizzle for texture coordinate

2019-08-16 Thread Ilia Mirkin

On Fri, Aug 16, 2019 at 11:38 AM Alyssa Rosenzweig
 wrote:
>
> The texture coordinate for a 2D texture could be a vec2 or a vec3,
> depending if it's an array texture or not. If it's vec2 (non-array
> texture), we should not reference the z component; otherwise, liveness
> analysis will get very confused when z is never written.
>
> Signed-off-by: Alyssa Rosenzweig 
> ---
>  src/panfrost/midgard/midgard_compile.c | 8 +++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
>
> diff --git a/src/panfrost/midgard/midgard_compile.c 
> b/src/panfrost/midgard/midgard_compile.c
> index 67c5c848214..0be9cc32dc2 100644
> --- a/src/panfrost/midgard/midgard_compile.c
> +++ b/src/panfrost/midgard/midgard_compile.c
> @@ -1753,6 +1753,7 @@ emit_texop_native(compiler_context *ctx, nir_tex_instr 
> *instr,
>  for (unsigned i = 0; i < instr->num_srcs; ++i) {
>  int index = nir_src_index(ctx, >src[i].src);
>  midgard_vector_alu_src alu_src = blank_alu_src;
> +unsigned nr_components = 
> nir_src_num_components(instr->src[i].src);
>
>  switch (instr->src[i].src_type) {
>  case nir_tex_src_coord: {
> @@ -1804,7 +1805,12 @@ emit_texop_native(compiler_context *ctx, nir_tex_instr 
> *instr,
>
>  if (instr->sampler_dim == GLSL_SAMPLER_DIM_2D) {
>  /* Array component in w but NIR wants it in 
> z */
> -ins.texture.in_reg_swizzle = SWIZZLE_XYZZ;
> +if (nr_components == 3)
> +ins.texture.in_reg_swizzle = 
> SWIZZLE_XYZZ;
> +else if (nr_components == 2)
> +ins.texture.in_reg_swizzle = 
> SWIZZLE_XYXX;
> +else
> +unreachable("Invalid texture 2D 
> componens");

Just a drive-by review: components

  -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] mesa: fix texStore for FORMAT_Z32_FLOAT_S8X24_UINT

2019-08-15 Thread Ilia Mirkin

Subtle. The source format *can* be 64-bit, by the way, but if it's
GL_DEPTH_COMPONENT it may well be 32-bit.

But what if it's GL_STENCIL_INDEX -- could it not be 1-byte? IOW,
should this just be a char *, and use byte addressing and be done with
it?

On Thu, Aug 15, 2019 at 7:56 PM Marek Olšák  wrote:
>
> From: Jiadong Zhu 
>
> _mesa_texstore_z32f_x24s8 calculates source rowStride at a
> pace of 64-bit, this will make inaccuracy offset if the width
> of src image is an odd number. Modify src pointer to int_32* as
> source image format is gl_float which is 32-bit per pixel.
>
> Signed-off-by: Jiadong Zhu 
> Signed-off-by: Marek Olšák 
> ---
>  src/mesa/main/texstore.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
>  mode change 100644 => 100755 src/mesa/main/texstore.c
>
> diff --git a/src/mesa/main/texstore.c b/src/mesa/main/texstore.c
> old mode 100644
> new mode 100755
> index 2913d4bc067..207695041a7
> --- a/src/mesa/main/texstore.c
> +++ b/src/mesa/main/texstore.c
> @@ -531,35 +531,35 @@ _mesa_texstore_s8(TEXSTORE_PARAMS)
> return GL_TRUE;
>  }
>
>
>  static GLboolean
>  _mesa_texstore_z32f_x24s8(TEXSTORE_PARAMS)
>  {
> GLint img, row;
> const GLint srcRowStride
>= _mesa_image_row_stride(srcPacking, srcWidth, srcFormat, srcType)
> -  / sizeof(uint64_t);
> + / sizeof(int32_t);
>
> assert(dstFormat == MESA_FORMAT_Z32_FLOAT_S8X24_UINT);
> assert(srcFormat == GL_DEPTH_STENCIL ||
>srcFormat == GL_DEPTH_COMPONENT ||
>srcFormat == GL_STENCIL_INDEX);
> assert(srcFormat != GL_DEPTH_STENCIL ||
>srcType == GL_UNSIGNED_INT_24_8 ||
>srcType == GL_FLOAT_32_UNSIGNED_INT_24_8_REV);
>
> /* In case we only upload depth we need to preserve the stencil */
> for (img = 0; img < srcDepth; img++) {
>uint64_t *dstRow = (uint64_t *) dstSlices[img];
> -  const uint64_t *src
> - = (const uint64_t *) _mesa_image_address(dims, srcPacking, srcAddr,
> +  const int32_t *src
> + = (const int32_t *) _mesa_image_address(dims, srcPacking, srcAddr,
> srcWidth, srcHeight,
> srcFormat, srcType,
> img, 0, 0);
>for (row = 0; row < srcHeight; row++) {
>   /* The unpack functions with:
>*dstType = GL_FLOAT_32_UNSIGNED_INT_24_8_REV
>* only write their own dword, so the other dword (stencil
>* or depth) is preserved. */
>   if (srcFormat != GL_STENCIL_INDEX)
>  _mesa_unpack_depth_span(ctx, srcWidth,
> --
> 2.17.1
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] Concern about proposed patch "egl/android: remove HAL_PIXEL_FORMAT_BGRA_8888 support"

2019-08-15 Thread Ilia Mirkin

Without commenting on Android-specific issues, of which I'm blissfully
unaware, all NVIDIA G80+ (i.e. DX10+) GPUs support both RGBA8 and
BGRA8 for display and color rendering:

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/gpu/drm/nouveau/dispnv50/base507c.c#n206

However the nouveau gallium driver only sets the BGRA variant as
PIPE_BIND_DISPLAY_TARGET since supporting both RGBA and BGRA caused
issues, as I recall. And the legacy AddFB() call defaults to BGRA8, so
that has to remain working. I think this may affect which configs are
made available.

[Also note that only Kepler+ NVIDIA GPUs support RGB10 ordering, so we
prefer BGR10 everywhere to make life easier. But I think that's out of
scope for this discussion.]

Cheers,

  -ilia

On Thu, Aug 15, 2019 at 3:49 PM Mauro Rossi  wrote:
>
> Hi,
>
> sorry if I'm communicating in dedicated thread, but I don't know how to reply 
> to existing one.
>
> The proposed patch is based on the wrong assumption that all drivers used 
> with Android have RGBA_, which is not correct, for example nouveau does 
> not support RGBA on primary planes, infact we have a fallback to simpler EGL 
> query in android-x86 and we use BGRA_ (pixel format = 5)
>
> If patch does not solve a specific problem and it will cause a regression at 
> least in nouveau, it is recommended to abandon the patch
>
> Thanks for your feedbacks
>
> Mauro
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] dEQP + llvmpipe

2019-08-13 Thread Ilia Mirkin

On Tue, Aug 13, 2019 at 4:49 PM Eric Anholt  wrote:
>
> Ilia Mirkin  writes:
>
> > Hi Eric,
> >
> > I see that you recently added testing dEQP with llvmpipe in the CI. It
> > looks like a number of your expected failures would be resolved by
> > disabling some llvmpipe optimizations. You can do this by running with
> >
> > GALLIVM_DEBUG=no_rho_approx,no_brilinear,no_quad_lod
> >
> > in the environment. Some of this is detailed in
> > https://bugs.freedesktop.org/show_bug.cgi?id=94957 .
>
> Yeah, this was discussed in the MR.  I think that for GL and GLES
> contexts, we should disable noncormant hacks instead of using env vars
> to falsely claim conformance.

OK. I figured you'd want the wider coverage of dEQP tests (since a
test may fail for many reasons). But mostly wasn't sure if you knew
about that situation. Sounds like you were well aware though.

Cheers,

  -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] dEQP + llvmpipe

2019-08-13 Thread Ilia Mirkin

Hi Eric,

I see that you recently added testing dEQP with llvmpipe in the CI. It
looks like a number of your expected failures would be resolved by
disabling some llvmpipe optimizations. You can do this by running with

GALLIVM_DEBUG=no_rho_approx,no_brilinear,no_quad_lod

in the environment. Some of this is detailed in
https://bugs.freedesktop.org/show_bug.cgi?id=94957 .

Cheers,

  -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 3/4] nvc0: add support for ATOMC_WRAP TGSI operations

2019-08-06 Thread Ilia Mirkin

Signed-off-by: Ilia Mirkin 
---
 .../drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp  | 10 ++
 src/gallium/drivers/nouveau/nvc0/nvc0_screen.c |  2 +-
 2 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
index 2dd13e70d0e..c3e3cf2dff5 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
@@ -606,6 +606,8 @@ nv50_ir::DataType Instruction::inferSrcType() const
case TGSI_OPCODE_ATOMXOR:
case TGSI_OPCODE_ATOMUMIN:
case TGSI_OPCODE_ATOMUMAX:
+   case TGSI_OPCODE_ATOMDEC_WRAP:
+   case TGSI_OPCODE_ATOMINC_WRAP:
case TGSI_OPCODE_UBFE:
case TGSI_OPCODE_UMSB:
case TGSI_OPCODE_UP2H:
@@ -969,6 +971,8 @@ static nv50_ir::operation translateOpcode(uint opcode)
NV50_IR_OPCODE_CASE(ATOMIMIN, ATOM);
NV50_IR_OPCODE_CASE(ATOMIMAX, ATOM);
NV50_IR_OPCODE_CASE(ATOMFADD, ATOM);
+   NV50_IR_OPCODE_CASE(ATOMDEC_WRAP, ATOM);
+   NV50_IR_OPCODE_CASE(ATOMINC_WRAP, ATOM);
 
NV50_IR_OPCODE_CASE(TEX2, TEX);
NV50_IR_OPCODE_CASE(TXB2, TXB);
@@ -1012,6 +1016,8 @@ static uint16_t opcodeToSubOp(uint opcode)
case TGSI_OPCODE_ATOMUMAX: return NV50_IR_SUBOP_ATOM_MAX;
case TGSI_OPCODE_ATOMIMAX: return NV50_IR_SUBOP_ATOM_MAX;
case TGSI_OPCODE_ATOMFADD: return NV50_IR_SUBOP_ATOM_ADD;
+   case TGSI_OPCODE_ATOMDEC_WRAP: return NV50_IR_SUBOP_ATOM_DEC;
+   case TGSI_OPCODE_ATOMINC_WRAP: return NV50_IR_SUBOP_ATOM_INC;
case TGSI_OPCODE_IMUL_HI:
case TGSI_OPCODE_UMUL_HI:
   return NV50_IR_SUBOP_MUL_HIGH;
@@ -1628,6 +1634,8 @@ bool Source::scanInstruction(const struct 
tgsi_full_instruction *inst)
   case TGSI_OPCODE_ATOMUMAX:
   case TGSI_OPCODE_ATOMIMAX:
   case TGSI_OPCODE_ATOMFADD:
+  case TGSI_OPCODE_ATOMDEC_WRAP:
+  case TGSI_OPCODE_ATOMINC_WRAP:
   case TGSI_OPCODE_LOAD:
  info->io.globalAccess |= (insn.getOpcode() == TGSI_OPCODE_LOAD) ?
 0x1 : 0x2;
@@ -3779,6 +3787,8 @@ Converter::handleInstruction(const struct 
tgsi_full_instruction *insn)
case TGSI_OPCODE_ATOMUMAX:
case TGSI_OPCODE_ATOMIMAX:
case TGSI_OPCODE_ATOMFADD:
+   case TGSI_OPCODE_ATOMDEC_WRAP:
+   case TGSI_OPCODE_ATOMINC_WRAP:
   handleATOM(dst0, dstTy, tgsi::opcodeToSubOp(tgsi.getOpcode()));
   break;
case TGSI_OPCODE_RESQ:
diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c 
b/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c
index 8a9d3c4d0bd..b1e12432e14 100644
--- a/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c
+++ b/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c
@@ -280,6 +280,7 @@ nvc0_screen_get_param(struct pipe_screen *pscreen, enum 
pipe_cap param)
case PIPE_CAP_QUERY_SO_OVERFLOW:
case PIPE_CAP_DEST_SURFACE_SRGB_CONTROL:
case PIPE_CAP_TGSI_DIV:
+   case PIPE_CAP_TGSI_ATOMINC_WRAP:
   return 1;
case PIPE_CAP_PREFER_BLIT_BASED_TEXTURE_TRANSFER:
   return nouveau_screen(pscreen)->vram_domain & NOUVEAU_BO_VRAM ? 1 : 0;
@@ -364,7 +365,6 @@ nvc0_screen_get_param(struct pipe_screen *pscreen, enum 
pipe_cap param)
case PIPE_CAP_CS_DERIVED_SYSTEM_VALUES_SUPPORTED:
case PIPE_CAP_FBFETCH_COHERENT:
case PIPE_CAP_TGSI_SKIP_SHRINK_IO_ARRAYS:
-   case PIPE_CAP_TGSI_ATOMINC_WRAP:
   return 0;
 
case PIPE_CAP_VENDOR_ID:
-- 
2.21.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 4/4] nvc0: fix program dumping, use _debug_printf

2019-08-06 Thread Ilia Mirkin

This debug situation is unforunate. debug_printf only does something
with DEBUG set, but in practice all that needs to be moved to !NDEBUG.
For now, use _debug_printf which always prints. However the whole
function is guarded by !NDEBUG.

Signed-off-by: Ilia Mirkin 
---
 src/gallium/drivers/nouveau/nvc0/nvc0_program.c | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_program.c 
b/src/gallium/drivers/nouveau/nvc0/nvc0_program.c
index 32487248c7a..2f235805436 100644
--- a/src/gallium/drivers/nouveau/nvc0/nvc0_program.c
+++ b/src/gallium/drivers/nouveau/nvc0/nvc0_program.c
@@ -551,18 +551,18 @@ nvc0_program_dump(struct nvc0_program *prog)
unsigned pos;
 
if (prog->type != PIPE_SHADER_COMPUTE) {
-  debug_printf("dumping HDR for type %i\n", prog->type);
+  _debug_printf("dumping HDR for type %i\n", prog->type);
   for (pos = 0; pos < ARRAY_SIZE(prog->hdr); ++pos)
- debug_printf("HDR[%02"PRIxPTR"] = 0x%08x\n",
+ _debug_printf("HDR[%02"PRIxPTR"] = 0x%08x\n",
   pos * sizeof(prog->hdr[0]), prog->hdr[pos]);
}
-   debug_printf("shader binary code (0x%x bytes):", prog->code_size);
+   _debug_printf("shader binary code (0x%x bytes):", prog->code_size);
for (pos = 0; pos < prog->code_size / 4; ++pos) {
   if ((pos % 8) == 0)
- debug_printf("\n");
-  debug_printf("%08x ", prog->code[pos]);
+ _debug_printf("\n");
+  _debug_printf("%08x ", prog->code[pos]);
}
-   debug_printf("\n");
+   _debug_printf("\n");
 }
 #endif
 
-- 
2.21.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 1/4] st/mesa: relax EXT_shader_image_load_store enable

2019-08-06 Thread Ilia Mirkin

There's no reason to bring format-less load requirement into this
extension. It requires a size to be provided, and a compatible format is
computed from the size + data type. For example

  layout(size1x32) uniform iimage1D image;

becomes

  DCL IMAGE[0], 1D, PIPE_FORMAT_R32_SINT, WR

whereas PIPE_CAP_IMAGE_LOAD_FORMATTED is designed to allow
PIPE_FORMAT_NONE to be provided as a format and still enable LOAD
operations to be performed.

So the shader has all the information it needs about the format.

Signed-off-by: Ilia Mirkin 
---
 src/mesa/state_tracker/st_extensions.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/src/mesa/state_tracker/st_extensions.c 
b/src/mesa/state_tracker/st_extensions.c
index abc816ed0d4..16bfedcdbc4 100644
--- a/src/mesa/state_tracker/st_extensions.c
+++ b/src/mesa/state_tracker/st_extensions.c
@@ -1180,9 +1180,6 @@ void st_init_extensions(struct pipe_screen *screen,
extensions->OES_sample_variables = extensions->ARB_sample_shading &&
   extensions->ARB_gpu_shader5;
 
-   extensions->EXT_shader_image_load_store &=
-  screen->get_param(screen, PIPE_CAP_IMAGE_LOAD_FORMATTED);
-
/* Maximum sample count. */
{
   static const enum pipe_format color_formats[] = {
-- 
2.21.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/4] gallium: redefine ATOMINC_WRAP to be more hardware-friendly

2019-08-06 Thread Ilia Mirkin

Both AMD and NVIDIA hardware define it this way. Instead of replicating
the logic everywhere, just fix it up in one place.

Signed-off-by: Ilia Mirkin 
---

I'm open to also not doing this. Just seems like it'll make our
collective lives a little easier, since this is what the hw wants (and
presumably other APIs ... PTX definitely defines it this way).

If there's push-back, I can also duplicate the logic in codegen.

 src/gallium/docs/source/tgsi.rst  |  2 +-
 src/gallium/drivers/radeonsi/si_shader_tgsi_mem.c | 12 
 src/mesa/state_tracker/st_glsl_to_tgsi.cpp| 11 ++-
 3 files changed, 11 insertions(+), 14 deletions(-)

diff --git a/src/gallium/docs/source/tgsi.rst b/src/gallium/docs/source/tgsi.rst
index 17ad097e85e..e72b047dbd5 100644
--- a/src/gallium/docs/source/tgsi.rst
+++ b/src/gallium/docs/source/tgsi.rst
@@ -2846,7 +2846,7 @@ These atomic operations may only be used with 32-bit 
integer image formats.
 
   dst_x = resource[offset] + 1
 
-  resource[offset] = dst_x < src_x ? dst_x : 0
+  resource[offset] = dst_x <= src_x ? dst_x : 0
 
 
 .. opcode:: ATOMDEC_WRAP - Atomic decrement + wrap around
diff --git a/src/gallium/drivers/radeonsi/si_shader_tgsi_mem.c 
b/src/gallium/drivers/radeonsi/si_shader_tgsi_mem.c
index 4a4ba43780a..f79ed2c57e1 100644
--- a/src/gallium/drivers/radeonsi/si_shader_tgsi_mem.c
+++ b/src/gallium/drivers/radeonsi/si_shader_tgsi_mem.c
@@ -828,18 +828,6 @@ static void atomic_emit(
args.data[num_data++] =
ac_to_integer(>ac, lp_build_emit_fetch(bld_base, inst, 2, 
0));
 
-   if (inst->Instruction.Opcode == TGSI_OPCODE_ATOMINC_WRAP) {
-   /* ATOMIC_INC instruction does:
-*  value = (value + 1) % (data + 1)
-* but we want:
-*  value = (value + 1) % data
-* So replace 'data' by 'data - 1'.
-*/
-   args.data[0] = LLVMBuildSub(ctx->ac.builder,
-   args.data[0],
-   ctx->ac.i32_1, "");
-   }
-
args.cache_policy = get_cache_policy(ctx, inst, true, false, false);
 
if (inst->Src[0].Register.File == TGSI_FILE_BUFFER) {
diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp 
b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
index ff2ec0726e8..9b982569490 100644
--- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
@@ -3938,9 +3938,18 @@ glsl_to_tgsi_visitor::visit_image_intrinsic(ir_call *ir)
   case ir_intrinsic_image_atomic_comp_swap:
  opcode = TGSI_OPCODE_ATOMCAS;
  break;
-  case ir_intrinsic_image_atomic_inc_wrap:
+  case ir_intrinsic_image_atomic_inc_wrap: {
+ /* There's a bit of disagreement between GLSL and the hardware. The
+  * hardware wants to wrap after the given wrap value, while GLSL
+  * wants to wrap at the value. Subtract 1 to make up the difference.
+  */
+ st_src_reg wrap = get_temp(glsl_type::uint_type);
+ emit_asm(ir, TGSI_OPCODE_ADD, st_dst_reg(wrap),
+  arg1, st_src_reg_for_int(-1));
+ arg1 = wrap;
  opcode = TGSI_OPCODE_ATOMINC_WRAP;
  break;
+  }
   case ir_intrinsic_image_atomic_dec_wrap:
  opcode = TGSI_OPCODE_ATOMDEC_WRAP;
  break;
-- 
2.21.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] EXT_shader_image_load_store requires format-less loads?

2019-08-06 Thread Ilia Mirkin

Hi Pierre-Eric,

I see you recently added EXT_shader_image_load_store - nice. It seems
like you're relying on the format-less read access feature to make it
happen:

+   extensions->EXT_shader_image_load_store &=
+  screen->get_param(screen, PIPE_CAP_IMAGE_LOAD_FORMATTED);

Can you elaborate why this is necessary? From a quick read of the
spec, it seems like the format is required, and maps to a format same
way that ARB_image_load_store does (well, *slightly* different, but in
the end, we get one of those same enum values out). The LOAD_FORMATTED
thing enables you to read from an image with PIPE_FORMAT_NONE (which
normally can only be written to).

I don't see anything in the spec which requires it, but perhaps I
missed something. I'd be a bit surprised though, since this is rather
non-trivial on NVIDIA GPUs.

Cheers,

  -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] Re mesa/st: add tgsi-lowering code for depth-clamp

2019-08-03 Thread Ilia Mirkin

I've done the tests to satisfy my own curiousity... it all works
correctly as-is, it would seem. I need to go back and re-look at
barycentric coordinates generation.

Gert - your change should be all good from what I can tell
experimentally. Although I still don't understand why.

  -ilia

On Sat, Aug 3, 2019 at 12:11 AM Marek Olšák  wrote:
>
> Z is 0 at the near plane and 1 at the far plane, but the planes can have an 
> arbitrary distance from the viewer, therefore Z Is irrelevant.
>
> W is the real distance from the viewer. The greater W is, the smaller 
> (farther away) the object is, because W is the divisor (which makes objects 
> smaller as it increases).
>
> The W=0 plane intersects the viewer. It is exactly in the middle of the "eye 
> ball".
>
> W < 0 means behind the viewer.
>
> Marek
>
> On Fri., Aug. 2, 2019, 21:22 Ilia Mirkin,  wrote:
>>
>> Compare
>>
>> generated_tests/spec/glsl-1.30/execution/interpolation/interpolation-smooth-other-smooth-none.shader_test
>> generated_tests/spec/glsl-1.30/execution/interpolation/interpolation-noperspective-other-smooth-none.shader_test
>>
>> as you can see, the first one is perspective-interpolated and has a
>> lot more red and less blue (due to the depth). The second one is even,
>> since depth is ignored. gl_Position.w is the same in both cases, but
>> the "other" varying gets interpolated differently.
>>
>> I think you're right that 1/w plays into it -- you're supposed to
>> divide the interpolated result by w in the frag shader. But I think
>> the ij coefficients themselves are different? Not sure, it just seems
>> like z has to be taken into account *somewhere* otherwise the whole
>> thing can't possibly work.
>>
>>   -ilia
>>
>> On Fri, Aug 2, 2019 at 8:11 PM Marek Olšák  wrote:
>> >
>> > IIRC, perspective interpolation is driven by W, not Z. Interpolating W and 
>> > then computing barycentric coordinates using 1/W is what causes the 
>> > perspective distortion.
>> >
>> > Marek
>> >
>> > On Fri, Aug 2, 2019 at 4:59 PM Ilia Mirkin  wrote:
>> >>
>> >> On Fri, Aug 2, 2019 at 3:46 PM Gert Wollny  
>> >> wrote:
>> >> >
>> >> > Am Freitag, den 02.08.2019, 15:09 -0400 schrieb Ilia Mirkin:
>> >> > > On Fri, Aug 2, 2019 at 1:28 PM Gert Wollny > >> > > > wrote:
>> >> > > > Am Donnerstag, den 01.08.2019, 07:22 -0400 schrieb Ilia Mirkin:
>> >> > > > > Hey Gert,
>> >> > > > >
>> >> > > > > I'm looking at
>> >> > > > > https://cgit.freedesktop.org/mesa/mesa/commit/?id=b048d8bf8f056759d1845a799d4ba2ac84bce30f
>> >> > > > > , which attempts to implement depth clamping (rather than
>> >> > > > > clipping)
>> >> > > > > with shader tricks.
>> >> > > > >
>> >> > > > > You're forcing the final vertex stage's position's depth to 0,
>> >> > > > > and
>> >> > > > > then making up for it in the frag shader with an extra varying.
>> >> > > > > However won't this screw up the barycentric coordinates for
>> >> > > > > perspective interpolation? i.e. won't you effectively always just
>> >> > > > > get
>> >> > > > > noperspective interp everywhere as a result?
>> >> > > > That is probably true, I was following @kumas lead [1], in fact
>> >> > > > he implemented the initial version of this code, and I only fixed
>> >> > > > it up
>> >> > > > for ARB_clip_control and GS and TES shaders.
>> >> > > >
>> >> > > > One fix I could think is maybe clamp the z value to the clip range
>> >> > > > [-1,
>> >> > > > 1] or [0,1] depending on the clip control z accuracy, so that the
>> >> > > > error
>> >> > > > only happens only for the clamped areas where the result is
>> >> > > > distorted
>> >> > > > anyway, what do you think?
>> >> > >
>> >> > > Unfortunately I can't think of a way to generically emulate it
>> >> > > without implementing clipping in a geometry shader.
>> >> > Yeah, when I first looked into this, this is also what I thought of.
>> >> >
>> >> >
>> >> >
>> >> > > B

Re: [Mesa-dev] Re mesa/st: add tgsi-lowering code for depth-clamp

2019-08-02 Thread Ilia Mirkin

Compare

generated_tests/spec/glsl-1.30/execution/interpolation/interpolation-smooth-other-smooth-none.shader_test
generated_tests/spec/glsl-1.30/execution/interpolation/interpolation-noperspective-other-smooth-none.shader_test

as you can see, the first one is perspective-interpolated and has a
lot more red and less blue (due to the depth). The second one is even,
since depth is ignored. gl_Position.w is the same in both cases, but
the "other" varying gets interpolated differently.

I think you're right that 1/w plays into it -- you're supposed to
divide the interpolated result by w in the frag shader. But I think
the ij coefficients themselves are different? Not sure, it just seems
like z has to be taken into account *somewhere* otherwise the whole
thing can't possibly work.

  -ilia

On Fri, Aug 2, 2019 at 8:11 PM Marek Olšák  wrote:
>
> IIRC, perspective interpolation is driven by W, not Z. Interpolating W and 
> then computing barycentric coordinates using 1/W is what causes the 
> perspective distortion.
>
> Marek
>
> On Fri, Aug 2, 2019 at 4:59 PM Ilia Mirkin  wrote:
>>
>> On Fri, Aug 2, 2019 at 3:46 PM Gert Wollny  wrote:
>> >
>> > Am Freitag, den 02.08.2019, 15:09 -0400 schrieb Ilia Mirkin:
>> > > On Fri, Aug 2, 2019 at 1:28 PM Gert Wollny > > > > wrote:
>> > > > Am Donnerstag, den 01.08.2019, 07:22 -0400 schrieb Ilia Mirkin:
>> > > > > Hey Gert,
>> > > > >
>> > > > > I'm looking at
>> > > > > https://cgit.freedesktop.org/mesa/mesa/commit/?id=b048d8bf8f056759d1845a799d4ba2ac84bce30f
>> > > > > , which attempts to implement depth clamping (rather than
>> > > > > clipping)
>> > > > > with shader tricks.
>> > > > >
>> > > > > You're forcing the final vertex stage's position's depth to 0,
>> > > > > and
>> > > > > then making up for it in the frag shader with an extra varying.
>> > > > > However won't this screw up the barycentric coordinates for
>> > > > > perspective interpolation? i.e. won't you effectively always just
>> > > > > get
>> > > > > noperspective interp everywhere as a result?
>> > > > That is probably true, I was following @kumas lead [1], in fact
>> > > > he implemented the initial version of this code, and I only fixed
>> > > > it up
>> > > > for ARB_clip_control and GS and TES shaders.
>> > > >
>> > > > One fix I could think is maybe clamp the z value to the clip range
>> > > > [-1,
>> > > > 1] or [0,1] depending on the clip control z accuracy, so that the
>> > > > error
>> > > > only happens only for the clamped areas where the result is
>> > > > distorted
>> > > > anyway, what do you think?
>> > >
>> > > Unfortunately I can't think of a way to generically emulate it
>> > > without implementing clipping in a geometry shader.
>> > Yeah, when I first looked into this, this is also what I thought of.
>> >
>> >
>> >
>> > > Basically if depth is clipped, the triangle gets split into 2 -- one
>> > > half which is shown, and one half which isn't. When depth is clamped,
>> > > the half which isn't shown appears as a different polygon with the
>> > > clamped depth for all of its vertices instead. An interesting
>> > > question is whether it should be using the original z coords for its
>> > > barycentric coords or not -- I have no idea, and the spec doesn't
>> > > seem to explain it in a manner which is accessible to me.
>> > I think that the spec suggest to use the original z coords and the
>> > clamping only happens when the depth test is executed:
>> >
>> > RESOLUTION:  ...   Eliminating far and near plane clipping and
>> >   clamping *interpolated* depth values to the depth range is much
>> >   simpler to specify.
>> >
>> > > If it should be the original z coords, then perhaps you can just
>> > > disable clipping entirely in that case,
>> > My guess is that hardware that disables clipping completely also
>> > supports clamping the depth values. Our use case (virgl on top of e
>>
>> Yeah, that's how NVIDIA hardware works. There's a DEPTH_CLAMP_NEAR/FAR
>> value which you set to a value between 0 and 1, based on the viewport
>> transform, and that just clamps it prior to being supplied to the frag
>> shader. (And separately, you disable clippi

Re: [Mesa-dev] Re mesa/st: add tgsi-lowering code for depth-clamp

2019-08-02 Thread Ilia Mirkin

On Fri, Aug 2, 2019 at 3:46 PM Gert Wollny  wrote:
>
> Am Freitag, den 02.08.2019, 15:09 -0400 schrieb Ilia Mirkin:
> > On Fri, Aug 2, 2019 at 1:28 PM Gert Wollny  > > wrote:
> > > Am Donnerstag, den 01.08.2019, 07:22 -0400 schrieb Ilia Mirkin:
> > > > Hey Gert,
> > > >
> > > > I'm looking at
> > > > https://cgit.freedesktop.org/mesa/mesa/commit/?id=b048d8bf8f056759d1845a799d4ba2ac84bce30f
> > > > , which attempts to implement depth clamping (rather than
> > > > clipping)
> > > > with shader tricks.
> > > >
> > > > You're forcing the final vertex stage's position's depth to 0,
> > > > and
> > > > then making up for it in the frag shader with an extra varying.
> > > > However won't this screw up the barycentric coordinates for
> > > > perspective interpolation? i.e. won't you effectively always just
> > > > get
> > > > noperspective interp everywhere as a result?
> > > That is probably true, I was following @kumas lead [1], in fact
> > > he implemented the initial version of this code, and I only fixed
> > > it up
> > > for ARB_clip_control and GS and TES shaders.
> > >
> > > One fix I could think is maybe clamp the z value to the clip range
> > > [-1,
> > > 1] or [0,1] depending on the clip control z accuracy, so that the
> > > error
> > > only happens only for the clamped areas where the result is
> > > distorted
> > > anyway, what do you think?
> >
> > Unfortunately I can't think of a way to generically emulate it
> > without implementing clipping in a geometry shader.
> Yeah, when I first looked into this, this is also what I thought of.
>
>
>
> > Basically if depth is clipped, the triangle gets split into 2 -- one
> > half which is shown, and one half which isn't. When depth is clamped,
> > the half which isn't shown appears as a different polygon with the
> > clamped depth for all of its vertices instead. An interesting
> > question is whether it should be using the original z coords for its
> > barycentric coords or not -- I have no idea, and the spec doesn't
> > seem to explain it in a manner which is accessible to me.
> I think that the spec suggest to use the original z coords and the
> clamping only happens when the depth test is executed:
>
> RESOLUTION:  ...   Eliminating far and near plane clipping and
>   clamping *interpolated* depth values to the depth range is much
>   simpler to specify.
>
> > If it should be the original z coords, then perhaps you can just
> > disable clipping entirely in that case,
> My guess is that hardware that disables clipping completely also
> supports clamping the depth values. Our use case (virgl on top of e

Yeah, that's how NVIDIA hardware works. There's a DEPTH_CLAMP_NEAR/FAR
value which you set to a value between 0 and 1, based on the viewport
transform, and that just clamps it prior to being supplied to the frag
shader. (And separately, you disable clipping near/far planes.) Adreno
hw works similarly.

> GLES host that doesn't support EXT_depth_clamp) repesents the opposite:
> a hardware that doesn't support disabling clipping because OpenGL (ES)
> doesn't allow this and also not doesn't support depth clamping.
>
> I still have to think about whether the interpolation really goes
> wrong. I think I need to write another piglit to get an idea.

Should be easy - take one of the interpolation piglit tests, and
enable depth clamp without doing anything else.

Cheers,

  -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] Re mesa/st: add tgsi-lowering code for depth-clamp

2019-08-02 Thread Ilia Mirkin

On Fri, Aug 2, 2019 at 1:28 PM Gert Wollny  wrote:
>
> Am Donnerstag, den 01.08.2019, 07:22 -0400 schrieb Ilia Mirkin:
> > Hey Gert,
> >
> > I'm looking at
> > https://cgit.freedesktop.org/mesa/mesa/commit/?id=b048d8bf8f056759d1845a799d4ba2ac84bce30f
> > , which attempts to implement depth clamping (rather than clipping)
> > with shader tricks.
> >
> > You're forcing the final vertex stage's position's depth to 0, and
> > then making up for it in the frag shader with an extra varying.
> > However won't this screw up the barycentric coordinates for
> > perspective interpolation? i.e. won't you effectively always just get
> > noperspective interp everywhere as a result?
> That is probably true, I was following @kumas lead [1], in fact
> he implemented the initial version of this code, and I only fixed it up
> for ARB_clip_control and GS and TES shaders.
>
> One fix I could think is maybe clamp the z value to the clip range [-1,
> 1] or [0,1] depending on the clip control z accuracy, so that the error
> only happens only for the clamped areas where the result is distorted
> anyway, what do you think?

Unfortunately I can't think of a way to generically emulate it without
implementing clipping in a geometry shader. (I believe Marek has done
something like this in a compute shader, actually, to pre-clip
geometry-heavy workloads.)

Basically if depth is clipped, the triangle gets split into 2 -- one
half which is shown, and one half which isn't. When depth is clamped,
the half which isn't shown appears as a different polygon with the
clamped depth for all of its vertices instead. An interesting question
is whether it should be using the original z coords for its
barycentric coords or not -- I have no idea, and the spec doesn't seem
to explain it in a manner which is accessible to me. If it should be
the original z coords, then perhaps you can just disable clipping
entirely in that case, and then in the frag shader, just output
clamp(gl_FragCoord.z, minz, maxz) or something [and those would be set
based on the transform/halfz settings]. You may want to consult
someone with a clearer understanding of depth than me though -- it has
always been a concept just out of reach for me, unfortunately.

Cheers,

  -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] Re mesa/st: add tgsi-lowering code for depth-clamp

2019-08-01 Thread Ilia Mirkin

Hey Gert,

I'm looking at 
https://cgit.freedesktop.org/mesa/mesa/commit/?id=b048d8bf8f056759d1845a799d4ba2ac84bce30f
, which attempts to implement depth clamping (rather than clipping)
with shader tricks.

You're forcing the final vertex stage's position's depth to 0, and
then making up for it in the frag shader with an extra varying.
However won't this screw up the barycentric coordinates for
perspective interpolation? i.e. won't you effectively always just get
noperspective interp everywhere as a result?

  -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] softpipe: Don't draw when rasterizer_discard is set

2019-07-29 Thread Ilia Mirkin

Hi Gert,

I just saw you pushed commit

https://cgit.freedesktop.org/mesa/mesa/commit/?id=4ee638cd7826e8a4bed76f51c7b73395a2fcdb

which just returns if rasterizer_discard is set, similar to if the
render condition doesn't match. However this breaks TF and any
image/etc writes in vertex stages when GL_RASTERIZER_DISCARD is
enabled, no?

  -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] nvc0/ir: Fix assert accessing null pointer

2019-07-26 Thread Ilia Mirkin

On Wed, Jul 24, 2019 at 9:34 AM Juan A. Suarez Romero
 wrote:
>
> On Wed, 2019-07-24 at 14:27 +0200, Karol Herbst wrote:
> > it's only fixing a crash in a build with asserts enabled, but if
> > somebody wants to apply those to stable, then go ahead.
> >
>
> OK; in that case I will keep it out.

Looks like distros build with debug enabled, sadly:

https://bugs.freedesktop.org/show_bug.cgi?id=111218

Please do include this in stable:

commit 7493fbf032f5bcbf4c48187bc089c9a34f04a1d5
Author: Mark Menzynski 
Date:   Fri Jul 19 13:09:02 2019 +0200

nvc0/ir: Fix assert accessing null pointer

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111007
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=67

Signed-off-by: Mark Menzynski 
Reviewed-by: Ilia Mirkin 
Reviewed-by: Tobias Klausmann
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 3/4] nvc0: allow a non-user buffer to be bound at position 0

2019-07-26 Thread Ilia Mirkin

Thanks! I had to make a small update to the asserts:

assert(nvc0->constbuf[5][0].user || !nvc0->constbuf[5][0].u.buf);

u.buf is not valid to check when .user is set. (in fact it aliases
with the "data" pointer)

Let me know if you want me to resend.

On Fri, Jul 26, 2019 at 5:51 AM Karol Herbst  wrote:
>
> Reviewed-by: Karol Herbst 
>
> On Fri, Jul 26, 2019 at 5:31 AM Ilia Mirkin  wrote:
> >
> > Previously the code only handled it for positions 1 and up (as would be
> > for UBO's in GL). It's not a lot of trouble to handle this, and vl or
> > vdpau want this.
> >
> > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111213
> > Signed-off-by: Ilia Mirkin 
> > Cc: mesa-sta...@lists.freedesktop.org
> > ---
> >  .../drivers/nouveau/nvc0/nve4_compute.c   | 45 +++
> >  1 file changed, 27 insertions(+), 18 deletions(-)
> >
> > diff --git a/src/gallium/drivers/nouveau/nvc0/nve4_compute.c 
> > b/src/gallium/drivers/nouveau/nvc0/nve4_compute.c
> > index c5e4dec20bd..a1c40d1e6b9 100644
> > --- a/src/gallium/drivers/nouveau/nvc0/nve4_compute.c
> > +++ b/src/gallium/drivers/nouveau/nvc0/nve4_compute.c
> > @@ -393,23 +393,24 @@ nve4_compute_validate_constbufs(struct nvc0_context 
> > *nvc0)
> >  uint64_t address
> > = nvc0->screen->uniform_bo->offset + NVC0_CB_AUX_INFO(s);
> >
> > -assert(i > 0); /* we really only want uniform buffer objects */
> > -
> > -BEGIN_NVC0(push, NVE4_CP(UPLOAD_DST_ADDRESS_HIGH), 2);
> > -PUSH_DATAh(push, address + NVC0_CB_AUX_UBO_INFO(i - 1));
> > -PUSH_DATA (push, address + NVC0_CB_AUX_UBO_INFO(i - 1));
> > -BEGIN_NVC0(push, NVE4_CP(UPLOAD_LINE_LENGTH_IN), 2);
> > -PUSH_DATA (push, 4 * 4);
> > -PUSH_DATA (push, 0x1);
> > -BEGIN_1IC0(push, NVE4_CP(UPLOAD_EXEC), 1 + 4);
> > -PUSH_DATA (push, NVE4_COMPUTE_UPLOAD_EXEC_LINEAR | (0x20 << 
> > 1));
> > -
> > -PUSH_DATA (push, res->address + nvc0->constbuf[s][i].offset);
> > -PUSH_DATAh(push, res->address + nvc0->constbuf[s][i].offset);
> > -PUSH_DATA (push, nvc0->constbuf[5][i].size);
> > -PUSH_DATA (push, 0);
> > -BCTX_REFN(nvc0->bufctx_cp, CP_CB(i), res, RD);
> > +/* constbufs above 0 will are fetched via ubo info in the 
> > shader */
> > +if (i > 0) {
> > +   BEGIN_NVC0(push, NVE4_CP(UPLOAD_DST_ADDRESS_HIGH), 2);
> > +   PUSH_DATAh(push, address + NVC0_CB_AUX_UBO_INFO(i - 1));
> > +   PUSH_DATA (push, address + NVC0_CB_AUX_UBO_INFO(i - 1));
> > +   BEGIN_NVC0(push, NVE4_CP(UPLOAD_LINE_LENGTH_IN), 2);
> > +   PUSH_DATA (push, 4 * 4);
> > +   PUSH_DATA (push, 0x1);
> > +   BEGIN_1IC0(push, NVE4_CP(UPLOAD_EXEC), 1 + 4);
> > +   PUSH_DATA (push, NVE4_COMPUTE_UPLOAD_EXEC_LINEAR | (0x20 << 
> > 1));
> > +
> > +   PUSH_DATA (push, res->address + 
> > nvc0->constbuf[s][i].offset);
> > +   PUSH_DATAh(push, res->address + 
> > nvc0->constbuf[s][i].offset);
> > +   PUSH_DATA (push, nvc0->constbuf[s][i].size);
> > +   PUSH_DATA (push, 0);
> > +}
> >
> > +BCTX_REFN(nvc0->bufctx_cp, CP_CB(i), res, RD);
> >  res->cb_bindings[s] |= 1 << i;
> >   }
> >}
> > @@ -554,9 +555,9 @@ nve4_compute_derive_cache_split(struct nvc0_context 
> > *nvc0, uint32_t shared_size)
> >  static void
> >  nve4_compute_setup_buf_cb(struct nvc0_context *nvc0, bool gp100, void 
> > *desc)
> >  {
> > -   // only user constant buffers 1-6 can be put in the descriptor, the 
> > rest are
> > +   // only user constant buffers 0-6 can be put in the descriptor, the 
> > rest are
> > // loaded through global memory
> > -   for (int i = 1; i <= 6; i++) {
> > +   for (int i = 0; i <= 6; i++) {
> >if (nvc0->constbuf[5][i].user || !nvc0->constbuf[5][i].u.buf)
> >   continue;
> >
> > @@ -609,6 +610,10 @@ nve4_compute_setup_launch_desc(struct nvc0_context 
> > *nvc0,
> > if (nvc0->constbuf[5][0].user || cp->parm_size) {
> >nve4_cp_launch_desc_set_cb(desc, 0, screen->uniform_bo,
> >   NVC0_CB_USR_INFO(5), 1 << 16);
> > +
> > +  // Later logic will attempt to bind

[Mesa-dev] [PATCH] nv50/ir: don't consider the main compute function as taking arguments

2019-07-25 Thread Ilia Mirkin

With OpenCL, kernels can take arguments and return values (?). However
in practice, there is no more TGSI compute implementation, and even if
there were, it would probably have named functions and no explicit main.

This improves RA considerably for compute shaders, since temps are not
kept around as return values.

Signed-off-by: Ilia Mirkin 
---
 src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
index 9d0ab336c75..2dd13e70d0e 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
@@ -4298,7 +4298,7 @@ Converter::BindArgumentsPass::visit(Function *f)
   }
}
 
-   if (func == prog->main && prog->getType() != Program::TYPE_COMPUTE)
+   if (func == prog->main /* && prog->getType() != Program::TYPE_COMPUTE */)
   return true;
updatePrototype(::get(f->cfg.getRoot())->liveSet,
::buildLiveSets, ::ins);
-- 
2.21.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] nv50/ir: handle insn not being there for definition of CVT arg

2019-07-25 Thread Ilia Mirkin

This can happen if it's e.g. a uniform or a function argument.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111217
Signed-off-by: Ilia Mirkin 
Cc: mesa-sta...@lists.freedesktop.org
---
 src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
index 0b3220903b9..bfdb923379b 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
@@ -2080,14 +2080,15 @@ void
 AlgebraicOpt::handleCVT_CVT(Instruction *cvt)
 {
Instruction *insn = cvt->getSrc(0)->getInsn();
-   RoundMode rnd = insn->rnd;
 
-   if (insn->saturate ||
+   if (!insn ||
+   insn->saturate ||
insn->subOp ||
insn->dType != insn->sType ||
insn->dType != cvt->sType)
   return;
 
+   RoundMode rnd = insn->rnd;
switch (insn->op) {
case OP_CEIL:
   rnd = ROUND_PI;
-- 
2.21.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 4/4] nouveau: flip DEBUG -> !NDEBUG

2019-07-25 Thread Ilia Mirkin

The meson conversion chose to change the meaning of DEBUG to "used for
debugging" to be "used for expensive things for debugging", primarily
for nir_validate. Flip things over so that we get nice things with
optimizations enabled.

While we're at it, also kill off nouveau_statebuf.h which is unused (and
has a mention of DEBUG which is how I found it).

Signed-off-by: Ilia Mirkin 
---
 src/gallium/drivers/nouveau/Makefile.sources  |  1 -
 .../drivers/nouveau/codegen/nv50_ir_driver.h  |  2 +-
 .../drivers/nouveau/codegen/nv50_ir_inlines.h |  2 +-
 .../drivers/nouveau/codegen/nv50_ir_util.h|  8 ++---
 src/gallium/drivers/nouveau/meson.build   |  1 -
 src/gallium/drivers/nouveau/nouveau_screen.h  |  2 +-
 .../drivers/nouveau/nouveau_statebuf.h| 32 ---
 .../drivers/nouveau/nv50/nv50_program.c   |  2 +-
 .../drivers/nouveau/nvc0/nvc0_program.c   |  8 ++---
 .../drivers/nouveau/nvc0/nve4_compute.c   |  6 ++--
 10 files changed, 15 insertions(+), 49 deletions(-)
 delete mode 100644 src/gallium/drivers/nouveau/nouveau_statebuf.h

diff --git a/src/gallium/drivers/nouveau/Makefile.sources 
b/src/gallium/drivers/nouveau/Makefile.sources
index c6a1aff7110..6c360992a53 100644
--- a/src/gallium/drivers/nouveau/Makefile.sources
+++ b/src/gallium/drivers/nouveau/Makefile.sources
@@ -12,7 +12,6 @@ C_SOURCES := \
nouveau_mm.h \
nouveau_screen.c \
nouveau_screen.h \
-   nouveau_statebuf.h \
nouveau_video.c \
nouveau_video.h \
nouveau_vp3_video_bsp.c \
diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_driver.h 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_driver.h
index 95b3d633ee6..322bdd02557 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_driver.h
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_driver.h
@@ -54,7 +54,7 @@ struct nv50_ir_varying
ubyte si; /* TGSI semantic index */
 };
 
-#ifdef DEBUG
+#ifndef NDEBUG
 # define NV50_IR_DEBUG_BASIC (1 << 0)
 # define NV50_IR_DEBUG_VERBOSE   (2 << 0)
 # define NV50_IR_DEBUG_REG_ALLOC (1 << 2)
diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_inlines.h 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_inlines.h
index 4cb53ab42ed..b4ca5ed8215 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_inlines.h
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_inlines.h
@@ -222,7 +222,7 @@ Instruction *Value::getUniqueInsn() const
 return (*it)->getInsn();
   // should be unreachable and trigger assertion at the end
}
-#ifdef DEBUG
+#ifndef NDEBUG
if (reg.data.id < 0) {
   int n = 0;
   for (DefCIterator it = defs.begin(); n < 2 && it != defs.end(); ++it)
diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_util.h 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_util.h
index affe04a2dd9..307c23d5e03 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_util.h
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_util.h
@@ -36,14 +36,14 @@
 #include "util/u_inlines.h"
 #include "util/u_memory.h"
 
-#define ERROR(args...) debug_printf("ERROR: " args)
-#define WARN(args...) debug_printf("WARNING: " args)
-#define INFO(args...) debug_printf(args)
+#define ERROR(args...) _debug_printf("ERROR: " args)
+#define WARN(args...) _debug_printf("WARNING: " args)
+#define INFO(args...) _debug_printf(args)
 
 #define INFO_DBG(m, f, args...)  \
do {  \
   if (m & NV50_IR_DEBUG_##f) \
- debug_printf(args); \
+ _debug_printf(args); \
} while(0)
 
 #define FATAL(args...)  \
diff --git a/src/gallium/drivers/nouveau/meson.build 
b/src/gallium/drivers/nouveau/meson.build
index 64138212b5b..b3e79bf7089 100644
--- a/src/gallium/drivers/nouveau/meson.build
+++ b/src/gallium/drivers/nouveau/meson.build
@@ -32,7 +32,6 @@ files_libnouveau = files(
   'nouveau_mm.h',
   'nouveau_screen.c',
   'nouveau_screen.h',
-  'nouveau_statebuf.h',
   'nouveau_video.c',
   'nouveau_video.h',
   'nouveau_vp3_video_bsp.c',
diff --git a/src/gallium/drivers/nouveau/nouveau_screen.h 
b/src/gallium/drivers/nouveau/nouveau_screen.h
index 1302c608bec..450c7c466be 100644
--- a/src/gallium/drivers/nouveau/nouveau_screen.h
+++ b/src/gallium/drivers/nouveau/nouveau_screen.h
@@ -6,7 +6,7 @@
 #include "util/u_atomic.h"
 #include "util/u_memory.h"
 
-#ifdef DEBUG
+#ifndef NDEBUG
 # define NOUVEAU_ENABLE_DRIVER_STATISTICS
 #endif
 
diff --git a/src/gallium/drivers/nouveau/nouveau_statebuf.h 
b/src/gallium/drivers/nouveau/nouveau_statebuf.h
deleted file mode 100644
index da5d7972d9c..000
--- a/src/gallium/drivers/nouveau/nouveau_statebuf.h
+++ /dev/null
@@ -1,32 +0,0 @@
-#ifndef __NOUVEAU_STATEBUF_H__
-#define __NOUVEAU_STATEBUF_H__
-
-/* state buffers: lightweight state objects interface */
-/* relocations are not support

[Mesa-dev] [PATCH 3/4] nvc0: allow a non-user buffer to be bound at position 0

2019-07-25 Thread Ilia Mirkin

Previously the code only handled it for positions 1 and up (as would be
for UBO's in GL). It's not a lot of trouble to handle this, and vl or
vdpau want this.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111213
Signed-off-by: Ilia Mirkin 
Cc: mesa-sta...@lists.freedesktop.org
---
 .../drivers/nouveau/nvc0/nve4_compute.c   | 45 +++
 1 file changed, 27 insertions(+), 18 deletions(-)

diff --git a/src/gallium/drivers/nouveau/nvc0/nve4_compute.c 
b/src/gallium/drivers/nouveau/nvc0/nve4_compute.c
index c5e4dec20bd..a1c40d1e6b9 100644
--- a/src/gallium/drivers/nouveau/nvc0/nve4_compute.c
+++ b/src/gallium/drivers/nouveau/nvc0/nve4_compute.c
@@ -393,23 +393,24 @@ nve4_compute_validate_constbufs(struct nvc0_context *nvc0)
 uint64_t address
= nvc0->screen->uniform_bo->offset + NVC0_CB_AUX_INFO(s);
 
-assert(i > 0); /* we really only want uniform buffer objects */
-
-BEGIN_NVC0(push, NVE4_CP(UPLOAD_DST_ADDRESS_HIGH), 2);
-PUSH_DATAh(push, address + NVC0_CB_AUX_UBO_INFO(i - 1));
-PUSH_DATA (push, address + NVC0_CB_AUX_UBO_INFO(i - 1));
-BEGIN_NVC0(push, NVE4_CP(UPLOAD_LINE_LENGTH_IN), 2);
-PUSH_DATA (push, 4 * 4);
-PUSH_DATA (push, 0x1);
-BEGIN_1IC0(push, NVE4_CP(UPLOAD_EXEC), 1 + 4);
-PUSH_DATA (push, NVE4_COMPUTE_UPLOAD_EXEC_LINEAR | (0x20 << 1));
-
-PUSH_DATA (push, res->address + nvc0->constbuf[s][i].offset);
-PUSH_DATAh(push, res->address + nvc0->constbuf[s][i].offset);
-PUSH_DATA (push, nvc0->constbuf[5][i].size);
-PUSH_DATA (push, 0);
-BCTX_REFN(nvc0->bufctx_cp, CP_CB(i), res, RD);
+/* constbufs above 0 will are fetched via ubo info in the shader */
+if (i > 0) {
+   BEGIN_NVC0(push, NVE4_CP(UPLOAD_DST_ADDRESS_HIGH), 2);
+   PUSH_DATAh(push, address + NVC0_CB_AUX_UBO_INFO(i - 1));
+   PUSH_DATA (push, address + NVC0_CB_AUX_UBO_INFO(i - 1));
+   BEGIN_NVC0(push, NVE4_CP(UPLOAD_LINE_LENGTH_IN), 2);
+   PUSH_DATA (push, 4 * 4);
+   PUSH_DATA (push, 0x1);
+   BEGIN_1IC0(push, NVE4_CP(UPLOAD_EXEC), 1 + 4);
+   PUSH_DATA (push, NVE4_COMPUTE_UPLOAD_EXEC_LINEAR | (0x20 << 1));
+
+   PUSH_DATA (push, res->address + nvc0->constbuf[s][i].offset);
+   PUSH_DATAh(push, res->address + nvc0->constbuf[s][i].offset);
+   PUSH_DATA (push, nvc0->constbuf[s][i].size);
+   PUSH_DATA (push, 0);
+}
 
+BCTX_REFN(nvc0->bufctx_cp, CP_CB(i), res, RD);
 res->cb_bindings[s] |= 1 << i;
  }
   }
@@ -554,9 +555,9 @@ nve4_compute_derive_cache_split(struct nvc0_context *nvc0, 
uint32_t shared_size)
 static void
 nve4_compute_setup_buf_cb(struct nvc0_context *nvc0, bool gp100, void *desc)
 {
-   // only user constant buffers 1-6 can be put in the descriptor, the rest are
+   // only user constant buffers 0-6 can be put in the descriptor, the rest are
// loaded through global memory
-   for (int i = 1; i <= 6; i++) {
+   for (int i = 0; i <= 6; i++) {
   if (nvc0->constbuf[5][i].user || !nvc0->constbuf[5][i].u.buf)
  continue;
 
@@ -609,6 +610,10 @@ nve4_compute_setup_launch_desc(struct nvc0_context *nvc0,
if (nvc0->constbuf[5][0].user || cp->parm_size) {
   nve4_cp_launch_desc_set_cb(desc, 0, screen->uniform_bo,
  NVC0_CB_USR_INFO(5), 1 << 16);
+
+  // Later logic will attempt to bind a real buffer at position 0. That
+  // should not happen if we've bound a user buffer.
+  assert(!nvc0->constbuf[5][0].u.buf);
}
nve4_cp_launch_desc_set_cb(desc, 7, screen->uniform_bo,
   NVC0_CB_AUX_INFO(5), 1 << 11);
@@ -649,6 +654,10 @@ gp100_compute_setup_launch_desc(struct nvc0_context *nvc0,
if (nvc0->constbuf[5][0].user || cp->parm_size) {
   gp100_cp_launch_desc_set_cb(desc, 0, screen->uniform_bo,
   NVC0_CB_USR_INFO(5), 1 << 16);
+
+  // Later logic will attempt to bind a real buffer at position 0. That
+  // should not happen if we've bound a user buffer.
+  assert(!nvc0->constbuf[5][0].u.buf);
}
gp100_cp_launch_desc_set_cb(desc, 7, screen->uniform_bo,
NVC0_CB_AUX_INFO(5), 1 << 11);
-- 
2.21.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/4] nv50, nvc0: update sampler/view bind functions to accept NULL array

2019-07-25 Thread Ilia Mirkin

Apparently vl (or vdpau) wants to pass that in now. Handle it.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111213
Signed-off-by: Ilia Mirkin 
Cc: mesa-sta...@lists.freedesktop.org
---
 src/gallium/drivers/nouveau/nv50/nv50_state.c | 14 --
 src/gallium/drivers/nouveau/nvc0/nvc0_state.c | 18 ++
 2 files changed, 18 insertions(+), 14 deletions(-)

diff --git a/src/gallium/drivers/nouveau/nv50/nv50_state.c 
b/src/gallium/drivers/nouveau/nv50/nv50_state.c
index 8b294be6d86..a4163aa1713 100644
--- a/src/gallium/drivers/nouveau/nv50/nv50_state.c
+++ b/src/gallium/drivers/nouveau/nv50/nv50_state.c
@@ -599,19 +599,20 @@ nv50_sampler_state_delete(struct pipe_context *pipe, void 
*hwcso)
 
 static inline void
 nv50_stage_sampler_states_bind(struct nv50_context *nv50, int s,
-   unsigned nr, void **hwcso)
+   unsigned nr, void **hwcsos)
 {
unsigned highest_found = 0;
unsigned i;
 
assert(nr <= PIPE_MAX_SAMPLERS);
for (i = 0; i < nr; ++i) {
+  struct nv50_tsc_entry *hwcso = hwcsos ? nv50_tsc_entry(hwcsos[i]) : NULL;
   struct nv50_tsc_entry *old = nv50->samplers[s][i];
 
-  if (hwcso[i])
+  if (hwcso)
  highest_found = i;
 
-  nv50->samplers[s][i] = nv50_tsc_entry(hwcso[i]);
+  nv50->samplers[s][i] = hwcso;
   if (old)
  nv50_screen_tsc_unlock(nv50->screen, old);
}
@@ -685,12 +686,13 @@ nv50_stage_set_sampler_views(struct nv50_context *nv50, 
int s,
 
assert(nr <= PIPE_MAX_SAMPLERS);
for (i = 0; i < nr; ++i) {
+  struct pipe_sampler_view *view = views ? views[i] : NULL;
   struct nv50_tic_entry *old = nv50_tic_entry(nv50->textures[s][i]);
   if (old)
  nv50_screen_tic_unlock(nv50->screen, old);
 
-  if (views[i] && views[i]->texture) {
- struct pipe_resource *res = views[i]->texture;
+  if (view && view->texture) {
+ struct pipe_resource *res = view->texture;
  if (res->target == PIPE_BUFFER &&
  (res->flags & PIPE_RESOURCE_FLAG_MAP_COHERENT))
 nv50->textures_coherent[s] |= 1 << i;
@@ -700,7 +702,7 @@ nv50_stage_set_sampler_views(struct nv50_context *nv50, int 
s,
  nv50->textures_coherent[s] &= ~(1 << i);
   }
 
-  pipe_sampler_view_reference(>textures[s][i], views[i]);
+  pipe_sampler_view_reference(>textures[s][i], view);
}
 
assert(nv50->num_textures[s] <= PIPE_MAX_SAMPLERS);
diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_state.c 
b/src/gallium/drivers/nouveau/nvc0/nvc0_state.c
index a9ee7b784bd..60dcbe3ec39 100644
--- a/src/gallium/drivers/nouveau/nvc0/nvc0_state.c
+++ b/src/gallium/drivers/nouveau/nvc0/nvc0_state.c
@@ -463,22 +463,23 @@ nvc0_sampler_state_delete(struct pipe_context *pipe, void 
*hwcso)
 static inline void
 nvc0_stage_sampler_states_bind(struct nvc0_context *nvc0,
unsigned s,
-   unsigned nr, void **hwcso)
+   unsigned nr, void **hwcsos)
 {
unsigned highest_found = 0;
unsigned i;
 
for (i = 0; i < nr; ++i) {
+  struct nv50_tsc_entry *hwcso = hwcsos ? nv50_tsc_entry(hwcsos[i]) : NULL;
   struct nv50_tsc_entry *old = nvc0->samplers[s][i];
 
-  if (hwcso[i])
+  if (hwcso)
  highest_found = i;
 
-  if (hwcso[i] == old)
+  if (hwcso == old)
  continue;
   nvc0->samplers_dirty[s] |= 1 << i;
 
-  nvc0->samplers[s][i] = nv50_tsc_entry(hwcso[i]);
+  nvc0->samplers[s][i] = hwcso;
   if (old)
  nvc0_screen_tsc_unlock(nvc0->screen, old);
}
@@ -523,14 +524,15 @@ nvc0_stage_set_sampler_views(struct nvc0_context *nvc0, 
int s,
unsigned i;
 
for (i = 0; i < nr; ++i) {
+  struct pipe_sampler_view *view = views ? views[i] : NULL;
   struct nv50_tic_entry *old = nv50_tic_entry(nvc0->textures[s][i]);
 
-  if (views[i] == nvc0->textures[s][i])
+  if (view == nvc0->textures[s][i])
  continue;
   nvc0->textures_dirty[s] |= 1 << i;
 
-  if (views[i] && views[i]->texture) {
- struct pipe_resource *res = views[i]->texture;
+  if (view && view->texture) {
+ struct pipe_resource *res = view->texture;
  if (res->target == PIPE_BUFFER &&
  (res->flags & PIPE_RESOURCE_FLAG_MAP_COHERENT))
 nvc0->textures_coherent[s] |= 1 << i;
@@ -548,7 +550,7 @@ nvc0_stage_set_sampler_views(struct nvc0_context *nvc0, int 
s,
  nvc0_screen_tic_unlock(nvc0->screen, old);
   }
 
-  pipe_sampler_view_reference(>textures[s][i], views[i]);
+  pipe_sampler_view_reference(>textures[s][i], view);
}
 
for (i = nr; i < nvc0->num_textures[s]; ++i) {
-- 
2.21.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 1/4] gallium/vl: fix compute tgsi shaders to not process undefined components

2019-07-25 Thread Ilia Mirkin

This caused nouveau's function handling logic to think that the MAIN
function was due to receive external parameters, and cascaded some
failures after that. Instead avoid having the undefined components in
the first place.

Fixes: f6ac0b5d71 (gallium/auxiliary/vl: Add compute shader to support video 
compositor render)
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111213
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111217
Signed-off-by: Ilia Mirkin 
---
 src/gallium/auxiliary/vl/vl_compositor_cs.c | 102 ++--
 1 file changed, 51 insertions(+), 51 deletions(-)

diff --git a/src/gallium/auxiliary/vl/vl_compositor_cs.c 
b/src/gallium/auxiliary/vl/vl_compositor_cs.c
index 485b4174b8e..d84df7240da 100644
--- a/src/gallium/auxiliary/vl/vl_compositor_cs.c
+++ b/src/gallium/auxiliary/vl/vl_compositor_cs.c
@@ -61,7 +61,7 @@ const char *compute_shader_video_buffer =
   "IMM[0] UINT32 { 8, 8, 1, 0}\n"
   "IMM[1] FLT32 { 1.0, 2.0, 0.0, 0.0}\n"
 
-  "UMAD TEMP[0], SV[1], IMM[0], SV[0]\n"
+  "UMAD TEMP[0].xy, SV[1].xyyy, IMM[0].xyyy, SV[0].xyyy\n"
 
   /* Drawn area check */
   "USGE TEMP[1].xy, TEMP[0].xyxy, CONST[4].xyxy\n"
@@ -70,20 +70,20 @@ const char *compute_shader_video_buffer =
   "AND TEMP[1].x, TEMP[1]., TEMP[1].\n"
   "AND TEMP[1].x, TEMP[1]., TEMP[1].\n"
 
-  "UIF TEMP[1]\n"
+  "UIF TEMP[1].\n"
  /* Translate */
  "UADD TEMP[2].xy, TEMP[0], -CONST[5].xyxy\n"
- "U2F TEMP[2], TEMP[2]\n"
- "DIV TEMP[3], TEMP[2], IMM[1].\n"
+ "U2F TEMP[2].xy, TEMP[2].xyyy\n"
+ "DIV TEMP[3].xy, TEMP[2].xyyy, IMM[1].\n"
 
  /* Scale */
- "DIV TEMP[2], TEMP[2], CONST[3].zwzw\n"
- "DIV TEMP[3], TEMP[3], CONST[3].zwzw\n"
+ "DIV TEMP[2].xy, TEMP[2].xyyy, CONST[3].zwww\n"
+ "DIV TEMP[3].xy, TEMP[3].xyyy, CONST[3].zwww\n"
 
  /* Fetch texels */
- "TEX_LZ TEMP[4].x, TEMP[2], SAMP[0], RECT\n"
- "TEX_LZ TEMP[4].y, TEMP[3], SAMP[1], RECT\n"
- "TEX_LZ TEMP[4].z, TEMP[3], SAMP[2], RECT\n"
+ "TEX_LZ TEMP[4].x, TEMP[2].xyyy, SAMP[0], RECT\n"
+ "TEX_LZ TEMP[4].y, TEMP[3].xyyy, SAMP[1], RECT\n"
+ "TEX_LZ TEMP[4].z, TEMP[3].xyyy, SAMP[2], RECT\n"
 
  "MOV TEMP[4].w, IMM[1].\n"
 
@@ -93,12 +93,12 @@ const char *compute_shader_video_buffer =
  "DP4 TEMP[7].z, CONST[2], TEMP[4]\n"
 
  "MOV TEMP[5].w, TEMP[4].\n"
- "SLE TEMP[6].w, TEMP[5], CONST[3].\n"
- "SGT TEMP[5].w, TEMP[5], CONST[3].\n"
+ "SLE TEMP[6].w, TEMP[5]., CONST[3].\n"
+ "SGT TEMP[5].w, TEMP[5]., CONST[3].\n"
 
- "MAX TEMP[7].w, TEMP[5], TEMP[6]\n"
+ "MAX TEMP[7].w, TEMP[5]., TEMP[6].\n"
 
- "STORE IMAGE[0], TEMP[0], TEMP[7], 2D\n"
+ "STORE IMAGE[0], TEMP[0].xyyy, TEMP[7], 2D\n"
   "ENDIF\n"
 
   "END\n";
@@ -124,7 +124,7 @@ const char *compute_shader_weave =
   "IMM[2] UINT32 { 1, 2, 4, 0}\n"
   "IMM[3] FLT32 { 0.25, 0.5, 0.125, 0.125}\n"
 
-  "UMAD TEMP[0], SV[1], IMM[0], SV[0]\n"
+  "UMAD TEMP[0].xy, SV[1].xyyy, IMM[0].xyyy, SV[0].xyyy\n"
 
   /* Drawn area check */
   "USGE TEMP[1].xy, TEMP[0].xyxy, CONST[4].xyxy\n"
@@ -133,22 +133,22 @@ const char *compute_shader_weave =
   "AND TEMP[1].x, TEMP[1]., TEMP[1].\n"
   "AND TEMP[1].x, TEMP[1]., TEMP[1].\n"
 
-  "UIF TEMP[1]\n"
- "MOV TEMP[2], TEMP[0]\n"
+  "UIF TEMP[1].\n"
+ "MOV TEMP[2].xy, TEMP[0].xyyy\n"
  /* Translate */
- "UADD TEMP[2].xy, TEMP[2], -CONST[5].xyxy\n"
+ "UADD TEMP[2].xy, TEMP[2].xyyy, -CONST[5].xyxy\n"
 
  /* Top Y */
- "U2F TEMP[2], TEMP[2]\n"
+ "U2F TEMP[2].xy, TEMP[2].xyyy\n"
  "DIV TEMP[2].y, TEMP[2]., IMM[1].\n"
  /* Down Y */
- "MOV TEMP[12], TEMP[2]\n"
+ "MOV TEMP[12].xy, TEMP[2].xyyy\n"
 
  /* Top UV */
- "MOV TEMP[3], TEMP[2]\n"
+ "MOV TEMP[3].xy, TEMP[2].xyyy\n"
  "DIV TEMP[3].xy, TEMP[3], IMM[1].\n"
  /* Down UV */
- "MOV TEMP[13], TEMP[3]\n"
+ "MOV TEMP[13].xy, TEMP[3].xyyy\n"
 
  /* Texture offset */
  "ADD TEMP[2].x, TEMP[2]., IMM[3].\n"
@@ -162,10 +162,10 @@ const char *compute_shader_

Re: [Mesa-dev] [PATCH] nv50/ir: Add mul and mod constant optimizations

2019-07-23 Thread Ilia Mirkin

On Tue, Jul 23, 2019 at 11:15 AM Karol Herbst  wrote:
>
> On Tue, Jul 23, 2019 at 4:50 PM Ilia Mirkin  wrote:
> >
> > You handle 1/n but not 1%n? TBH, the 1/n code isn't 100% obvious to
> > me... 1/n = |n|-1 > 0 ?  i forget how SLCT works, but I can't
> > think of a way to finish that expression in terms of |n|-1 and n. And
> > what about n == 0. I'd just as soon drop that case.
> >
>
> is 1/0 actually defined in glsl? I thought that the result is
> undefined and we can basically do whatever, no? At least intel seems
> to return INT_MAX for 1/0

If you guys really like it, just add more comments that cover my questions.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] nv50/ir: Add mul and mod constant optimizations

2019-07-23 Thread Ilia Mirkin

You handle 1/n but not 1%n? TBH, the 1/n code isn't 100% obvious to
me... 1/n = |n|-1 > 0 ?  i forget how SLCT works, but I can't
think of a way to finish that expression in terms of |n|-1 and n. And
what about n == 0. I'd just as soon drop that case.

On Tue, Jul 23, 2019 at 10:20 AM Mark Menzynski  wrote:
>
> Optimizations for 0/n, 1/n and 0%n.
> No changes in shader db tests, because it is never used here, but it
> should become handy.
>
> Signed-off-by: Mark Menzynski 
> ---
>  .../nouveau/codegen/nv50_ir_peephole.cpp  | 30 +--
>  1 file changed, 28 insertions(+), 2 deletions(-)
>
> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp 
> b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
> index 0b3220903b9..12069e19808 100644
> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
> @@ -1177,10 +1177,28 @@ ConstantFolding::opnd(Instruction *i, ImmediateValue 
> , int s)
>break;
>
> case OP_DIV:
> -  if (s != 1 || (i->dType != TYPE_S32 && i->dType != TYPE_U32))
> +  if (i->dType != TYPE_S32 && i->dType != TYPE_U32)
>   break;
> +
>bld.setPosition(i, false);
> -  if (imm0.reg.data.u32 == 0) {
> +  if (s == 0) {
> + if (imm0.reg.data.u32 == 0) {
> +i->op = OP_MOV;
> +i->setSrc(1, NULL);
> + }
> + else if (imm0.reg.data.u32 == 1) {
> +Value *tA, *tB;
> +Instruction *slct;
> +
> +tA = bld.mkOp1v(OP_ABS, TYPE_U32, bld.getSSA(), i->getSrc(1));
> +tB = bld.mkOp2v(OP_ADD, TYPE_S32, bld.getSSA(), tA, 
> bld.loadImm(NULL, -1));
> +slct = bld.mkCmp(OP_SLCT, CC_GT, i->dType, bld.getSSA(), 
> TYPE_U32, bld.loadImm(NULL, 0), i->getSrc(1), tB);
> +i->def(0).replace(slct->getDef(0), false);
> + }
> + break;
> +  }
> +
> +  if (s != 1 || imm0.reg.data.u32 == 0) {
>   break;
>} else
>if (imm0.reg.data.u32 == 1) {
> @@ -1259,6 +1277,14 @@ ConstantFolding::opnd(Instruction *i, ImmediateValue 
> , int s)
>break;
>
> case OP_MOD:
> +  if (s == 0) {
> + if (imm0.reg.data.u32 == 0) {
> +i->op = OP_MOV;
> +i->setSrc(1, NULL);
> + }
> + break;
> +  }
> +
>if (s == 1 && imm0.isPow2()) {
>   bld.setPosition(i, false);
>   if (i->sType == TYPE_U32) {
> --
> 2.21.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] boolean usage in gallium

2019-07-22 Thread Ilia Mirkin

PSA - I've pushed the changes which flip the gallium APIs and their
direct users from boolean -> bool. I've tried to take every precaution
against breaking compilation, but it's conceivable something got
missed in the repository-wide update. However fixing it up should be
trivial - let me know if you should have trouble, I'm happy to help. I
have not updated any of the various drivers' internal usages, and
encourage gallium driver and state tracker maintainers to convert

boolean -> bool
TRUE -> true
FALSE -> false

The one caveat I'll note is that the following situation can occur with boolean:

void f(int x) { print(x); }
void g(boolean x) { f(x); }
void h() { g(123); }

This will print 123, since boolean is a char. With bool, it will print
1. IMHO any code which relies on this behavior is a bug, but bugs do
happen, so a blind conversion can be dangerous.

Cheers,

  -ilia

On Sat, Jun 29, 2019 at 12:08 AM Ilia Mirkin  wrote:
>
> Ken pointed out on IRC today that there was still a lot of "boolean"
> (vs bool/_Bool) usage in gallium. In fact, many interfaces are
> specified with boolean.
>
> I had it in my mind that I had at some point removed most boolean
> usage, but that is just not the case - first of all, the interfaces
> remain with it, and I could find no evidence of such a commit. I must
> have imagined it.
>
> Is there any reason to keep boolean around? I know conversions must be
> done carefully (since incorrect-but-working usage would not currently
> be caught by the compiler), but are there any practical reasons to
> avoid C99 _Bool in gallium code?
>
> If not, I may begin converting things over.
>
> Cheers,
>
>   -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/2] radv/gfx10: correctly determine the number of vertices per primitive

2019-07-22 Thread Ilia Mirkin

On Mon, Jul 22, 2019 at 11:49 AM Samuel Pitoiset
 wrote:
>
> For TES as NGG.
>
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/amd/vulkan/radv_nir_to_llvm.c | 17 -
>  1 file changed, 16 insertions(+), 1 deletion(-)
>
> diff --git a/src/amd/vulkan/radv_nir_to_llvm.c 
> b/src/amd/vulkan/radv_nir_to_llvm.c
> index 336bae28614..6e5a283f923 100644
> --- a/src/amd/vulkan/radv_nir_to_llvm.c
> +++ b/src/amd/vulkan/radv_nir_to_llvm.c
> @@ -112,6 +112,7 @@ struct radv_shader_context {
> unsigned gs_max_out_vertices;
> unsigned gs_output_prim;
>
> +   unsigned tes_point_mode;
> unsigned tes_primitive_mode;
>
> uint32_t tcs_patch_outputs_read;
> @@ -3304,7 +3305,6 @@ handle_ngg_outputs_post(struct radv_shader_context *ctx)
>  {
> LLVMBuilderRef builder = ctx->ac.builder;
> struct ac_build_if_state if_state;
> -   unsigned num_vertices = 3;
> LLVMValueRef tmp;
>
> assert((ctx->stage == MESA_SHADER_VERTEX ||
> @@ -3322,6 +3322,20 @@ handle_ngg_outputs_post(struct radv_shader_context 
> *ctx)
> ac_unpack_param(>ac, ctx->gs_vtx_offset[2], 0, 16),
> };
>
> +   /* Determine the number of vertices per primitive. */
> +   unsigned num_vertices;
> +
> +   if (ctx->stage == MESA_SHADER_VERTEX) {
> +   num_vertices = 3; /* TODO: optimize for points & lines */
> +   } else {
> +   if (ctx->tes_point_mode)
> +   num_vertices = 1;
> +   else if (ctx->tes_primitive_mode == GL_LINES)
> +   num_vertices = 2;
> +   else
> +   num_vertices = 3;
> +   }
> +
> /* TODO: streamout */
>
> /* Copy Primitive IDs from GS threads to the LDS address corresponding
> @@ -4435,6 +4449,7 @@ LLVMModuleRef ac_translate_nir_to_llvm(struct 
> ac_llvm_compiler *ac_llvm,
> ctx.tcs_num_inputs = 
> util_last_bit64(shader_info->info.vs.ls_outputs_written);
> ctx.tcs_num_patches = get_tcs_num_patches();
> } else if (shaders[i]->info.stage == MESA_SHADER_TESS_EVAL) {
> +   ctx.tes_point_mode = shaders[i]->info.tess.point_mode;

Drive-by-comment without reading the full context...

What if there's e.g. a GS which produces not-points? This bool will be
set, and the logic above will say num_vertices = 1, which presumably
is bad.

  -ilia

> ctx.tes_primitive_mode = 
> shaders[i]->info.tess.primitive_mode;
> ctx.abi.load_tess_varyings = load_tes_input;
> ctx.abi.load_tess_coord = load_tess_coord;
> --
> 2.22.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] nvc0/ir: Fix assert accessing null pointer

2019-07-19 Thread Ilia Mirkin

On Fri, Jul 19, 2019 at 12:07 PM Tobias Klausmann
 wrote:
> On 19.07.19 15:39, Eric Engestrom wrote:
> > On Friday, 2019-07-19 13:56:30 +0200, Mark Menzynski wrote:
> >> Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=111007
> >> Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=67
> > `Fixes:` is used to indicate the commit that introduced the code being
> > fixed, such as:
> >Fixes: 1c4e6d7ca83578caf521 ("nvc0/ir: propagate immediates to CALL 
> > input MOVs")
> > This tag is used by our tools to backport fixes to the relevant stable
> > releases.
> >
> > Bugzilla entries are referenced using the `Bugzilla:` prefix.
> >
> >> Signed-off-by: Mark Menzynski 
> >> ---
> >>   src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp | 2 +-
> >>   1 file changed, 1 insertion(+), 1 deletion(-)
> >>
> >> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp 
> >> b/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp
> >> index aca3b0afb1e..1f702a987d8 100644
> >> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp
> >> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp
> >> @@ -51,12 +51,12 @@ NVC0LegalizeSSA::handleDIV(Instruction *i)
> >>  // Generate movs to the input regs for the call we want to generate
> >>  for (int s = 0; i->srcExists(s); ++s) {
> >> Instruction *ld = i->getSrc(s)->getInsn();
> >> -  assert(ld->getSrc(0) != NULL);
> > I'll admit I don't know anything about this code, but it looks like
> > this might be a better fix?
> > assert(ld == NULL || ld->getSrc(0) != NULL)
> >
> > I cc'ed Tobias who wrote this code as he'll probably know best.
>
> Thanks for letting me know, but i'm kind of out of the loop and sadly
> don't remember too much about this one.
>
> Yet while having a look at the code i somehow wonder if there is a
> possibility to have
>
> if (!ld || ld->fixed || (ld->op != OP_LOAD && ld->op != OP_MOV) ||
>   !(ld->src(0).getFile() == FILE_IMMEDIATE))
>
> crash at ld->src(0), as this is only set when a value is added to the 
> instruction. So in the case ld->src(0) is legal, ld->getSrc(0) should be as 
> well and we would need the assert at the beginning...
>
> PS: Did a functional change bring this to the light? (I ran piglit in front 
> of every patch sumbission :/)

Nope. Just weird shaders that divide and/or mod 0 by a non-constant amount.

There's an argument to just remove that assert entirely - not sure
that it's adding a whole lot, except complication.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] gm107/ir: Add stg, ldg instructions and function for checking offset length

2019-07-19 Thread Ilia Mirkin

On Fri, Jul 19, 2019 at 11:34 AM Mark Menzynski  wrote:
>
> On Fri, Jul 19, 2019 at 5:04 PM Ilia Mirkin  wrote:
> >
> > On Fri, Jul 19, 2019 at 10:57 AM Mark Menzynski  wrote:
> > >
> > > Nvidia actively uses these instructions, maybe they are better in
> > > something.
> > > Long offset checking function was made because these functions only have 
> > > 24 bit
> > > address offsets.
> > >
> > > Signed-off-by: Mark Menzynski 
> > > ---
> > >  .../nouveau/codegen/nv50_ir_emit_gm107.cpp| 36 +++
> > >  1 file changed, 36 insertions(+)
> > >
> > > diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp 
> > > b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
> > > index 6eefe8f0025..c01a3017ba9 100644
> > > --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
> > > +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
> > > @@ -87,6 +87,7 @@ private:
> > > inline void emitADDR(int, int, int, int, const ValueRef &);
> > > inline void emitCBUF(int, int, int, int, int, const ValueRef &);
> > > inline bool longIMMD(const ValueRef &);
> > > +   inline bool longOffset(const ValueRef &);
> > > inline void emitIMMD(int, int, const ValueRef &);
> > >
> > > void emitCond3(int, CondCode);
> > > @@ -174,9 +175,11 @@ private:
> > > void emitLDC();
> > > void emitLDL();
> > > void emitLDS();
> > > +   void emitLDG();
> > > void emitLD();
> > > void emitSTL();
> > > void emitSTS();
> > > +   void emitSTG();
> > > void emitST();
> > > void emitALD();
> > > void emitAST();
> > > @@ -333,6 +336,17 @@ CodeEmitterGM107::longIMMD(const ValueRef )
> > > return false;
> > >  }
> > >
> > > +bool
> > > +CodeEmitterGM107::longOffset(const ValueRef )
> > > +{
> > > +   // TODO: check for other files as well?
> > > +   if (ref.getFile() != FILE_MEMORY_GLOBAL)
> > > +  return false;
> >
> > I haven't seen the uses (best to send stuff like this as a series),
> > but you're saying that if it's not global memory, then it's not a long
> > offset? I suspect in the caller it should be more like
> >
> > assert(file == global || !long offset) or something.
> >
> This is how I used it for Load. Store was used the same way. I have
> not sent it because we didn't found any noticeable changes with that:
>
> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
> b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
> index c01a3017ba9..f632178138b 100644
> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
> @@ -3603,7 +3603,12 @@ CodeEmitterGM107::emitInstruction(Instruction *i)
>case FILE_MEMORY_CONST : emitLDC(); break;
>case FILE_MEMORY_LOCAL : emitLDL(); break;
>case FILE_MEMORY_SHARED: emitLDS(); break;
> -  case FILE_MEMORY_GLOBAL: emitLD(); break;
> +  case FILE_MEMORY_GLOBAL:
> + if (longOffset(insn->src(0)))
> +emitLD();
> + else
> +emitLDG();
> + break;

OK. I'd dump the check from longOffset -- all it says is if an offset
is long. It just seems awkward as-is.

If people feel strongly though, I could be convinced otherwise.

  -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] gm107/ir: Add stg, ldg instructions and function for checking offset length

2019-07-19 Thread Ilia Mirkin

On Fri, Jul 19, 2019 at 10:57 AM Mark Menzynski  wrote:
>
> Nvidia actively uses these instructions, maybe they are better in
> something.
> Long offset checking function was made because these functions only have 24 
> bit
> address offsets.
>
> Signed-off-by: Mark Menzynski 
> ---
>  .../nouveau/codegen/nv50_ir_emit_gm107.cpp| 36 +++
>  1 file changed, 36 insertions(+)
>
> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp 
> b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
> index 6eefe8f0025..c01a3017ba9 100644
> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
> @@ -87,6 +87,7 @@ private:
> inline void emitADDR(int, int, int, int, const ValueRef &);
> inline void emitCBUF(int, int, int, int, int, const ValueRef &);
> inline bool longIMMD(const ValueRef &);
> +   inline bool longOffset(const ValueRef &);
> inline void emitIMMD(int, int, const ValueRef &);
>
> void emitCond3(int, CondCode);
> @@ -174,9 +175,11 @@ private:
> void emitLDC();
> void emitLDL();
> void emitLDS();
> +   void emitLDG();
> void emitLD();
> void emitSTL();
> void emitSTS();
> +   void emitSTG();
> void emitST();
> void emitALD();
> void emitAST();
> @@ -333,6 +336,17 @@ CodeEmitterGM107::longIMMD(const ValueRef )
> return false;
>  }
>
> +bool
> +CodeEmitterGM107::longOffset(const ValueRef )
> +{
> +   // TODO: check for other files as well?
> +   if (ref.getFile() != FILE_MEMORY_GLOBAL)
> +  return false;

I haven't seen the uses (best to send stuff like this as a series),
but you're saying that if it's not global memory, then it's not a long
offset? I suspect in the caller it should be more like

assert(file == global || !long offset) or something.

> +
> +   int32_t offset = ref.get()->reg.data.offset;
> +   return offset >  0x7f || offset < -0x80;

You have two spaces after the >. Remove one of them.

> +}
> +
>  void
>  CodeEmitterGM107::emitIMMD(int pos, int len, const ValueRef )
>  {
> @@ -2414,6 +2428,17 @@ CodeEmitterGM107::emitLDS()
> emitGPR  (0x00, insn->def(0));
>  }
>
> +void
> +CodeEmitterGM107::emitLDG()
> +{
> +   emitInsn (0xeed0);
> +   emitLDSTs(0x30, insn->dType);
> +   emitLDSTc(0x2e);
> +   emitField(0x2d, 1, insn->src(0).getIndirect(0)->getSize() == 8);

I didn't look, but we don't do something a bit more subtle on the
other ones, like checking if there's an indirect access in the first
place? With g[], it almost exclusively will be, but still...

> +   emitADDR (0x08, 0x14, 24, 0, insn->src(0));
> +   emitGPR  (0x00, insn->def(0));
> +}
> +
>  void
>  CodeEmitterGM107::emitLD()
>  {
> @@ -2445,6 +2470,17 @@ CodeEmitterGM107::emitSTS()
> emitGPR  (0x00, insn->src(1));
>  }
>
> +void
> +CodeEmitterGM107::emitSTG()
> +{
> +   emitInsn (0xeed8);
> +   emitLDSTs(0x30, insn->dType);
> +   emitLDSTc(0x2e);
> +   emitField(0x2d, 1, insn->src(0).getIndirect(0)->getSize() == 8);
> +   emitADDR (0x08, 0x14, 24, 0, insn->src(0));
> +   emitGPR  (0x00, insn->src(1));
> +}
> +
>  void
>  CodeEmitterGM107::emitST()
>  {
> --
> 2.21.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] nvc0/ir: Fix assert accessing null pointer

2019-07-19 Thread Ilia Mirkin

The patch is correct as-is.

Reviewed-by: Ilia Mirkin 

On Fri, Jul 19, 2019 at 9:39 AM Eric Engestrom  wrote:
>
> On Friday, 2019-07-19 13:56:30 +0200, Mark Menzynski wrote:
> > Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=111007
> > Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=67
>
> `Fixes:` is used to indicate the commit that introduced the code being
> fixed, such as:
>   Fixes: 1c4e6d7ca83578caf521 ("nvc0/ir: propagate immediates to CALL input 
> MOVs")
> This tag is used by our tools to backport fixes to the relevant stable
> releases.
>
> Bugzilla entries are referenced using the `Bugzilla:` prefix.
>
> > Signed-off-by: Mark Menzynski 
> > ---
> >  src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp 
> > b/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp
> > index aca3b0afb1e..1f702a987d8 100644
> > --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp
> > +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp
> > @@ -51,12 +51,12 @@ NVC0LegalizeSSA::handleDIV(Instruction *i)
> > // Generate movs to the input regs for the call we want to generate
> > for (int s = 0; i->srcExists(s); ++s) {
> >Instruction *ld = i->getSrc(s)->getInsn();
> > -  assert(ld->getSrc(0) != NULL);
>
> I'll admit I don't know anything about this code, but it looks like
> this might be a better fix?
>assert(ld == NULL || ld->getSrc(0) != NULL)
>
> I cc'ed Tobias who wrote this code as he'll probably know best.
>
> >// check if we are moving an immediate, propagate it in that case
> >if (!ld || ld->fixed || (ld->op != OP_LOAD && ld->op != OP_MOV) ||
> >  !(ld->src(0).getFile() == FILE_IMMEDIATE))
> >   bld.mkMovToReg(s, i->getSrc(s));
> >else {
> > + assert(ld->getSrc(0) != NULL);
> >   bld.mkMovToReg(s, ld->getSrc(0));
> >   // Clear the src, to make code elimination possible here before we
> >   // delete the instruction i later
> > --
> > 2.21.0
> >
> > ___
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 03/11] panfrost/midgard: Flush undefineds to zero

2019-07-10 Thread Ilia Mirkin

On Wed, Jul 10, 2019 at 9:25 AM Alyssa Rosenzweig
 wrote:
>
> Fixes a buggy dEQP test.
>
> Signed-off-by: Alyssa Rosenzweig 
> ---
>  .../drivers/panfrost/ci/expected-failures.txt |  6 --
>  src/gallium/drivers/panfrost/meson.build  |  1 +
>  .../drivers/panfrost/midgard/compiler.h   |  6 ++
>  .../panfrost/midgard/midgard_compile.c|  4 +
>  .../panfrost/midgard/nir_undef_to_zero.c  | 89 +++
>  5 files changed, 100 insertions(+), 6 deletions(-)
>  create mode 100644 src/gallium/drivers/panfrost/midgard/nir_undef_to_zero.c
>
> diff --git a/src/gallium/drivers/panfrost/ci/expected-failures.txt 
> b/src/gallium/drivers/panfrost/ci/expected-failures.txt
> index 33059118b49..6f52773cc73 100644
> --- a/src/gallium/drivers/panfrost/ci/expected-failures.txt
> +++ b/src/gallium/drivers/panfrost/ci/expected-failures.txt
> @@ -256,12 +256,6 @@ dEQP-GLES2.functional.rasterization.limits.points
>  dEQP-GLES2.functional.shaders.builtin_variable.fragcoord_w
>  dEQP-GLES2.functional.shaders.preprocessor.predefined_macros.line_2_fragment
>  dEQP-GLES2.functional.shaders.preprocessor.predefined_macros.line_2_vertex
> -dEQP-GLES2.functional.shaders.random.all_features.fragment.0
> -dEQP-GLES2.functional.shaders.random.all_features.fragment.16
> -dEQP-GLES2.functional.shaders.random.all_features.fragment.5
> -dEQP-GLES2.functional.shaders.random.all_features.fragment.6
> -dEQP-GLES2.functional.shaders.random.all_features.vertex.0
> -dEQP-GLES2.functional.shaders.random.all_features.vertex.17
>  
> dEQP-GLES2.functional.shaders.scoping.valid.local_variable_hides_function_parameter_fragment
>  
> dEQP-GLES2.functional.shaders.scoping.valid.local_variable_hides_function_parameter_vertex
>  dEQP-GLES2.functional.shaders.struct.local.dynamic_loop_assignment_fragment

> diff --git a/src/gallium/drivers/panfrost/midgard/nir_undef_to_zero.c 
> b/src/gallium/drivers/panfrost/midgard/nir_undef_to_zero.c
> new file mode 100644
> index 000..aacecc17e9d
> --- /dev/null
> +++ b/src/gallium/drivers/panfrost/midgard/nir_undef_to_zero.c
> + * Flushes undefined SSA values to a zero vector fo the appropriate component
> + * count, to avoid undefined behaviour in the resulting shader. Not required
> + * for conformance as use of uninitialized variables is explicitly left
> + * undefined by the spec.  Works around buggy apps, however.
> + *
> + * Call immediately after nir_opt_undef. If called before, larger 
> optimization
> + * opportunities from the former pass will be missed. If called outside of an
> + * optimization loop, constant propagation and algebraic optimizations won't 
> be
> + * able to kick in to reduce stuff consuming the zero.

I don't think other drivers have had to do this. I've definitely
resisted stuff like that in nouveau in the past. Are you sure this is
necessary? Would be good to check if these tests pass or fail on
nouveau, for example. (Although by coincidence, they could be ending
up with zero values, of course...)

  -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] nvc0: remove nvc0_program.tp.input_patch_size

2019-07-08 Thread Ilia Mirkin

Reviewed-by: Ilia Mirkin 

The tcp input patch size is not a compile-time value, and even if it
were, not sure where we'd use it. The tep input patch size is set at
compile time, but in the code you're removing, we set it to ~0
anyways.

On Mon, Jul 8, 2019 at 3:22 PM Karol Herbst  wrote:
>
> right now that's dead code
>
> Signed-off-by: Karol Herbst 
> ---
>  src/gallium/drivers/nouveau/codegen/nv50_ir_driver.h | 1 -
>  src/gallium/drivers/nouveau/nvc0/nvc0_program.c  | 4 
>  src/gallium/drivers/nouveau/nvc0/nvc0_program.h  | 1 -
>  3 files changed, 6 deletions(-)
>
> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_driver.h 
> b/src/gallium/drivers/nouveau/codegen/nv50_ir_driver.h
> index 7c835ceab8d..95b3d633ee6 100644
> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_driver.h
> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_driver.h
> @@ -123,7 +123,6 @@ struct nv50_ir_prog_info
>   bool usesDrawParameters;
>} vp;
>struct {
> - uint8_t inputPatchSize;
>   uint8_t outputPatchSize;
>   uint8_t partitioning;/* PIPE_TESS_PART */
>   int8_t winding;  /* +1 (clockwise) / -1 (counter-clockwise) 
> */
> diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_program.c 
> b/src/gallium/drivers/nouveau/nvc0/nvc0_program.c
> index 1ff9f19f139..180b31ea893 100644
> --- a/src/gallium/drivers/nouveau/nvc0/nvc0_program.c
> +++ b/src/gallium/drivers/nouveau/nvc0/nvc0_program.c
> @@ -343,8 +343,6 @@ nvc0_tcp_gen_header(struct nvc0_program *tcp, struct 
> nv50_ir_prog_info *info)
>  {
> unsigned opcs = 6; /* output patch constants (at least the TessFactors) */
>
> -   tcp->tp.input_patch_size = info->prop.tp.inputPatchSize;
> -
> if (info->numPatchConstants)
>opcs = 8 + info->numPatchConstants * 4;
>
> @@ -374,8 +372,6 @@ nvc0_tcp_gen_header(struct nvc0_program *tcp, struct 
> nv50_ir_prog_info *info)
>  static int
>  nvc0_tep_gen_header(struct nvc0_program *tep, struct nv50_ir_prog_info *info)
>  {
> -   tep->tp.input_patch_size = ~0;
> -
> tep->hdr[0] = 0x20061 | (3 << 10);
> tep->hdr[4] = 0xff000;
>
> diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_program.h 
> b/src/gallium/drivers/nouveau/nvc0/nvc0_program.h
> index b73822ea9f7..183b14a42c2 100644
> --- a/src/gallium/drivers/nouveau/nvc0/nvc0_program.h
> +++ b/src/gallium/drivers/nouveau/nvc0/nvc0_program.h
> @@ -54,7 +54,6 @@ struct nvc0_program {
> } fp;
> struct {
>uint32_t tess_mode; /* ~0 if defined by the other stage */
> -  uint32_t input_patch_size;
> } tp;
> struct {
>uint32_t lmem_size; /* local memory (TGSI PRIVATE resource) size */
> --
> 2.21.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [MR] Update README to recommend MRs instead of `git send-email`

2019-07-06 Thread Ilia Mirkin

I see this as driving away contributions, esp from new people. MR's
are annoying to create, since they require dealing with the hosting
provider, getting it to host clones, etc. Everyone has email.

Cheers,

  -ilia

On Sat, Jul 6, 2019 at 7:41 AM Eric Engestrom  wrote:
>
> Hi,
>
> I sent an MR to update the README, but as pointed out there I should
> make sure people who don't look at MRs are also aware of it, so here is:
>
> https://gitlab.freedesktop.org/mesa/mesa/merge_requests/1223
>
> ---8<---
> diff --git a/README.rst b/README.rst
> index 4ccd72eca38a0845571c..eba097e652edde48a735 100644
> --- a/README.rst
> +++ b/README.rst
> @@ -56,5 +56,6 @@ Contributions are welcome, and step-by-step instructions 
> can be found in our
>  documentation (`docs/submittingpatches.html
>  `_).
>
> -Note that Mesa uses email mailing-lists for patches submission, review and
> -discussions.
> +We recommend using GitLab Merge Requests (MRs) for patch submission and
> +review, and email mailing-lists for discussions, but the old `git send-email`
> +method is still accepted.
> --->8---
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] mesa: Set minimum possible GLSL version

2019-07-04 Thread Ilia Mirkin

Reviewed-by: Ilia Mirkin 

On Fri, Jul 5, 2019 at 12:59 AM Ian Romanick  wrote:
>
> From: Ian Romanick 
>
> Set the absolute minimum possible GLSL version.  API_OPENGL_CORE can
> mean an OpenGL 3.0 forward-compatible context, so that implies a minimum
> possible version of 1.30.  Otherwise, the minimum possible version 1.20.
> Since Mesa unconditionally advertises GL_ARB_shading_language_100 and
> GL_ARB_shader_objects, every driver has GLSL 1.20... even if they don't
> advertise any extensions to enable any shader stages (e.g.,
> GL_ARB_vertex_shader).
>
> Converts about 2,500 piglit tests from crash to skip on NV18.
> ---
>  src/mesa/main/context.c | 11 +++
>  1 file changed, 11 insertions(+)
>
> diff --git a/src/mesa/main/context.c b/src/mesa/main/context.c
> index e5a89d9c2fc..516660d55d2 100644
> --- a/src/mesa/main/context.c
> +++ b/src/mesa/main/context.c
> @@ -616,6 +616,17 @@ _mesa_init_constants(struct gl_constants *consts, gl_api 
> api)
> consts->MaxProgramMatrices = MAX_PROGRAM_MATRICES;
> consts->MaxProgramMatrixStackDepth = MAX_PROGRAM_MATRIX_STACK_DEPTH;
>
> +   /* Set the absolute minimum possible GLSL version.  API_OPENGL_CORE can
> +* mean an OpenGL 3.0 forward-compatible context, so that implies a 
> minimum
> +* possible version of 1.30.  Otherwise, the minimum possible version 
> 1.20.
> +* Since Mesa unconditionally advertises GL_ARB_shading_language_100 and
> +* GL_ARB_shader_objects, every driver has GLSL 1.20... even if they don't
> +* advertise any extensions to enable any shader stages (e.g.,
> +* GL_ARB_vertex_shader).
> +*/
> +   consts->GLSLVersion = api == API_OPENGL_CORE ? 130 : 120;
> +   consts->GLSLVersionCompat = consts->GLSLVersion;
> +
> /* Assume that if GLSL 1.30+ (or GLSL ES 3.00+) is supported that
>  * gl_VertexID is implemented using a native hardware register with OpenGL
>  * semantics.
> --
> 2.21.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] mesa: avoid _mesa_problem invocation when running on drivers without glsl

2019-07-04 Thread Ilia Mirkin

On Fri, Jul 5, 2019 at 12:56 AM Ian Romanick  wrote:
>
> On 7/4/19 4:21 PM, Ilia Mirkin wrote:
> > For example wine might query GL_SHADING_LANGUAGE_VERSION on a driver
> > that doesn't support GLSL. This is not a problem in itself, we can just
> > return a INVALID_ENUM error.
> >
> > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109524
> > Signed-off-by: Ilia Mirkin 
> > ---
> >  src/mesa/main/getstring.c | 2 ++
> >  1 file changed, 2 insertions(+)
> >
> > diff --git a/src/mesa/main/getstring.c b/src/mesa/main/getstring.c
> > index 3d5ae0b694b..6c0dd9048da 100644
> > --- a/src/mesa/main/getstring.c
> > +++ b/src/mesa/main/getstring.c
> > @@ -150,6 +150,8 @@ _mesa_GetString( GLenum name )
> >case GL_SHADING_LANGUAGE_VERSION:
> >   if (ctx->API == API_OPENGLES)
> >  break;
> > + if (_mesa_is_desktop_gl(ctx) && ctx->Const.GLSLVersion == 0)
> > +break;
>
> GLSL version should never be zero.  We advertise GL_ARB_shading_language
> in all drivers, so every driver has "GLSL" even if it doesn't have
> vertex shaders or fragment shaders.  I thought I sent out a patch some
> time ago that set GLSLVersion to 120 by default to avoid problems like this.

That may be, but either they weren't merged, or other things changed
in between which nullified their effect. Currently
_mesa_compute_version will leave GLSLVersion blank for pre-GL 2.0
contexts. FWIW this makes wine work ok:

diff --git a/src/mesa/main/version.c b/src/mesa/main/version.c
index 5f004ff1dab..948bdbcc891 100644
--- a/src/mesa/main/version.c
+++ b/src/mesa/main/version.c
@@ -634,6 +634,8 @@ _mesa_compute_version(struct gl_context *ctx)
   default:
  if (ctx->Version >= 33)
 ctx->Const.GLSLVersion = ctx->Version * 10;
+ else
+ctx->Const.GLSLVersion = 120;
  break;
   }
}


  -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] mesa: avoid _mesa_problem invocation when running on drivers without glsl

2019-07-04 Thread Ilia Mirkin

For example wine might query GL_SHADING_LANGUAGE_VERSION on a driver
that doesn't support GLSL. This is not a problem in itself, we can just
return a INVALID_ENUM error.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109524
Signed-off-by: Ilia Mirkin 
---
 src/mesa/main/getstring.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/src/mesa/main/getstring.c b/src/mesa/main/getstring.c
index 3d5ae0b694b..6c0dd9048da 100644
--- a/src/mesa/main/getstring.c
+++ b/src/mesa/main/getstring.c
@@ -150,6 +150,8 @@ _mesa_GetString( GLenum name )
   case GL_SHADING_LANGUAGE_VERSION:
  if (ctx->API == API_OPENGLES)
 break;
+ if (_mesa_is_desktop_gl(ctx) && ctx->Const.GLSLVersion == 0)
+break;
 return shading_language_version(ctx);
   case GL_PROGRAM_ERROR_STRING_ARB:
  if (ctx->API == API_OPENGL_COMPAT &&
-- 
2.21.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] gallium: remove boolean from state tracker APIs

2019-07-04 Thread Ilia Mirkin

Signed-off-by: Ilia Mirkin 
---

It's actually a bit tricky to compile all the bits involved, so some of
this is untested. However if it so happens that I missed a spot, it
should be trivial to fix.

 src/gallium/include/state_tracker/graw.h  |  8 +-
 src/gallium/include/state_tracker/st_api.h| 73 +--
 src/gallium/include/state_tracker/sw_winsys.h |  5 +-
 src/gallium/state_trackers/dri/dri_drawable.c | 34 -
 src/gallium/state_trackers/dri/dri_drawable.h |  6 +-
 src/gallium/state_trackers/dri/dri_screen.c   |  8 +-
 src/gallium/state_trackers/glx/xlib/xm_st.c   | 24 +++---
 src/gallium/state_trackers/glx/xlib/xm_st.h   |  2 +-
 src/gallium/state_trackers/hgl/hgl.c  | 14 ++--
 src/gallium/state_trackers/osmesa/osmesa.c|  8 +-
 src/gallium/state_trackers/wgl/stw_st.c   | 14 ++--
 src/gallium/state_trackers/wgl/stw_st.h   |  2 +-
 src/gallium/targets/graw-null/graw_util.c | 14 ++--
 src/gallium/winsys/sw/dri/dri_sw_winsys.c | 12 +--
 src/gallium/winsys/sw/gdi/gdi_sw_winsys.c | 10 +--
 src/gallium/winsys/sw/hgl/hgl_sw_winsys.c |  8 +-
 .../winsys/sw/kms-dri/kms_dri_sw_winsys.c | 12 +--
 src/gallium/winsys/sw/null/null_sw_winsys.c   |  8 +-
 .../winsys/sw/wrapper/wrapper_sw_winsys.c | 10 +--
 src/gallium/winsys/sw/xlib/xlib_sw_winsys.c   | 14 ++--
 src/mesa/state_tracker/st_manager.c   | 50 ++---
 src/mesa/state_tracker/st_manager.h   |  2 +-
 22 files changed, 168 insertions(+), 170 deletions(-)

diff --git a/src/gallium/include/state_tracker/graw.h 
b/src/gallium/include/state_tracker/graw.h
index 78ddf0a87f7..af81cc8871b 100644
--- a/src/gallium/include/state_tracker/graw.h
+++ b/src/gallium/include/state_tracker/graw.h
@@ -79,7 +79,7 @@ PUBLIC void *graw_parse_fragment_shader( struct pipe_context 
*pipe,
  * If an option has been successfully parsed, argi is updated
  * to point just after the option and return TRUE.
  */
-PUBLIC boolean graw_parse_args(int *argi, int argc, char *argv[]);
+PUBLIC bool graw_parse_args(int *argi, int argc, char *argv[]);
 
 /* Saves surface contents to a file.
  *
@@ -89,8 +89,8 @@ PUBLIC boolean graw_parse_args(int *argi, int argc, char 
*argv[]);
  *
  * Returns TRUE if the surface has been saved.
  */
-PUBLIC boolean graw_save_surface_to_file(struct pipe_context *pipe,
- struct pipe_surface *surface,
- const char *filename);
+PUBLIC bool graw_save_surface_to_file(struct pipe_context *pipe,
+  struct pipe_surface *surface,
+  const char *filename);
 
 #endif
diff --git a/src/gallium/include/state_tracker/st_api.h 
b/src/gallium/include/state_tracker/st_api.h
index 2b63b8a3d2a..b2b81b6ebc4 100644
--- a/src/gallium/include/state_tracker/st_api.h
+++ b/src/gallium/include/state_tracker/st_api.h
@@ -27,7 +27,6 @@
 #ifndef _ST_API_H_
 #define _ST_API_H_
 
-#include "pipe/p_compiler.h"
 #include "pipe/p_format.h"
 
 /**
@@ -218,19 +217,19 @@ struct st_visual
  */
 struct st_config_options
 {
-   boolean disable_blend_func_extended;
-   boolean disable_glsl_line_continuations;
-   boolean force_glsl_extensions_warn;
+   bool disable_blend_func_extended;
+   bool disable_glsl_line_continuations;
+   bool force_glsl_extensions_warn;
unsigned force_glsl_version;
-   boolean allow_glsl_extension_directive_midshader;
-   boolean allow_glsl_builtin_const_expression;
-   boolean allow_glsl_relaxed_es;
-   boolean allow_glsl_builtin_variable_redeclaration;
-   boolean allow_higher_compat_version;
-   boolean glsl_zero_init;
-   boolean force_glsl_abs_sqrt;
-   boolean allow_glsl_cross_stage_interpolation_mismatch;
-   boolean allow_glsl_layout_qualifier_on_function_parameters;
+   bool allow_glsl_extension_directive_midshader;
+   bool allow_glsl_builtin_const_expression;
+   bool allow_glsl_relaxed_es;
+   bool allow_glsl_builtin_variable_redeclaration;
+   bool allow_higher_compat_version;
+   bool glsl_zero_init;
+   bool force_glsl_abs_sqrt;
+   bool allow_glsl_cross_stage_interpolation_mismatch;
+   bool allow_glsl_layout_qualifier_on_function_parameters;
unsigned char config_options_sha1[20];
 };
 
@@ -319,9 +318,9 @@ struct st_framebuffer_iface
 *
 * @att is one of the front buffer attachments.
 */
-   boolean (*flush_front)(struct st_context_iface *stctx,
-  struct st_framebuffer_iface *stfbi,
-  enum st_attachment_type statt);
+   bool (*flush_front)(struct st_context_iface *stctx,
+   struct st_framebuffer_iface *stfbi,
+   enum st_attachment_type statt);
 
/**
 * The state tracker asks for the textures it needs.
@@ -340,13 +339,13 @@ struct st_framebuffer_iface
 * the last call might be destroyed.  This behavior might change in the
 * future.
 */
-   boolean (*validate)(struct st_

Re: [Mesa-dev] [PATCH] nouveau: handle new CAPS

2019-07-02 Thread Ilia Mirkin

On Tue, Jul 2, 2019 at 12:04 PM Karol Herbst  wrote:
>
> On Tue, Jul 2, 2019 at 5:54 PM Ilia Mirkin  wrote:
> >
> > Can you check on PIPE_CAP_COMPUTE_SHADER_DERIVATIVES ? I think we
> > should be able to just flip that on for nvc0. Also the
> > CS_DERIVED_SYSTEM_VALUES thing might be useful -- I had wanted to do
> > that a while ago but laziness defeated me. Now that it's there though
> > ... we have sysvals for many of those derived things.
> >
> > Or at least add commentary about each one, like "should be enabled
> > when we get to it" sort of thing.
> >
>
> I added a trello card for the PIPE_CAP_COMPUTE_SHADER_DERIVATIVES one,
> but I could add another one for CS_DERIVED_SYSTEM_VALUES

OK. I have a mild preference for there to be a comment about something
that should be supportable in the code, but I'm good either way.

Reviewed-by: Ilia Mirkin 

>
> > On Tue, Jul 2, 2019 at 11:49 AM Karol Herbst  wrote:
> > >
> > > Signed-off-by: Karol Herbst 
> > > ---
> > >  src/gallium/drivers/nouveau/nv50/nv50_screen.c | 13 +
> > >  src/gallium/drivers/nouveau/nvc0/nvc0_screen.c | 13 +
> > >  2 files changed, 26 insertions(+)
> > >
> > > diff --git a/src/gallium/drivers/nouveau/nv50/nv50_screen.c 
> > > b/src/gallium/drivers/nouveau/nv50/nv50_screen.c
> > > index b84330b4b38..24796aff1ce 100644
> > > --- a/src/gallium/drivers/nouveau/nv50/nv50_screen.c
> > > +++ b/src/gallium/drivers/nouveau/nv50/nv50_screen.c
> > > @@ -320,6 +320,13 @@ nv50_screen_get_param(struct pipe_screen *pscreen, 
> > > enum pipe_cap param)
> > > case PIPE_CAP_NIR_COMPACT_ARRAYS:
> > > case PIPE_CAP_COMPUTE:
> > > case PIPE_CAP_IMAGE_LOAD_FORMATTED:
> > > +   case PIPE_CAP_COMPUTE_SHADER_DERIVATIVES:
> > > +   case PIPE_CAP_ATOMIC_FLOAT_MINMAX:
> > > +   case PIPE_CAP_CONSERVATIVE_RASTER_INNER_COVERAGE:
> > > +   case PIPE_CAP_FRAGMENT_SHADER_INTERLOCK:
> > > +   case PIPE_CAP_CS_DERIVED_SYSTEM_VALUES_SUPPORTED:
> > > +   case PIPE_CAP_FBFETCH_COHERENT:
> > > +   case PIPE_CAP_TGSI_SKIP_SHRINK_IO_ARRAYS:
> > >return 0;
> > >
> > > case PIPE_CAP_VENDOR_ID:
> > > @@ -338,8 +345,14 @@ nv50_screen_get_param(struct pipe_screen *pscreen, 
> > > enum pipe_cap param)
> > >return dev->vram_size >> 20;
> > > case PIPE_CAP_UMA:
> > >return 0;
> > > +
> > > default:
> > >debug_printf("%s: unhandled cap %d\n", __func__, param);
> > > +  /* fallthrough */
> > > +   /* caps where we want the default value */
> > > +   case PIPE_CAP_DMABUF:
> > > +   case PIPE_CAP_ESSL_FEATURE_LEVEL:
> > > +   case PIPE_CAP_MAX_FRAMES_IN_FLIGHT:
> > >return u_pipe_screen_get_param_defaults(pscreen, param);
> > > }
> > >  }
> > > diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c 
> > > b/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c
> > > index 3a543e54d1f..bf883631b86 100644
> > > --- a/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c
> > > +++ b/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c
> > > @@ -355,6 +355,13 @@ nvc0_screen_get_param(struct pipe_screen *pscreen, 
> > > enum pipe_cap param)
> > > case PIPE_CAP_GLSL_TESS_LEVELS_AS_INPUTS:
> > > case PIPE_CAP_NIR_COMPACT_ARRAYS:
> > > case PIPE_CAP_IMAGE_LOAD_FORMATTED:
> > > +   case PIPE_CAP_COMPUTE_SHADER_DERIVATIVES:
> > > +   case PIPE_CAP_ATOMIC_FLOAT_MINMAX:
> > > +   case PIPE_CAP_CONSERVATIVE_RASTER_INNER_COVERAGE:
> > > +   case PIPE_CAP_FRAGMENT_SHADER_INTERLOCK:
> > > +   case PIPE_CAP_CS_DERIVED_SYSTEM_VALUES_SUPPORTED:
> > > +   case PIPE_CAP_FBFETCH_COHERENT:
> > > +   case PIPE_CAP_TGSI_SKIP_SHRINK_IO_ARRAYS:
> > >return 0;
> > >
> > > case PIPE_CAP_VENDOR_ID:
> > > @@ -373,8 +380,14 @@ nvc0_screen_get_param(struct pipe_screen *pscreen, 
> > > enum pipe_cap param)
> > >return dev->vram_size >> 20;
> > > case PIPE_CAP_UMA:
> > >return 0;
> > > +
> > > default:
> > >debug_printf("%s: unhandled cap %d\n", __func__, param);
> > > +  /* fallthrough */
> > > +   /* caps where we want the default value */
> > > +   case PIPE_CAP_DMABUF:
> > > +   case PIPE_CAP_ESSL_FEATURE_LEVEL:
> > > +   case PIPE_CAP_MAX_FRAMES_IN_FLIGHT:
> > >return u_pipe_screen_get_param_defaults(pscreen, param);
> > > }
> > >  }
> > > --
> > > 2.21.0
> > >
> > > ___
> > > mesa-dev mailing list
> > > mesa-dev@lists.freedesktop.org
> > > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] nouveau: handle new CAPS

2019-07-02 Thread Ilia Mirkin

Can you check on PIPE_CAP_COMPUTE_SHADER_DERIVATIVES ? I think we
should be able to just flip that on for nvc0. Also the
CS_DERIVED_SYSTEM_VALUES thing might be useful -- I had wanted to do
that a while ago but laziness defeated me. Now that it's there though
... we have sysvals for many of those derived things.

Or at least add commentary about each one, like "should be enabled
when we get to it" sort of thing.

On Tue, Jul 2, 2019 at 11:49 AM Karol Herbst  wrote:
>
> Signed-off-by: Karol Herbst 
> ---
>  src/gallium/drivers/nouveau/nv50/nv50_screen.c | 13 +
>  src/gallium/drivers/nouveau/nvc0/nvc0_screen.c | 13 +
>  2 files changed, 26 insertions(+)
>
> diff --git a/src/gallium/drivers/nouveau/nv50/nv50_screen.c 
> b/src/gallium/drivers/nouveau/nv50/nv50_screen.c
> index b84330b4b38..24796aff1ce 100644
> --- a/src/gallium/drivers/nouveau/nv50/nv50_screen.c
> +++ b/src/gallium/drivers/nouveau/nv50/nv50_screen.c
> @@ -320,6 +320,13 @@ nv50_screen_get_param(struct pipe_screen *pscreen, enum 
> pipe_cap param)
> case PIPE_CAP_NIR_COMPACT_ARRAYS:
> case PIPE_CAP_COMPUTE:
> case PIPE_CAP_IMAGE_LOAD_FORMATTED:
> +   case PIPE_CAP_COMPUTE_SHADER_DERIVATIVES:
> +   case PIPE_CAP_ATOMIC_FLOAT_MINMAX:
> +   case PIPE_CAP_CONSERVATIVE_RASTER_INNER_COVERAGE:
> +   case PIPE_CAP_FRAGMENT_SHADER_INTERLOCK:
> +   case PIPE_CAP_CS_DERIVED_SYSTEM_VALUES_SUPPORTED:
> +   case PIPE_CAP_FBFETCH_COHERENT:
> +   case PIPE_CAP_TGSI_SKIP_SHRINK_IO_ARRAYS:
>return 0;
>
> case PIPE_CAP_VENDOR_ID:
> @@ -338,8 +345,14 @@ nv50_screen_get_param(struct pipe_screen *pscreen, enum 
> pipe_cap param)
>return dev->vram_size >> 20;
> case PIPE_CAP_UMA:
>return 0;
> +
> default:
>debug_printf("%s: unhandled cap %d\n", __func__, param);
> +  /* fallthrough */
> +   /* caps where we want the default value */
> +   case PIPE_CAP_DMABUF:
> +   case PIPE_CAP_ESSL_FEATURE_LEVEL:
> +   case PIPE_CAP_MAX_FRAMES_IN_FLIGHT:
>return u_pipe_screen_get_param_defaults(pscreen, param);
> }
>  }
> diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c 
> b/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c
> index 3a543e54d1f..bf883631b86 100644
> --- a/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c
> +++ b/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c
> @@ -355,6 +355,13 @@ nvc0_screen_get_param(struct pipe_screen *pscreen, enum 
> pipe_cap param)
> case PIPE_CAP_GLSL_TESS_LEVELS_AS_INPUTS:
> case PIPE_CAP_NIR_COMPACT_ARRAYS:
> case PIPE_CAP_IMAGE_LOAD_FORMATTED:
> +   case PIPE_CAP_COMPUTE_SHADER_DERIVATIVES:
> +   case PIPE_CAP_ATOMIC_FLOAT_MINMAX:
> +   case PIPE_CAP_CONSERVATIVE_RASTER_INNER_COVERAGE:
> +   case PIPE_CAP_FRAGMENT_SHADER_INTERLOCK:
> +   case PIPE_CAP_CS_DERIVED_SYSTEM_VALUES_SUPPORTED:
> +   case PIPE_CAP_FBFETCH_COHERENT:
> +   case PIPE_CAP_TGSI_SKIP_SHRINK_IO_ARRAYS:
>return 0;
>
> case PIPE_CAP_VENDOR_ID:
> @@ -373,8 +380,14 @@ nvc0_screen_get_param(struct pipe_screen *pscreen, enum 
> pipe_cap param)
>return dev->vram_size >> 20;
> case PIPE_CAP_UMA:
>return 0;
> +
> default:
>debug_printf("%s: unhandled cap %d\n", __func__, param);
> +  /* fallthrough */
> +   /* caps where we want the default value */
> +   case PIPE_CAP_DMABUF:
> +   case PIPE_CAP_ESSL_FEATURE_LEVEL:
> +   case PIPE_CAP_MAX_FRAMES_IN_FLIGHT:
>return u_pipe_screen_get_param_defaults(pscreen, param);
> }
>  }
> --
> 2.21.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 7/7] panfrost: Implement instanced rendering

2019-07-01 Thread Ilia Mirkin

On Mon, Jul 1, 2019 at 10:56 AM Alyssa Rosenzweig
 wrote:
> As a side effect, we rework how vertex buffers are handled, duplicating
> them to be 1:1 with vertex descriptors to simplify instancing code paths
> dramatically. This might be a performance regression, but this remains
> to be seen; if so, we can always deduplicate later with some added logic
> in pan_instancing.c

If you're interested, nouveau has logic to optionally go the shared
(1:N) or non-shared (1:1) way of doing it. Have a look at nvc0_vbo.c
if you're curious -- look for mentions of "shared".

Cheers,

  -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] egl: Make EGL_EXT_device_drm optional.

2019-06-29 Thread Ilia Mirkin

It looks like I was mistaken -- a primary node is always allocated. Sorry!

On Sat, Jun 29, 2019 at 4:28 AM Mathias Fröhlich
 wrote:
>
> Emil,
>
> Ok, I thought Ilja is a reference here. In this case forget that change!
> Thanks!
>
> best
>
> Mathias
>
> On Friday, 28 June 2019 18:26:07 CEST Emil Velikov wrote:
> > On 2019/06/28, mathias.froehl...@gmx.net wrote:
> > > From: Mathias Fröhlich 
> > >
> > > Hi,
> > >
> > > Ilia mentioned that there are drm rendernode only drivers out there.
> > > To support an egl device on those platforms, make the EGL_EXT_device_drm
> > > device extension optional.
> > >
> > Currently DRM core mandates that primary node is available, even for GPU
> > only devices.
> >
> > A while back, we had a chat with kernel people why we do so. IIRC the
> > conclustion was that existing userspace will just work, since it already
> > must handle cases when you unplug your only monitor.
> >
> > I think Ilia was having a pretty reasonable assumption, that's why I did
> > before digging deeper, as opposed to saying there _are_ such cases.
> >
> > HTH
> > Emil
> >
>
>
>
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] boolean usage in gallium

2019-06-28 Thread Ilia Mirkin

Ken pointed out on IRC today that there was still a lot of "boolean"
(vs bool/_Bool) usage in gallium. In fact, many interfaces are
specified with boolean.

I had it in my mind that I had at some point removed most boolean
usage, but that is just not the case - first of all, the interfaces
remain with it, and I could find no evidence of such a commit. I must
have imagined it.

Is there any reason to keep boolean around? I know conversions must be
done carefully (since incorrect-but-working usage would not currently
be caught by the compiler), but are there any practical reasons to
avoid C99 _Bool in gallium code?

If not, I may begin converting things over.

Cheers,

  -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] nouveau: fix frees in unsupported IR error paths.

2019-06-18 Thread Ilia Mirkin

Reviewed-by: Ilia Mirkin 

On Tue, Jun 18, 2019 at 5:14 PM Dave Airlie  wrote:
>
> From: Dave Airlie 
>
> This is pointless in that we won't ever hit those paths in real life,
> but coverity complains.
>
> Fixes: f014ae3c7cce ("nouveau: add support for nir")
> ---
>  src/gallium/drivers/nouveau/nv50/nv50_program.c | 1 +
>  src/gallium/drivers/nouveau/nv50/nv50_state.c   | 2 ++
>  src/gallium/drivers/nouveau/nvc0/nvc0_program.c | 1 +
>  src/gallium/drivers/nouveau/nvc0/nvc0_state.c   | 2 ++
>  4 files changed, 6 insertions(+)
>
> diff --git a/src/gallium/drivers/nouveau/nv50/nv50_program.c 
> b/src/gallium/drivers/nouveau/nv50/nv50_program.c
> index 940fb9ce25c..a725aedcd8e 100644
> --- a/src/gallium/drivers/nouveau/nv50/nv50_program.c
> +++ b/src/gallium/drivers/nouveau/nv50/nv50_program.c
> @@ -346,6 +346,7 @@ nv50_program_translate(struct nv50_program *prog, 
> uint16_t chipset,
>break;
> default:
>assert(!"unsupported IR!");
> +  free(info);
>return false;
> }
>
> diff --git a/src/gallium/drivers/nouveau/nv50/nv50_state.c 
> b/src/gallium/drivers/nouveau/nv50/nv50_state.c
> index 228feced5d1..89558ee442f 100644
> --- a/src/gallium/drivers/nouveau/nv50/nv50_state.c
> +++ b/src/gallium/drivers/nouveau/nv50/nv50_state.c
> @@ -768,6 +768,7 @@ nv50_sp_state_create(struct pipe_context *pipe,
>break;
> default:
>assert(!"unsupported IR!");
> +  free(prog);
>return NULL;
> }
>
> @@ -864,6 +865,7 @@ nv50_cp_state_create(struct pipe_context *pipe,
>break;
> default:
>assert(!"unsupported IR!");
> +  free(prog);
>return NULL;
> }
>
> diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_program.c 
> b/src/gallium/drivers/nouveau/nvc0/nvc0_program.c
> index c81d8952c98..1ff9f19f139 100644
> --- a/src/gallium/drivers/nouveau/nvc0/nvc0_program.c
> +++ b/src/gallium/drivers/nouveau/nvc0/nvc0_program.c
> @@ -594,6 +594,7 @@ nvc0_program_translate(struct nvc0_program *prog, 
> uint16_t chipset,
>break;
> default:
>assert(!"unsupported IR!");
> +  free(info);
>return false;
> }
>
> diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_state.c 
> b/src/gallium/drivers/nouveau/nvc0/nvc0_state.c
> index 2ab51c8529e..7c0f605dc16 100644
> --- a/src/gallium/drivers/nouveau/nvc0/nvc0_state.c
> +++ b/src/gallium/drivers/nouveau/nvc0/nvc0_state.c
> @@ -607,6 +607,7 @@ nvc0_sp_state_create(struct pipe_context *pipe,
>break;
> default:
>assert(!"unsupported IR!");
> +  free(prog);
>return NULL;
> }
>
> @@ -739,6 +740,7 @@ nvc0_cp_state_create(struct pipe_context *pipe,
>break;
> default:
>assert(!"unsupported IR!");
> +  free(prog);
>return NULL;
> }
>
> --
> 2.21.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] egl: Don't add hardware device if there is no render node v2.

2019-06-17 Thread Ilia Mirkin

On Mon, Jun 17, 2019 at 6:37 AM  wrote:
>
> From: Mathias Fröhlich 
>
>
> Emil,
>
> that one probably matches your original intent then.
>
> please review
> Thanks
>
> Mathias
>
>
> Do not offer a hardware drm backed egl device if no render node
> is available. The current implementation will fail on this
> egl device. On top it issues a warning that is actually missleading.
> There are finally more error paths that can fail on the way to a
> hardware backed egl device. Fixing all of them would kind of require
> opening the drm device and see if there is a usable driver associated
> with the device. The taken approach avoids a full probe and fixes at
> least this kind of problem on kvm virtualization hosts I observe here.
>
> Signed-off-by: Mathias Fröhlich 
> ---
>  src/egl/main/egldevice.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/src/egl/main/egldevice.c b/src/egl/main/egldevice.c
> index 76b8960fa5b..99d8a6c1886 100644
> --- a/src/egl/main/egldevice.c
> +++ b/src/egl/main/egldevice.c
> @@ -108,9 +108,9 @@ static int
>  _eglAddDRMDevice(drmDevicePtr device, _EGLDevice **out_dev)
>  {
> _EGLDevice *dev;
> +   const int wanted_nodes = 1 << DRM_NODE_RENDER | 1 << DRM_NODE_PRIMARY;

Why check for primary? Why require both at this point? I guess the
original intent was to require render *or* primary... but some devices
will not have a primary node if they don't have any outputs...

>
> -   if ((device->available_nodes & (1 << DRM_NODE_PRIMARY |
> -   1 << DRM_NODE_RENDER)) == 0)
> +   if ((device->available_nodes & wanted_nodes) != wanted_nodes)
>return -1;
>
> dev = _eglGlobal.DeviceList;
> --
> 2.21.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] ac/nir: mark some texture intrinsics as convergent

2019-05-30 Thread Ilia Mirkin

txf supplies an lod, but tg4's is implicitly always 0.

On Thu, May 30, 2019 at 2:26 PM Bas Nieuwenhuizen
 wrote:
>
> On Thu, May 30, 2019 at 6:50 PM Rhys Perry  wrote:
> >
> > Otherwise LLVM can sink them and their texture coordinate calculations
> > into divergent branches.
> >
> > Cc: 
> > Signed-off-by: Rhys Perry 
> > ---
> >  src/amd/common/ac_nir_to_llvm.c | 29 +
> >  1 file changed, 29 insertions(+)
> >
> > diff --git a/src/amd/common/ac_nir_to_llvm.c 
> > b/src/amd/common/ac_nir_to_llvm.c
> > index 265e3b636c4..d2dc617de36 100644
> > --- a/src/amd/common/ac_nir_to_llvm.c
> > +++ b/src/amd/common/ac_nir_to_llvm.c
> > @@ -1316,6 +1316,30 @@ static nir_deref_instr *get_tex_texture_deref(const 
> > nir_tex_instr *instr)
> > return texture_deref_instr;
> >  }
> >
> > +static bool has_implicit_derivatives(const nir_tex_instr *instr)
> > +{
> > +   switch (instr->op) {
> > +   case nir_texop_txs:
> > +   case nir_texop_query_levels:
> > +   case nir_texop_texture_samples:
> > +   case nir_texop_samples_identical:
> > +   return false;
> > +   default:
> > +   break;
> > +   }
> > +   for (unsigned i = 0; i < instr->num_srcs; i++) {
> > +   switch (instr->src[i].src_type) {
> > +   case nir_tex_src_lod:
> > +   case nir_tex_src_ddx:
> > +   case nir_tex_src_ddy:
> > +   return false;
> > +   default:
> > +   break;
> > +   }
> > +   }
> > +   return true;
> > +}
>
> txf, tg4 and friends do not provide any of lod/ddx/ddy do they?
>
> > +
> >  static LLVMValueRef build_tex_intrinsic(struct ac_nir_context *ctx,
> > const nir_tex_instr *instr,
> > struct ac_image_args *args)
> > @@ -1394,6 +1418,11 @@ static LLVMValueRef build_tex_intrinsic(struct 
> > ac_nir_context *ctx,
> > }
> >
> > args->attributes = AC_FUNC_ATTR_READNONE;
> > +   /* Prevent texture instructions with implicit derivatives from being
> > +* sinked into branches. */
> > +   if (has_implicit_derivatives(instr))
> > +   args->attributes |= AC_FUNC_ATTR_CONVERGENT;
> > +
> > return ac_build_image_opcode(>ac, args);
> >  }
> >
> > --
> > 2.21.0
> >
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] list: add some iterator debug

2019-05-25 Thread Ilia Mirkin

Why not just do it in a way that works for everyone? Both the do/while
method and the ifdef-based method that I suggested work everywhere. Or
is there another reason you prefer to use those statement expressions?

On Sat, May 25, 2019 at 6:21 PM Rob Clark  wrote:
>
> Is there a convenient #ifdef I can use to guard the list_assert()
> macro..  I don't really mind if MSVC can't have this, but would rather
> not let it prevent the rest of us from having nice things
>
> BR,
> -R
>
> On Sat, May 25, 2019 at 1:23 PM Jason Ekstrand  wrote:
> >
> > Yeah, that's a GNU extension. It also works in clang but not MSVC which is
> > used to build NIR.
> >
> > On May 25, 2019 13:30:29 Rob Clark  wrote:
> >
> > > On Sat, May 25, 2019 at 11:13 AM Ilia Mirkin  wrote:
> > >>
> > >> On Sat, May 25, 2019 at 2:03 PM Rob Clark  wrote:
> > >> >
> > >> > From: Rob Clark 
> > >> >
> > >> > Debugging use of unsafe iterators when you should have used the _safe
> > >> > version sucks.  Add some DEBUG build support to catch and assert if
> > >> > someone does that.
> > >> >
> > >> > I didn't update the UPPERCASE verions of the iterators.  They should
> > >> > probably be deprecated/removed.
> > >> >
> > >> > Signed-off-by: Rob Clark 
> > >> > ---
> > >> >  src/util/list.h | 23 ++-
> > >> >  1 file changed, 18 insertions(+), 5 deletions(-)
> > >> >
> > >> > diff --git a/src/util/list.h b/src/util/list.h
> > >> > index 09d1b4cae64..6d89a42b226 100644
> > >> > --- a/src/util/list.h
> > >> > +++ b/src/util/list.h
> > >> > @@ -43,6 +43,13 @@
> > >> >  #include 
> > >> >  #include "c99_compat.h"
> > >> >
> > >> > +#ifdef DEBUG
> > >> > +#  define LIST_DEBUG 1
> > >> > +#else
> > >> > +#  define LIST_DEBUG 0
> > >> > +#endif
> > >> > +
> > >> > +#define list_assert(cond, msg)  ({ if (LIST_DEBUG) assert((cond) && 
> > >> > msg); })
> > >>
> > >> Not sure if it's worth worrying about, but this style of macro
> > >> definition can be dangerous. One might use it as
> > >>
> > >> if (x) list_assert()
> > >> else blah;
> > >>
> > >> With the macro defined as-is, the "else blah" will get attached to the
> > >> if in the macro. I believe the common style is to do do {}while(0) to
> > >> avoid such issues (or to use an inline function). Alternatively, just
> > >> define it differently for LIST_DEBUG vs not.
> > >>
> > >
> > > I think the ({ ... }) should save the day..
> > >
> > > (hmm, is that c99 or a gnu thing?  I've it isn't avail on some
> > > compilers I guess we should disable list_assert() for those?)
> > >
> > > BR,
> > > -R
> > > ___
> > > mesa-dev mailing list
> > > mesa-dev@lists.freedesktop.org
> > > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> >
> >
> >
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] list: add some iterator debug

2019-05-25 Thread Ilia Mirkin

On Sat, May 25, 2019 at 2:03 PM Rob Clark  wrote:
>
> From: Rob Clark 
>
> Debugging use of unsafe iterators when you should have used the _safe
> version sucks.  Add some DEBUG build support to catch and assert if
> someone does that.
>
> I didn't update the UPPERCASE verions of the iterators.  They should
> probably be deprecated/removed.
>
> Signed-off-by: Rob Clark 
> ---
>  src/util/list.h | 23 ++-
>  1 file changed, 18 insertions(+), 5 deletions(-)
>
> diff --git a/src/util/list.h b/src/util/list.h
> index 09d1b4cae64..6d89a42b226 100644
> --- a/src/util/list.h
> +++ b/src/util/list.h
> @@ -43,6 +43,13 @@
>  #include 
>  #include "c99_compat.h"
>
> +#ifdef DEBUG
> +#  define LIST_DEBUG 1
> +#else
> +#  define LIST_DEBUG 0
> +#endif
> +
> +#define list_assert(cond, msg)  ({ if (LIST_DEBUG) assert((cond) && msg); })

Not sure if it's worth worrying about, but this style of macro
definition can be dangerous. One might use it as

if (x) list_assert()
else blah;

With the macro defined as-is, the "else blah" will get attached to the
if in the macro. I believe the common style is to do do {}while(0) to
avoid such issues (or to use an inline function). Alternatively, just
define it differently for LIST_DEBUG vs not.

Cheers,

  -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [RFC PATCH] nir/algebraic: Simplify max(abs(a), 0.0) -> abs(a)

2019-05-24 Thread Ilia Mirkin

On Fri, May 24, 2019 at 2:46 PM Alyssa Rosenzweig  wrote:
>
> > I /think/ that should be adequate here too.
>
> Do inexact values not need to handle NaNs strictly, then? I'm not sure
> what this corresponds to in the various GLs/CLs/Vulkan specs, hence
> labeling this RFC.

I don't know about Vulkan, but GL has a very childish approach to
NaN's -- you can pretty much do whatever and be in-spec.

Applications fall into two categories -- ones that expect NaN's and
carefully deal with them (maybe DX10, definitely DX11+), and ones that
are from a time before NaN on GPUs existed (DX9 and earlier).

I'm generally a lot less concerned about inexact opts that remove
NaN's (e.g. a - a -> 0, but Inf - Inf = NaN, or a * 0 -> 0, but Inf *
0 = NaN) than ones that add NaN's such as this one. I don't have a
specific example, but I can well imagine an application that would
expect this to have cleared out any NaN's, and you'll suddenly remove
their op.

But I also see the benefit of doing it since 99.9% of the time
it's perfectly safe :)

  -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [RFC PATCH] nir/algebraic: Simplify max(abs(a), 0.0) -> abs(a)

2019-05-23 Thread Ilia Mirkin

How does max(NaN, 0) work? IIRC there's some provision that this
becomes 0, while abs(NaN) = NaN.

On Thu, May 23, 2019 at 10:47 PM Alyssa Rosenzweig  wrote:
>
> I noticed this pattern in glmark's jellyfish scene.
>
> Assuming this is correct (it should be...?), could someone do a
> shader-db run? Thank you!
>
> Signed-off-by: Alyssa Rosenzweig 
> Cc: Ian Romanick 
> ---
>  src/compiler/nir/nir_opt_algebraic.py | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/src/compiler/nir/nir_opt_algebraic.py 
> b/src/compiler/nir/nir_opt_algebraic.py
> index 89d07aa1261..abd0b6591ce 100644
> --- a/src/compiler/nir/nir_opt_algebraic.py
> +++ b/src/compiler/nir/nir_opt_algebraic.py
> @@ -377,6 +377,7 @@ optimizations = [
> (('imax', a, a), a),
> (('umin', a, a), a),
> (('umax', a, a), a),
> +   (('fmax', ('fabs', a), 0.0), ('fabs', a)),
> (('fmax', ('fmax', a, b), b), ('fmax', a, b)),
> (('umax', ('umax', a, b), b), ('umax', a, b)),
> (('imax', ('imax', a, b), b), ('imax', a, b)),
> --
> 2.20.1
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/4] glsl: init non-static class member in link uniforms.

2019-05-16 Thread Ilia Mirkin

On Thu, May 16, 2019 at 10:22 PM Dave Airlie  wrote:
>
> From: Dave Airlie 
>
> link_uniforms.cpp:477: uninit_member: Non-static class member 
> "shader_storage_blocks_write_access" is not initialized in this constructor 
> nor in any functions that it calls.
>
> Reported by coverity.
> ---
>  src/compiler/glsl/link_uniforms.cpp | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/src/compiler/glsl/link_uniforms.cpp 
> b/src/compiler/glsl/link_uniforms.cpp
> index aa96227a7e1..641d38689bf 100644
> --- a/src/compiler/glsl/link_uniforms.cpp
> +++ b/src/compiler/glsl/link_uniforms.cpp
> @@ -472,7 +472,8 @@ public:
>bool use_std430_as_default)
>: prog(prog), map(map), uniforms(uniforms),
>  use_std430_as_default(use_std430_as_default), values(values),
> -bindless_targets(NULL), bindless_access(NULL)
> +bindless_targets(NULL), bindless_access(NULL),
> +shader_storage_blocks_write_access(9)

Probably meant 0 here. With that, the series is

Acked-by: Ilia Mirkin 

> {
> }
>
> --
> 2.21.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2 1/2] nir: Add is_divergent_vector search helper

2019-05-10 Thread Ilia Mirkin

Perhaps change the subject?

On Fri, May 10, 2019 at 12:08 PM Alyssa Rosenzweig  wrote:
>
> This allows algebraic optimizations to check if the argument accesses
> multiple distinct components of a vector. So a swizzle like "xyz" will
> return true, but "yyy" will return false, as will a scalar. This can be
> useful for optimizations on vector processors, where a convergent
> swizzle can be done in one clock (replicating as if a scalar) but a
> divergent one must be scalarized. In these cases, it is useful to
> optimize differently based on whether the swizzle diverges. (Use case is
> the "csel" condition on Midgard).

I might rephrase it as

"""
csel on Midgard requires a scalar condition, so add a function to
enable lowering csel for cases where the condition is non-scalar.
"""

Your call. (But either way, get rid of the divergent/convergent references.)




>
> Signed-off-by: Alyssa Rosenzweig 
> Cc: Jason Ekstrand 
> Cc: Ilia Mirkin 
> ---
>  src/compiler/nir/nir_search_helpers.h | 17 +
>  1 file changed, 17 insertions(+)
>
> diff --git a/src/compiler/nir/nir_search_helpers.h 
> b/src/compiler/nir/nir_search_helpers.h
> index 1624508993d..8e26739a3ce 100644
> --- a/src/compiler/nir/nir_search_helpers.h
> +++ b/src/compiler/nir/nir_search_helpers.h
> @@ -143,6 +143,23 @@ is_not_const(nir_alu_instr *instr, unsigned src, UNUSED 
> unsigned num_components,
> return !nir_src_is_const(instr->src[src].src);
>  }
>
> +/* I.e. the vector's swizzle actually accesses multiple channels. True for
> + * xyzw, false for , false for w */
> +
> +static inline bool
> +is_non_scalar_swizzle(nir_alu_instr *instr, UNUSED unsigned src,
> + unsigned num_components, const uint8_t *swizzle)
> +{
> +   unsigned first_component = swizzle[0];
> +
> +   for (unsigned i = 1; i < num_components; ++i) {
> +  if (swizzle[i] != first_component)
> + return true;
> +   }
> +
> +   return false;
> +}
> +
>  static inline bool
>  is_used_more_than_once(nir_alu_instr *instr)
>  {
> --
> 2.20.1
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [RFC PATCH 03/17] eir: add live ranges pass

2019-05-10 Thread Ilia Mirkin

On Fri, May 10, 2019 at 5:47 AM Connor Abbott  wrote:
>
> On Fri, May 10, 2019 at 11:09 AM Christian Gmeiner
>  wrote:
> >
> > Signed-off-by: Christian Gmeiner 
> > ---
> >  src/etnaviv/compiler/eir.h|   3 +
> >  src/etnaviv/compiler/eir_live_variables.c | 258 ++
> >  src/etnaviv/compiler/meson.build  |   1 +
> >  .../compiler/tests/eir_live_variables.cpp | 162 +++
> >  src/etnaviv/compiler/tests/meson.build|  11 +
> >  5 files changed, 435 insertions(+)
> >  create mode 100644 src/etnaviv/compiler/eir_live_variables.c
> >  create mode 100644 src/etnaviv/compiler/tests/eir_live_variables.cpp
> >
> > diff --git a/src/etnaviv/compiler/eir.h b/src/etnaviv/compiler/eir.h
> > index a05b12de94b..38c6af4e07e 100644
> > --- a/src/etnaviv/compiler/eir.h
> > +++ b/src/etnaviv/compiler/eir.h
> > @@ -151,6 +151,9 @@ struct eir
> > /* keep track of number of allocated uniforms */
> > struct util_dynarray uniform_alloc;
> > unsigned uniform_offset;
> > +
> > +   /* Live ranges of temp registers */
> > +   int *temp_start, *temp_end;
>
> This way of representing liveness, and then using a coloring register
> allocator, is a common anti-pattern in Mesa, that was initially copied
> from i965 which dates back to before we knew any better. I really
> don't want to see it spread to yet another driver :(.
>
> Representing live ranges like this is imprecise. If I have a program like 
> this:
>
> foo = ...
> if (...) {
>bar = ...
>... = bar; /* last use of "bar" */
> }
> ... = foo;
>
> Then it will say that foo and bar interfere, even when they don't.
>
> Now, this approximation does make things a bit simpler. But, it turns
> out that if you're willing to make it, then the interference graph is
> trivially colorable via a simple linear-time algorithm. This is the
> basis of "linear-scan" register allocators, including the one in LLVM.
> If you want to go down this route, you can, but this hybrid is just
> useless as it gives you the worst of both worlds.
>
> If you want to properly build up the interference graph, it's actually
> not that hard. After doing the inter-basic-block liveness analysis,
> for each block, you initialize a bitset to the live-out bitset. Then
> you walk the block backwards, updating it at each instruction exactly
> as in liveness analysis, so that it always represents the live
> registers before each instruction. Then you add interferences between
> all of the live registers and the register(s) defined at the
> instruction.
>
> One last pitfall I'll mention is that in the real world, you'll also
> need to use reachability. If you have something like
>
> if (...)
>foo = ... /* only definition of "foo" */
>
> ... = foo;
>
> where foo is only partially defined, then the liveness of foo will
> "leak" through the if. To fix this you need to consider what's called
> "reachability," i.e. something is only live if, in addition to
> potentially being used sometime later, it is reachable (potentially
> defined sometime earlier). Reachability analysis is exactly like
> liveness analysis, but everything is backwards. i965 does this
> properly nowadays, and the change had a huge effect on spilling/RA.

One more word on the reachability thing... watch out for code like

while() {
  use(foo);
  if ()
foo = bar;
  more code that doesn't use foo
}

In SSA, this becomes like

foo1 = undef
while() {
  foo = phi(foo1, foo2)
  use(foo)
  if ()
foo2 = bar
  more code that doesn't use foo2
}

And so you have to extend the live range of foo2 until the end of the
loop. This becomes even more fun with various nested control flow
scenarios. (I haven't reviewed whether this series handles this sort
of thing appropriately, but Connor's comment reminded me of it.)

  -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/2] nir: Add is_divergent_vector search helper

2019-05-07 Thread Ilia Mirkin

Sigh ... given that there's both "is_used_by_if" and
"is_not_used_by_if" ... gonna go with "no".

On Tue, May 7, 2019 at 6:42 PM Alyssa Rosenzweig  wrote:
>
> Gotcha. I wasn't sure negations in the NIR search rule were possible...?
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/2] nir: Add is_divergent_vector search helper

2019-05-07 Thread Ilia Mirkin

On Tue, May 7, 2019 at 5:45 PM Alyssa Rosenzweig  wrote:
>
> > IMO better names might be is_scalar_swizzle or something.
>
> Ah, yes, that would be a better name! is_not_scalar_swizzle in this case
> (logic is flipped).
>
> > Can num_components be 1? If so, then this will return false, whereas
> > you probably wanted it to return true.
>
> I think that's the correct behaviour...? It should return true if
> multiple distinct channels are accessed. For nr_components=1, that's
> false by definition.

Right, that's my bad -- I was thinking is_scalar, but this is
is_non_scalar, so you're good. (My personal preference would be to
make it is_scalar and then stick a negation into the nir rule, but I
don't have anything solid to back up that preference. Negations in
names are confusing to me, I guess.)

Cheers,

  -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 5859 matches

Mail list logo