Re: [Mesa-dev] [PATCH 5/5] i965/chv: Display proper branding

2016-03-07 Thread Rob Clark
On Mon, Mar 7, 2016 at 8:39 PM, Ben Widawsky
 wrote:
> "Braswell" is a Cherryview based *thing*. It unfortunately requires extra
> information to determine its marketing name. Unlike all previous products, and
> hopefully all future ones, there is no unique 1:1 mapping of PCI device ID to
> brand string.
>
> I put up a fight about adding any complexity to our GL renderer string code 
> for
> a very long time. However, a wise man made a comment to me that I couldn't 
> argue
> with: if a user installs Windows on their hardware, the brand string should be
> the same as what we display in Linux. The Windows driver apparently does this
> check, so we should too.
>
> Note that I did manage to find a good use for this info anyway in the computer
> shader thread counts.

s/computer/compute/ I assume?

BR,
-R

> Cc: Kaveh Nasri 
> Signed-off-by: Ben Widawsky 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [android-x86-devel] Re: gralloc_drm_pipe

2016-03-07 Thread Rob Clark
On Tue, Mar 8, 2016 at 2:22 AM, Rob Herring  wrote:
> On Sun, Mar 6, 2016 at 10:32 PM, Rob Clark  wrote:
>> On Sun, Mar 6, 2016 at 9:29 PM, Chih-Wei Huang  
>> wrote:
>>> 2016-03-05 3:53 GMT+08:00 Rob Clark :
 On Fri, Mar 4, 2016 at 2:43 PM, Thomas Hellstrom  
 wrote:
> On 03/04/2016 07:07 PM, Rob Clark wrote:
>> On Fri, Mar 4, 2016 at 12:59 PM, Rob Clark  wrote:
>>> So, I've been advocating that for android, gallium drivers use
>>> gralloc_drm_pipe, since with android it seems like you end up with
>>> both gralloc and libGL in the same process, and having both share the
>>> same pipe_screen avoids lots of headaches with multiple gem handles
>>> pointing to same underlying buffer.
>>>
>>> But the awkward thing is that gralloc_drm_pipe is using gallium APIs
>>> that aren't particularly intended to be used out-of-tree.  Ie. not
>>> really stable APIs.  At the time, the thing that made sense to me was
>>> to pull drm_gralloc into mesa.  But at the time, there were no
>>> non-mesa users of drm_gralloc, which isn't really true anymore.
>>>
>>> Maybe what makes more sense now is to implement a gralloc state
>>> tracker, which exposes a stable API for drm_gralloc?  It would mostly
>>> be a shim to expose gallium import/export/transfer APIs in a stable
>>> way, but would also be where the code that figures out which driver to
>>> use to create/get the pipe_screen.
>> and actually, we might just be able to use XA state tracker for this..
>> I think it exposes all the necessary import/export/etc stuff that
>> gralloc would need..
>>
>> BR,
>> -R
>>
> and it was created for a very similar purpose, except that we also
> needed some
> render functionality, enough to composite surfaces.

 right, and since we have the ability to import/export dmabuf handles,
 I think it is a superset of what is needed.  (gralloc is using blits
 instead of flips for vmwgfx, for reasons I don't fully understand..
 but XA can do these blits and more, so we are still good there)
>>>
>>> Hi Rob,
>>> Thank you for raising the problem though
>>> I don't fully understand the technical details.
>>>
>>> So you are planning to modify (or re-implement?)
>>> gralloc_drm_pipe to use the APIs of XA state tracker.
>>
>> well, unless I can talk someone else into doing it before I find time ;-)
>>
>> But yeah, if no one else does it (or comes up with a better idea
>> first), I will eventually do it.
>>
>> fwiw, the XA API is:
>>
>> https://cgit.freedesktop.org/mesa/mesa/tree/src/gallium/state_trackers/xa/xa_tracker.h
>> https://cgit.freedesktop.org/mesa/mesa/tree/src/gallium/state_trackers/xa/xa_context.h
>> (ignoring the composite bits which we shouldn't need)
>>
>> I think that should provide everything needed for gralloc, but with a
>> stable API rather than using the gallium pipe screen/context APIs
>> directly.
>
> This seems reasonable from what I've looked at it. Does supporting XA
> require any h/w specific bits for gallium drivers and the classic
> drivers already support XA?

Any of the gallium drivers should support XA[1]..  but not classic
(non-gallium) drivers, which really only matters for intel.  So
basically the intel backend in drm_gralloc would have to remain, but
the nouveau/radeon/vmwgfx ones could be dropped[2]

[1] maybe at one point I had to make some makefile changes or
something like that?  Not sure, it was a long time ago when I first
started using XA in xf86-video-freedreno.  But if that was the case it
pre-dated pipe_loader_create_screen().. at least from point of view of
what features XA needs from gallium pipe driver, it is very minimal

[2] there is still the unrelated issue of how to deal w/ drm drivers
which do not (yet?) support atomic..

> One of the problems I've been looking at is dma-buf fds are not
> exposed by gralloc in any standard way although various
> implementations happen to be aligned. So mesa EGL has a dependency on
> drm_gralloc to retrieve the fd from native_handle_t. Would we be able
> to remove that dependency and retrieve the dma-buf fds directly from
> XA?

Hmm.. not 100% sure..  it is no problem to import/export dma-buf
handles (xa_handle_type_fd) from an xa_surface.  I guess *somewhere*
you need to be able to go from a native_handle_t <-> fd (and then <->
xa_surface).  So I'm not sure if that decouples mesa EGL from
drm_gralloc..

BR,
-R

> Rob
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [android-x86-devel] Re: gralloc_drm_pipe

2016-03-07 Thread Rob Herring
On Sun, Mar 6, 2016 at 10:32 PM, Rob Clark  wrote:
> On Sun, Mar 6, 2016 at 9:29 PM, Chih-Wei Huang  
> wrote:
>> 2016-03-05 3:53 GMT+08:00 Rob Clark :
>>> On Fri, Mar 4, 2016 at 2:43 PM, Thomas Hellstrom  
>>> wrote:
 On 03/04/2016 07:07 PM, Rob Clark wrote:
> On Fri, Mar 4, 2016 at 12:59 PM, Rob Clark  wrote:
>> So, I've been advocating that for android, gallium drivers use
>> gralloc_drm_pipe, since with android it seems like you end up with
>> both gralloc and libGL in the same process, and having both share the
>> same pipe_screen avoids lots of headaches with multiple gem handles
>> pointing to same underlying buffer.
>>
>> But the awkward thing is that gralloc_drm_pipe is using gallium APIs
>> that aren't particularly intended to be used out-of-tree.  Ie. not
>> really stable APIs.  At the time, the thing that made sense to me was
>> to pull drm_gralloc into mesa.  But at the time, there were no
>> non-mesa users of drm_gralloc, which isn't really true anymore.
>>
>> Maybe what makes more sense now is to implement a gralloc state
>> tracker, which exposes a stable API for drm_gralloc?  It would mostly
>> be a shim to expose gallium import/export/transfer APIs in a stable
>> way, but would also be where the code that figures out which driver to
>> use to create/get the pipe_screen.
> and actually, we might just be able to use XA state tracker for this..
> I think it exposes all the necessary import/export/etc stuff that
> gralloc would need..
>
> BR,
> -R
>
 and it was created for a very similar purpose, except that we also
 needed some
 render functionality, enough to composite surfaces.
>>>
>>> right, and since we have the ability to import/export dmabuf handles,
>>> I think it is a superset of what is needed.  (gralloc is using blits
>>> instead of flips for vmwgfx, for reasons I don't fully understand..
>>> but XA can do these blits and more, so we are still good there)
>>
>> Hi Rob,
>> Thank you for raising the problem though
>> I don't fully understand the technical details.
>>
>> So you are planning to modify (or re-implement?)
>> gralloc_drm_pipe to use the APIs of XA state tracker.
>
> well, unless I can talk someone else into doing it before I find time ;-)
>
> But yeah, if no one else does it (or comes up with a better idea
> first), I will eventually do it.
>
> fwiw, the XA API is:
>
> https://cgit.freedesktop.org/mesa/mesa/tree/src/gallium/state_trackers/xa/xa_tracker.h
> https://cgit.freedesktop.org/mesa/mesa/tree/src/gallium/state_trackers/xa/xa_context.h
> (ignoring the composite bits which we shouldn't need)
>
> I think that should provide everything needed for gralloc, but with a
> stable API rather than using the gallium pipe screen/context APIs
> directly.

This seems reasonable from what I've looked at it. Does supporting XA
require any h/w specific bits for gallium drivers and the classic
drivers already support XA?

One of the problems I've been looking at is dma-buf fds are not
exposed by gralloc in any standard way although various
implementations happen to be aligned. So mesa EGL has a dependency on
drm_gralloc to retrieve the fd from native_handle_t. Would we be able
to remove that dependency and retrieve the dma-buf fds directly from
XA?

Rob
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/4] glcpp: Implicitly resolve version after the first non-space/hash token.

2016-03-07 Thread Kenneth Graunke
On Monday, March 7, 2016 5:23:26 PM PST Ian Romanick wrote:
> On 03/04/2016 07:33 PM, Kenneth Graunke wrote:
> > We resolved the implicit version directive when processing control lines,
> > such as #ifdef, to ensure any built-in macros exist.  However, we failed
> > to resolve it when handling ordinary text.
> > 
> > For example,
> > 
> > int x = __VERSION__;
> > 
> > should resolve __VERSION__ to 110, but since we never resolved the 
implicit
> > version, none of the built-in macros exist, so it was left as is.
> > 
> > This also meant we allowed the following shader to slop through:
> > 
> > 123
> > #version 120
> > 
> > Nothing would cause the implicit version to take effect, so when we saw
> > the #version directive, we thought everything was peachy.
> > 
> > This patch makes the lexer's per-token action resolve the implicit
> > version on the first non-space/newline/hash token that isn't part of
> > a #version directive, fulfilling the GLSL language spec:
> > 
> > "The #version directive must occur in a shader before anything else,
> >  except for comments and white space."
> > 
> > Because we emit #version as HASH_TOKEN then VERSION_TOKEN, we have to
> > allow HASH_TOKEN to slop through as well, so we don't resolve the
> > implicit version as soon as we see the # character.  However, this is
> > fine, because the parser's HASH_TOKEN NEWLINE rule does resolve the
> > version, disallowing cases like:
> > 
> > #
> > #version 120
> > 
> > This patch also adds the above shaders as new glcpp tests.
> 
> Does any of this interfere with various workarounds that we have to
> allow mixed #version lines, #extension lines, and code?  Assuming all
> that still works, this series is
> 
> Reviewed-by: Ian Romanick 

Good call.  I verified that Unigine Heaven's shaders fail to compile
normally, but work with allow_glsl_extension_directive_midshader=true,
even after my patch series.  The app itself also continues running fine.

Thanks for the review!


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] gallium/vl: Fix brightness usage v2

2016-03-07 Thread Thomas Hellstrom
On 03/04/2016 12:52 PM, Thomas Hellstrom wrote:
> Multiplying the contrast- and brightness matrices it becomes obvious that
> brightness should be multiplied by contrast in the procamp matrix. Fix this.
>
> v2: Fixed a couple of typos, one of them affecting the results.

Actually, when looking at also the msdn document,

https://msdn.microsoft.com/en-us/library/windows/hardware/ff569191%28v=vs.85%29.aspx

it turns out that the brightness matrix described was incorrect whereas the
procamp matrix and the actual calculation was indeed correct.

So this patch will turn out to be a documentation change only. I'll
respin the other ones as well.

/Thomas


>
> Cc: "11.0 11.1 11.2" 
> Signed-off-by: Thomas Hellstrom 
> ---
>  src/gallium/auxiliary/vl/vl_csc.c | 22 +++---
>  1 file changed, 11 insertions(+), 11 deletions(-)
>
> diff --git a/src/gallium/auxiliary/vl/vl_csc.c 
> b/src/gallium/auxiliary/vl/vl_csc.c
> index c8efe28..fc11f73 100644
> --- a/src/gallium/auxiliary/vl/vl_csc.c
> +++ b/src/gallium/auxiliary/vl/vl_csc.c
> @@ -77,10 +77,10 @@
>   * [ 0,   0,  0, 1]
>   *
>   * procamp
> - * [ c,   0,  0, b]
> - * [ 0,  c*s*cos(h), c*s*sin(h), 0]
> - * [ 0, -c*s*sin(h), c*s*cos(h), 0]
> - * [ 0,   0,  0, 1]
> + * [ c,   0,  0, cb]
> + * [ 0,  c*s*cos(h), c*s*sin(h),  0]
> + * [ 0, -c*s*sin(h), c*s*cos(h),  0]
> + * [ 0,   0,  0,  1]
>   *
>   * bias
>   * [ 1, 0, 0,  ybias]
> @@ -89,10 +89,10 @@
>   * [ 0, 0, 0,  1]
>   *
>   * csc
> - * [ c*cstd[ 0], c*cstd[ 1]*s*cos(h) - c*cstd[ 2]*s*sin(h), c*cstd[ 
> 2]*s*cos(h) + c*cstd[ 1]*s*sin(h), cstd[ 3] + cstd[ 0]*(b + c*ybias) + cstd[ 
> 1]*(c*cbbias*s*cos(h) + c*crbias*s*sin(h)) + cstd[ 2]*(c*crbias*s*cos(h) - 
> c*cbbias*s*sin(h))]
> - * [ c*cstd[ 4], c*cstd[ 5]*s*cos(h) - c*cstd[ 6]*s*sin(h), c*cstd[ 
> 6]*s*cos(h) + c*cstd[ 5]*s*sin(h), cstd[ 7] + cstd[ 4]*(b + c*ybias) + cstd[ 
> 5]*(c*cbbias*s*cos(h) + c*crbias*s*sin(h)) + cstd[ 6]*(c*crbias*s*cos(h) - 
> c*cbbias*s*sin(h))]
> - * [ c*cstd[ 8], c*cstd[ 9]*s*cos(h) - c*cstd[10]*s*sin(h), 
> c*cstd[10]*s*cos(h) + c*cstd[ 9]*s*sin(h), cstd[11] + cstd[ 8]*(b + c*ybias) 
> + cstd[ 9]*(c*cbbias*s*cos(h) + c*crbias*s*sin(h)) + 
> cstd[10]*(c*crbias*s*cos(h) - c*cbbias*s*sin(h))]
> - * [ c*cstd[12], c*cstd[13]*s*cos(h) - c*cstd[14]*s*sin(h), 
> c*cstd[14]*s*cos(h) + c*cstd[13]*s*sin(h), cstd[15] + cstd[12]*(b + c*ybias) 
> + cstd[13]*(c*cbbias*s*cos(h) + c*crbias*s*sin(h)) + 
> cstd[14]*(c*crbias*s*cos(h) - c*cbbias*s*sin(h))]
> + * [ c*cstd[ 0], c*cstd[ 1]*s*cos(h) - c*cstd[ 2]*s*sin(h), c*cstd[ 
> 2]*s*cos(h) + c*cstd[ 1]*s*sin(h), cstd[ 3] + cstd[ 0]*c(b + ybias) + cstd[ 
> 1]*(c*cbbias*s*cos(h) + c*crbias*s*sin(h)) + cstd[ 2]*(c*crbias*s*cos(h) - 
> c*cbbias*s*sin(h))]
> + * [ c*cstd[ 4], c*cstd[ 5]*s*cos(h) - c*cstd[ 6]*s*sin(h), c*cstd[ 
> 6]*s*cos(h) + c*cstd[ 5]*s*sin(h), cstd[ 7] + cstd[ 4]*c*(b + ybias) + cstd[ 
> 5]*(c*cbbias*s*cos(h) + c*crbias*s*sin(h)) + cstd[ 6]*(c*crbias*s*cos(h) - 
> c*cbbias*s*sin(h))]
> + * [ c*cstd[ 8], c*cstd[ 9]*s*cos(h) - c*cstd[10]*s*sin(h), 
> c*cstd[10]*s*cos(h) + c*cstd[ 9]*s*sin(h), cstd[11] + cstd[ 8]*c*(b + ybias) 
> + cstd[ 9]*(c*cbbias*s*cos(h) + c*crbias*s*sin(h)) + 
> cstd[10]*(c*crbias*s*cos(h) - c*cbbias*s*sin(h))]
> + * [ c*cstd[12], c*cstd[13]*s*cos(h) - c*cstd[14]*s*sin(h), 
> c*cstd[14]*s*cos(h) + c*cstd[13]*s*sin(h), cstd[15] + cstd[12]*c*(b + ybias) 
> + cstd[13]*(c*cbbias*s*cos(h) + c*crbias*s*sin(h)) + 
> cstd[14]*(c*crbias*s*cos(h) - c*cbbias*s*sin(h))]
>   */
>  
>  /*
> @@ -210,21 +210,21 @@ void vl_csc_get_matrix(enum VL_CSC_COLOR_STANDARD cs,
> (*matrix)[0][0] = c * (*cstd)[0][0];
> (*matrix)[0][1] = c * (*cstd)[0][1] * s * cosf(h) - c * (*cstd)[0][2] * s 
> * sinf(h);
> (*matrix)[0][2] = c * (*cstd)[0][2] * s * cosf(h) + c * (*cstd)[0][1] * s 
> * sinf(h);
> -   (*matrix)[0][3] = (*cstd)[0][3] + (*cstd)[0][0] * (b + c * ybias) +
> +   (*matrix)[0][3] = (*cstd)[0][3] + (*cstd)[0][0] * c * (b + ybias) +
>   (*cstd)[0][1] * (c * cbbias * s * cosf(h) + c * crbias 
> * s * sinf(h)) +
>   (*cstd)[0][2] * (c * crbias * s * cosf(h) - c * cbbias 
> * s * sinf(h));
>  
> (*matrix)[1][0] = c * (*cstd)[1][0];
> (*matrix)[1][1] = c * (*cstd)[1][1] * s * cosf(h) - c * (*cstd)[1][2] * s 
> * sinf(h);
> (*matrix)[1][2] = c * (*cstd)[1][2] * s * cosf(h) + c * (*cstd)[1][1] * s 
> * sinf(h);
> -   (*matrix)[1][3] = (*cstd)[1][3] + (*cstd)[1][0] * (b + c * ybias) +
> +   (*matrix)[1][3] = (*cstd)[1][3] + (*cstd)[1][0] * c * (b + ybias) +
>   (*cstd)[1][1] * (c * cbbias * s * cosf(h) + c * crbias 
> * s * sinf(h)) +
>   (*cstd)[1][2] * (c * crbias * s * cosf(h) - c * cbbias 
> * s * sinf(h));
>  
> (*matrix)[2][0] = c * (*cstd)[2][0];
> (*matrix)[2][1] = c * (*cstd)[2][1] * s * cosf(h) - c * (*cstd)[2][2] * s 
> * sinf(h);
> (*matrix)[2][2] = c * (*cstd)[2][2] * s * c

Re: [Mesa-dev] [PATCH 2/5] i965/chv: Use kernel provided info for max_cs_threads

2016-03-07 Thread Matt Turner
On Mon, Mar 7, 2016 at 5:39 PM, Ben Widawsky
 wrote:
> With the previous patches, the code can find out the actual number of 
> available
> compute threads. It is enabled only for Cherryview since that is the only
> platform I know for a fact has shipped devices which can benefit from this.  
> It
> seems like other platforms /might/ benefit from this because of fused
> configurations which /might/ have shipped. Fallback code is still there.
>
> Cc: Jordan Justen 
> Signed-off-by: Ben Widawsky 
> ---
>  src/mesa/drivers/dri/i965/brw_context.c | 8 +++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_context.c 
> b/src/mesa/drivers/dri/i965/brw_context.c
> index c66dd13..6e3e97b 100644
> --- a/src/mesa/drivers/dri/i965/brw_context.c
> +++ b/src/mesa/drivers/dri/i965/brw_context.c
> @@ -929,7 +929,13 @@ brwCreateContext(gl_api api,
> brw->max_ds_threads = devinfo->max_ds_threads;
> brw->max_gs_threads = devinfo->max_gs_threads;
> brw->max_wm_threads = devinfo->max_wm_threads;
> -   brw->max_cs_threads = devinfo->max_cs_threads;
> +   /* FINISHME: Do this for all platforms that the kernel supports */
> +   if (brw->is_cherryview &&
> +   screen->subslice_total > 0 && screen->eu_total > 0)
> +  /* Logical CS threads = n EUs per subslice * 7 threads per EU */

"EUs per subslice" is fine without the 'n' abbreviation, IMO.

> +  brw->max_cs_threads = screen->eu_total / screen->subslice_total * 7;
> +   else
> +  brw->max_cs_threads = devinfo->max_cs_threads;

This would be a lot more readable with braces, and would simplify the
next patch if you just added them here.

> brw->urb.size = devinfo->urb.size;
> brw->urb.min_vs_entries = devinfo->urb.min_vs_entries;
> brw->urb.max_vs_entries = devinfo->urb.max_vs_entries;
> --
> 2.7.1
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/5] i965: Query and store GPU properties from kernel

2016-03-07 Thread Matt Turner
On Mon, Mar 7, 2016 at 5:39 PM, Ben Widawsky
 wrote:
> Certain products are not uniquely identifiable based on device id alone. The
> kernel exports an interface to help deal with this. This patch merely 
> introduces
> the consumer of the interface and makes sure nothing breaks.
>
> It is also possible to use these values for programming GPGPU mode, and I plan
> to do that as well.
>
> The interface was introduced in libdrm 2.4.60, which is already required, so 
> it
> should all be fine.
>
> Signed-off-by: Ben Widawsky 
> ---
>  src/mesa/drivers/dri/i965/intel_screen.c | 21 +
>  src/mesa/drivers/dri/i965/intel_screen.h | 12 +++-
>  2 files changed, 32 insertions(+), 1 deletion(-)
>
> diff --git a/src/mesa/drivers/dri/i965/intel_screen.c 
> b/src/mesa/drivers/dri/i965/intel_screen.c
> index ee7c1d7..343b497 100644
> --- a/src/mesa/drivers/dri/i965/intel_screen.c
> +++ b/src/mesa/drivers/dri/i965/intel_screen.c
> @@ -1082,6 +1082,7 @@ static bool
>  intel_init_bufmgr(struct intel_screen *intelScreen)
>  {
> __DRIscreen *spriv = intelScreen->driScrnPriv;
> +   bool devid_override = getenv("INTEL_DEVID_OVERRIDE") != NULL;
>
> intelScreen->no_hw = getenv("INTEL_NO_HW") != NULL;
>
> @@ -1099,6 +1100,26 @@ intel_init_bufmgr(struct intel_screen *intelScreen)
>return false;
> }
>
> +   intelScreen->subslice_total = -1;
> +   intelScreen->eu_total = -1;
> +
> +   /* Everything below this is for real hardware only */
> +   if (intelScreen->no_hw || devid_override)
> +  return true;
> +
> +   intel_get_param(spriv, I915_PARAM_SUBSLICE_TOTAL,
> +   &intelScreen->subslice_total);
> +   intel_get_param(spriv, I915_PARAM_EU_TOTAL, &intelScreen->eu_total);
> +
> +   /* Without this information, we cannot get the right Braswell 
> brandstrings,
> +* and we have to use conservative numbers for GPGPU on many platforms, 
> but
> +* otherwise, things will just work.
> +*/
> +   if (intelScreen->subslice_total == -1 ||
> +   intelScreen->eu_total == -1)

I think this condition will fit on one line.

> +  _mesa_warning(NULL,
> +"Kernel 4.1 required to properly query GPU 
> properties.\n");
> +
> return true;
>  }
>
> diff --git a/src/mesa/drivers/dri/i965/intel_screen.h 
> b/src/mesa/drivers/dri/i965/intel_screen.h
> index 3a5f22c..695ed50 100644
> --- a/src/mesa/drivers/dri/i965/intel_screen.h
> +++ b/src/mesa/drivers/dri/i965/intel_screen.h
> @@ -81,7 +81,17 @@ struct intel_screen
>  * I915_PARAM_CMD_PARSER_VERSION parameter
>  */
> int cmd_parser_version;
> - };
> +
> +   /**
> +* Best effort attempt to get system information. Needed for GPGPU, and 
> brand
> +* strings (sigh)

The comment doesn't really describe the fields. Maybe

/**
 * Number of subslices reported by the I915_PARAM_SUBSLICE_TOTAL parameter
 */
int subslice_total;

/**
 * Number of EUs reported by the I915_PARAM_EU_TOTAL parameter
 */
int eu_total;

(Might have to linewrap the comments, not sure)

> +* I915_PARAM_SUBSLICE_TOTAL, and I915_PARAM_EU_TOTAL
> +*/
> +   struct {

Do these need to be together in a struct?

> +  int subslice_total;
> +  int eu_total;
> +   };
> +};
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 5/5] i965/chv: Display proper branding

2016-03-07 Thread Matt Turner
On Mon, Mar 7, 2016 at 5:39 PM, Ben Widawsky
 wrote:
> "Braswell" is a Cherryview based *thing*. It unfortunately requires extra
> information to determine its marketing name. Unlike all previous products, and
> hopefully all future ones, there is no unique 1:1 mapping of PCI device ID to
> brand string.
>
> I put up a fight about adding any complexity to our GL renderer string code 
> for
> a very long time. However, a wise man made a comment to me that I couldn't 
> argue
> with: if a user installs Windows on their hardware, the brand string should be
> the same as what we display in Linux. The Windows driver apparently does this
> check, so we should too.
>
> Note that I did manage to find a good use for this info anyway in the computer
> shader thread counts.
>
> Cc: Kaveh Nasri 
> Signed-off-by: Ben Widawsky 
> ---
>  include/pci_ids/i965_pci_ids.h   |  4 ++--
>  src/mesa/drivers/dri/i965/brw_context.c  | 33 
> +---
>  src/mesa/drivers/dri/i965/brw_context.h  |  3 ++-
>  src/mesa/drivers/dri/i965/intel_screen.c |  2 +-
>  4 files changed, 35 insertions(+), 7 deletions(-)
>
> diff --git a/include/pci_ids/i965_pci_ids.h b/include/pci_ids/i965_pci_ids.h
> index bdfbefe..d783e39 100644
> --- a/include/pci_ids/i965_pci_ids.h
> +++ b/include/pci_ids/i965_pci_ids.h
> @@ -156,8 +156,8 @@ CHIPSET(0x5932, kbl_gt4, "Intel(R) Kabylake GT4")
>  CHIPSET(0x593A, kbl_gt4, "Intel(R) Kabylake GT4")
>  CHIPSET(0x593B, kbl_gt4, "Intel(R) Kabylake GT4")
>  CHIPSET(0x593D, kbl_gt4, "Intel(R) Kabylake GT4")
> -CHIPSET(0x22B0, chv, "Intel(R) HD Graphics (Cherryview)")
> -CHIPSET(0x22B1, chv, "Intel(R) HD Graphics (Cherryview)")
> +CHIPSET(0x22B0, chv, "Intel(R) HD Graphics (Cherrytrail)")
> +CHIPSET(0x22B1, chv, "Intel(R) HD Graphics XXX (Braswell)") /* Overriden 
> in brw_get_renderer_string */

Typo: Overridden

>  CHIPSET(0x22B2, chv, "Intel(R) HD Graphics (Cherryview)")
>  CHIPSET(0x22B3, chv, "Intel(R) HD Graphics (Cherryview)")
>  CHIPSET(0x0A84, bxt, "Intel(R) HD Graphics (Broxton)")
> diff --git a/src/mesa/drivers/dri/i965/brw_context.c 
> b/src/mesa/drivers/dri/i965/brw_context.c
> index df0f6bb..f57184f 100644
> --- a/src/mesa/drivers/dri/i965/brw_context.c
> +++ b/src/mesa/drivers/dri/i965/brw_context.c
> @@ -77,13 +77,27 @@
>
>  const char *const brw_vendor_string = "Intel Open Source Technology Center";
>
> +static const char *
> +get_bsw_model(const struct intel_screen *intelScreen)
> +{
> +   switch (intelScreen->eu_total) {
> +   case 16:
> +  return "405";
> +   case 12:
> +  return "400";
> +   default:

I think this is safe to just mark unreachable(), right?

> +  return "   ";
> +   }
> +}
> +
>  const char *
> -brw_get_renderer_string(unsigned deviceID)
> +brw_get_renderer_string(const struct intel_screen *intelScreen)
>  {
> const char *chipset;
> static char buffer[128];

Not your fault, but driGetRendererString() into this static buffer
isn't thread-safe. I ran into a similar problem in EGL with
shader-db's run.c last year.

> +   char *bsw = NULL;

Thought the initialization wasn't necessary at first, but indeed it is
if you want to unconditionally call free().

>
> -   switch (deviceID) {
> +   switch (intelScreen->deviceID) {
>  #undef CHIPSET
>  #define CHIPSET(id, symbol, str) case id: chipset = str; break;
>  #include "pci_ids/i965_pci_ids.h"
> @@ -92,7 +106,20 @@ brw_get_renderer_string(unsigned deviceID)
>break;
> }
>
> +   /* Braswell branding is funny, so we have to fix it up here */
> +   if (intelScreen->deviceID == 0x22B1) {
> +  char *needle;
> +
> +  bsw = strdup(chipset);
> +  needle = strstr(bsw, "XXX");

Could declare char *needle here and initialize on one line if you wanted.

> +  if (needle) {
> + strncpy(needle, get_bsw_model(intelScreen), strlen("XXX"));

Don't actually need (or want) any of the features of strncpy. Should
just use memcpy.

> + chipset = bsw;
> +  }
> +   }
> +
> (void) driGetRendererString(buffer, chipset, 0);
> +   free(bsw);
> return buffer;
>  }
>
> @@ -107,7 +134,7 @@ intel_get_string(struct gl_context * ctx, GLenum name)
>
> case GL_RENDERER:
>return
> - (GLubyte *) brw_get_renderer_string(brw->intelScreen->deviceID);
> + (GLubyte *) brw_get_renderer_string(brw->intelScreen);
>

With the couple things fixed,

Reviewed-by: Matt Turner 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] mesa: fix malformed assertion in _image_format_class_to_glenum()

2016-03-07 Thread Vinson Lee
On Mon, Mar 7, 2016 at 5:58 PM, Brian Paul  wrote:
> ---
>  src/mesa/main/shaderimage.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/src/mesa/main/shaderimage.c b/src/mesa/main/shaderimage.c
> index fa967a2..fd5934f 100644
> --- a/src/mesa/main/shaderimage.c
> +++ b/src/mesa/main/shaderimage.c
> @@ -360,7 +360,7 @@ _image_format_class_to_glenum(enum image_format_class 
> class)
> case IMAGE_FORMAT_CLASS_2_10_10_10:
>return GL_IMAGE_CLASS_10_10_10_2;
> default:
> -  assert("Invalid image_format_class");
> +  assert(!"Invalid image_format_class");
>return GL_NONE;
> }
>  }
> --
> 1.9.1
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Reviewed-by: Vinson Lee 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 00/26] RadeonSI getting ready for interop with HSA/OpenCL

2016-03-07 Thread Michel Dänzer
On 03.03.2016 01:36, Marek Olšák wrote:
> Hi,
> 
> This patch series contains necessary radeonsi changes in order to support 
> OpenGL-OpenCL interop. This only covers buffer and texture sharing.
> 
> The changes can be summarized to:
> - write an image descriptor to amdgpu buffer metadata for OpenCL to use it
> - buffer map/unmap optimizations need to be aware of shared buffers
> - disable CMASK and DCC when needed
> - allow texture sharing with DCC enabled
> - get the PCI device location (group:bus:dev:func) for OpenCL to query it 
> later
> 
> Dependent gallium patches sent separately:
> - resource_from(get)_handle getting new flags
> - new PCI device location CAPs
> 
> This radeonsi series and dependent gallium patches are required for being 
> able to implement the already-proposed public Mesa interop API in st/dri, 
> libGL, and libEGL.

Patches 1-16 & 18-26 are

Reviewed-by: Michel Dänzer 


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 17/26] gallium/radeon: disable CMASK on handle export if sharing doesn't allow it

2016-03-07 Thread Michel Dänzer
On 03.03.2016 01:36, Marek Olšák wrote:
> From: Marek Olšák 
> 
> The disabling of CMASK is simple, but notifying all contexts about it is not:
> - The screen must have a list of all contexts.
> - Each context must have a monotonic counter that is incremented only when
>   the screen wants to re-emit framebuffer states.
> - Each context must check in draw_vbo if the counter has been changed and
>   re-emit the framebuffer state accordingly.

The list seems a bit overkill. How about having dirty_fb_counter in the
screen and last_dirty_fb_counter in the context, incrementing the former
in r600_dirty_all_framebuffer_states and emitting the framebuffer state
if the two counters don't match?


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 09/26] gallium/radeon: buffer valid range tracking only works with unshared buffers

2016-03-07 Thread Michel Dänzer
On 03.03.2016 01:36, Marek Olšák wrote:
> From: Marek Olšák 
> 
> ---
>  src/gallium/drivers/radeon/r600_buffer_common.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/src/gallium/drivers/radeon/r600_buffer_common.c 
> b/src/gallium/drivers/radeon/r600_buffer_common.c
> index 439a3cb..fb3a80e 100644
> --- a/src/gallium/drivers/radeon/r600_buffer_common.c
> +++ b/src/gallium/drivers/radeon/r600_buffer_common.c
> @@ -294,6 +294,7 @@ static void *r600_buffer_transfer_map(struct pipe_context 
> *ctx,
>* in which case it can be mapped unsynchronized. */
>   if (!(usage & PIPE_TRANSFER_UNSYNCHRONIZED) &&
>   usage & PIPE_TRANSFER_WRITE &&
> + !rbuffer->is_shared &&
>   !util_ranges_intersect(&rbuffer->valid_buffer_range, box->x, box->x 
> + box->width)) {
>   usage |= PIPE_TRANSFER_UNSYNCHRONIZED;
>   }
> 

Maybe this could be a bit more thorough, e.g. also guarding the
util_range_add calls for shared buffers and clearing/destroying the
valid range when a buffer gets shared. Can be done in a followup change
though.


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] mesa: fix malformed assertion in _image_format_class_to_glenum()

2016-03-07 Thread Brian Paul
---
 src/mesa/main/shaderimage.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/mesa/main/shaderimage.c b/src/mesa/main/shaderimage.c
index fa967a2..fd5934f 100644
--- a/src/mesa/main/shaderimage.c
+++ b/src/mesa/main/shaderimage.c
@@ -360,7 +360,7 @@ _image_format_class_to_glenum(enum image_format_class class)
case IMAGE_FORMAT_CLASS_2_10_10_10:
   return GL_IMAGE_CLASS_10_10_10_2;
default:
-  assert("Invalid image_format_class");
+  assert(!"Invalid image_format_class");
   return GL_NONE;
}
 }
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/5] i965/chv: Use kernel provided info for max_cs_threads

2016-03-07 Thread Ben Widawsky
With the previous patches, the code can find out the actual number of available
compute threads. It is enabled only for Cherryview since that is the only
platform I know for a fact has shipped devices which can benefit from this.  It
seems like other platforms /might/ benefit from this because of fused
configurations which /might/ have shipped. Fallback code is still there.

Cc: Jordan Justen 
Signed-off-by: Ben Widawsky 
---
 src/mesa/drivers/dri/i965/brw_context.c | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/brw_context.c 
b/src/mesa/drivers/dri/i965/brw_context.c
index c66dd13..6e3e97b 100644
--- a/src/mesa/drivers/dri/i965/brw_context.c
+++ b/src/mesa/drivers/dri/i965/brw_context.c
@@ -929,7 +929,13 @@ brwCreateContext(gl_api api,
brw->max_ds_threads = devinfo->max_ds_threads;
brw->max_gs_threads = devinfo->max_gs_threads;
brw->max_wm_threads = devinfo->max_wm_threads;
-   brw->max_cs_threads = devinfo->max_cs_threads;
+   /* FINISHME: Do this for all platforms that the kernel supports */
+   if (brw->is_cherryview &&
+   screen->subslice_total > 0 && screen->eu_total > 0)
+  /* Logical CS threads = n EUs per subslice * 7 threads per EU */
+  brw->max_cs_threads = screen->eu_total / screen->subslice_total * 7;
+   else
+  brw->max_cs_threads = devinfo->max_cs_threads;
brw->urb.size = devinfo->urb.size;
brw->urb.min_vs_entries = devinfo->urb.min_vs_entries;
brw->urb.max_vs_entries = devinfo->urb.max_vs_entries;
-- 
2.7.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 4/5] i965/chv: Update lower min for CS threads

2016-03-07 Thread Ben Widawsky
We have better information now, and 28 was not a valid thing to support. 6 EUs
per sublice with 7 threads per EU is the minimum supported config.

Cc: Jordan Justen 
Signed-off-by: Ben Widawsky 
---
 src/mesa/drivers/dri/i965/brw_device_info.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/brw_device_info.c 
b/src/mesa/drivers/dri/i965/brw_device_info.c
index 55c4d36..c703fb5 100644
--- a/src/mesa/drivers/dri/i965/brw_device_info.c
+++ b/src/mesa/drivers/dri/i965/brw_device_info.c
@@ -312,7 +312,7 @@ static const struct brw_device_info brw_device_info_chv = {
.max_ds_threads = 80,
.max_gs_threads = 80,
.max_wm_threads = 128,
-   .max_cs_threads = 28,
+   .max_cs_threads = 6 * 7,
.urb = {
   .size = 192,
   .min_vs_entries = 34,
-- 
2.7.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/5] i965/chv: Check that compute threads are above threshold

2016-03-07 Thread Ben Widawsky
The way we are organizing this code, the statically configured max_cs_threads
should always be the minimum value we actually support (ie. are aware of). As a
result, we can fall back to that if we get invalid numbers from the kernel (ie.
when the query succeeds, but the result is lower than expected).

I was originally planning to use an assert, but there is no reason to be so
mean.

Cc: Jordan Justen 
Signed-off-by: Ben Widawsky 
---
 src/mesa/drivers/dri/i965/brw_context.c | 8 ++--
 src/mesa/drivers/dri/i965/brw_device_info.h | 5 +
 2 files changed, 11 insertions(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_context.c 
b/src/mesa/drivers/dri/i965/brw_context.c
index 6e3e97b..df0f6bb 100644
--- a/src/mesa/drivers/dri/i965/brw_context.c
+++ b/src/mesa/drivers/dri/i965/brw_context.c
@@ -931,10 +931,14 @@ brwCreateContext(gl_api api,
brw->max_wm_threads = devinfo->max_wm_threads;
/* FINISHME: Do this for all platforms that the kernel supports */
if (brw->is_cherryview &&
-   screen->subslice_total > 0 && screen->eu_total > 0)
+   screen->subslice_total > 0 && screen->eu_total > 0) {
   /* Logical CS threads = n EUs per subslice * 7 threads per EU */
   brw->max_cs_threads = screen->eu_total / screen->subslice_total * 7;
-   else
+
+  /* Fuse configurations may give more threads than expected, never less. 
*/
+  if (brw->max_cs_threads < devinfo->max_cs_threads)
+ brw->max_cs_threads = devinfo->max_cs_threads;
+   } else
   brw->max_cs_threads = devinfo->max_cs_threads;
brw->urb.size = devinfo->urb.size;
brw->urb.min_vs_entries = devinfo->urb.min_vs_entries;
diff --git a/src/mesa/drivers/dri/i965/brw_device_info.h 
b/src/mesa/drivers/dri/i965/brw_device_info.h
index 73d6820..5c9517e 100644
--- a/src/mesa/drivers/dri/i965/brw_device_info.h
+++ b/src/mesa/drivers/dri/i965/brw_device_info.h
@@ -71,6 +71,11 @@ struct brw_device_info
/**
 * Total number of slices present on the device whether or not they've been
 * fused off.
+*
+* XXX: CS thread counts are limited by the inability to do cross subslice
+* communication. It is the effectively the number of logical threads which
+* can be executed in a subslice. Fuse configurations may cause this number
+* to change, so we program @max_cs_threads as the lower maximum.
 */
unsigned num_slices;
unsigned max_vs_threads;
-- 
2.7.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/5] i965: Query and store GPU properties from kernel

2016-03-07 Thread Ben Widawsky
Certain products are not uniquely identifiable based on device id alone. The
kernel exports an interface to help deal with this. This patch merely introduces
the consumer of the interface and makes sure nothing breaks.

It is also possible to use these values for programming GPGPU mode, and I plan
to do that as well.

The interface was introduced in libdrm 2.4.60, which is already required, so it
should all be fine.

Signed-off-by: Ben Widawsky 
---
 src/mesa/drivers/dri/i965/intel_screen.c | 21 +
 src/mesa/drivers/dri/i965/intel_screen.h | 12 +++-
 2 files changed, 32 insertions(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/intel_screen.c 
b/src/mesa/drivers/dri/i965/intel_screen.c
index ee7c1d7..343b497 100644
--- a/src/mesa/drivers/dri/i965/intel_screen.c
+++ b/src/mesa/drivers/dri/i965/intel_screen.c
@@ -1082,6 +1082,7 @@ static bool
 intel_init_bufmgr(struct intel_screen *intelScreen)
 {
__DRIscreen *spriv = intelScreen->driScrnPriv;
+   bool devid_override = getenv("INTEL_DEVID_OVERRIDE") != NULL;
 
intelScreen->no_hw = getenv("INTEL_NO_HW") != NULL;
 
@@ -1099,6 +1100,26 @@ intel_init_bufmgr(struct intel_screen *intelScreen)
   return false;
}
 
+   intelScreen->subslice_total = -1;
+   intelScreen->eu_total = -1;
+
+   /* Everything below this is for real hardware only */
+   if (intelScreen->no_hw || devid_override)
+  return true;
+
+   intel_get_param(spriv, I915_PARAM_SUBSLICE_TOTAL,
+   &intelScreen->subslice_total);
+   intel_get_param(spriv, I915_PARAM_EU_TOTAL, &intelScreen->eu_total);
+
+   /* Without this information, we cannot get the right Braswell brandstrings,
+* and we have to use conservative numbers for GPGPU on many platforms, but
+* otherwise, things will just work.
+*/
+   if (intelScreen->subslice_total == -1 ||
+   intelScreen->eu_total == -1)
+  _mesa_warning(NULL,
+"Kernel 4.1 required to properly query GPU properties.\n");
+
return true;
 }
 
diff --git a/src/mesa/drivers/dri/i965/intel_screen.h 
b/src/mesa/drivers/dri/i965/intel_screen.h
index 3a5f22c..695ed50 100644
--- a/src/mesa/drivers/dri/i965/intel_screen.h
+++ b/src/mesa/drivers/dri/i965/intel_screen.h
@@ -81,7 +81,17 @@ struct intel_screen
 * I915_PARAM_CMD_PARSER_VERSION parameter
 */
int cmd_parser_version;
- };
+
+   /**
+* Best effort attempt to get system information. Needed for GPGPU, and 
brand
+* strings (sigh)
+* I915_PARAM_SUBSLICE_TOTAL, and I915_PARAM_EU_TOTAL
+*/
+   struct {
+  int subslice_total;
+  int eu_total;
+   };
+};
 
 extern void intelDestroyContext(__DRIcontext * driContextPriv);
 
-- 
2.7.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 5/5] i965/chv: Display proper branding

2016-03-07 Thread Ben Widawsky
"Braswell" is a Cherryview based *thing*. It unfortunately requires extra
information to determine its marketing name. Unlike all previous products, and
hopefully all future ones, there is no unique 1:1 mapping of PCI device ID to
brand string.

I put up a fight about adding any complexity to our GL renderer string code for
a very long time. However, a wise man made a comment to me that I couldn't argue
with: if a user installs Windows on their hardware, the brand string should be
the same as what we display in Linux. The Windows driver apparently does this
check, so we should too.

Note that I did manage to find a good use for this info anyway in the computer
shader thread counts.

Cc: Kaveh Nasri 
Signed-off-by: Ben Widawsky 
---
 include/pci_ids/i965_pci_ids.h   |  4 ++--
 src/mesa/drivers/dri/i965/brw_context.c  | 33 +---
 src/mesa/drivers/dri/i965/brw_context.h  |  3 ++-
 src/mesa/drivers/dri/i965/intel_screen.c |  2 +-
 4 files changed, 35 insertions(+), 7 deletions(-)

diff --git a/include/pci_ids/i965_pci_ids.h b/include/pci_ids/i965_pci_ids.h
index bdfbefe..d783e39 100644
--- a/include/pci_ids/i965_pci_ids.h
+++ b/include/pci_ids/i965_pci_ids.h
@@ -156,8 +156,8 @@ CHIPSET(0x5932, kbl_gt4, "Intel(R) Kabylake GT4")
 CHIPSET(0x593A, kbl_gt4, "Intel(R) Kabylake GT4")
 CHIPSET(0x593B, kbl_gt4, "Intel(R) Kabylake GT4")
 CHIPSET(0x593D, kbl_gt4, "Intel(R) Kabylake GT4")
-CHIPSET(0x22B0, chv, "Intel(R) HD Graphics (Cherryview)")
-CHIPSET(0x22B1, chv, "Intel(R) HD Graphics (Cherryview)")
+CHIPSET(0x22B0, chv, "Intel(R) HD Graphics (Cherrytrail)")
+CHIPSET(0x22B1, chv, "Intel(R) HD Graphics XXX (Braswell)") /* Overriden 
in brw_get_renderer_string */
 CHIPSET(0x22B2, chv, "Intel(R) HD Graphics (Cherryview)")
 CHIPSET(0x22B3, chv, "Intel(R) HD Graphics (Cherryview)")
 CHIPSET(0x0A84, bxt, "Intel(R) HD Graphics (Broxton)")
diff --git a/src/mesa/drivers/dri/i965/brw_context.c 
b/src/mesa/drivers/dri/i965/brw_context.c
index df0f6bb..f57184f 100644
--- a/src/mesa/drivers/dri/i965/brw_context.c
+++ b/src/mesa/drivers/dri/i965/brw_context.c
@@ -77,13 +77,27 @@
 
 const char *const brw_vendor_string = "Intel Open Source Technology Center";
 
+static const char *
+get_bsw_model(const struct intel_screen *intelScreen)
+{
+   switch (intelScreen->eu_total) {
+   case 16:
+  return "405";
+   case 12:
+  return "400";
+   default:
+  return "   ";
+   }
+}
+
 const char *
-brw_get_renderer_string(unsigned deviceID)
+brw_get_renderer_string(const struct intel_screen *intelScreen)
 {
const char *chipset;
static char buffer[128];
+   char *bsw = NULL;
 
-   switch (deviceID) {
+   switch (intelScreen->deviceID) {
 #undef CHIPSET
 #define CHIPSET(id, symbol, str) case id: chipset = str; break;
 #include "pci_ids/i965_pci_ids.h"
@@ -92,7 +106,20 @@ brw_get_renderer_string(unsigned deviceID)
   break;
}
 
+   /* Braswell branding is funny, so we have to fix it up here */
+   if (intelScreen->deviceID == 0x22B1) {
+  char *needle;
+
+  bsw = strdup(chipset);
+  needle = strstr(bsw, "XXX");
+  if (needle) {
+ strncpy(needle, get_bsw_model(intelScreen), strlen("XXX"));
+ chipset = bsw;
+  }
+   }
+
(void) driGetRendererString(buffer, chipset, 0);
+   free(bsw);
return buffer;
 }
 
@@ -107,7 +134,7 @@ intel_get_string(struct gl_context * ctx, GLenum name)
 
case GL_RENDERER:
   return
- (GLubyte *) brw_get_renderer_string(brw->intelScreen->deviceID);
+ (GLubyte *) brw_get_renderer_string(brw->intelScreen);
 
default:
   return NULL;
diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
b/src/mesa/drivers/dri/i965/brw_context.h
index 88f0d49..a953745 100644
--- a/src/mesa/drivers/dri/i965/brw_context.h
+++ b/src/mesa/drivers/dri/i965/brw_context.h
@@ -1341,7 +1341,8 @@ extern void intelInitClearFuncs(struct dd_function_table 
*functions);
  */
 extern const char *const brw_vendor_string;
 
-extern const char *brw_get_renderer_string(unsigned deviceID);
+extern const char *
+brw_get_renderer_string(const struct intel_screen *intelScreen);
 
 enum {
DRI_CONF_BO_REUSE_DISABLED,
diff --git a/src/mesa/drivers/dri/i965/intel_screen.c 
b/src/mesa/drivers/dri/i965/intel_screen.c
index 343b497..97aa877 100644
--- a/src/mesa/drivers/dri/i965/intel_screen.c
+++ b/src/mesa/drivers/dri/i965/intel_screen.c
@@ -891,7 +891,7 @@ brw_query_renderer_string(__DRIscreen *psp, int param, 
const char **value)
   value[0] = brw_vendor_string;
   return 0;
case __DRI2_RENDERER_DEVICE_ID:
-  value[0] = brw_get_renderer_string(intelScreen->deviceID);
+  value[0] = brw_get_renderer_string(intelScreen);
   return 0;
default:
   break;
-- 
2.7.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 02/10] i965: Use foreach_in_list_reverse_safe() macro.

2016-03-07 Thread Ian Romanick
I don't really have any basis to comment on the rest of the series, but
this patch is

Reviewed-by: Ian Romanick 

On 03/04/2016 08:04 PM, Matt Turner wrote:
> ---
>  src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp | 14 ++
>  1 file changed, 2 insertions(+), 12 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp 
> b/src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp
> index 2c7e4f7..51d9ce1 100644
> --- a/src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp
> @@ -1057,12 +1057,7 @@ fs_instruction_scheduler::calculate_deps()
> last_accumulator_write = NULL;
> last_fixed_grf_write = NULL;
>  
> -   exec_node *node;
> -   exec_node *prev;
> -   for (node = instructions.get_tail(), prev = node->prev;
> -!node->is_head_sentinel();
> -node = prev, prev = node->prev) {
> -  schedule_node *n = (schedule_node *)node;
> +   foreach_in_list_reverse_safe(schedule_node, n, &instructions) {
>fs_inst *inst = (fs_inst *)n->inst;
>  
>/* write-after-read deps. */
> @@ -1284,12 +1279,7 @@ vec4_instruction_scheduler::calculate_deps()
> last_accumulator_write = NULL;
> last_fixed_grf_write = NULL;
>  
> -   exec_node *node;
> -   exec_node *prev;
> -   for (node = instructions.get_tail(), prev = node->prev;
> -!node->is_head_sentinel();
> -node = prev, prev = node->prev) {
> -  schedule_node *n = (schedule_node *)node;
> +   foreach_in_list_reverse_safe(schedule_node, n, &instructions) {
>vec4_instruction *inst = (vec4_instruction *)n->inst;
>  
>/* write-after-read deps. */
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/4] glcpp: Implicitly resolve version after the first non-space/hash token.

2016-03-07 Thread Ian Romanick
On 03/04/2016 07:33 PM, Kenneth Graunke wrote:
> We resolved the implicit version directive when processing control lines,
> such as #ifdef, to ensure any built-in macros exist.  However, we failed
> to resolve it when handling ordinary text.
> 
> For example,
> 
> int x = __VERSION__;
> 
> should resolve __VERSION__ to 110, but since we never resolved the implicit
> version, none of the built-in macros exist, so it was left as is.
> 
> This also meant we allowed the following shader to slop through:
> 
> 123
> #version 120
> 
> Nothing would cause the implicit version to take effect, so when we saw
> the #version directive, we thought everything was peachy.
> 
> This patch makes the lexer's per-token action resolve the implicit
> version on the first non-space/newline/hash token that isn't part of
> a #version directive, fulfilling the GLSL language spec:
> 
> "The #version directive must occur in a shader before anything else,
>  except for comments and white space."
> 
> Because we emit #version as HASH_TOKEN then VERSION_TOKEN, we have to
> allow HASH_TOKEN to slop through as well, so we don't resolve the
> implicit version as soon as we see the # character.  However, this is
> fine, because the parser's HASH_TOKEN NEWLINE rule does resolve the
> version, disallowing cases like:
> 
> #
> #version 120
> 
> This patch also adds the above shaders as new glcpp tests.

Does any of this interfere with various workarounds that we have to
allow mixed #version lines, #extension lines, and code?  Assuming all
that still works, this series is

Reviewed-by: Ian Romanick 

> Fixes dEQP-GLES2.functional.shaders.preprocessor.predefined_macros.
> {gl_es_1_vertex,gl_es_1_fragment}.
> 
> Signed-off-by: Kenneth Graunke 
> ---
>  src/compiler/glsl/glcpp/glcpp-lex.l | 8 
>  src/compiler/glsl/glcpp/glcpp.h | 1 +
>  src/compiler/glsl/glcpp/tests/144-implicit-version.c| 1 +
>  src/compiler/glsl/glcpp/tests/144-implicit-version.c.expected   | 1 +
>  src/compiler/glsl/glcpp/tests/145-version-first.c   | 2 ++
>  src/compiler/glsl/glcpp/tests/145-version-first.c.expected  | 3 +++
>  src/compiler/glsl/glcpp/tests/146-version-first-hash.c  | 2 ++
>  src/compiler/glsl/glcpp/tests/146-version-first-hash.c.expected | 3 +++
>  8 files changed, 21 insertions(+)
>  create mode 100644 src/compiler/glsl/glcpp/tests/144-implicit-version.c
>  create mode 100644 
> src/compiler/glsl/glcpp/tests/144-implicit-version.c.expected
>  create mode 100644 src/compiler/glsl/glcpp/tests/145-version-first.c
>  create mode 100644 src/compiler/glsl/glcpp/tests/145-version-first.c.expected
>  create mode 100644 src/compiler/glsl/glcpp/tests/146-version-first-hash.c
>  create mode 100644 
> src/compiler/glsl/glcpp/tests/146-version-first-hash.c.expected
> 
> diff --git a/src/compiler/glsl/glcpp/glcpp-lex.l 
> b/src/compiler/glsl/glcpp/glcpp-lex.l
> index fa9aa50..071918e 100644
> --- a/src/compiler/glsl/glcpp/glcpp-lex.l
> +++ b/src/compiler/glsl/glcpp/glcpp-lex.l
> @@ -120,6 +120,11 @@ void glcpp_set_column (int  column_no , yyscan_t 
> yyscanner);
>  static int
>  glcpp_lex_update_state_per_token (glcpp_parser_t *parser, int token)
>  {
> + if (token != NEWLINE && token != SPACE && token != HASH_TOKEN &&
> + !parser->lexing_version_directive) {
> + glcpp_parser_resolve_implicit_version(parser);
> + }
> +
>   /* After the first non-space token in a line, we won't
>* allow any '#' to introduce a directive. */
>   if (token == NEWLINE) {
> @@ -285,6 +290,7 @@ HEXADECIMAL_INTEGER   0[xX][0-9a-fA-F]+[uU]?
>  version{HSPACE}+ {
>   BEGIN INITIAL;
>   yyextra->space_tokens = 0;
> + yyextra->lexing_version_directive = 1;
>   RETURN_STRING_TOKEN (VERSION_TOKEN);
>  }
>  
> @@ -536,6 +542,7 @@ HEXADECIMAL_INTEGER   0[xX][0-9a-fA-F]+[uU]?
>   }
>   yyextra->space_tokens = 1;
>   yyextra->lexing_directive = 0;
> + yyextra->lexing_version_directive = 0;
>   yylineno++;
>   yycolumn = 0;
>   RETURN_TOKEN_NEVER_SKIP (NEWLINE);
> @@ -546,6 +553,7 @@ HEXADECIMAL_INTEGER   0[xX][0-9a-fA-F]+[uU]?
>   glcpp_error(yylloc, yyextra, "Unterminated comment");
>   BEGIN DONE; /* Don't keep matching this rule forever. */
>   yyextra->lexing_directive = 0;
> + yyextra->lexing_version_directive = 0;
>   if (! parser->last_token_was_newline)
>   RETURN_TOKEN (NEWLINE);
>  }
> diff --git a/src/compiler/glsl/glcpp/glcpp.h b/src/compiler/glsl/glcpp/glcpp.h
> index 70aa14b..d87e6b7 100644
> --- a/src/compiler/glsl/glcpp/glcpp.h
> +++ b/src/compiler/glsl/glcpp/glcpp.h
> @@ -176,6 +176,7 @@ struct glcpp_parser {
>   struct hash_table *defines;
>   active_list_t *active;
>   int lexing_directive;
> + int lexing_version_directive;
>   int space_tokens;
>   int last_token_was_newline;
> 

Re: [Mesa-dev] [PATCH] mesa: Fix error code for GetFramebufferAttachmentParameter in ES 3.0+.

2016-03-07 Thread Ian Romanick
Reviewed-by: Ian Romanick 

On 03/07/2016 04:55 PM, Kenneth Graunke wrote:
> The ES 3.0+ specifications contain the exact same text as the OpenGL
> specification, which says that we should return GL_INVALID_OPERATION.
> 
> ES 2.0 contains different text saying we should return GL_INVALID_ENUM.
> 
> Previously, Mesa chose the error code based on API (GL vs. ES).
> This patch makes ES 3.0+ follow the GL behavior.  ES 2 remains as is.
> 
> Fixes dEQP-GLES3.functional.fbo.api.attachment_query_empty_fbo.
> However, breaks the dEQP-GLES2 variant of the same test for drivers
> which silently promote to ES 3.0.  This can be worked around by
> exporting MESA_GLES_VERSION_OVERRIDE=2.0, but is a bug in dEQP-GLES2.
> 
> Signed-off-by: Kenneth Graunke 
> ---
>  src/mesa/main/fbobject.c | 18 --
>  1 file changed, 16 insertions(+), 2 deletions(-)
> 
> diff --git a/src/mesa/main/fbobject.c b/src/mesa/main/fbobject.c
> index feab86c..d490918 100644
> --- a/src/mesa/main/fbobject.c
> +++ b/src/mesa/main/fbobject.c
> @@ -3580,8 +3580,22 @@ _mesa_get_framebuffer_attachment_parameter(struct 
> gl_context *ctx,
> const struct gl_renderbuffer_attachment *att;
> GLenum err;
>  
> -   /* The error differs in GL and GLES. */
> -   err = _mesa_is_desktop_gl(ctx) ? GL_INVALID_OPERATION : GL_INVALID_ENUM;
> +   /* The error code for an attachment type of GL_NONE differs between APIs.
> +*
> +* From the ES 2.0.25 specification, page 127:
> +* "If the value of FRAMEBUFFER_ATTACHMENT_OBJECT_TYPE is NONE, then
> +*  querying any other pname will generate INVALID_ENUM."
> +*
> +* From the OpenGL 3.0 specification, page 337, or identically,
> +* the OpenGL ES 3.0.4 specification, page 240:
> +*
> +* "If the value of FRAMEBUFFER_ATTACHMENT_OBJECT_TYPE is NONE, no
> +*  framebuffer is bound to target.  In this case querying pname
> +*  FRAMEBUFFER_ATTACHMENT_OBJECT_NAME will return zero, and all other
> +*  queries will generate an INVALID_OPERATION error."
> +*/
> +   err = ctx->API == API_OPENGLES2 && ctx->Version < 30 ?
> +  GL_INVALID_ENUM : GL_INVALID_OPERATION;
>  
> if (_mesa_is_winsys_fbo(buffer)) {
>/* Page 126 (page 136 of the PDF) of the OpenGL ES 2.0.25 spec
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] mesa: Fix error code for GetFramebufferAttachmentParameter in ES 3.0+.

2016-03-07 Thread Kenneth Graunke
The ES 3.0+ specifications contain the exact same text as the OpenGL
specification, which says that we should return GL_INVALID_OPERATION.

ES 2.0 contains different text saying we should return GL_INVALID_ENUM.

Previously, Mesa chose the error code based on API (GL vs. ES).
This patch makes ES 3.0+ follow the GL behavior.  ES 2 remains as is.

Fixes dEQP-GLES3.functional.fbo.api.attachment_query_empty_fbo.
However, breaks the dEQP-GLES2 variant of the same test for drivers
which silently promote to ES 3.0.  This can be worked around by
exporting MESA_GLES_VERSION_OVERRIDE=2.0, but is a bug in dEQP-GLES2.

Signed-off-by: Kenneth Graunke 
---
 src/mesa/main/fbobject.c | 18 --
 1 file changed, 16 insertions(+), 2 deletions(-)

diff --git a/src/mesa/main/fbobject.c b/src/mesa/main/fbobject.c
index feab86c..d490918 100644
--- a/src/mesa/main/fbobject.c
+++ b/src/mesa/main/fbobject.c
@@ -3580,8 +3580,22 @@ _mesa_get_framebuffer_attachment_parameter(struct 
gl_context *ctx,
const struct gl_renderbuffer_attachment *att;
GLenum err;
 
-   /* The error differs in GL and GLES. */
-   err = _mesa_is_desktop_gl(ctx) ? GL_INVALID_OPERATION : GL_INVALID_ENUM;
+   /* The error code for an attachment type of GL_NONE differs between APIs.
+*
+* From the ES 2.0.25 specification, page 127:
+* "If the value of FRAMEBUFFER_ATTACHMENT_OBJECT_TYPE is NONE, then
+*  querying any other pname will generate INVALID_ENUM."
+*
+* From the OpenGL 3.0 specification, page 337, or identically,
+* the OpenGL ES 3.0.4 specification, page 240:
+*
+* "If the value of FRAMEBUFFER_ATTACHMENT_OBJECT_TYPE is NONE, no
+*  framebuffer is bound to target.  In this case querying pname
+*  FRAMEBUFFER_ATTACHMENT_OBJECT_NAME will return zero, and all other
+*  queries will generate an INVALID_OPERATION error."
+*/
+   err = ctx->API == API_OPENGLES2 && ctx->Version < 30 ?
+  GL_INVALID_ENUM : GL_INVALID_OPERATION;
 
if (_mesa_is_winsys_fbo(buffer)) {
   /* Page 126 (page 136 of the PDF) of the OpenGL ES 2.0.25 spec
-- 
2.7.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 94437] [swrast] piglit glx-shader-sharing regression

2016-03-07 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=94437

Bug ID: 94437
   Summary: [swrast] piglit glx-shader-sharing regression
   Product: Mesa
   Version: 11.2
  Hardware: x86-64 (AMD64)
OS: Linux (All)
Status: NEW
  Keywords: bisected, regression
  Severity: normal
  Priority: medium
 Component: Mesa core
  Assignee: mesa-dev@lists.freedesktop.org
  Reporter: v...@freedesktop.org
QA Contact: mesa-dev@lists.freedesktop.org
CC: imir...@alum.mit.edu, lem...@gmail.com,
plamena.manol...@intel.com

mesa: 90f9df3210b5b66585007ec4836bfca498fd45f0 (master 11.3.0-devel)

$ ./bin/glx-shader-sharing -auto
Failed to link: error: count of uniform locations >
MAX_UNIFORM_LOCATIONS(4294967295 > 98304)
PIGLIT: {"result": "fail" }


65dfb3048e8291675ca33581aeff8921f7ea509d is the first bad commit
commit 65dfb3048e8291675ca33581aeff8921f7ea509d
Author: Plamena Manolova 
Date:   Thu Feb 11 15:00:02 2016 +0200

compiler/glsl: Fix uniform location counting.

This patch moves the calculation of current uniforms to
link_uniforms, which makes use of UniformRemapTable which
stores all the reserved uniform locations.

Location assignment for implicit uniforms now tries to use
any gaps left in the table after the location assignment
for explicit uniforms. This gives us more space to store more
uniforms.

Patch is based on earlier patch with following changes/additions:

   1: Move the counting of explicit locations to
  check_explicit_uniform_locations and then pass
  the number to link_assign_uniform_locations.
   2: Count the number of empty slots in UniformRemapTable
  and store them in a list_head.
   3: Try to find an empty slot for implicit locations from
  the list, if that fails resize UniformRemapTable.

Fixes following CTS tests:
   ES31-CTS.explicit_uniform_location.uniform-loc-mix-with-implicit-max
  
ES31-CTS.explicit_uniform_location.uniform-loc-mix-with-implicit-max-array

Signed-off-by: Tapani Pälli 
Signed-off-by: Plamena Manolova 
Reviewed-by: Ilia Mirkin 
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93696

:04 04 5848c556c369c2c798c1c1e036c70c740b56a97a
25915fac71a54954aafd0139a55045ba394969e6 Msrc
bisect run success

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2] i965/vec4: pass the correct src_sz to emit_send at emit_untyped_atomic

2016-03-07 Thread Francisco Jerez
Alejandro Piñeiro  writes:

> If the src is invalid, so src size is zero, the src_sz passed to emit
> send should be zero too, instead of a default 1 if we are in a simd4x2
> case. This can happens if using emit_untyped_atomic for an atomic
> dec/inc.
>
> v2: use the proper src_sz when calling emit_send, instead of just
> avoid loading src at emit_send if BAD_FILE (Francisco Jerez)

Thanks!

Reviewed-by: Francisco Jerez 

> ---
>  src/mesa/drivers/dri/i965/brw_vec4_surface_builder.cpp | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_surface_builder.cpp 
> b/src/mesa/drivers/dri/i965/brw_vec4_surface_builder.cpp
> index 28002c5..1db349a 100644
> --- a/src/mesa/drivers/dri/i965/brw_vec4_surface_builder.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_vec4_surface_builder.cpp
> @@ -221,7 +221,7 @@ namespace brw {
>emit_insert(bld, addr, dims, has_simd4x2),
>has_simd4x2 ? 1 : dims,
>emit_insert(bld, src_reg(srcs), size, has_simd4x2),
> -  has_simd4x2 ? 1 : size,
> +  has_simd4x2 && size ? 1 : size,
>surface, op, rsize, pred);
>}
>  
> -- 
> 2.5.0


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/3] gallium/vl: Parameter substitution in the csc matrix computation

2016-03-07 Thread Brian Paul

On 03/04/2016 04:52 AM, Thomas Hellstrom wrote:

Makes the code significantly more readable.

Signed-off-by: Thomas Hellstrom 
---
  src/gallium/auxiliary/vl/vl_csc.c | 36 +---
  1 file changed, 21 insertions(+), 15 deletions(-)

diff --git a/src/gallium/auxiliary/vl/vl_csc.c 
b/src/gallium/auxiliary/vl/vl_csc.c
index bb2cc58..66cedd9 100644
--- a/src/gallium/auxiliary/vl/vl_csc.c
+++ b/src/gallium/auxiliary/vl/vl_csc.c
@@ -153,6 +153,7 @@ void vl_csc_get_matrix(enum VL_CSC_COLOR_STANDARD cs,
 float s = p->saturation;
 float b = p->brightness;
 float h = p->hue;
+   float x, y, z;

 const vl_csc_matrix *cstd;

@@ -162,6 +163,11 @@ void vl_csc_get_matrix(enum VL_CSC_COLOR_STANDARD cs,
s *= 1.138f/1.164f; /* Adjust for the chroma range */
 }

+   /* Parameter substitutions */
+   x = c * s * cosf(h);
+   y = c * s * sinf(h);
+   z = c * b;
+
 assert(matrix);

 switch (cs) {
@@ -182,23 +188,23 @@ void vl_csc_get_matrix(enum VL_CSC_COLOR_STANDARD cs,
 }

 (*matrix)[0][0] = c * (*cstd)[0][0];
-   (*matrix)[0][1] = c * (*cstd)[0][1] * s * cosf(h) - c * (*cstd)[0][2] * s * 
sinf(h);
-   (*matrix)[0][2] = c * (*cstd)[0][2] * s * cosf(h) + c * (*cstd)[0][1] * s * 
sinf(h);
-   (*matrix)[0][3] = (*cstd)[0][3] + (*cstd)[0][0] * c * b +
- (*cstd)[0][1] * (c * cbbias * s * cosf(h) + c * crbias * 
s * sinf(h)) +
- (*cstd)[0][2] * (c * crbias * s * cosf(h) - c * cbbias * 
s * sinf(h));
+   (*matrix)[0][1] = (*cstd)[0][1] * x - (*cstd)[0][2] * y;
+   (*matrix)[0][2] = (*cstd)[0][2] * x + (*cstd)[0][1] * y;
+   (*matrix)[0][3] = (*cstd)[0][3] + (*cstd)[0][0] * z +
+ (*cstd)[0][1] * (x * cbbias + y * crbias) +
+ (*cstd)[0][2] * (x * crbias - y * cbbias);

 (*matrix)[1][0] = c * (*cstd)[1][0];
-   (*matrix)[1][1] = c * (*cstd)[1][1] * s * cosf(h) - c * (*cstd)[1][2] * s * 
sinf(h);
-   (*matrix)[1][2] = c * (*cstd)[1][2] * s * cosf(h) + c * (*cstd)[1][1] * s * 
sinf(h);
-   (*matrix)[1][3] = (*cstd)[1][3] + (*cstd)[1][0] * c * b +
- (*cstd)[1][1] * (c * cbbias * s * cosf(h) + c * crbias * 
s * sinf(h)) +
- (*cstd)[1][2] * (c * crbias * s * cosf(h) - c * cbbias * 
s * sinf(h));
+   (*matrix)[1][1] = (*cstd)[1][1] * x - (*cstd)[1][2] * y;
+   (*matrix)[1][2] = (*cstd)[1][2] * x + (*cstd)[1][1] * y;
+   (*matrix)[1][3] = (*cstd)[1][3] + (*cstd)[1][0] * z +
+ (*cstd)[1][1] * (x * cbbias + y * crbias) +
+ (*cstd)[1][2] * (x * crbias - y * cbbias);

 (*matrix)[2][0] = c * (*cstd)[2][0];
-   (*matrix)[2][1] = c * (*cstd)[2][1] * s * cosf(h) - c * (*cstd)[2][2] * s * 
sinf(h);
-   (*matrix)[2][2] = c * (*cstd)[2][2] * s * cosf(h) + c * (*cstd)[2][1] * s * 
sinf(h);
-   (*matrix)[2][3] = (*cstd)[2][3] + (*cstd)[2][0] * c * b +
- (*cstd)[2][1] * (c * cbbias * s * cosf(h) + c * crbias * 
s * sinf(h)) +
- (*cstd)[2][2] * (c * crbias * s * cosf(h) - c * cbbias * 
s * sinf(h));
+   (*matrix)[2][1] = (*cstd)[2][1] * x - (*cstd)[2][2] * y;
+   (*matrix)[2][2] = (*cstd)[2][2] * x + (*cstd)[2][1] * y;
+   (*matrix)[2][3] = (*cstd)[2][3] + (*cstd)[2][0] * z +
+ (*cstd)[2][1] * (x * cbbias + y * crbias) +
+ (*cstd)[2][2] * (x * crbias - y * cbbias);
  }



For the series, Reviewed-by: Brian Paul 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 94383] build error on i386 when enabling swr

2016-03-07 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=94383

Tim Rowley  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|NEW |RESOLVED

--- Comment #2 from Tim Rowley  ---
Change committed.

Commit: 90f9df3210b5b66585007ec4836bfca498fd45f0

-- 
You are receiving this mail because:
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] gallium/swr: remove use of UINT64 from swr_fence

2016-03-07 Thread Kenneth Graunke
On Monday, March 7, 2016 2:59:34 PM PST Tim Rowley wrote:
> Remove use of a win32-style type leaked from the swr rasterizer.
> ---
>  src/gallium/drivers/swr/swr_fence.cpp | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/src/gallium/drivers/swr/swr_fence.cpp b/src/gallium/drivers/
swr/swr_fence.cpp
> index f97ea22..e80faee 100644
> --- a/src/gallium/drivers/swr/swr_fence.cpp
> +++ b/src/gallium/drivers/swr/swr_fence.cpp
> @@ -37,7 +37,7 @@
>   * to SwrSync call.
>   */
>  static void
> -swr_sync_cb(UINT64 userData, UINT64 userData2, UINT64 userData3)
> +swr_sync_cb(uint64_t userData, uint64_t userData2, uint64_t userData3)
>  {
> struct swr_fence *fence = (struct swr_fence *)userData;
>  
> @@ -53,7 +53,7 @@ swr_fence_submit(struct swr_context *ctx, struct 
pipe_fence_handle *fh)
> struct swr_fence *fence = swr_fence(fh);
>  
> fence->write++;
> -   SwrSync(ctx->swrContext, swr_sync_cb, (UINT64)fence, 0, 0);
> +   SwrSync(ctx->swrContext, swr_sync_cb, (uint64_t)fence, 0, 0);
>  }
>  
>  /*
> 

Reviewed-by: Kenneth Graunke 


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] anv/device: Up device limits for 3D and array texture dimensions

2016-03-07 Thread Anuj Phogat
On Mon, Mar 7, 2016 at 1:40 PM, Nanley Chery  wrote:
>
> From: Nanley Chery 
>
> The limit for these textures is 2048 not 1024.
>
> Signed-off-by: Nanley Chery 
> ---
>  src/intel/vulkan/anv_device.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/src/intel/vulkan/anv_device.c b/src/intel/vulkan/anv_device.c
> index 8aa1e61..5367375 100644
> --- a/src/intel/vulkan/anv_device.c
> +++ b/src/intel/vulkan/anv_device.c
> @@ -421,9 +421,9 @@ void anv_GetPhysicalDeviceProperties(
> VkPhysicalDeviceLimits limits = {
>.maxImageDimension1D  = (1 << 14),
>.maxImageDimension2D  = (1 << 14),
> -  .maxImageDimension3D  = (1 << 10),
> +  .maxImageDimension3D  = (1 << 11),
>.maxImageDimensionCube= (1 << 14),
> -  .maxImageArrayLayers  = (1 << 10),
> +  .maxImageArrayLayers  = (1 << 11),
>.maxTexelBufferElements   = 128 * 1024 * 1024,
>.maxUniformBufferRange= UINT32_MAX,
>.maxStorageBufferRange= UINT32_MAX,
> --
> 2.7.2
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Reviewed-by: Anuj Phogat 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] anv/device: Up device limits for 3D and array texture dimensions

2016-03-07 Thread Nanley Chery
From: Nanley Chery 

The limit for these textures is 2048 not 1024.

Signed-off-by: Nanley Chery 
---
 src/intel/vulkan/anv_device.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/intel/vulkan/anv_device.c b/src/intel/vulkan/anv_device.c
index 8aa1e61..5367375 100644
--- a/src/intel/vulkan/anv_device.c
+++ b/src/intel/vulkan/anv_device.c
@@ -421,9 +421,9 @@ void anv_GetPhysicalDeviceProperties(
VkPhysicalDeviceLimits limits = {
   .maxImageDimension1D  = (1 << 14),
   .maxImageDimension2D  = (1 << 14),
-  .maxImageDimension3D  = (1 << 10),
+  .maxImageDimension3D  = (1 << 11),
   .maxImageDimensionCube= (1 << 14),
-  .maxImageArrayLayers  = (1 << 10),
+  .maxImageArrayLayers  = (1 << 11),
   .maxTexelBufferElements   = 128 * 1024 * 1024,
   .maxUniformBufferRange= UINT32_MAX,
   .maxStorageBufferRange= UINT32_MAX,
-- 
2.7.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] gallium/swr: remove use of UINT64 from swr_fence

2016-03-07 Thread Tim Rowley
Remove use of a win32-style type leaked from the swr rasterizer.
---
 src/gallium/drivers/swr/swr_fence.cpp | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/gallium/drivers/swr/swr_fence.cpp 
b/src/gallium/drivers/swr/swr_fence.cpp
index f97ea22..e80faee 100644
--- a/src/gallium/drivers/swr/swr_fence.cpp
+++ b/src/gallium/drivers/swr/swr_fence.cpp
@@ -37,7 +37,7 @@
  * to SwrSync call.
  */
 static void
-swr_sync_cb(UINT64 userData, UINT64 userData2, UINT64 userData3)
+swr_sync_cb(uint64_t userData, uint64_t userData2, uint64_t userData3)
 {
struct swr_fence *fence = (struct swr_fence *)userData;
 
@@ -53,7 +53,7 @@ swr_fence_submit(struct swr_context *ctx, struct 
pipe_fence_handle *fh)
struct swr_fence *fence = swr_fence(fh);
 
fence->write++;
-   SwrSync(ctx->swrContext, swr_sync_cb, (UINT64)fence, 0, 0);
+   SwrSync(ctx->swrContext, swr_sync_cb, (uint64_t)fence, 0, 0);
 }
 
 /*
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH RFC 0/2] GBM API extension to support fusing KMS and render devices

2016-03-07 Thread Emil Velikov
On 7 March 2016 at 10:35, Daniel Stone  wrote:
> Hi,
>
> On 7 March 2016 at 10:19, Thierry Reding  wrote:
>> On Mon, Mar 07, 2016 at 10:46:52AM +0100, Lucas Stach wrote:
>>> Am Freitag, den 04.03.2016, 18:34 + schrieb Emil Velikov:
>>> > While I'm more inclined to Daniel's suggestion, I wonder why people
>>> > moved away from Thierry's approach - creating a composite/wrapped dri
>>> > module ? Is there anything wrong with it - be that from technical or
>>> > conceptual POV ?
>>> >
>>> The wrapped driver takes away the ability of the application to decide
>>> which GPUs to bind together - at least if you want to keep things
>>> tightly coupled at that level.
>>
>> That was actually the prime objective of the patches I posted back at
>> the time. =)
>
> Which, for the single-GPU/single-scanout case, is perfect. But it's
> completely backwards for the multi-GPU case, and they _are_ the exact
> same problem, so I'd rather not encourage separate solutions for both.
> I really, really, really want to get rid of $DRI_PRIME.
>
Nuking DRI_PRIME and straightening things out is a very worthy goal
imho. Although... how are these two new approaches likely to work with
X11. Afaics there is no GBM/EGL in xserver, outside of glamor that is.
Even if we add the bits, how are thing going to work in GLX ?

>>> The point of the explicit application control is that we not only solve
>>> the "SoCs have split render/scanout devices" issue, but gain an API for
>>> compositors to work properly on PRIME laptop configurations with
>>> render/render/scanout. We don't want any autodetection to happen there,
>>> a compositor may well decide to use the Intel GPU as scanout only and do
>>> all composition on the discreet GPU. Having a tightly coupled wrapped
>>> driver for every device combination is not really where we want to go,
>>> right?
>>
>> To be honest, I don't think we have much of a choice. Most bare-metal
>> applications don't make a distinction between render and scanout. They
>> will simply assume that you can do both on the same device, because
>> that's what their development machine happens to have. So unless we
>> make a deliberate decision not to support most applications out there,
>> what other options do we have?
>>
>> While I agree it's good to have an API to allow explicit control over
>> association of render to scanout nodes, I think that we really want
>> both. In addition to giving users the flexibility if they request it,
>> I think we want to give them a sensible default if they don't care.
>>
>> Especially on systems where there usually isn't a reason to care. Most
>> modern SoCs would never want explicit control over the association
>> because there usually is only a single render node and a single scanout
>> node in the system.
>
> I don't think anyone's arguing otherwise! If there's only one scanout
> node, then the application cannot possibly discover any other DRM
> device to give to GBM. If there's only one render node, then the EGL
> implementation cannot possibly discover anything with EGLDevice. And
> there'll always have to be a default render node selection for apps
> who don't use EGLDevice (currently all of them), so, no problem.
>
I believe that the series is inspired by device(s) where both parts of
the GPU/SoC are closely coupled. Even if I recall correctly, the
display engine cannot scanout from the tiled format, and the GPU
should be used prior to that. This information can be communicated
easier via the wrapped approach, otherwise we will need an API to pass
it through the layers. Which in itself is a great although I'd imagine
we won't get it right the first (X?) time(s). Meanwhile the original
approach will just work ;-)

> I think the only disagreement is how to implement the API internally.
> Lucas is heading in a more generic direction, whereas the
> wrapper-driver approach requires you to write one driver for literally
> every possible combination of render/scanout. Again, fine for
> platforms like Tegra and i.MX where there will only ever be one
> combination ever, but it doesn't scale.
>
Indeed there would be some duplication/boilerplate. The render-only
idea by Christian Gmeiner mitigates it. I believe Thierry was unhappy
with it, due to the possible permutation as new users come along.

Personally, I would use it for now, and worry/re-design thing as we
get (m)any other users. No offence but it seems like we're trying to
tackle a problem which does not exist, yet ?

>>> As said above: if you you want to bind arbitrary combinations of drivers
>>> together you need to move away from tight coupling to a shared interface
>>> anyway. I don't see how having this interface inside a wrapped driver
>>> instead of GBM help in any way, it's a MESA internal interface anyways.
>>>
>>> We don't need any of this for GLX. Etnaviv is working fine with GLX on
>>> both imx-drm and armada-drm, as the DDX does all the work when binding
>>> devices together in that case.
>>
>> In this case DDX wil

[Mesa-dev] [PATCH 2/2] vc4: reorder instructions that exclusively use a vpm value

2016-03-07 Thread Varad Gautam
...to directly read from the vpm, saving a handful of QPU cycles.
The order of reads is preserved.

Signed-off-by: Varad Gautam 
---
 src/gallium/drivers/vc4/vc4_opt_vpm.c | 74 ---
 src/gallium/drivers/vc4/vc4_qir.c |  2 +-
 src/gallium/drivers/vc4/vc4_qir.h |  2 +-
 3 files changed, 71 insertions(+), 7 deletions(-)

diff --git a/src/gallium/drivers/vc4/vc4_opt_vpm.c 
b/src/gallium/drivers/vc4/vc4_opt_vpm.c
index 0fcf1e5..277b345 100644
--- a/src/gallium/drivers/vc4/vc4_opt_vpm.c
+++ b/src/gallium/drivers/vc4/vc4_opt_vpm.c
@@ -24,22 +24,26 @@
 /**
  * @file vc4_opt_vpm.c
  *
- * This modifies instructions that generate the value consumed by a VPM write
- * to write directly into the VPM.
+ * This modifies instructions that:
+ * 1. exclusively consume a value read from the VPM to directly read the VPM if
+ *other operands allow it.
+ * 2. generate the value consumed by a VPM write to write directly into the 
VPM.
  */
 
 #include "vc4_qir.h"
 
 bool
-qir_opt_vpm_writes(struct vc4_compile *c)
+qir_opt_vpm(struct vc4_compile *c)
 {
 if (c->stage == QSTAGE_FRAG)
 return false;
 
 bool progress = false;
 struct qinst *vpm_writes[64] = { 0 };
+struct qinst *vpm_reads[64] = { 0 };
 uint32_t use_count[c->num_temps];
 uint32_t vpm_write_count = 0;
+uint32_t vpm_read_count = 0;
 memset(&use_count, 0, sizeof(use_count));
 
 list_for_each_entry(struct qinst, inst, &c->instructions, link) {
@@ -52,8 +56,68 @@ qir_opt_vpm_writes(struct vc4_compile *c)
 }
 
 for (int i = 0; i < qir_get_op_nsrc(inst->op); i++) {
-if (inst->src[i].file == QFILE_TEMP)
-use_count[inst->src[i].index]++;
+if (inst->src[i].file == QFILE_TEMP) {
+uint32_t temp = inst->src[i].index;
+use_count[temp]++;
+
+struct qinst *mov = c->defs[temp];
+if (!mov ||
+(mov->op != QOP_MOV &&
+mov->op != QOP_FMOV &&
+mov->op != QOP_MMOV)) {
+continue;
+}
+
+if (mov->src[0].file == QFILE_VPM)
+vpm_reads[vpm_read_count++] = inst;
+}
+}
+}
+
+for (int i = 0; i < vpm_read_count; i++) {
+struct qinst *inst = vpm_reads[i];
+
+if (!inst || qir_is_multi_instruction(inst))
+continue;
+
+if (qir_depends_on_flags(inst) || inst->sf)
+continue;
+
+if (qir_has_side_effects(c, inst) ||
+qir_has_side_effect_reads(c, inst))
+continue;
+
+for (int j = 0; j < qir_get_op_nsrc(inst->op); j++) {
+if(inst->src[j].file != QFILE_TEMP)
+continue;
+
+uint32_t temp = inst->src[j].index;
+if (use_count[temp] != 1)
+continue;
+
+struct qinst *mov = c->defs[temp];
+
+if (mov->src[0].file != QFILE_VPM)
+continue;
+
+uint32_t temps = 0;
+for (int k = 0; k < qir_get_op_nsrc(inst->op); k++) {
+if (inst->src[k].file == QFILE_TEMP)
+temps++;
+}
+
+/* The instruction is safe to reorder if its other
+ * sources are independent of previous instructions
+ */
+if (temps == 1 ) {
+list_del(&inst->link);
+inst->src[j] = mov->src[0];
+list_replace(&mov->link, &inst->link);
+c->defs[temp] = NULL;
+free(mov);
+}
+
+progress = true;
 }
 }
 
diff --git a/src/gallium/drivers/vc4/vc4_qir.c 
b/src/gallium/drivers/vc4/vc4_qir.c
index f9eb0e1..65f0067 100644
--- a/src/gallium/drivers/vc4/vc4_qir.c
+++ b/src/gallium/drivers/vc4/vc4_qir.c
@@ -526,7 +526,7 @@ qir_optimize(struct vc4_compile *c)
 OPTPASS(qir_opt_copy_propagation);
 OPTPASS(qir_opt_dead_code);
 OPTPASS(qir_opt_small_immediates);
-OPTPASS(qir_opt_vpm_writes);
+OPTPASS(qir_opt_vpm);
 
 if (!progress)
 break;
diff --git a/src/gal

[Mesa-dev] [PATCH 1/2] vc4: rename file to group vpm optimizations together

2016-03-07 Thread Varad Gautam
This file will contain optimization passes for both vpm reads
and writes.

Signed-off-by: Varad Gautam 
---
 src/gallium/drivers/vc4/Makefile.sources |  2 +-
 src/gallium/drivers/vc4/vc4_opt_vpm.c| 98 
 src/gallium/drivers/vc4/vc4_opt_vpm_writes.c | 98 
 3 files changed, 99 insertions(+), 99 deletions(-)
 create mode 100644 src/gallium/drivers/vc4/vc4_opt_vpm.c
 delete mode 100644 src/gallium/drivers/vc4/vc4_opt_vpm_writes.c

diff --git a/src/gallium/drivers/vc4/Makefile.sources 
b/src/gallium/drivers/vc4/Makefile.sources
index a9a2742..c5df0f1 100644
--- a/src/gallium/drivers/vc4/Makefile.sources
+++ b/src/gallium/drivers/vc4/Makefile.sources
@@ -28,7 +28,7 @@ C_SOURCES := \
vc4_opt_cse.c \
vc4_opt_dead_code.c \
vc4_opt_small_immediates.c \
-   vc4_opt_vpm_writes.c \
+   vc4_opt_vpm.c \
vc4_program.c \
vc4_qir.c \
vc4_qir_lower_uniforms.c \
diff --git a/src/gallium/drivers/vc4/vc4_opt_vpm.c 
b/src/gallium/drivers/vc4/vc4_opt_vpm.c
new file mode 100644
index 000..0fcf1e5
--- /dev/null
+++ b/src/gallium/drivers/vc4/vc4_opt_vpm.c
@@ -0,0 +1,98 @@
+/*
+ * Copyright © 2014 Broadcom
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ */
+
+/**
+ * @file vc4_opt_vpm.c
+ *
+ * This modifies instructions that generate the value consumed by a VPM write
+ * to write directly into the VPM.
+ */
+
+#include "vc4_qir.h"
+
+bool
+qir_opt_vpm_writes(struct vc4_compile *c)
+{
+if (c->stage == QSTAGE_FRAG)
+return false;
+
+bool progress = false;
+struct qinst *vpm_writes[64] = { 0 };
+uint32_t use_count[c->num_temps];
+uint32_t vpm_write_count = 0;
+memset(&use_count, 0, sizeof(use_count));
+
+list_for_each_entry(struct qinst, inst, &c->instructions, link) {
+switch (inst->dst.file) {
+case QFILE_VPM:
+vpm_writes[vpm_write_count++] = inst;
+break;
+default:
+break;
+}
+
+for (int i = 0; i < qir_get_op_nsrc(inst->op); i++) {
+if (inst->src[i].file == QFILE_TEMP)
+use_count[inst->src[i].index]++;
+}
+}
+
+for (int i = 0; i < vpm_write_count; i++) {
+if (!qir_is_raw_mov(vpm_writes[i]) ||
+vpm_writes[i]->src[0].file != QFILE_TEMP) {
+continue;
+}
+
+uint32_t temp = vpm_writes[i]->src[0].index;
+if (use_count[temp] != 1)
+continue;
+
+struct qinst *inst = c->defs[temp];
+if (!inst || qir_is_multi_instruction(inst))
+continue;
+
+if (qir_depends_on_flags(inst) || inst->sf)
+continue;
+
+if (qir_has_side_effects(c, inst) ||
+qir_has_side_effect_reads(c, inst)) {
+continue;
+}
+
+/* Move the generating instruction to the end of the program
+ * to maintain the order of the VPM writes.
+ */
+assert(!vpm_writes[i]->sf);
+list_del(&inst->link);
+list_addtail(&inst->link, &vpm_writes[i]->link);
+qir_remove_instruction(c, vpm_writes[i]);
+
+c->defs[inst->dst.index] = NULL;
+inst->dst.file = QFILE_VPM;
+inst->dst.index = 0;
+
+progress = true;
+}
+
+return progress;
+}
diff --git a/src/gallium/drivers/vc4/vc4_opt_vpm_writes.c 
b/src/gallium/drivers/vc4/vc4_opt_vpm_writes.c
deleted file mode 100644
index 73ded76..000
--- a/src/gallium/drivers/vc4/vc4_opt_vpm

Re: [Mesa-dev] [PATCH] gm107/ir: add emission for ATOMS

2016-03-07 Thread Ilia Mirkin
Reviewed-by: Ilia Mirkin 

On Mon, Mar 7, 2016 at 12:57 PM, Samuel Pitoiset
 wrote:
> This allows to perform atomic operations on shared memory.
>
> Signed-off-by: Samuel Pitoiset 
> ---
>  .../drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp | 45 
> +-
>  1 file changed, 44 insertions(+), 1 deletion(-)
>
> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp 
> b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
> index 0e621e0..e079a57 100644
> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
> @@ -177,6 +177,7 @@ private:
> void emitAL2P();
> void emitIPA();
> void emitATOM();
> +   void emitATOMS();
> void emitCCTL();
>
> void emitPIXLD();
> @@ -2374,6 +2375,45 @@ CodeEmitterGM107::emitATOM()
>  }
>
>  void
> +CodeEmitterGM107::emitATOMS()
> +{
> +   unsigned dType, subOp;
> +
> +   if (insn->subOp == NV50_IR_SUBOP_ATOM_CAS) {
> +  switch (insn->dType) {
> +  case TYPE_U32: dType = 0; break;
> +  case TYPE_U64: dType = 1; break;
> +  default: assert(!"unexpected dType"); dType = 0; break;
> +  }
> +  subOp = 4;
> +
> +  emitInsn (0xee00);
> +  emitField(0x34, 1, dType);
> +   } else {
> +  switch (insn->dType) {
> +  case TYPE_U32: dType = 0; break;
> +  case TYPE_S32: dType = 1; break;
> +  case TYPE_U64: dType = 2; break;
> +  case TYPE_S64: dType = 3; break;
> +  default: assert(!"unexpected dType"); dType = 0; break;
> +  }
> +
> +  if (insn->subOp == NV50_IR_SUBOP_ATOM_EXCH)
> + subOp = 8;
> +  else
> + subOp = insn->subOp;
> +
> +  emitInsn (0xec00);
> +  emitField(0x1c, 3, dType);
> +   }
> +
> +   emitField(0x34, 4, subOp);
> +   emitGPR  (0x14, insn->src(1));
> +   emitADDR (0x08, 0x12, 22, 0, insn->src(0));
> +   emitGPR  (0x00, insn->def(0));
> +}
> +
> +void
>  CodeEmitterGM107::emitCCTL()
>  {
> unsigned width;
> @@ -2967,7 +3007,10 @@ CodeEmitterGM107::emitInstruction(Instruction *i)
>}
>break;
> case OP_ATOM:
> -  emitATOM();
> +  if (insn->src(0).getFile() == FILE_MEMORY_SHARED)
> + emitATOMS();
> +  else
> + emitATOM();
>break;
> case OP_CCTL:
>emitCCTL();
> --
> 2.7.1
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] tgsi: fix parsing of shared memory declarations

2016-03-07 Thread Ilia Mirkin
Reviewed-by: Ilia Mirkin 

On Mon, Mar 7, 2016 at 12:52 PM, Samuel Pitoiset
 wrote:
> The SHARED TGSI keyword is only allowed with TGSI_FILE_MEMORY and not
> with TGSI_FILE_BUFFER. I have found this by using the nouveau_compiler
> from command line.
>
> Signed-off-by: Samuel Pitoiset 
> Cc: "11.2" 
> ---
>  src/gallium/auxiliary/tgsi/tgsi_text.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/src/gallium/auxiliary/tgsi/tgsi_text.c 
> b/src/gallium/auxiliary/tgsi/tgsi_text.c
> index 91baa01..77598d2 100644
> --- a/src/gallium/auxiliary/tgsi/tgsi_text.c
> +++ b/src/gallium/auxiliary/tgsi/tgsi_text.c
> @@ -1388,7 +1388,9 @@ static boolean parse_declaration( struct translate_ctx 
> *ctx )
>   if (str_match_nocase_whole(&cur, "ATOMIC")) {
>  decl.Declaration.Atomic = 1;
>  ctx->cur = cur;
> - } else if (str_match_nocase_whole(&cur, "SHARED")) {
> + }
> +  } else if (file == TGSI_FILE_MEMORY) {
> + if (str_match_nocase_whole(&cur, "SHARED")) {
>  decl.Declaration.Shared = 1;
>  ctx->cur = cur;
>   }
> --
> 2.7.1
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 00/16] Add infrastructure for GL_OES_texture_compression_astc

2016-03-07 Thread Anuj Phogat
On Fri, Mar 4, 2016 at 6:39 PM, Ilia Mirkin  wrote:
> Not that I'm against this, but is there actual HW that supports the full 3d
> stuff? From what I gather, no proprietary drivers expose this ext.
>
Right. I realized it after I wrote these patches. I'm not planning to post
more patches for this extension. We can either land the series now or
wait until it is actually usable to any hardware running Mesa.

> On Mar 4, 2016 8:17 PM, "Anuj Phogat"  wrote:
>>
>> Anuj Phogat (16):
>>   mesa: Add block depth field in struct gl_format_info
>>   mesa: Add support to query block depth using
>> _mesa_get_format_block_size()
>>   mesa: Add error conditions for compressed textures with 3D blocks
>>   mesa: Account for block depth in _mesa_format_image_size()
>>   glapi: Update dispatch XML files for OES_texture_compression_astc.xml
>>   mesa: Add mesa formats for astc 3d formats
>>   mesa: Add entries for astc 3d formats initializing struct
>> gl_format_info
>>   mesa: Add OES_texture_compression_astc to extension table and
>> gl_extensions
>>   mesa: Align the values of #define's in glheader.h
>>   mesa: Add the missing defines for GL_OES_texture_compression_astc
>>   mesa: Add a helper function is_astc_3d_format()
>>   mesa: Account for astc 3d formats in _mesa_is_astc_format()
>>   mesa: Handle astc 3d formats in _mesa_base_tex_format()
>>   mesa: Handle astc 3d formats in _mesa_get_compressed_formats()
>>   mesa: Enable translation between astc 3d gl formats and mesa formats
>>   swrast: Add texfetch_funcs entries for astc 3d formats
>>
>>  src/mapi/glapi/gen/Makefile.am |   1 +
>>  .../glapi/gen/OES_texture_compression_astc.xml |  61 +++
>>  src/mapi/glapi/gen/gl_API.xml  |   2 +
>>  src/mesa/drivers/dri/i915/intel_mipmap_tree.c  |   8 +-
>>  src/mesa/drivers/dri/i915/intel_tex_layout.c   |   4 +-
>>  src/mesa/drivers/dri/i965/brw_tex_layout.c |  21 +-
>>  src/mesa/drivers/dri/i965/intel_copy_image.c   |  14 +-
>>  src/mesa/drivers/dri/i965/intel_mipmap_tree.c  |   8 +-
>>  src/mesa/drivers/dri/nouveau/nouveau_util.h|   8 +-
>>  src/mesa/drivers/dri/radeon/radeon_mipmap_tree.c   |  11 +-
>>  src/mesa/drivers/dri/radeon/radeon_texture.c   |   4 +-
>>  src/mesa/main/copyimage.c  |   6 +-
>>  src/mesa/main/extensions_table.h   |   1 +
>>  src/mesa/main/format_info.py   |   5 +-
>>  src/mesa/main/format_parser.py |  15 +-
>>  src/mesa/main/formatquery.c|   4 +-
>>  src/mesa/main/formats.c|  52 +-
>>  src/mesa/main/formats.csv  | 550
>> +++--
>>  src/mesa/main/formats.h|  24 +-
>>  src/mesa/main/glformats.c  |  54 +-
>>  src/mesa/main/glheader.h   |  81 +--
>>  src/mesa/main/mtypes.h |   1 +
>>  src/mesa/main/texcompress.c| 117 -
>>  src/mesa/main/texgetimage.c|   4 +-
>>  src/mesa/main/teximage.c   |  17 +-
>>  src/mesa/main/texstore.c   |   4 +-
>>  src/mesa/swrast/s_texfetch.c   |  27 +-
>>  src/mesa/swrast/s_texture.c|   4 +-
>>  28 files changed, 719 insertions(+), 389 deletions(-)
>>  create mode 100644 src/mapi/glapi/gen/OES_texture_compression_astc.xml
>>
>> --
>> 2.5.0
>>
>> ___
>> mesa-dev mailing list
>> mesa-dev@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 02/16] mesa: Add support to query block depth using _mesa_get_format_block_size()

2016-03-07 Thread Anuj Phogat
On Sat, Mar 5, 2016 at 8:02 AM, Brian Paul  wrote:
> On 03/04/2016 06:29 PM, Anuj Phogat wrote:
>>
>> Signed-off-by: Anuj Phogat 
>> ---
>>   src/mesa/drivers/dri/i915/intel_mipmap_tree.c|  8 
>>   src/mesa/drivers/dri/i915/intel_tex_layout.c |  4 ++--
>>   src/mesa/drivers/dri/i965/brw_tex_layout.c   | 21
>> +++--
>>   src/mesa/drivers/dri/i965/intel_copy_image.c | 14 +++---
>>   src/mesa/drivers/dri/i965/intel_mipmap_tree.c|  8 
>>   src/mesa/drivers/dri/nouveau/nouveau_util.h  |  8 ++--
>>   src/mesa/drivers/dri/radeon/radeon_mipmap_tree.c | 11 +++
>>   src/mesa/drivers/dri/radeon/radeon_texture.c |  4 ++--
>>   src/mesa/main/copyimage.c|  6 +++---
>>   src/mesa/main/formatquery.c  |  4 ++--
>>   src/mesa/main/formats.c  |  4 +++-
>>   src/mesa/main/formats.h  |  3 ++-
>>   src/mesa/main/texcompress.c  |  8 
>>   src/mesa/main/texgetimage.c  |  4 ++--
>>   src/mesa/main/teximage.c |  4 ++--
>>   src/mesa/main/texstore.c |  4 ++--
>>   src/mesa/swrast/s_texfetch.c |  4 ++--
>>   src/mesa/swrast/s_texture.c  |  4 ++--
>>   18 files changed, 67 insertions(+), 56 deletions(-)
>>
>> diff --git a/src/mesa/drivers/dri/i915/intel_mipmap_tree.c
>> b/src/mesa/drivers/dri/i915/intel_mipmap_tree.c
>> index 5cbf763..947a556 100644
>> --- a/src/mesa/drivers/dri/i915/intel_mipmap_tree.c
>> +++ b/src/mesa/drivers/dri/i915/intel_mipmap_tree.c
>> @@ -90,8 +90,8 @@ intel_miptree_create_layout(struct intel_context *intel,
>>  /* The cpp is bytes per (1, blockheight)-sized block for compressed
>>   * textures.  This is why you'll see divides by blockheight all over
>>   */
>> -   unsigned bw, bh;
>> -   _mesa_get_format_block_size(format, &bw, &bh);
>> +   unsigned bw, bh, bd;
>> +   _mesa_get_format_block_size(format, &bw, &bh, &bd);
>>  assert(_mesa_get_format_bytes(mt->format) % bw == 0);
>>  mt->cpp = _mesa_get_format_bytes(mt->format) / bw;
>>
>> @@ -726,7 +726,7 @@ intel_miptree_map_gtt(struct intel_context *intel,
>>   struct intel_miptree_map *map,
>>   unsigned int level, unsigned int slice)
>>   {
>> -   unsigned int bw, bh;
>> +   unsigned int bw, bh, bd;
>>  void *base;
>>  unsigned int image_x, image_y;
>>  int x = map->x;
>> @@ -736,7 +736,7 @@ intel_miptree_map_gtt(struct intel_context *intel,
>>   * row of blocks.  intel_miptree_get_image_offset() already does
>>   * the divide.
>>   */
>> -   _mesa_get_format_block_size(mt->format, &bw, &bh);
>> +   _mesa_get_format_block_size(mt->format, &bw, &bh, &bd);
>>  assert(y % bh == 0);
>>  y /= bh;
>>
>> diff --git a/src/mesa/drivers/dri/i915/intel_tex_layout.c
>> b/src/mesa/drivers/dri/i915/intel_tex_layout.c
>> index 01ea165..401282c 100644
>> --- a/src/mesa/drivers/dri/i915/intel_tex_layout.c
>> +++ b/src/mesa/drivers/dri/i915/intel_tex_layout.c
>> @@ -69,8 +69,8 @@ intel_horizontal_texture_alignment_unit(struct
>> intel_context *intel,
>>  /* The hardware alignment requirements for compressed textures
>>   * happen to match the block boundaries.
>>   */
>> -  unsigned int i, j;
>> -  _mesa_get_format_block_size(format, &i, &j);
>> +  unsigned int i, j, k;
>> +  _mesa_get_format_block_size(format, &i, &j, &k);
>> return i;
>>   }
>>
>> diff --git a/src/mesa/drivers/dri/i965/brw_tex_layout.c
>> b/src/mesa/drivers/dri/i965/brw_tex_layout.c
>> index a294829..67923e9 100644
>> --- a/src/mesa/drivers/dri/i965/brw_tex_layout.c
>> +++ b/src/mesa/drivers/dri/i965/brw_tex_layout.c
>> @@ -296,9 +296,9 @@ brw_miptree_layout_2d(struct intel_mipmap_tree *mt)
>>  unsigned width = mt->physical_width0;
>>  unsigned height = mt->physical_height0;
>>  unsigned depth = mt->physical_depth0; /* number of array layers. */
>> -   unsigned int bw, bh;
>> +   unsigned int bw, bh, bd;
>>
>> -   _mesa_get_format_block_size(mt->format, &bw, &bh);
>> +   _mesa_get_format_block_size(mt->format, &bw, &bh, &bd);
>>
>>  mt->total_width = mt->physical_width0;
>>
>> @@ -515,9 +515,9 @@ brw_miptree_layout_texture_3d(struct brw_context *brw,
>>  mt->total_height = 0;
>>
>>  unsigned ysum = 0;
>> -   unsigned bh, bw;
>> +   unsigned bh, bw, bd;
>>
>> -   _mesa_get_format_block_size(mt->format, &bw, &bh);
>> +   _mesa_get_format_block_size(mt->format, &bw, &bh, &bd);
>>
>>  for (unsigned level = mt->first_level; level <= mt->last_level;
>> level++) {
>> unsigned WL = MAX2(mt->physical_width0 >> level, 1);
>> @@ -745,10 +745,11 @@ intel_miptree_set_alignment(struct brw_context *brw,
>>mt->valign = 32;
>> }
>>  } else if (mt->compressed) {
>> -   /* The hardware alignment requirements for compre

[Mesa-dev] [PATCH] gm107/ir: add emission for ATOMS

2016-03-07 Thread Samuel Pitoiset
This allows to perform atomic operations on shared memory.

Signed-off-by: Samuel Pitoiset 
---
 .../drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp | 45 +-
 1 file changed, 44 insertions(+), 1 deletion(-)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
index 0e621e0..e079a57 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
@@ -177,6 +177,7 @@ private:
void emitAL2P();
void emitIPA();
void emitATOM();
+   void emitATOMS();
void emitCCTL();
 
void emitPIXLD();
@@ -2374,6 +2375,45 @@ CodeEmitterGM107::emitATOM()
 }
 
 void
+CodeEmitterGM107::emitATOMS()
+{
+   unsigned dType, subOp;
+
+   if (insn->subOp == NV50_IR_SUBOP_ATOM_CAS) {
+  switch (insn->dType) {
+  case TYPE_U32: dType = 0; break;
+  case TYPE_U64: dType = 1; break;
+  default: assert(!"unexpected dType"); dType = 0; break;
+  }
+  subOp = 4;
+
+  emitInsn (0xee00);
+  emitField(0x34, 1, dType);
+   } else {
+  switch (insn->dType) {
+  case TYPE_U32: dType = 0; break;
+  case TYPE_S32: dType = 1; break;
+  case TYPE_U64: dType = 2; break;
+  case TYPE_S64: dType = 3; break;
+  default: assert(!"unexpected dType"); dType = 0; break;
+  }
+
+  if (insn->subOp == NV50_IR_SUBOP_ATOM_EXCH)
+ subOp = 8;
+  else
+ subOp = insn->subOp;
+
+  emitInsn (0xec00);
+  emitField(0x1c, 3, dType);
+   }
+
+   emitField(0x34, 4, subOp);
+   emitGPR  (0x14, insn->src(1));
+   emitADDR (0x08, 0x12, 22, 0, insn->src(0));
+   emitGPR  (0x00, insn->def(0));
+}
+
+void
 CodeEmitterGM107::emitCCTL()
 {
unsigned width;
@@ -2967,7 +3007,10 @@ CodeEmitterGM107::emitInstruction(Instruction *i)
   }
   break;
case OP_ATOM:
-  emitATOM();
+  if (insn->src(0).getFile() == FILE_MEMORY_SHARED)
+ emitATOMS();
+  else
+ emitATOM();
   break;
case OP_CCTL:
   emitCCTL();
-- 
2.7.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] tgsi: fix parsing of shared memory declarations

2016-03-07 Thread Samuel Pitoiset
The SHARED TGSI keyword is only allowed with TGSI_FILE_MEMORY and not
with TGSI_FILE_BUFFER. I have found this by using the nouveau_compiler
from command line.

Signed-off-by: Samuel Pitoiset 
Cc: "11.2" 
---
 src/gallium/auxiliary/tgsi/tgsi_text.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/src/gallium/auxiliary/tgsi/tgsi_text.c 
b/src/gallium/auxiliary/tgsi/tgsi_text.c
index 91baa01..77598d2 100644
--- a/src/gallium/auxiliary/tgsi/tgsi_text.c
+++ b/src/gallium/auxiliary/tgsi/tgsi_text.c
@@ -1388,7 +1388,9 @@ static boolean parse_declaration( struct translate_ctx 
*ctx )
  if (str_match_nocase_whole(&cur, "ATOMIC")) {
 decl.Declaration.Atomic = 1;
 ctx->cur = cur;
- } else if (str_match_nocase_whole(&cur, "SHARED")) {
+ }
+  } else if (file == TGSI_FILE_MEMORY) {
+ if (str_match_nocase_whole(&cur, "SHARED")) {
 decl.Declaration.Shared = 1;
 ctx->cur = cur;
  }
-- 
2.7.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] gm107/ir: add emission for BAR

2016-03-07 Thread Samuel Pitoiset



On 03/06/2016 11:37 PM, Ilia Mirkin wrote:

On Tue, Mar 1, 2016 at 12:44 PM, Samuel Pitoiset
 wrote:

Signed-off-by: Samuel Pitoiset 
---
  .../drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp | 52 ++
  1 file changed, 52 insertions(+)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
index a383c53..0e621e0 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
@@ -194,6 +194,7 @@ private:
 void emitKIL();
 void emitOUT();

+   void emitBAR();
 void emitMEMBAR();

 void emitVOTE();
@@ -2649,6 +2650,54 @@ CodeEmitterGM107::emitOUT()
  }

  void
+CodeEmitterGM107::emitBAR()
+{
+   uint8_t subop;
+
+   emitInsn (0xf0a8);
+
+   switch (insn->subOp) {
+   case NV50_IR_SUBOP_BAR_RED_POPC: subop = 0x02; break;
+   case NV50_IR_SUBOP_BAR_RED_AND:  subop = 0x0a; break;
+   case NV50_IR_SUBOP_BAR_RED_OR:   subop = 0x12; break;
+   case NV50_IR_SUBOP_BAR_ARRIVE:   subop = 0x81; break;
+   default:
+  subop = 0x80;
+  assert(insn->subOp == NV50_IR_SUBOP_BAR_SYNC);
+  break;
+   }
+
+   emitField(0x20, 8, subop);
+
+   // barrier id
+   if (insn->src(0).getFile() == FILE_GPR) {
+  emitGPR(0x08, insn->src(0));
+   } else {
+  ImmediateValue *imm = insn->getSrc(0)->asImm();
+  assert(imm);
+  emitField(0x08, 8, imm->reg.data.u32);
+  emitField(0x2b, 1, 1);
+   }
+
+   // thread count
+   if (insn->src(1).getFile() == FILE_GPR) {
+  emitGPR(0x14, insn->src(1));
+   } else {
+  ImmediateValue *imm = insn->getSrc(0)->asImm();
+  assert(imm);
+  emitField(0x14, 12, imm->reg.data.u32);
+  emitField(0x2c, 1, 1);
+   }
+
+   if (insn->srcExists(2) && (insn->predSrc != 2)) {
+  emitPRED (0x27, insn->src(2));
+  emitField(0x2a, 1, insn->src(2).mod == Modifier(NV50_IR_MOD_NOT));
+   } else {
+  emitField(0x27, 3, 7);
+   }


Can a bar be predicated? If so, you probably want emitPredicate(i)
somewhere in there.


The predicate is added by emitInsn() when the second parameter is true 
which is the default behaviour, and emitField() already takes care of that.




Also please assert that the barrier id/thread count immediates fit
within the specified field widths (or does emitField take care of
that?)

With those resolved, this is

Reviewed-by: Ilia Mirkin 


+}
+
+void
  CodeEmitterGM107::emitMEMBAR()
  {
 emitInsn (0xef98);
@@ -2978,6 +3027,9 @@ CodeEmitterGM107::emitInstruction(Instruction *i)
 case OP_RESTART:
emitOUT();
break;
+   case OP_BAR:
+  emitBAR();
+  break;
 case OP_MEMBAR:
emitMEMBAR();
break;
--
2.7.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] nvc0/ir: make sure that thread count immediate for BAR fit

2016-03-07 Thread Ilia Mirkin
Reviewed-by: Ilia Mirkin 

On Mon, Mar 7, 2016 at 12:29 PM, Samuel Pitoiset
 wrote:
> The limit of the thread count immediate value is 12 bits.
>
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp 
> b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp
> index f172b72..d61109f 100644
> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp
> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp
> @@ -1482,6 +1482,7 @@ CodeEmitterNVC0::emitBAR(const Instruction *i)
> } else {
>ImmediateValue *imm = i->getSrc(1)->asImm();
>assert(imm);
> +  assert(imm->reg.data.u32 <= 0xfff);
>code[0] |= imm->reg.data.u32 << 26;
>code[1] |= imm->reg.data.u32 >> 6;
>code[1] |= 0x4000;
> --
> 2.7.1
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] nvc0/ir: make sure that thread count immediate for BAR fit

2016-03-07 Thread Samuel Pitoiset
The limit of the thread count immediate value is 12 bits.

Signed-off-by: Samuel Pitoiset 
---
 src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp
index f172b72..d61109f 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp
@@ -1482,6 +1482,7 @@ CodeEmitterNVC0::emitBAR(const Instruction *i)
} else {
   ImmediateValue *imm = i->getSrc(1)->asImm();
   assert(imm);
+  assert(imm->reg.data.u32 <= 0xfff);
   code[0] |= imm->reg.data.u32 << 26;
   code[1] |= imm->reg.data.u32 >> 6;
   code[1] |= 0x4000;
-- 
2.7.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] program: add no-op add_uniform_to_shader::set_buffer_offset() method

2016-03-07 Thread Brian Paul

On 03/05/2016 02:01 PM, Jose Fonseca wrote:


On 05/03/16 17:40, Brian Paul wrote:

Fixes VMware MSVC, MinGW builds:

build/windows-x86-debug/mesa/libmesa.a(ir_to_mesa.o):
ir_to_mesa.cpp:(.rdata+0xf9c): undefined reference to
`program_resource_visitor::set_buffer_offset(unsigned int)'

This doesn't seem to be needed for the libgl-gdi target, however.
---
  src/mesa/program/ir_to_mesa.cpp | 4 
  1 file changed, 4 insertions(+)

diff --git a/src/mesa/program/ir_to_mesa.cpp
b/src/mesa/program/ir_to_mesa.cpp
index 10d931c..d9338e0 100644
--- a/src/mesa/program/ir_to_mesa.cpp
+++ b/src/mesa/program/ir_to_mesa.cpp
@@ -2318,6 +2318,10 @@ public:
 }

  private:
+   virtual void set_buffer_offset(unsigned offset)
+   {
+   }
+
 virtual void visit_field(const glsl_type *type, const char *name,
  bool row_major);




program_resource_visitor::set_buffer_offset is not pure virtual.   There
shouldn't be a need to implement it on derived classes.  The
program_resource_visitor::set_buffer_offset implementation from
src/compiler/glsl/link_uniforms.cpp should normally be picked up.

So, somehow link_uniforms.cpp 's symbol is being picked up when building
libgl-gdi target, but not when building the failing targets.

Usually I'd say link order is the cause of this sort of issues.

But the odd thing is that MSVC is failing too, and unlike GCC's, MSVC's
linker usually is not sensitive to build order.  So the problem might be
more subtle...

Still it's probably worth checking if tweaking the build order helps in
anyway.


The problem was our non-clean build was picking up a stale copy of 
libglsl.a from before the glsl/ -> compiler/glsl/ move.  Thanks for 
finding that, Jose.


This patch is not needed.

-Brian



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] gallium/swr: fix issues preventing 32-bit build

2016-03-07 Thread Rowley, Timothy O

> On Mar 4, 2016, at 3:26 PM, Emil Velikov  wrote:
> 
> On 4 March 2016 at 19:28, Tim Rowley  wrote:
>> 
>> diff --git a/src/gallium/drivers/swr/rasterizer/common/os.h 
>> b/src/gallium/drivers/swr/rasterizer/common/os.h
>> index 736d298..522ae0d 100644
>> --- a/src/gallium/drivers/swr/rasterizer/common/os.h
>> +++ b/src/gallium/drivers/swr/rasterizer/common/os.h
>> @@ -81,7 +81,6 @@ typedef CARD8 BOOL;
>> typedef wchar_tWCHAR;
>> typedef uint16_t   UINT16;
>> typedef intINT;
>> -typedef int INT32;
>> typedef unsigned int   UINT;
>> typedef uint32_t   UINT32;
>> typedef uint64_t   UINT64;
> If you can remove this abstraction and use plain C types that will be
> amazing. With future commits of course.

There was a pass over the tree removing these types a while back, but 
unfortunately the typedefs remained and some uses creeped back in.  Working on 
cleaning this up.

-Tim
 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] radeon/uvd: increase max height to 4096 for VI and newer

2016-03-07 Thread Alex Deucher
On Mon, Mar 7, 2016 at 9:51 AM, Leo Liu  wrote:
> From: Tamil velan 
>
> With this issue 'mpv --hwdec=vdpau --vo=vdpau ' fails
> for vdpau decode if the stream height is 4096. Vdpau decode of
> height upto 4096 is necessary usecase on amdgpu driver for VI
> and newer platforms.
>
> The fix is in driver specific implementation of "Decoder
> Query Capabilities" API to return 4096 for VI and newer
> platforms. With this fix vdpauinfo reports height support as
> 4096 and mpv for vdpau decode works fine for 4096 height streams.
>
> Signed-off-by: Tamil velan 
> Reviewed-by: Christian König 
> Cc: "11.0 11.1 11.2" 

Reviewed-by: Alex Deucher 


> ---
>  src/gallium/drivers/radeon/radeon_video.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/src/gallium/drivers/radeon/radeon_video.c 
> b/src/gallium/drivers/radeon/radeon_video.c
> index ec29d8c..e24bbf8 100644
> --- a/src/gallium/drivers/radeon/radeon_video.c
> +++ b/src/gallium/drivers/radeon/radeon_video.c
> @@ -257,7 +257,7 @@ int rvid_get_video_param(struct pipe_screen *screen,
> case PIPE_VIDEO_CAP_MAX_WIDTH:
> return (rscreen->family < CHIP_TONGA) ? 2048 : 4096;
> case PIPE_VIDEO_CAP_MAX_HEIGHT:
> -   return (rscreen->family < CHIP_TONGA) ? 1152 : 2304;
> +   return (rscreen->family < CHIP_TONGA) ? 1152 : 4096;
> case PIPE_VIDEO_CAP_PREFERED_FORMAT:
> return PIPE_FORMAT_NV12;
> case PIPE_VIDEO_CAP_PREFERS_INTERLACED:
> --
> 2.5.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] radeon/uvd: increase max height to 4096 for VI and newer

2016-03-07 Thread Leo Liu
From: Tamil velan 

With this issue 'mpv --hwdec=vdpau --vo=vdpau ' fails
for vdpau decode if the stream height is 4096. Vdpau decode of
height upto 4096 is necessary usecase on amdgpu driver for VI
and newer platforms.

The fix is in driver specific implementation of "Decoder
Query Capabilities" API to return 4096 for VI and newer
platforms. With this fix vdpauinfo reports height support as
4096 and mpv for vdpau decode works fine for 4096 height streams.

Signed-off-by: Tamil velan 
Reviewed-by: Christian König 
Cc: "11.0 11.1 11.2" 
---
 src/gallium/drivers/radeon/radeon_video.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/gallium/drivers/radeon/radeon_video.c 
b/src/gallium/drivers/radeon/radeon_video.c
index ec29d8c..e24bbf8 100644
--- a/src/gallium/drivers/radeon/radeon_video.c
+++ b/src/gallium/drivers/radeon/radeon_video.c
@@ -257,7 +257,7 @@ int rvid_get_video_param(struct pipe_screen *screen,
case PIPE_VIDEO_CAP_MAX_WIDTH:
return (rscreen->family < CHIP_TONGA) ? 2048 : 4096;
case PIPE_VIDEO_CAP_MAX_HEIGHT:
-   return (rscreen->family < CHIP_TONGA) ? 1152 : 2304;
+   return (rscreen->family < CHIP_TONGA) ? 1152 : 4096;
case PIPE_VIDEO_CAP_PREFERED_FORMAT:
return PIPE_FORMAT_NV12;
case PIPE_VIDEO_CAP_PREFERS_INTERLACED:
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 0/9] Skip automatic execsize for instructions with a width of 4

2016-03-07 Thread Samuel Iglesias Gonsálvez
On Mon, 2016-03-07 at 16:03 +0200, Pohjolainen, Topi wrote:
> On Mon, Mar 07, 2016 at 10:48:49AM +0100, Samuel Iglesias Gons?lvez
> wrote:
> > Hello,
> > 
> > There is only one patch from this series that has been reviewed
> > (patch
> > 1).
> > 
> > Our plans is to start sending patches for adding fp64 support to
> > i965
> > driver in the coming weeks but they depend on these patches.
> > 
> > Can someone take a look at them? ;)
> 
> I'm interested, although we may need to involve also other people in
> the end.
> Do you have a branch somewhere I could clone?
> 

Yes, we have this one:

https://github.com/Igalia/mesa/commits/i965-fix-execsize

To clone it:

$ git clone -b i965-fix-execsize g...@github.com:Igalia/mesa.git

Thanks,

Sam

> > 
> > Sam
> > 
> > 
> > On Thu, 2015-12-17 at 14:44 +0100, Samuel Iglesias Gonsálvez wrote:
> > > Hello,
> > > 
> > > This patch series is a updated version of the one Iago sent last
> > > week [0] that includes patches for gen6 too, as suggested by
> > > Jason.
> > > 
> > > We checked the gen9 code paths that work with a horizontal width
> > > of 4
> > > and we think there won't be any regression on gen9... but we
> > > don't
> > > have any gen9 machine to run piglit with these patches. Can
> > > someone
> > > check it?
> > > 
> > > Please read the original cover letter [0] for more information.
> > > 
> > > Sam
> > > 
> > > [0] http://lists.freedesktop.org/archives/mesa-dev/2015-December/
> > > 1027
> > > 46.html
> > > 
> > > Iago Toral Quiroga (5):
> > >   i965/eu: set correct execution size in brw_NOP
> > >   i965/fs: set execution size for SEND messages in
> > > generate_uniform_pull_constant_load_gen7
> > >   i965/eu: set execution size for SEND message in
> > > brw_send_indirect_message
> > >   i965: set correct execsize for MOVS with a width of 4 in
> > > brw_find_live_channel
> > >   i965: Skip execution size adjustment for instructions of width
> > > 4
> > > 
> > > Samuel Iglesias Gonsálvez (4):
> > >   i965/gs/gen6: fix execsize for instructions with width of 4 in
> > > gen6_sol_program()
> > >   i965/vec4/gen6: fix exec_size for instructions with width of 4
> > > in
> > > generate_gs_svb_write()
> > >   i965/vec4/gen6: fix exec_size for instructions with destination
> > > width
> > > of 4
> > >   i965/vec4/gen6: fix exec_size for MOV with a width of 4 in
> > > generate_gs_ff_sync()
> > > 
> > >  src/mesa/drivers/dri/i965/brw_eu_emit.c  | 25
> > > +---
> > >  src/mesa/drivers/dri/i965/brw_ff_gs_emit.c   |  9 -
> > >  src/mesa/drivers/dri/i965/brw_fs_generator.cpp   |  2 ++
> > >  src/mesa/drivers/dri/i965/brw_vec4_generator.cpp | 13
> > > +++-
> > >  4 files changed, 44 insertions(+), 5 deletions(-)
> > > 
> 
> 
> 
> > ___
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> 
> 

signature.asc
Description: This is a digitally signed message part
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 0/9] Skip automatic execsize for instructions with a width of 4

2016-03-07 Thread Pohjolainen, Topi
On Mon, Mar 07, 2016 at 10:48:49AM +0100, Samuel Iglesias Gons?lvez wrote:
> Hello,
> 
> There is only one patch from this series that has been reviewed (patch
> 1).
> 
> Our plans is to start sending patches for adding fp64 support to i965
> driver in the coming weeks but they depend on these patches.
> 
> Can someone take a look at them? ;)

I'm interested, although we may need to involve also other people in the end.
Do you have a branch somewhere I could clone?

> 
> Sam
> 
> 
> On Thu, 2015-12-17 at 14:44 +0100, Samuel Iglesias Gonsálvez wrote:
> > Hello,
> > 
> > This patch series is a updated version of the one Iago sent last
> > week [0] that includes patches for gen6 too, as suggested by Jason.
> > 
> > We checked the gen9 code paths that work with a horizontal width of 4
> > and we think there won't be any regression on gen9... but we don't
> > have any gen9 machine to run piglit with these patches. Can someone
> > check it?
> > 
> > Please read the original cover letter [0] for more information.
> > 
> > Sam
> > 
> > [0] http://lists.freedesktop.org/archives/mesa-dev/2015-December/1027
> > 46.html
> > 
> > Iago Toral Quiroga (5):
> >   i965/eu: set correct execution size in brw_NOP
> >   i965/fs: set execution size for SEND messages in
> > generate_uniform_pull_constant_load_gen7
> >   i965/eu: set execution size for SEND message in
> > brw_send_indirect_message
> >   i965: set correct execsize for MOVS with a width of 4 in
> > brw_find_live_channel
> >   i965: Skip execution size adjustment for instructions of width 4
> > 
> > Samuel Iglesias Gonsálvez (4):
> >   i965/gs/gen6: fix execsize for instructions with width of 4 in
> > gen6_sol_program()
> >   i965/vec4/gen6: fix exec_size for instructions with width of 4 in
> > generate_gs_svb_write()
> >   i965/vec4/gen6: fix exec_size for instructions with destination
> > width
> > of 4
> >   i965/vec4/gen6: fix exec_size for MOV with a width of 4 in
> > generate_gs_ff_sync()
> > 
> >  src/mesa/drivers/dri/i965/brw_eu_emit.c  | 25
> > +---
> >  src/mesa/drivers/dri/i965/brw_ff_gs_emit.c   |  9 -
> >  src/mesa/drivers/dri/i965/brw_fs_generator.cpp   |  2 ++
> >  src/mesa/drivers/dri/i965/brw_vec4_generator.cpp | 13 +++-
> >  4 files changed, 44 insertions(+), 5 deletions(-)
> > 



> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Removing R600_BIG_ENDIAN and using #ifdef instead

2016-03-07 Thread Oded Gabbay
On Mar 6, 2016 2:26 PM, "Marek Olšák"  wrote:
>
> Hi Oded,
>
> I prefer "if" over #ifdef. The idea is that everybody should be able
> to test if the compilation succeeds without a BE machine. #ifdef
> disallows that.
>
> R600_BIG_ENDIAN can be moved to r600_pipe_common.h.
>
> Marek
>
OK, no problem.
I'll move the define as part of the next patch set I'll send.
Oded

>
> On Sun, Mar 6, 2016 at 9:01 AM, Oded Gabbay  wrote:
> > Hi,
> >
> > Do you mind if I totally remove R600_BIG_ENDIAN global variable and
> > instead use in all places #ifdef PIPE_ARCH_BIG_ENDIAN ?
> >
> > It's just that:
> >
> > 1. Checking for R600_BIG_ENDIAN is an extra check which can be
> > eliminated using #ifdef
> >
> > 2. Some files, e.g r600_texture.c, don't know R600_BIG_ENDIAN so I
> > need to use the #ifdef anyway.
> >
> > 3. Other drivers in mesa use #ifdef
> >
> > Thanks,
> >
> >  Oded
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH RFC 0/2] GBM API extension to support fusing KMS and render devices

2016-03-07 Thread Lucas Stach
Am Montag, den 07.03.2016, 11:19 +0100 schrieb Thierry Reding:
> On Mon, Mar 07, 2016 at 10:46:52AM +0100, Lucas Stach wrote:
> > Am Freitag, den 04.03.2016, 18:34 + schrieb Emil Velikov:
> > > On 4 March 2016 at 17:38, Lucas Stach  wrote:
> > > > Am Freitag, den 04.03.2016, 17:20 + schrieb Daniel Stone:
> > > >> Hi,
> > > >>
> > > >> On 4 March 2016 at 16:08, Lucas Stach  wrote:
> > > >> > Am Freitag, den 04.03.2016, 15:09 + schrieb Daniel Stone:
> > > >> >> Thanks for taking this on, it looks really good! I just have the one
> > > >> >> question though - did you look at the EGLDevice extension? Using 
> > > >> >> that
> > > >> >> to enumerate the GPUs, we could create the gbm_device using the KMS
> > > >> >> device and pass that in to the EGLDisplay, with an additional attrib
> > > >> >> to pass in an EGLDevice handle to eglGetPlatformDisplay. This could
> > > >> >> possibly be better since it is more independent of DRM as the API, 
> > > >> >> and
> > > >> >> also allows people to share device enumeration/selection code with
> > > >> >> other platforms (e.g. choosing between multiple GPUs when using a
> > > >> >> winsys like Wayland or X11).
> > > >> >>
> > > >> > I have not looked at this in detail yet, but I think it's just an
> > > >> > extension to the interface outlined by this series.
> > > >> >
> > > >> > If we require the KMS device to have a DRI2/Gallium driver it should 
> > > >> > be
> > > >> > easy to hook up the EGLDevice discovery for them.
> > > >> > Passing in a second device handle for the KMS device is then just the
> > > >> > EGL implementation calling gbm_device_set_kms_provider() on the 
> > > >> > render
> > > >> > GBM device, instead of the application doing it manually.
> > > >>
> > > >> It turns the API backwards a bit though ...
> > > >>
> > > >> Right now, what we require is that the GBM device passed in is the KMS
> > > >> device, not the GPU device; what you're suggesting is that we discover
> > > >> the GPU device and then add the KMS device.
> > > >>
> > > >> So, with your proposal:
> > > >> gbm_gpu = gbm_device_create("/dev/dri/renderD128");
> > > >> egl_dpy = eglGetDisplay(gbm_gpu);
> > > >> gbm_kms = gbm_device_create("/dev/dri/card0");
> > > >> gbm_device_set_kms_provider(gbm_gpu, gbm_kms);
> > > >>
> > > >> i.e. the device the user creates first is the GPU device.
> > > >>
> > > >> With EGLDevice, we would have:
> > > >> gbm_kms = gbm_device_create("/dev/dri/card0");
> > > >> egl_gpus = eglGetDevicesEXT();
> > > >> egl_dpy = eglGetPlatformDisplay(gbm_kms, { EGL_TARGET_DEVICE, 
> > > >> egl_gpus[0] });
> > > >>
> > > >> So, the first/main device the user deals with is the KMS device - same
> > > >> as today. This makes sense, since GBM is the allocation API for KMS,
> > > >> and EGL should be the one dealing with the GPU ...
> > > >>
> > > > Right, my API design was from my view of GBM being the API to bootstrap
> > > > EGL rendering, but defining it as the KMS allocation API makes a lot
> > > > more sense, when you think about it.
> > > >
> > > >> Maybe it would make sense to reverse the API, so rather than creating
> > > >> a GBM device for the GPU and then linking that to the KMS device -
> > > >> requiring users to make different calls, e.g. gbm_bo_get_kms_bo(),
> > > >> which makes it harder to use and means we need to port current users -
> > > >> we create a GBM device for KMS and then link that to a GPU device.
> > > >> This would then mean that eglGetPlatformDisplay could do the linkage
> > > >> internally, and then existing users using gbm_bo_get_handle() etc
> > > >> would still work without needing any different codepaths.
> > > >
> > > > Yes, this will make the implementation inside GBM a bit more involved,
> > > > but it seems more natural this way around when thinking about hooking it
> > > > up to EGLDevice. I'll try it out and send an updated RFC after the
> > > > weekend.
> > > >
> > > While I'm more inclined to Daniel's suggestion, I wonder why people
> > > moved away from Thierry's approach - creating a composite/wrapped dri
> > > module ? Is there anything wrong with it - be that from technical or
> > > conceptual POV ?
> > > 
> > The wrapped driver takes away the ability of the application to decide
> > which GPUs to bind together - at least if you want to keep things
> > tightly coupled at that level.
> 
> That was actually the prime objective of the patches I posted back at
> the time. =)
> 
> > The point of the explicit application control is that we not only solve
> > the "SoCs have split render/scanout devices" issue, but gain an API for
> > compositors to work properly on PRIME laptop configurations with
> > render/render/scanout. We don't want any autodetection to happen there,
> > a compositor may well decide to use the Intel GPU as scanout only and do
> > all composition on the discreet GPU. Having a tightly coupled wrapped
> > driver for every device combination is not really where we want to go,
> > right?
> 
> To be honest,

Re: [Mesa-dev] [PATCH RFC 0/2] GBM API extension to support fusing KMS and render devices

2016-03-07 Thread Daniel Stone
Hi,

On 7 March 2016 at 10:19, Thierry Reding  wrote:
> On Mon, Mar 07, 2016 at 10:46:52AM +0100, Lucas Stach wrote:
>> Am Freitag, den 04.03.2016, 18:34 + schrieb Emil Velikov:
>> > While I'm more inclined to Daniel's suggestion, I wonder why people
>> > moved away from Thierry's approach - creating a composite/wrapped dri
>> > module ? Is there anything wrong with it - be that from technical or
>> > conceptual POV ?
>> >
>> The wrapped driver takes away the ability of the application to decide
>> which GPUs to bind together - at least if you want to keep things
>> tightly coupled at that level.
>
> That was actually the prime objective of the patches I posted back at
> the time. =)

Which, for the single-GPU/single-scanout case, is perfect. But it's
completely backwards for the multi-GPU case, and they _are_ the exact
same problem, so I'd rather not encourage separate solutions for both.
I really, really, really want to get rid of $DRI_PRIME.

>> The point of the explicit application control is that we not only solve
>> the "SoCs have split render/scanout devices" issue, but gain an API for
>> compositors to work properly on PRIME laptop configurations with
>> render/render/scanout. We don't want any autodetection to happen there,
>> a compositor may well decide to use the Intel GPU as scanout only and do
>> all composition on the discreet GPU. Having a tightly coupled wrapped
>> driver for every device combination is not really where we want to go,
>> right?
>
> To be honest, I don't think we have much of a choice. Most bare-metal
> applications don't make a distinction between render and scanout. They
> will simply assume that you can do both on the same device, because
> that's what their development machine happens to have. So unless we
> make a deliberate decision not to support most applications out there,
> what other options do we have?
>
> While I agree it's good to have an API to allow explicit control over
> association of render to scanout nodes, I think that we really want
> both. In addition to giving users the flexibility if they request it,
> I think we want to give them a sensible default if they don't care.
>
> Especially on systems where there usually isn't a reason to care. Most
> modern SoCs would never want explicit control over the association
> because there usually is only a single render node and a single scanout
> node in the system.

I don't think anyone's arguing otherwise! If there's only one scanout
node, then the application cannot possibly discover any other DRM
device to give to GBM. If there's only one render node, then the EGL
implementation cannot possibly discover anything with EGLDevice. And
there'll always have to be a default render node selection for apps
who don't use EGLDevice (currently all of them), so, no problem.

I think the only disagreement is how to implement the API internally.
Lucas is heading in a more generic direction, whereas the
wrapper-driver approach requires you to write one driver for literally
every possible combination of render/scanout. Again, fine for
platforms like Tegra and i.MX where there will only ever be one
combination ever, but it doesn't scale.

>> As said above: if you you want to bind arbitrary combinations of drivers
>> together you need to move away from tight coupling to a shared interface
>> anyway. I don't see how having this interface inside a wrapped driver
>> instead of GBM help in any way, it's a MESA internal interface anyways.
>>
>> We don't need any of this for GLX. Etnaviv is working fine with GLX on
>> both imx-drm and armada-drm, as the DDX does all the work when binding
>> devices together in that case.
>
> In this case DDX will take the role of the wrapped driver. So you'd end
> up with duplication of the "glue" in both Mesa and the DDX, don't you?

The DDX doesn't exist outside X11, thankfully.

Cheers,
Daniel
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH RFC 0/2] GBM API extension to support fusing KMS and render devices

2016-03-07 Thread Thierry Reding
On Mon, Mar 07, 2016 at 10:46:52AM +0100, Lucas Stach wrote:
> Am Freitag, den 04.03.2016, 18:34 + schrieb Emil Velikov:
> > On 4 March 2016 at 17:38, Lucas Stach  wrote:
> > > Am Freitag, den 04.03.2016, 17:20 + schrieb Daniel Stone:
> > >> Hi,
> > >>
> > >> On 4 March 2016 at 16:08, Lucas Stach  wrote:
> > >> > Am Freitag, den 04.03.2016, 15:09 + schrieb Daniel Stone:
> > >> >> Thanks for taking this on, it looks really good! I just have the one
> > >> >> question though - did you look at the EGLDevice extension? Using that
> > >> >> to enumerate the GPUs, we could create the gbm_device using the KMS
> > >> >> device and pass that in to the EGLDisplay, with an additional attrib
> > >> >> to pass in an EGLDevice handle to eglGetPlatformDisplay. This could
> > >> >> possibly be better since it is more independent of DRM as the API, and
> > >> >> also allows people to share device enumeration/selection code with
> > >> >> other platforms (e.g. choosing between multiple GPUs when using a
> > >> >> winsys like Wayland or X11).
> > >> >>
> > >> > I have not looked at this in detail yet, but I think it's just an
> > >> > extension to the interface outlined by this series.
> > >> >
> > >> > If we require the KMS device to have a DRI2/Gallium driver it should be
> > >> > easy to hook up the EGLDevice discovery for them.
> > >> > Passing in a second device handle for the KMS device is then just the
> > >> > EGL implementation calling gbm_device_set_kms_provider() on the render
> > >> > GBM device, instead of the application doing it manually.
> > >>
> > >> It turns the API backwards a bit though ...
> > >>
> > >> Right now, what we require is that the GBM device passed in is the KMS
> > >> device, not the GPU device; what you're suggesting is that we discover
> > >> the GPU device and then add the KMS device.
> > >>
> > >> So, with your proposal:
> > >> gbm_gpu = gbm_device_create("/dev/dri/renderD128");
> > >> egl_dpy = eglGetDisplay(gbm_gpu);
> > >> gbm_kms = gbm_device_create("/dev/dri/card0");
> > >> gbm_device_set_kms_provider(gbm_gpu, gbm_kms);
> > >>
> > >> i.e. the device the user creates first is the GPU device.
> > >>
> > >> With EGLDevice, we would have:
> > >> gbm_kms = gbm_device_create("/dev/dri/card0");
> > >> egl_gpus = eglGetDevicesEXT();
> > >> egl_dpy = eglGetPlatformDisplay(gbm_kms, { EGL_TARGET_DEVICE, 
> > >> egl_gpus[0] });
> > >>
> > >> So, the first/main device the user deals with is the KMS device - same
> > >> as today. This makes sense, since GBM is the allocation API for KMS,
> > >> and EGL should be the one dealing with the GPU ...
> > >>
> > > Right, my API design was from my view of GBM being the API to bootstrap
> > > EGL rendering, but defining it as the KMS allocation API makes a lot
> > > more sense, when you think about it.
> > >
> > >> Maybe it would make sense to reverse the API, so rather than creating
> > >> a GBM device for the GPU and then linking that to the KMS device -
> > >> requiring users to make different calls, e.g. gbm_bo_get_kms_bo(),
> > >> which makes it harder to use and means we need to port current users -
> > >> we create a GBM device for KMS and then link that to a GPU device.
> > >> This would then mean that eglGetPlatformDisplay could do the linkage
> > >> internally, and then existing users using gbm_bo_get_handle() etc
> > >> would still work without needing any different codepaths.
> > >
> > > Yes, this will make the implementation inside GBM a bit more involved,
> > > but it seems more natural this way around when thinking about hooking it
> > > up to EGLDevice. I'll try it out and send an updated RFC after the
> > > weekend.
> > >
> > While I'm more inclined to Daniel's suggestion, I wonder why people
> > moved away from Thierry's approach - creating a composite/wrapped dri
> > module ? Is there anything wrong with it - be that from technical or
> > conceptual POV ?
> > 
> The wrapped driver takes away the ability of the application to decide
> which GPUs to bind together - at least if you want to keep things
> tightly coupled at that level.

That was actually the prime objective of the patches I posted back at
the time. =)

> The point of the explicit application control is that we not only solve
> the "SoCs have split render/scanout devices" issue, but gain an API for
> compositors to work properly on PRIME laptop configurations with
> render/render/scanout. We don't want any autodetection to happen there,
> a compositor may well decide to use the Intel GPU as scanout only and do
> all composition on the discreet GPU. Having a tightly coupled wrapped
> driver for every device combination is not really where we want to go,
> right?

To be honest, I don't think we have much of a choice. Most bare-metal
applications don't make a distinction between render and scanout. They
will simply assume that you can do both on the same device, because
that's what their development machine happens to have. So unless we
make a deliberate decision not 

Re: [Mesa-dev] [PATCH 0/9] Skip automatic execsize for instructions with a width of 4

2016-03-07 Thread Samuel Iglesias Gonsálvez
Hello,

There is only one patch from this series that has been reviewed (patch
1).

Our plans is to start sending patches for adding fp64 support to i965
driver in the coming weeks but they depend on these patches.

Can someone take a look at them? ;)

Sam


On Thu, 2015-12-17 at 14:44 +0100, Samuel Iglesias Gonsálvez wrote:
> Hello,
> 
> This patch series is a updated version of the one Iago sent last
> week [0] that includes patches for gen6 too, as suggested by Jason.
> 
> We checked the gen9 code paths that work with a horizontal width of 4
> and we think there won't be any regression on gen9... but we don't
> have any gen9 machine to run piglit with these patches. Can someone
> check it?
> 
> Please read the original cover letter [0] for more information.
> 
> Sam
> 
> [0] http://lists.freedesktop.org/archives/mesa-dev/2015-December/1027
> 46.html
> 
> Iago Toral Quiroga (5):
>   i965/eu: set correct execution size in brw_NOP
>   i965/fs: set execution size for SEND messages in
> generate_uniform_pull_constant_load_gen7
>   i965/eu: set execution size for SEND message in
> brw_send_indirect_message
>   i965: set correct execsize for MOVS with a width of 4 in
> brw_find_live_channel
>   i965: Skip execution size adjustment for instructions of width 4
> 
> Samuel Iglesias Gonsálvez (4):
>   i965/gs/gen6: fix execsize for instructions with width of 4 in
> gen6_sol_program()
>   i965/vec4/gen6: fix exec_size for instructions with width of 4 in
> generate_gs_svb_write()
>   i965/vec4/gen6: fix exec_size for instructions with destination
> width
> of 4
>   i965/vec4/gen6: fix exec_size for MOV with a width of 4 in
> generate_gs_ff_sync()
> 
>  src/mesa/drivers/dri/i965/brw_eu_emit.c  | 25
> +---
>  src/mesa/drivers/dri/i965/brw_ff_gs_emit.c   |  9 -
>  src/mesa/drivers/dri/i965/brw_fs_generator.cpp   |  2 ++
>  src/mesa/drivers/dri/i965/brw_vec4_generator.cpp | 13 +++-
>  4 files changed, 44 insertions(+), 5 deletions(-)
> 

signature.asc
Description: This is a digitally signed message part
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH RFC 0/2] GBM API extension to support fusing KMS and render devices

2016-03-07 Thread Lucas Stach
Am Freitag, den 04.03.2016, 18:34 + schrieb Emil Velikov:
> On 4 March 2016 at 17:38, Lucas Stach  wrote:
> > Am Freitag, den 04.03.2016, 17:20 + schrieb Daniel Stone:
> >> Hi,
> >>
> >> On 4 March 2016 at 16:08, Lucas Stach  wrote:
> >> > Am Freitag, den 04.03.2016, 15:09 + schrieb Daniel Stone:
> >> >> Thanks for taking this on, it looks really good! I just have the one
> >> >> question though - did you look at the EGLDevice extension? Using that
> >> >> to enumerate the GPUs, we could create the gbm_device using the KMS
> >> >> device and pass that in to the EGLDisplay, with an additional attrib
> >> >> to pass in an EGLDevice handle to eglGetPlatformDisplay. This could
> >> >> possibly be better since it is more independent of DRM as the API, and
> >> >> also allows people to share device enumeration/selection code with
> >> >> other platforms (e.g. choosing between multiple GPUs when using a
> >> >> winsys like Wayland or X11).
> >> >>
> >> > I have not looked at this in detail yet, but I think it's just an
> >> > extension to the interface outlined by this series.
> >> >
> >> > If we require the KMS device to have a DRI2/Gallium driver it should be
> >> > easy to hook up the EGLDevice discovery for them.
> >> > Passing in a second device handle for the KMS device is then just the
> >> > EGL implementation calling gbm_device_set_kms_provider() on the render
> >> > GBM device, instead of the application doing it manually.
> >>
> >> It turns the API backwards a bit though ...
> >>
> >> Right now, what we require is that the GBM device passed in is the KMS
> >> device, not the GPU device; what you're suggesting is that we discover
> >> the GPU device and then add the KMS device.
> >>
> >> So, with your proposal:
> >> gbm_gpu = gbm_device_create("/dev/dri/renderD128");
> >> egl_dpy = eglGetDisplay(gbm_gpu);
> >> gbm_kms = gbm_device_create("/dev/dri/card0");
> >> gbm_device_set_kms_provider(gbm_gpu, gbm_kms);
> >>
> >> i.e. the device the user creates first is the GPU device.
> >>
> >> With EGLDevice, we would have:
> >> gbm_kms = gbm_device_create("/dev/dri/card0");
> >> egl_gpus = eglGetDevicesEXT();
> >> egl_dpy = eglGetPlatformDisplay(gbm_kms, { EGL_TARGET_DEVICE, egl_gpus[0] 
> >> });
> >>
> >> So, the first/main device the user deals with is the KMS device - same
> >> as today. This makes sense, since GBM is the allocation API for KMS,
> >> and EGL should be the one dealing with the GPU ...
> >>
> > Right, my API design was from my view of GBM being the API to bootstrap
> > EGL rendering, but defining it as the KMS allocation API makes a lot
> > more sense, when you think about it.
> >
> >> Maybe it would make sense to reverse the API, so rather than creating
> >> a GBM device for the GPU and then linking that to the KMS device -
> >> requiring users to make different calls, e.g. gbm_bo_get_kms_bo(),
> >> which makes it harder to use and means we need to port current users -
> >> we create a GBM device for KMS and then link that to a GPU device.
> >> This would then mean that eglGetPlatformDisplay could do the linkage
> >> internally, and then existing users using gbm_bo_get_handle() etc
> >> would still work without needing any different codepaths.
> >
> > Yes, this will make the implementation inside GBM a bit more involved,
> > but it seems more natural this way around when thinking about hooking it
> > up to EGLDevice. I'll try it out and send an updated RFC after the
> > weekend.
> >
> While I'm more inclined to Daniel's suggestion, I wonder why people
> moved away from Thierry's approach - creating a composite/wrapped dri
> module ? Is there anything wrong with it - be that from technical or
> conceptual POV ?
> 
The wrapped driver takes away the ability of the application to decide
which GPUs to bind together - at least if you want to keep things
tightly coupled at that level.

The point of the explicit application control is that we not only solve
the "SoCs have split render/scanout devices" issue, but gain an API for
compositors to work properly on PRIME laptop configurations with
render/render/scanout. We don't want any autodetection to happen there,
a compositor may well decide to use the Intel GPU as scanout only and do
all composition on the discreet GPU. Having a tightly coupled wrapped
driver for every device combination is not really where we want to go,
right?

Also the wrapped approach obscures resource usage from the backing GPU
drivers. We have a much better resource usage tracking on Etnaviv if we
get rid of the wrapping driver. This allows us to skip some of the
resource flush requests from the state tracker, when the resource has
not changed. Flushing a resource might mean to copy a 1080p (or possibly
even bigger) frame around, so having better control over resource usage
is quite a win.

> I believe it has a few advantages over the above two proposals - it
> allows greater flexibility as both drivers will be tightly coupled and
> can communicate directly, does not

Re: [Mesa-dev] [PATCH] gallium/radeon: don't use temporary buffers for persistent mappings

2016-03-07 Thread Michel Dänzer
On 02.03.2016 06:26, Marek Olšák wrote:
> From: Marek Olšák 
> 
> Cc: 11.1 11.2 
> ---
>  src/gallium/drivers/radeon/r600_buffer_common.c | 6 --
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/src/gallium/drivers/radeon/r600_buffer_common.c 
> b/src/gallium/drivers/radeon/r600_buffer_common.c
> index b384baa..81409ce 100644
> --- a/src/gallium/drivers/radeon/r600_buffer_common.c
> +++ b/src/gallium/drivers/radeon/r600_buffer_common.c
> @@ -314,7 +314,8 @@ static void *r600_buffer_transfer_map(struct pipe_context 
> *ctx,
>   }
>   }
>   else if ((usage & PIPE_TRANSFER_DISCARD_RANGE) &&
> -  !(usage & PIPE_TRANSFER_UNSYNCHRONIZED) &&
> +  !(usage & (PIPE_TRANSFER_UNSYNCHRONIZED |
> + PIPE_TRANSFER_PERSISTENT)) &&
>!(rscreen->debug_flags & DBG_NO_DISCARD_RANGE) &&
>r600_can_dma_copy_buffer(rctx, box->x, 0, box->width)) {
>   assert(usage & PIPE_TRANSFER_WRITE);
> @@ -341,7 +342,8 @@ static void *r600_buffer_transfer_map(struct pipe_context 
> *ctx,
>   }
>   /* Using a staging buffer in GTT for larger reads is much faster. */
>   else if ((usage & PIPE_TRANSFER_READ) &&
> -  !(usage & PIPE_TRANSFER_WRITE) &&
> +  !(usage & (PIPE_TRANSFER_WRITE |
> + PIPE_TRANSFER_PERSISTENT)) &&
>rbuffer->domains == RADEON_DOMAIN_VRAM &&
>r600_can_dma_copy_buffer(rctx, 0, box->x, box->width)) {
>   struct r600_resource *staging;
> 

Reviewed-by: Michel Dänzer 


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 09/14] vc4: adapt to new sized alu types

2016-03-07 Thread Samuel Iglesias Gonsálvez
From: Iago Toral Quiroga 

CC: Eric Anholt 
---
 src/gallium/drivers/vc4/vc4_program.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/src/gallium/drivers/vc4/vc4_program.c 
b/src/gallium/drivers/vc4/vc4_program.c
index 5c91c02..6f27665 100644
--- a/src/gallium/drivers/vc4/vc4_program.c
+++ b/src/gallium/drivers/vc4/vc4_program.c
@@ -885,7 +885,10 @@ ntq_emit_comparison(struct vc4_compile *c, struct qreg 
*dest,
 struct qreg src0 = ntq_get_alu_src(c, compare_instr, 0);
 struct qreg src1 = ntq_get_alu_src(c, compare_instr, 1);
 
-if (nir_op_infos[compare_instr->op].input_types[0] == nir_type_float)
+unsigned unsized_type =
+nir_op_infos[compare_instr->op].input_types[0] &
+NIR_ALU_TYPE_BASE_TYPE_MASK;
+if (unsized_type == nir_type_float)
 qir_SF(c, qir_FSUB(c, src0, src1));
 else
 qir_SF(c, qir_SUB(c, src0, src1));
-- 
2.7.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 10/14] nir: add nir_src_bit_size() helper

2016-03-07 Thread Samuel Iglesias Gonsálvez
From: Connor Abbott 

---
 src/compiler/nir/nir.h | 9 +
 1 file changed, 9 insertions(+)

diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h
index d2fd23d..39aad02 100644
--- a/src/compiler/nir/nir.h
+++ b/src/compiler/nir/nir.h
@@ -557,6 +557,15 @@ nir_dest_for_reg(nir_register *reg)
return dest;
 }
 
+static inline unsigned
+nir_src_bit_size(nir_src src)
+{
+   if (src.is_ssa)
+  return src.ssa->bit_size;
+
+   return src.reg.reg->bit_size;
+}
+
 void nir_src_copy(nir_src *dest, const nir_src *src, void *instr_or_if);
 void nir_dest_copy(nir_dest *dest, const nir_dest *src, nir_instr *instr);
 
-- 
2.7.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 13/14] nir: propagate bitsize information in nir_search

2016-03-07 Thread Samuel Iglesias Gonsálvez
From: Connor Abbott 

When we replace an expresion we have to compute bitsize information for the
replacement. We do this in two passes to validate that bitsize information
is consistent and correct: first we propagate bitsize from child nodes to
parent, then we do it the other way around, starting from the original's
instruction destination bitsize.

v2 (Iago):
- Always use nir_type_bool32 instead of nir_type_bool when generating
  algebraic optimizations. Before we used nir_type_bool32 with constants
  and nir_type_bool with variables.
- Fix bool comparisons in nir_search.c to account for bitsized types.

v3 (Sam):
- Unpack the double constant value as unsigned long long (8 bytes) in
nir_algrebraic.py.

Signed-off-by: Iago Toral Quiroga 
Signed-off-by: Samuel Iglesias Gonsálvez 
---
 src/compiler/nir/nir_algebraic.py |  22 +++-
 src/compiler/nir/nir_search.c | 244 ++
 src/compiler/nir/nir_search.h |   8 +-
 3 files changed, 247 insertions(+), 27 deletions(-)

diff --git a/src/compiler/nir/nir_algebraic.py 
b/src/compiler/nir/nir_algebraic.py
index 2357b57..1818877 100644
--- a/src/compiler/nir/nir_algebraic.py
+++ b/src/compiler/nir/nir_algebraic.py
@@ -63,11 +63,11 @@ class Value(object):
 static const ${val.c_type} ${val.name} = {
{ ${val.type_enum} },
 % if isinstance(val, Constant):
-   { ${hex(val)} /* ${val.value} */ },
+   ${val.type()}, { ${hex(val)} /* ${val.value} */ },
 % elif isinstance(val, Variable):
${val.index}, /* ${val.var_name} */
${'true' if val.is_constant else 'false'},
-   nir_type_${ val.required_type or 'invalid' },
+   ${val.type() or 'nir_type_invalid' },
 % elif isinstance(val, Expression):
nir_op_${val.opcode},
{ ${', '.join(src.c_ptr for src in val.sources)} },
@@ -107,10 +107,18 @@ class Constant(Value):
   if isinstance(self.value, (int, long)):
  return hex(self.value)
   elif isinstance(self.value, float):
- return hex(struct.unpack('I', struct.pack('f', self.value))[0])
+ return hex(struct.unpack('Q', struct.pack('d', self.value))[0])
   else:
  assert False
 
+   def type(self):
+  if isinstance(self.value, (bool)):
+ return "nir_type_bool32"
+  elif isinstance(self.value, (int, long)):
+ return "nir_type_int"
+  elif isinstance(self.value, float):
+ return "nir_type_float"
+
 _var_name_re = re.compile(r"(?P#)?(?P\w+)(?:@(?P\w+))?")
 
 class Variable(Value):
@@ -129,6 +137,14 @@ class Variable(Value):
 
   self.index = varset[self.var_name]
 
+   def type(self):
+  if self.required_type == 'bool':
+ return "nir_type_bool32"
+  elif self.required_type in ('int', 'unsigned'):
+ return "nir_type_int"
+  elif self.required_type == 'float':
+ return "nir_type_float"
+
 class Expression(Value):
def __init__(self, expr, name_base, varset):
   Value.__init__(self, name_base, "expression")
diff --git a/src/compiler/nir/nir_search.c b/src/compiler/nir/nir_search.c
index f509ce6..e874c79 100644
--- a/src/compiler/nir/nir_search.c
+++ b/src/compiler/nir/nir_search.c
@@ -62,7 +62,8 @@ alu_instr_is_bool(nir_alu_instr *instr)
case nir_op_inot:
   return src_is_bool(instr->src[0].src);
default:
-  return nir_op_infos[instr->op].output_type == nir_type_bool;
+  return (nir_op_infos[instr->op].output_type & 
NIR_ALU_TYPE_BASE_TYPE_MASK)
+ == nir_type_bool;
}
 }
 
@@ -125,8 +126,10 @@ match_value(const nir_search_value *value, nir_alu_instr 
*instr, unsigned src,
 nir_alu_instr *src_alu =
nir_instr_as_alu(instr->src[src].src.ssa->parent_instr);
 
-if (nir_op_infos[src_alu->op].output_type != var->type &&
-!(var->type == nir_type_bool && alu_instr_is_bool(src_alu)))
+if ((nir_op_infos[src_alu->op].output_type &
+ NIR_ALU_TYPE_BASE_TYPE_MASK) != var->type &&
+!((var->type & NIR_ALU_TYPE_BASE_TYPE_MASK) == nir_type_bool &&
+  alu_instr_is_bool(src_alu)))
return false;
  }
 
@@ -158,21 +161,65 @@ match_value(const nir_search_value *value, nir_alu_instr 
*instr, unsigned src,
   nir_load_const_instr *load =
  nir_instr_as_load_const(instr->src[src].src.ssa->parent_instr);
 
-  switch (nir_op_infos[instr->op].input_types[src]) {
+  switch (const_val->type) {
   case nir_type_float:
  for (unsigned i = 0; i < num_components; ++i) {
-if (load->value.f[new_swizzle[i]] != const_val->data.f)
+double val;
+switch (load->def.bit_size) {
+case 32:
+   val = load->value.f[new_swizzle[i]];
+   break;
+case 64:
+   val = load->value.d[new_swizzle[i]];
+   break;
+default:
+   unreachable("unknown bit size");
+}
+
+if (val != const_val->data.d)
   

[Mesa-dev] [PATCH 12/14] nir: add a bit_size parameter to nir_ssa_dest_init

2016-03-07 Thread Samuel Iglesias Gonsálvez
From: Connor Abbott 

v2: Squash multiple commits addressing the new parameter in different
files so we don't break the build (Iago)

v3: Fix tgsi (Samuel)

v4: Fix nir_clone.c (Samuel)

v5: Fix vc4 and freedreno (Iago)

CC: Eric Anholt 
CC: Rob Clark 

Signed-off-by: Iago Toral Quiroga 
Signed-off-by: Samuel Iglesias Gonsalvez 
Tested-by: Rob Clark 
---
 src/compiler/nir/glsl_to_nir.cpp   | 22 +--
 src/compiler/nir/nir.c | 14 +
 src/compiler/nir/nir.h |  6 ++--
 src/compiler/nir/nir_builder.h | 33 ++
 src/compiler/nir/nir_clone.c   |  3 +-
 src/compiler/nir/nir_from_ssa.c|  6 ++--
 src/compiler/nir/nir_lower_alu_to_scalar.c | 10 ---
 src/compiler/nir/nir_lower_atomics.c   |  6 ++--
 src/compiler/nir/nir_lower_clip.c  |  2 +-
 src/compiler/nir/nir_lower_io.c|  3 +-
 src/compiler/nir/nir_lower_locals_to_regs.c|  7 +++--
 src/compiler/nir/nir_lower_phis_to_scalar.c| 10 +--
 src/compiler/nir/nir_lower_tex.c   |  2 +-
 src/compiler/nir/nir_lower_two_sided_color.c   |  2 +-
 src/compiler/nir/nir_lower_var_copies.c|  5 +++-
 src/compiler/nir/nir_lower_vars_to_ssa.c   | 12 ++--
 src/compiler/nir/nir_opt_peephole_select.c |  3 +-
 src/compiler/nir/nir_search.c  |  5 ++--
 src/compiler/nir/nir_to_ssa.c  |  6 ++--
 src/gallium/auxiliary/nir/tgsi_to_nir.c| 14 -
 .../drivers/freedreno/ir3/ir3_nir_lower_if_else.c  |  2 +-
 src/gallium/drivers/vc4/vc4_nir_lower_blend.c  |  4 +--
 src/gallium/drivers/vc4/vc4_nir_lower_io.c |  6 ++--
 src/gallium/drivers/vc4/vc4_nir_lower_txf_ms.c |  2 +-
 src/gallium/drivers/vc4/vc4_program.c  |  2 +-
 .../drivers/dri/i965/brw_nir_opt_peephole_ffma.c   |  7 +++--
 src/mesa/program/prog_to_nir.c | 10 +++
 27 files changed, 131 insertions(+), 73 deletions(-)

diff --git a/src/compiler/nir/glsl_to_nir.cpp b/src/compiler/nir/glsl_to_nir.cpp
index a23fba7..5ca81de 100644
--- a/src/compiler/nir/glsl_to_nir.cpp
+++ b/src/compiler/nir/glsl_to_nir.cpp
@@ -759,7 +759,7 @@ nir_visitor::visit(ir_call *ir)
  ir_dereference *param =
 (ir_dereference *) ir->actual_parameters.get_head();
  instr->variables[0] = evaluate_deref(&instr->instr, param);
- nir_ssa_dest_init(&instr->instr, &instr->dest, 1, NULL);
+ nir_ssa_dest_init(&instr->instr, &instr->dest, 1, 32, NULL);
  nir_builder_instr_insert(&b, &instr->instr);
  break;
   }
@@ -793,7 +793,7 @@ nir_visitor::visit(ir_call *ir)
 const nir_intrinsic_info *info =
 &nir_intrinsic_infos[instr->intrinsic];
 nir_ssa_dest_init(&instr->instr, &instr->dest,
-  info->dest_components, NULL);
+  info->dest_components, 32, NULL);
  }
 
  if (op == nir_intrinsic_image_size ||
@@ -854,7 +854,7 @@ nir_visitor::visit(ir_call *ir)
  nir_builder_instr_insert(&b, &instr->instr);
  break;
   case nir_intrinsic_shader_clock:
- nir_ssa_dest_init(&instr->instr, &instr->dest, 1, NULL);
+ nir_ssa_dest_init(&instr->instr, &instr->dest, 1, 32, NULL);
  nir_builder_instr_insert(&b, &instr->instr);
  break;
   case nir_intrinsic_store_ssbo: {
@@ -895,7 +895,7 @@ nir_visitor::visit(ir_call *ir)
 
  /* Setup destination register */
  nir_ssa_dest_init(&instr->instr, &instr->dest,
-   type->vector_elements, NULL);
+   type->vector_elements, 32, NULL);
 
  /* Insert the created nir instruction now since in the case of boolean
   * result we will need to emit another instruction after it
@@ -918,7 +918,7 @@ nir_visitor::visit(ir_call *ir)
load_ssbo_compare->src[1].swizzle[i] = 0;
 nir_ssa_dest_init(&load_ssbo_compare->instr,
   &load_ssbo_compare->dest.dest,
-  type->vector_elements, NULL);
+  type->vector_elements, 32, NULL);
 load_ssbo_compare->dest.write_mask = (1 << type->vector_elements) 
- 1;
 nir_builder_instr_insert(&b, &load_ssbo_compare->instr);
 dest = &load_ssbo_compare->dest.dest;
@@ -964,7 +964,7 @@ nir_visitor::visit(ir_call *ir)
  /* Atomic result */
  assert(ir->return_deref);
  nir_ssa_dest_init(&instr->instr, &instr->dest,
-   ir->return_deref->type->vector_elements, NULL);
+   ir->return_deref->type->vector_elements, 32, NULL);
  nir_builder_instr_insert(&b, &instr->instr);
  break;
   }
@@ -979,8 +979,9 @@ ni

[Mesa-dev] [PATCH 14/14] i965/nir: fix check to resolve booleans to work with sized nir_alu_type

2016-03-07 Thread Samuel Iglesias Gonsálvez
As nir_alu_type has now embedded the data size, the check for the
instruction's output type (to see if a boolean resolve is required)
should ignore the data size part.

Signed-off-by: Samuel Iglesias Gonsálvez 
---
 src/mesa/drivers/dri/i965/brw_nir_analyze_boolean_resolves.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/brw_nir_analyze_boolean_resolves.c 
b/src/mesa/drivers/dri/i965/brw_nir_analyze_boolean_resolves.c
index 56e15ef..bd986cb 100644
--- a/src/mesa/drivers/dri/i965/brw_nir_analyze_boolean_resolves.c
+++ b/src/mesa/drivers/dri/i965/brw_nir_analyze_boolean_resolves.c
@@ -165,7 +165,7 @@ analyze_boolean_resolves_block(nir_block *block, void 
*void_state)
  }
 
  default:
-if (nir_op_infos[alu->op].output_type == nir_type_bool) {
+if ((nir_op_infos[alu->op].output_type & 
NIR_ALU_TYPE_BASE_TYPE_MASK) == nir_type_bool) {
/* This instructions will turn into a CMP when we actually emit
 * them so the result will have to be resolved before it can be
 * used.
-- 
2.7.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 11/14] nir: add nir_dest_bit_size() helper

2016-03-07 Thread Samuel Iglesias Gonsálvez
From: Connor Abbott 

---
 src/compiler/nir/nir.h | 9 +
 1 file changed, 9 insertions(+)

diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h
index 39aad02..c7e4dcc 100644
--- a/src/compiler/nir/nir.h
+++ b/src/compiler/nir/nir.h
@@ -566,6 +566,15 @@ nir_src_bit_size(nir_src src)
return src.reg.reg->bit_size;
 }
 
+static inline unsigned
+nir_dest_bit_size(nir_dest dest)
+{
+   if (dest.is_ssa)
+  return dest.ssa.bit_size;
+
+   return dest.reg.reg->bit_size;
+}
+
 void nir_src_copy(nir_src *dest, const nir_src *src, void *instr_or_if);
 void nir_dest_copy(nir_dest *dest, const nir_dest *src, nir_instr *instr);
 
-- 
2.7.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 08/14] i965: fix brw_glsl_base_type_for_nir_type() for sized types

2016-03-07 Thread Samuel Iglesias Gonsálvez
From: Connor Abbott 

This should only see sized types, but we can't do that until we have fixed NIR
and the driver to make this happen. A later commit will address this.
---
 src/mesa/drivers/dri/i965/brw_nir.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_nir.c 
b/src/mesa/drivers/dri/i965/brw_nir.c
index c9472af..ea3450e 100644
--- a/src/mesa/drivers/dri/i965/brw_nir.c
+++ b/src/mesa/drivers/dri/i965/brw_nir.c
@@ -648,12 +648,18 @@ brw_glsl_base_type_for_nir_type(nir_alu_type type)
 {
switch (type) {
case nir_type_float:
+   case nir_type_float32:
   return GLSL_TYPE_FLOAT;
 
+   case nir_type_float64:
+  return GLSL_TYPE_DOUBLE;
+
case nir_type_int:
+   case nir_type_int32:
   return GLSL_TYPE_INT;
 
case nir_type_uint:
+   case nir_type_uint32:
   return GLSL_TYPE_UINT;
 
default:
-- 
2.7.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 03/14] nir: Add a bit_size to nir_register and nir_ssa_def

2016-03-07 Thread Samuel Iglesias Gonsálvez
From: Jason Ekstrand 

This really hacky commit adds a bit size to registers and SSA values.  It
also adds rules in the validator to validate that they do the right things.

It's still an open question as to whether or not we want a bit_size in
nir_alu_instr or if we just want to let it inherit from the destination.
I'm inclined to just let it inherit from the destination.  A similar
question needs to be asked about intrinsics.

v2 (Connor):
  - Relax validation: comparisons have explicit destination sizes
and implicit source sizes.
---
 src/compiler/nir/nir.c  |  2 ++
 src/compiler/nir/nir.h  |  6 ++
 src/compiler/nir/nir_validate.c | 42 +
 3 files changed, 46 insertions(+), 4 deletions(-)

diff --git a/src/compiler/nir/nir.c b/src/compiler/nir/nir.c
index df40a55..de17305 100644
--- a/src/compiler/nir/nir.c
+++ b/src/compiler/nir/nir.c
@@ -68,6 +68,7 @@ reg_create(void *mem_ctx, struct exec_list *list)
list_inithead(®->if_uses);
 
reg->num_components = 0;
+   reg->bit_size = 32;
reg->num_array_elems = 0;
reg->is_packed = false;
reg->name = NULL;
@@ -1286,6 +1287,7 @@ nir_ssa_def_init(nir_instr *instr, nir_ssa_def *def,
list_inithead(&def->uses);
list_inithead(&def->if_uses);
def->num_components = num_components;
+   def->bit_size = 32; /* FIXME: Add an input paremeter or guess? */
 
if (instr->block) {
   nir_function_impl *impl =
diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h
index 659e98c..d493186 100644
--- a/src/compiler/nir/nir.h
+++ b/src/compiler/nir/nir.h
@@ -337,6 +337,9 @@ typedef struct nir_register {
unsigned num_components; /** < number of vector components */
unsigned num_array_elems; /** < size of array (0 for no array) */
 
+   /* The bit-size of each channel; must be one of 8, 16, 32, or 64 */
+   uint8_t bit_size;
+
/** generic register index. */
unsigned index;
 
@@ -444,6 +447,9 @@ typedef struct nir_ssa_def {
struct list_head if_uses;
 
uint8_t num_components;
+
+   /* The bit-size of each channel; must be one of 8, 16, 32, or 64 */
+   uint8_t bit_size;
 } nir_ssa_def;
 
 struct nir_src;
diff --git a/src/compiler/nir/nir_validate.c b/src/compiler/nir/nir_validate.c
index d1a9048..e6e8347 100644
--- a/src/compiler/nir/nir_validate.c
+++ b/src/compiler/nir/nir_validate.c
@@ -179,9 +179,12 @@ validate_alu_src(nir_alu_instr *instr, unsigned index, 
validate_state *state)
nir_alu_src *src = &instr->src[index];
 
unsigned num_components;
-   if (src->src.is_ssa)
+   unsigned src_bit_size;
+   if (src->src.is_ssa) {
+  src_bit_size = src->src.ssa->bit_size;
   num_components = src->src.ssa->num_components;
-   else {
+   } else {
+  src_bit_size = src->src.reg.reg->bit_size;
   if (src->src.reg.reg->is_packed)
  num_components = 4; /* can't check anything */
   else
@@ -194,6 +197,24 @@ validate_alu_src(nir_alu_instr *instr, unsigned index, 
validate_state *state)
  assert(src->swizzle[i] < num_components);
}
 
+   nir_alu_type src_type = nir_op_infos[instr->op].input_types[index];
+
+   /* 8-bit float isn't a thing */
+   if ((src_type & NIR_ALU_TYPE_BASE_TYPE_MASK) == nir_type_float)
+  assert(src_bit_size == 16 || src_bit_size == 32 || src_bit_size == 64);
+
+   if (src_type & NIR_ALU_TYPE_SIZE_MASK) {
+  /* This source has an explicit bit size */
+  assert((src_type & NIR_ALU_TYPE_SIZE_MASK) == src_bit_size);
+   } else {
+  if (!(nir_op_infos[instr->op].output_type & NIR_ALU_TYPE_SIZE_MASK)) {
+ unsigned dest_bit_size =
+instr->dest.dest.is_ssa ? instr->dest.dest.ssa.bit_size
+: instr->dest.dest.reg.reg->bit_size;
+ assert(dest_bit_size == src_bit_size);
+  }
+   }
+
validate_src(&src->src, state);
 }
 
@@ -263,8 +284,10 @@ validate_dest(nir_dest *dest, validate_state *state)
 }
 
 static void
-validate_alu_dest(nir_alu_dest *dest, validate_state *state)
+validate_alu_dest(nir_alu_instr *instr, validate_state *state)
 {
+   nir_alu_dest *dest = &instr->dest;
+
unsigned dest_size =
   dest->dest.is_ssa ? dest->dest.ssa.num_components
 : dest->dest.reg.reg->num_components;
@@ -282,6 +305,17 @@ validate_alu_dest(nir_alu_dest *dest, validate_state 
*state)
assert(nir_op_infos[alu->op].output_type == nir_type_float ||
   !dest->saturate);
 
+   unsigned bit_size = dest->dest.is_ssa ? dest->dest.ssa.bit_size
+ : dest->dest.reg.reg->bit_size;
+   nir_alu_type type = nir_op_infos[instr->op].output_type;
+
+   /* 8-bit float isn't a thing */
+   if ((type & NIR_ALU_TYPE_BASE_TYPE_MASK) == nir_type_float)
+  assert(bit_size == 16 || bit_size == 32 || bit_size == 64);
+
+   assert((type & NIR_ALU_TYPE_SIZE_MASK) == 0 ||
+  (type & NIR_ALU_TYPE_SIZE_MASK) == bit_size);
+
validate_dest(&dest->dest, state);
 }
 
@@ -294,7 +328,

[Mesa-dev] [PATCH 04/14] nir: add double constant types

2016-03-07 Thread Samuel Iglesias Gonsálvez
From: Connor Abbott 

---
 src/compiler/nir/nir.h | 4 
 1 file changed, 4 insertions(+)

diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h
index d493186..d2fd23d 100644
--- a/src/compiler/nir/nir.h
+++ b/src/compiler/nir/nir.h
@@ -99,6 +99,7 @@ union nir_constant_data {
int i[16];
float f[16];
bool b[16];
+   double d[16];
 };
 
 typedef struct nir_constant {
@@ -1177,8 +1178,11 @@ nir_tex_instr_src_index(nir_tex_instr *instr, 
nir_tex_src_type type)
 typedef struct {
union {
   float f[4];
+  double d[4];
   int32_t i[4];
   uint32_t u[4];
+  int64_t l[4];
+  uint64_t ul[4];
};
 } nir_const_value;
 
-- 
2.7.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 07/14] i965: fix brw_type_for_nir_type() for sized types

2016-03-07 Thread Samuel Iglesias Gonsálvez
From: Connor Abbott 

This should only see sized types, but we can't make that change
until we make sure that nir uses the sized versions in all the
relevant places. A later commit will address this.
---
 src/mesa/drivers/dri/i965/brw_nir.c | 12 
 1 file changed, 12 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_nir.c 
b/src/mesa/drivers/dri/i965/brw_nir.c
index a5949d5..c9472af 100644
--- a/src/mesa/drivers/dri/i965/brw_nir.c
+++ b/src/mesa/drivers/dri/i965/brw_nir.c
@@ -615,12 +615,24 @@ brw_type_for_nir_type(nir_alu_type type)
 {
switch (type) {
case nir_type_uint:
+   case nir_type_uint32:
   return BRW_REGISTER_TYPE_UD;
case nir_type_bool:
case nir_type_int:
+   case nir_type_bool32:
+   case nir_type_int32:
   return BRW_REGISTER_TYPE_D;
case nir_type_float:
+   case nir_type_float32:
   return BRW_REGISTER_TYPE_F;
+   case nir_type_float64:
+  return BRW_REGISTER_TYPE_DF;
+   case nir_type_int64:
+   case nir_type_uint64:
+  /* TODO we should only see these in moves, so for now it's ok, but when
+   * we add actual 64-bit integer support we should fix this.
+   */
+  return BRW_REGISTER_TYPE_DF;
default:
   unreachable("unknown type");
}
-- 
2.7.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 01/14] nir: Add explicitly sized types

2016-03-07 Thread Samuel Iglesias Gonsálvez
From: Jason Ekstrand 

v2: Fix size/type mask to properly handle 8-bit types.

Signed-off-by: Juan A. Suarez Romero 
---
 src/compiler/nir/nir.h | 17 -
 1 file changed, 16 insertions(+), 1 deletion(-)

diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h
index cccb3a4..659e98c 100644
--- a/src/compiler/nir/nir.h
+++ b/src/compiler/nir/nir.h
@@ -605,9 +605,24 @@ typedef enum {
nir_type_float,
nir_type_int,
nir_type_uint,
-   nir_type_bool
+   nir_type_bool,
+   nir_type_bool32 =32 | nir_type_bool,
+   nir_type_int8 =  8  | nir_type_int,
+   nir_type_int16 = 16 | nir_type_int,
+   nir_type_int32 = 32 | nir_type_int,
+   nir_type_int64 = 64 | nir_type_int,
+   nir_type_uint8 = 8  | nir_type_uint,
+   nir_type_uint16 =16 | nir_type_uint,
+   nir_type_uint32 =32 | nir_type_uint,
+   nir_type_uint64 =64 | nir_type_uint,
+   nir_type_float16 =   16 | nir_type_float,
+   nir_type_float32 =   32 | nir_type_float,
+   nir_type_float64 =   64 | nir_type_float,
 } nir_alu_type;
 
+#define NIR_ALU_TYPE_SIZE_MASK 0xfff8
+#define NIR_ALU_TYPE_BASE_TYPE_MASK 0x0007
+
 typedef enum {
NIR_OP_IS_COMMUTATIVE = (1 << 0),
NIR_OP_IS_ASSOCIATIVE = (1 << 1),
-- 
2.7.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 02/14] nir/types: add a function to get the bitsize of a base type

2016-03-07 Thread Samuel Iglesias Gonsálvez
From: Connor Abbott 

v2: fix it for GLSL_TYPE_SUBROUTINE (Iago)

Signed-off-by: Iago Toral Quiroga 
---
 src/compiler/nir_types.h | 21 +
 1 file changed, 21 insertions(+)

diff --git a/src/compiler/nir_types.h b/src/compiler/nir_types.h
index 18d64b7..0748783 100644
--- a/src/compiler/nir_types.h
+++ b/src/compiler/nir_types.h
@@ -77,6 +77,27 @@ enum glsl_base_type glsl_get_sampler_result_type(const 
struct glsl_type *type);
 unsigned glsl_get_record_location_offset(const struct glsl_type *type,
  unsigned length);
 
+static inline unsigned
+glsl_get_bit_size(enum glsl_base_type type)
+{
+   switch (type) {
+   case GLSL_TYPE_INT:
+   case GLSL_TYPE_UINT:
+   case GLSL_TYPE_BOOL:
+   case GLSL_TYPE_FLOAT: /* TODO handle mediump */
+   case GLSL_TYPE_SUBROUTINE:
+  return 32;
+
+   case GLSL_TYPE_DOUBLE:
+  return 64;
+
+   default:
+  unreachable("unknown base type");
+   }
+
+   return 0;
+}
+
 bool glsl_type_is_void(const struct glsl_type *type);
 bool glsl_type_is_error(const struct glsl_type *type);
 bool glsl_type_is_vector(const struct glsl_type *type);
-- 
2.7.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 06/14] nir: handle different bit sizes when constant folding

2016-03-07 Thread Samuel Iglesias Gonsálvez
From: Connor Abbott 

v2: Use the bit-size information from the opcode information if defined (Iago)

Signed-off-by: Iago Toral Quiroga 

FIXME: This should be squashed into the previous commit so we don't break
the build. The break happens because the python script that generates the
constant folding pass does not know how to handle the sized types introduced
by the previous commit until this patch, so it ends up generating code with
invalid types. Keep it separated for review purposes.
---
 src/compiler/nir/nir_constant_expressions.h  |   2 +-
 src/compiler/nir/nir_constant_expressions.py | 246 +--
 src/compiler/nir/nir_opt_constant_folding.c  |  24 ++-
 3 files changed, 182 insertions(+), 90 deletions(-)

diff --git a/src/compiler/nir/nir_constant_expressions.h 
b/src/compiler/nir/nir_constant_expressions.h
index 97997f2..201f278 100644
--- a/src/compiler/nir/nir_constant_expressions.h
+++ b/src/compiler/nir/nir_constant_expressions.h
@@ -28,4 +28,4 @@
 #include "nir.h"
 
 nir_const_value nir_eval_const_opcode(nir_op op, unsigned num_components,
-  nir_const_value *src);
+  unsigned bit_size, nir_const_value *src);
diff --git a/src/compiler/nir/nir_constant_expressions.py 
b/src/compiler/nir/nir_constant_expressions.py
index 32784f6..972d281 100644
--- a/src/compiler/nir/nir_constant_expressions.py
+++ b/src/compiler/nir/nir_constant_expressions.py
@@ -1,4 +1,43 @@
 #! /usr/bin/python2
+
+def type_has_size(type_):
+return type_[-1:].isdigit()
+
+def type_sizes(type_):
+if type_.endswith("8"):
+return [8]
+elif type_.endswith("16"):
+return [16]
+elif type_.endswith("32"):
+return [32]
+elif type_.endswith("64"):
+return [64]
+else:
+return [32, 64]
+
+def type_add_size(type_, size):
+if type_has_size(type_):
+return type_
+return type_ + str(size)
+
+def get_const_field(type_):
+if type_ == "int32":
+return "i"
+if type_ == "uint32":
+return "u"
+if type_ == "int64":
+return "l"
+if type_ == "uint64":
+return "ul"
+if type_ == "bool32":
+return "b"
+if type_ == "float32":
+return "f"
+if type_ == "float64":
+return "d"
+raise Exception(str(type_))
+assert(0)
+
 template = """\
 /*
  * Copyright (C) 2014 Intel Corporation
@@ -205,110 +244,140 @@ unpack_half_1x16(uint16_t u)
 }
 
 /* Some typed vector structures to make things like src0.y work */
-% for type in ["float", "int", "uint", "bool"]:
-struct ${type}_vec {
-   ${type} x;
-   ${type} y;
-   ${type} z;
-   ${type} w;
+typedef float float32_t;
+typedef double float64_t;
+typedef bool bool32_t;
+% for type in ["float", "int", "uint"]:
+% for width in [32, 64]:
+struct ${type}${width}_vec {
+   ${type}${width}_t x;
+   ${type}${width}_t y;
+   ${type}${width}_t z;
+   ${type}${width}_t w;
 };
 % endfor
+% endfor
+
+struct bool32_vec {
+bool x;
+bool y;
+bool z;
+bool w;
+};
 
 % for name, op in sorted(opcodes.iteritems()):
 static nir_const_value
-evaluate_${name}(unsigned num_components, nir_const_value *_src)
+evaluate_${name}(unsigned num_components, unsigned bit_size,
+ nir_const_value *_src)
 {
nir_const_value _dst_val = { { {0, 0, 0, 0} } };
 
-   ## For each non-per-component input, create a variable srcN that
-   ## contains x, y, z, and w elements which are filled in with the
-   ## appropriately-typed values.
-   % for j in range(op.num_inputs):
-  % if op.input_sizes[j] == 0:
- <% continue %>
-  % elif "src" + str(j) not in op.const_expr:
- ## Avoid unused variable warnings
- <% continue %>
-  %endif
-
-  struct ${op.input_types[j]}_vec src${j} = {
-  % for k in range(op.input_sizes[j]):
- % if op.input_types[j] == "bool":
-_src[${j}].u[${k}] != 0,
- % else:
-_src[${j}].${op.input_types[j][:1]}[${k}],
- % endif
-  % endfor
-  };
-   % endfor
+   switch (bit_size) {
+   % for bit_size in [32, 64]:
+   case ${bit_size}: {
+  <%
+  output_type = type_add_size(op.output_type, bit_size)
+  input_types = [type_add_size(type_, bit_size) for type_ in 
op.input_types]
+  %>
+
+  ## For each non-per-component input, create a variable srcN that
+  ## contains x, y, z, and w elements which are filled in with the
+  ## appropriately-typed values.
+  % for j in range(op.num_inputs):
+ % if op.input_sizes[j] == 0:
+<% continue %>
+ % elif "src" + str(j) not in op.const_expr:
+## Avoid unused variable warnings
+<% continue %>
+ %endif
 
-   % if op.output_size == 0:
-  ## For per-component instructions, we need to iterate over the
-  ## components and apply the constant expression one component
-  ## at a time.
-  for (unsigned _i = 0; _i < num_componen

[Mesa-dev] [PATCH 05/14] nir: update opcode definitions for different bit sizes

2016-03-07 Thread Samuel Iglesias Gonsálvez
From: Connor Abbott 

Some opcodes need explicit bitsizes, and sometimes we need to use the
double version when constant folding.

v2: fix output type for u2f (Iago)

v3: do not change vecN opcodes to be float. The next commit will add
infrastructure to enable 64-bit integer constant folding so this is isn't
really necessary. Also, that created problems with source modifiers in
some cases (Iago)

Signed-off-by: Iago Toral Quiroga 
---
 src/compiler/nir/nir_opcodes.py | 144 +---
 1 file changed, 74 insertions(+), 70 deletions(-)

diff --git a/src/compiler/nir/nir_opcodes.py b/src/compiler/nir/nir_opcodes.py
index a37fe2d..0c91c03 100644
--- a/src/compiler/nir/nir_opcodes.py
+++ b/src/compiler/nir/nir_opcodes.py
@@ -90,8 +90,12 @@ class Opcode(object):
 # helper variables for strings
 tfloat = "float"
 tint = "int"
-tbool = "bool"
+tbool = "bool32"
 tuint = "uint"
+tfloat32 = "float32"
+tint32 = "int32"
+tuint32 = "uint32"
+tfloat64 = "float64"
 
 commutative = "commutative "
 associative = "associative "
@@ -155,56 +159,56 @@ unop("frsq", tfloat, "1.0f / sqrtf(src0)")
 unop("fsqrt", tfloat, "sqrtf(src0)")
 unop("fexp2", tfloat, "exp2f(src0)")
 unop("flog2", tfloat, "log2f(src0)")
-unop_convert("f2i", tint, tfloat, "src0") # Float-to-integer conversion.
-unop_convert("f2u", tuint, tfloat, "src0") # Float-to-unsigned conversion
-unop_convert("i2f", tfloat, tint, "src0") # Integer-to-float conversion.
+unop_convert("f2i", tint32, tfloat32, "src0") # Float-to-integer conversion.
+unop_convert("f2u", tuint32, tfloat32, "src0") # Float-to-unsigned conversion
+unop_convert("i2f", tfloat32, tint32, "src0") # Integer-to-float conversion.
 # Float-to-boolean conversion
-unop_convert("f2b", tbool, tfloat, "src0 != 0.0f")
+unop_convert("f2b", tbool, tfloat32, "src0 != 0.0f")
 # Boolean-to-float conversion
-unop_convert("b2f", tfloat, tbool, "src0 ? 1.0f : 0.0f")
+unop_convert("b2f", tfloat32, tbool, "src0 ? 1.0f : 0.0f")
 # Int-to-boolean conversion
-unop_convert("i2b", tbool, tint, "src0 != 0")
-unop_convert("b2i", tint, tbool, "src0 ? 1 : 0") # Boolean-to-int conversion
-unop_convert("u2f", tfloat, tuint, "src0") # Unsigned-to-float conversion.
+unop_convert("i2b", tbool, tint32, "src0 != 0")
+unop_convert("b2i", tint32, tbool, "src0 ? 1 : 0") # Boolean-to-int conversion
+unop_convert("u2f", tfloat32, tuint32, "src0") # Unsigned-to-float conversion.
 
 # Unary floating-point rounding operations.
 
 
-unop("ftrunc", tfloat, "truncf(src0)")
-unop("fceil", tfloat, "ceilf(src0)")
-unop("ffloor", tfloat, "floorf(src0)")
-unop("ffract", tfloat, "src0 - floorf(src0)")
-unop("fround_even", tfloat, "_mesa_roundevenf(src0)")
+unop("ftrunc", tfloat, "bit_size == 64 ? trunc(src0) : truncf(src0)")
+unop("fceil", tfloat, "bit_size == 64 ? ceil(src0) : ceilf(src0)")
+unop("ffloor", tfloat, "bit_size == 64 ? floor(src0) : floorf(src0)")
+unop("ffract", tfloat, "src0 - (bit_size == 64 ? floor(src0) : floorf(src0))")
+unop("fround_even", tfloat, "bit_size == 64 ? _mesa_roundeven(src0) : 
_mesa_roundevenf(src0)")
 
 
 # Trigonometric operations.
 
 
-unop("fsin", tfloat, "sinf(src0)")
-unop("fcos", tfloat, "cosf(src0)")
+unop("fsin", tfloat, "bit_size == 64 ? sin(src0) : sinf(src0)")
+unop("fcos", tfloat, "bit_size == 64 ? cos(src0) : cosf(src0)")
 
 
 # Partial derivatives.
 
 
-unop("fddx", tfloat, "0.0f") # the derivative of a constant is 0.
-unop("fddy", tfloat, "0.0f")
-unop("fddx_fine", tfloat, "0.0f")
-unop("fddy_fine", tfloat, "0.0f")
-unop("fddx_coarse", tfloat, "0.0f")
-unop("fddy_coarse", tfloat, "0.0f")
+unop("fddx", tfloat, "0.0") # the derivative of a constant is 0.
+unop("fddy", tfloat, "0.0")
+unop("fddx_fine", tfloat, "0.0")
+unop("fddy_fine", tfloat, "0.0")
+unop("fddx_coarse", tfloat, "0.0")
+unop("fddy_coarse", tfloat, "0.0")
 
 
 # Floating point pack and unpack operations.
 
 def pack_2x16(fmt):
-   unop_horiz("pack_" + fmt + "_2x16", 1, tuint, 2, tfloat, """
+   unop_horiz("pack_" + fmt + "_2x16", 1, tuint32, 2, tfloat32, """
 dst.x = (uint32_t) pack_fmt_1x16(src0.x);
 dst.x |= ((uint32_t) pack_fmt_1x16(src0.y)) << 16;
 """.replace("fmt", fmt))
 
 def pack_4x8(fmt):
-   unop_horiz("pack_" + fmt + "_4x8", 1, tuint, 4, tfloat, """
+   unop_horiz("pack_" + fmt + "_4x8", 1, tuint32, 4, tfloat32, """
 dst.x = (uint32_t) pack_fmt_1x8(src0.x);
 dst.x |= ((uint32_t) pack_fmt_1x8(src0.y)) << 8;
 dst.x |= ((uint32_t) pack_fmt_1x8(src0.z)) << 16;
@@ -212,13 +216,13 @@ dst.x |= ((uint32_t) pack_fmt_1x8(src0.w)) << 24;
 """.replace("fmt", fmt))
 
 def unpack_2x16(fmt):
-   unop_horiz("unpack_" + fmt + "_2x16", 2, tfloat, 1, tuint, """
+   unop_horiz("unpack_" + fmt + "_2x16", 2, tfloat32, 1, tuint32, """
 dst.x = unpack_fmt_1x16((uint16_t)(src0.x & 0x));
 dst.y = unpack_fmt_1x16((uint16_t)(src0.x << 16));
 """.replace("fmt", fmt))
 
 def unpack_4x8(fmt):
-   unop_horiz("unpack_" + fmt + "_4x8", 4, tfloat, 1, tuint, """
+   unop_horiz("unpack_" + fmt + "_4x8", 4, tfloat32, 1,

[Mesa-dev] [PATCH 00/14] nir: add bit-size information in data types

2016-03-07 Thread Samuel Iglesias Gonsálvez
Hello,

Iago and I are working on adding FP64 support to i965 drivers [0] with
the help of Connor and Jason (Thanks!). That support requires changes
in NIR to include the bit-size information in the data types and also
modifications in the opcodes to use sized types as needed.

This means that all NIR clients should take that information into
consideration as well, since backends are now expected to see sized NIR
types and they need to handle them properly.

This batch is the smallest set of NIR patches required to
incorporate sized types together with the minimum set of changes
required to i965, freedreno and vc4 that we have identified. Rob Clark
and Eric Anholt checked that freedreno and vc4 drivers respectively
work fine with these changes.

We have verified that with these changes there are no regressions in
Piglit for i965, which does the minimum necessary to deal with sized
types at this point. We tested on gen5, gen6, gen7 and gen8. We don't
have gen4 hardware available, so trying this would also require
involvement from other people with access to this hardware.

With this series, drivers should be able to work normally with both
sized and unsized types (mostly by ignoring the size aspect of the
type).

The rest of the fp64 work that we hope to send soon for review will add
further changes to ensure that we get correct bit-sized types wherever
we need them, but we will postpone this until we actually need to care
about different bit-sizes, when we send the fp64 for review, since that
requires a lot more changes.

The reason we are sending this part ahead is that the inclusion of sized
types in NIR together with the bit-size information affects all drivers
using NIR, and since it affects a lot of NIR opcodes (that now need to
be defined using the corresponding bit-sized types) it is much easier to
review and land this ahead of the rest of the series and have everyone
be aware of this change as soon as possible.

Because of this it would also be great if new code checked in after this
series also tries to incorporate bit-size information to the types in
NIR and the drivers, even if the drivers can eat the unsized types too
for now.  We will try to fix anything that slips in as we rebase our
fp64 branch though.

We would like to land this batch of patches in NIR ahead of the rest
of the fp64 changes, which will use sized types extensively. This is
important because the change is significant and it is important that
new code landing in master is aware of this as soon as possible.

Our idea is to squash these patches together into a single commit before
pushing them to master. We keep them separate here to facilitate the
review.

Thanks,

Sam

[0] https://bugs.freedesktop.org/show_bug.cgi?id=92760

Connor Abbott (10):
  nir/types: add a function to get the bitsize of a base type
  nir: add double constant types
  nir: update opcode definitions for different bit sizes
  nir: handle different bit sizes when constant folding
  i965: fix brw_type_for_nir_type() for sized types
  i965: fix brw_glsl_base_type_for_nir_type() for sized types
  nir: add nir_src_bit_size() helper
  nir: add nir_dest_bit_size() helper
  nir: add a bit_size parameter to nir_ssa_dest_init
  nir: propagate bitsize information in nir_search

Iago Toral Quiroga (1):
  vc4: adapt to new sized alu types

Jason Ekstrand (2):
  nir: Add explicitly sized types
  nir: Add a bit_size to nir_register and nir_ssa_def

Samuel Iglesias Gonsálvez (1):
  i965/nir: fix check to resolve booleans to work with sized
nir_alu_type

 src/compiler/nir/glsl_to_nir.cpp   |  22 +-
 src/compiler/nir/nir.c |  14 +-
 src/compiler/nir/nir.h |  51 -
 src/compiler/nir/nir_algebraic.py  |  22 +-
 src/compiler/nir/nir_builder.h |  33 ++-
 src/compiler/nir/nir_clone.c   |   3 +-
 src/compiler/nir/nir_constant_expressions.h|   2 +-
 src/compiler/nir/nir_constant_expressions.py   | 246 
 src/compiler/nir/nir_from_ssa.c|   6 +-
 src/compiler/nir/nir_lower_alu_to_scalar.c |  10 +-
 src/compiler/nir/nir_lower_atomics.c   |   6 +-
 src/compiler/nir/nir_lower_clip.c  |   2 +-
 src/compiler/nir/nir_lower_io.c|   3 +-
 src/compiler/nir/nir_lower_locals_to_regs.c|   7 +-
 src/compiler/nir/nir_lower_phis_to_scalar.c|  10 +-
 src/compiler/nir/nir_lower_tex.c   |   2 +-
 src/compiler/nir/nir_lower_two_sided_color.c   |   2 +-
 src/compiler/nir/nir_lower_var_copies.c|   5 +-
 src/compiler/nir/nir_lower_vars_to_ssa.c   |  12 +-
 src/compiler/nir/nir_opcodes.py| 144 ++--
 src/compiler/nir/nir_opt_constant_folding.c|  24 +-
 src/compiler/nir/nir_opt_peephole_select.c |   3 +-
 src/compiler/nir/nir_search.c  | 247 +

[Mesa-dev] [Bug 93667] Crash in eglCreateImageKHR with huge texture size

2016-03-07 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=93667

Fabian Vogt  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #2 from Fabian Vogt  ---
(In reply to Emil Velikov from comment #1)
> Hi Fabian, just send out the a patch for this case. Can you please test it ?
> 
> The only other case that I've spot has already been addressed with commit
> 5d87a7c894d "egl_dri2: NULL check for xcb_dri2_get_buffers_reply()". Can you
> let me know if we've missed any others ?
> 
> -Emil

Patch tested and confirmed to work :)
The other places seem to be fixed now.
Thanks!

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH RFC 0/2] GBM API extension to support fusing KMS and render devices

2016-03-07 Thread Daniel Vetter
On Fri, Mar 04, 2016 at 06:34:37PM +, Emil Velikov wrote:
> On 4 March 2016 at 17:38, Lucas Stach  wrote:
> > Am Freitag, den 04.03.2016, 17:20 + schrieb Daniel Stone:
> >> Hi,
> >>
> >> On 4 March 2016 at 16:08, Lucas Stach  wrote:
> >> > Am Freitag, den 04.03.2016, 15:09 + schrieb Daniel Stone:
> >> >> Thanks for taking this on, it looks really good! I just have the one
> >> >> question though - did you look at the EGLDevice extension? Using that
> >> >> to enumerate the GPUs, we could create the gbm_device using the KMS
> >> >> device and pass that in to the EGLDisplay, with an additional attrib
> >> >> to pass in an EGLDevice handle to eglGetPlatformDisplay. This could
> >> >> possibly be better since it is more independent of DRM as the API, and
> >> >> also allows people to share device enumeration/selection code with
> >> >> other platforms (e.g. choosing between multiple GPUs when using a
> >> >> winsys like Wayland or X11).
> >> >>
> >> > I have not looked at this in detail yet, but I think it's just an
> >> > extension to the interface outlined by this series.
> >> >
> >> > If we require the KMS device to have a DRI2/Gallium driver it should be
> >> > easy to hook up the EGLDevice discovery for them.
> >> > Passing in a second device handle for the KMS device is then just the
> >> > EGL implementation calling gbm_device_set_kms_provider() on the render
> >> > GBM device, instead of the application doing it manually.
> >>
> >> It turns the API backwards a bit though ...
> >>
> >> Right now, what we require is that the GBM device passed in is the KMS
> >> device, not the GPU device; what you're suggesting is that we discover
> >> the GPU device and then add the KMS device.
> >>
> >> So, with your proposal:
> >> gbm_gpu = gbm_device_create("/dev/dri/renderD128");
> >> egl_dpy = eglGetDisplay(gbm_gpu);
> >> gbm_kms = gbm_device_create("/dev/dri/card0");
> >> gbm_device_set_kms_provider(gbm_gpu, gbm_kms);
> >>
> >> i.e. the device the user creates first is the GPU device.
> >>
> >> With EGLDevice, we would have:
> >> gbm_kms = gbm_device_create("/dev/dri/card0");
> >> egl_gpus = eglGetDevicesEXT();
> >> egl_dpy = eglGetPlatformDisplay(gbm_kms, { EGL_TARGET_DEVICE, egl_gpus[0] 
> >> });
> >>
> >> So, the first/main device the user deals with is the KMS device - same
> >> as today. This makes sense, since GBM is the allocation API for KMS,
> >> and EGL should be the one dealing with the GPU ...
> >>
> > Right, my API design was from my view of GBM being the API to bootstrap
> > EGL rendering, but defining it as the KMS allocation API makes a lot
> > more sense, when you think about it.
> >
> >> Maybe it would make sense to reverse the API, so rather than creating
> >> a GBM device for the GPU and then linking that to the KMS device -
> >> requiring users to make different calls, e.g. gbm_bo_get_kms_bo(),
> >> which makes it harder to use and means we need to port current users -
> >> we create a GBM device for KMS and then link that to a GPU device.
> >> This would then mean that eglGetPlatformDisplay could do the linkage
> >> internally, and then existing users using gbm_bo_get_handle() etc
> >> would still work without needing any different codepaths.
> >
> > Yes, this will make the implementation inside GBM a bit more involved,
> > but it seems more natural this way around when thinking about hooking it
> > up to EGLDevice. I'll try it out and send an updated RFC after the
> > weekend.
> >
> While I'm more inclined to Daniel's suggestion, I wonder why people
> moved away from Thierry's approach - creating a composite/wrapped dri
> module ? Is there anything wrong with it - be that from technical or
> conceptual POV ?
> 
> I believe it has a few advantages over the above two proposals - it
> allows greater flexibility as both drivers will be tightly coupled and
> can communicate directly, does not expand the internal/hidden ABI that
> we currently have between GBM and EGL, could (in theory) work with
> GLX.

I think composite driver makes sense if you e.g. have some 2d blitter on
the display block and a 3d gpu and want to make them one thing (and use
the blitter to shuffle bytes around for uploads). Wrt the internal API I'm
not concerned too much, since in the end all that coordination is the
exact same thing we need to add for compositor/client communication too.
They need to agree on what is most suitable as the frontbuffer format and
how/where to allocate it, otherwise you'll suffer tons of unnecessary
copies. In short we need much more powerful "what buffers can you
support/prefer" interfaces anyway, not just for kms/gpu dual drm device
support in gbm. And I actually think prototyping those in gbm is a great
idea, gets rid of the wayland/X proto complexities.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinf

Re: [Mesa-dev] [PATCH v2] r600g: Fix ARB_texture_rgb10_a2ui support in big-endian

2016-03-07 Thread Oded Gabbay
On Tue, Mar 1, 2016 at 2:11 PM, Oded Gabbay  wrote:
> This patch enables the correct detection of PIPE_FORMAT_R10G10B10A2_UINT
> and PIPE_FORMAT_B10G10R10A2_UINT formats in r600g in big-endian mode.
>
> Because the swapping doesn't happen on component boundaries for these
> formats, the GPU H/W needs to be configured differently for LE/BE.
> Therefore, we need to use a different color format for BE - 
> V_0280A0_COLOR_10_10_10_2
>
> This enables support for ARB_texture_rgb10_a2ui, which otherwise is not
> detected as supported.
>
> Tested using piglit texwrap with GL_ARB_texture_rgb10_a2ui.
>
> v2:
>
> - Used the correct color format for R10G10B10A2 on
>   BE (V_0280A0_COLOR_10_10_10_2) to configure the GPU
>
> - Added detection of this color format in endian swap function
>
> - removed blank line
>
> Signed-off-by: Oded Gabbay 
> Cc: "11.1 11.2" 
> ---
>  src/gallium/drivers/r600/r600_state_common.c | 11 +++
>  1 file changed, 11 insertions(+)
>
> diff --git a/src/gallium/drivers/r600/r600_state_common.c 
> b/src/gallium/drivers/r600/r600_state_common.c
> index aa3a085..53cf972 100644
> --- a/src/gallium/drivers/r600/r600_state_common.c
> +++ b/src/gallium/drivers/r600/r600_state_common.c
> @@ -2464,6 +2464,14 @@ uint32_t r600_translate_texformat(struct pipe_screen 
> *screen,
> result = FMT_2_10_10_10;
> goto out_word4;
> }
> +   if (R600_BIG_ENDIAN &&
> +   desc->channel[0].size == 2 &&
> +   desc->channel[1].size == 10 &&
> +   desc->channel[2].size == 10 &&
> +   desc->channel[3].size == 10) {
> +   result = FMT_10_10_10_2;
> +   goto out_word4;
> +   }
> goto out_unknown;
> }
> goto out_unknown;
> @@ -2685,6 +2693,8 @@ uint32_t r600_translate_colorformat(enum chip_class 
> chip, enum pipe_format forma
> return V_0280A0_COLOR_1_5_5_5;
> } else if (HAS_SIZE(10,10,10,2)) {
> return V_0280A0_COLOR_2_10_10_10;
> +   } else if (R600_BIG_ENDIAN && HAS_SIZE(2,10,10,10)) {
> +   return V_0280A0_COLOR_10_10_10_2;
> }
> break;
> }
> @@ -2717,6 +2727,7 @@ uint32_t r600_colorformat_endian_swap(uint32_t 
> colorformat)
>  */
> return ENDIAN_NONE;
>
> +   case V_0280A0_COLOR_10_10_10_2:
> case V_0280A0_COLOR_2_10_10_10:
> case V_0280A0_COLOR_8_24:
> case V_0280A0_COLOR_24_8:
> --
> 2.5.0
>

Hi Michel,
Please disregard this patch.
I'm going to fix BE issues methodically, as Marek suggested. Basically
removing all GL functionality (for debug) and testing the whole piglit
suite for GL 1.3 (lowest I could go), 1.4, ... and sending patches in
batches for each GL version.

Sorry for wasting your time.
Oded
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev