Re: [Mesa-dev] [Mesa-stable] New stable-branch 11.0 candidate pushed

2015-11-08 Thread Emil Velikov
[Dropping Jason, Ken, adding Roland to CC list]

Hi Oded,

On 8 November 2015 at 09:22, Oded Gabbay  wrote:

> Hi Emil,
>
> I would like to propose the following patch to be added to 11.0.5:
> 39b4dfe6ab1003863778a25c091c080e098833ec llvmpipe: use simple coeffs
> calc for 128bit vectors
>
> It fixes 3 piglit tests in ppc64le (and doesn't cause regressions of course).
>
If you don't might I'd rather leave it for 11.0.6. That is, unless
Roland and/or others feel that it's a bad idea to have in the stable
branch altogether ?

Thanks
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH mesa 0/5] nouveau: codegen: Make use of double immediates

2015-11-08 Thread Hans de Goede

Hi,

On 07-11-15 01:59, Ilia Mirkin wrote:

Hi Hans,

All pushed. I made a few additional fixes and improvement to fp64
immediate handling along the way, but all your commits were fine
as-is. (Except that they enabled fp64 immediates on nv50 implicitly
which is wrong -- there are no immediate-taking variants on nv50, so I
fixed that glitch. But only the G200 can do fp64 in the first place,
and nouveau doesn't actually expose it. Corner case of a corner case
:) )


Right, I did actually think about that one a bit since Compute capability
1.3 does include doubles, but I figured that since we do not support doubles
on nv50 at all that that would not be an issue, guess I should have
mentioned this in one of the commit messages.


Thanks for taking care of this... it was a small bit of fp64 which I
always felt bad about not having finished up. (But not bad enough to
actually finish it myself.)


You're welcome, this was a fun learning experience for me and I look
forward to doing more work on the codegen bits in the future. But for now
I will be spending my time on a tgsi backend for llvm, so sorry I will
not be looking into:

https://trello.com/c/DX357llE/71-fold-immediates-into-const-load-offsets

Anytime soon, but I do plan to work more on the codegen code in the
future. I will make sure to coordinate with you when I have time to
work on codegen again to avoid doing double work.

Regards,

Hans
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] New stable-branch 11.0 candidate pushed

2015-11-08 Thread Emil Velikov
On 7 November 2015 at 19:18, Kenneth Graunke  wrote:
> On Saturday, November 07, 2015 04:37:05 PM Emil Velikov wrote:
>> Jason, Kenneth,
>>
>> I had to pick a few extra commits as prerequirements to the "nir:
>> Properly invalidate metadata in nir_foo" patches. Can you let me know
>> if any of these should not be in 11.0. Thanks !
>>
>> commit 800217a1654ab7932870b1510981f5e38712d58b
>> Author: Kenneth Graunke 
>>
>> nir: Report progress from nir_split_var_copies().
>>
>> (cherry picked from commit dc18b9357b553a972ea439facfbc55e376f1179f)
>>
>>
>> commit 2cc4e973962c1d5ea0357685036879c7bf9575ce
>> Author: Jason Ekstrand 
>>
>> nir/lower_vec_to_movs: Pass the shader around directly
>>
>> (cherry picked from commit b7eeced3c724bf5de05290551ced8621ce2c7c52)
>>
>>
>> commit ef4e862396ae81b0d59f172d0d5273a4e6b5992d
>> Author: Jason Ekstrand 
>>
>> nir: Report progress from lower_vec_to_movs().
>>
>> (cherry picked from commit 9f5e7ae9d83ce6de761936b95cd0b7ba4c1219c4)
>
> Looks reasonable to me - thanks, and sorry for the trouble!
>
No real trouble, mostly shortage of experience with NIR, so I'm trying
to be extra careful and not break things.

Thanks
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [Mesa-stable] New stable-branch 11.0 candidate pushed

2015-11-08 Thread Oded Gabbay
On Sun, Nov 8, 2015 at 1:39 PM, Emil Velikov 
wrote:

> [Dropping Jason, Ken, adding Roland to CC list]
>
> Hi Oded,
>
> On 8 November 2015 at 09:22, Oded Gabbay  wrote:
>
> > Hi Emil,
> >
> > I would like to propose the following patch to be added to 11.0.5:
> > 39b4dfe6ab1003863778a25c091c080e098833ec llvmpipe: use simple coeffs
> > calc for 128bit vectors
> >
> > It fixes 3 piglit tests in ppc64le (and doesn't cause regressions of
> course).
> >
> If you don't might I'd rather leave it for 11.0.6. That is, unless
> Roland and/or others feel that it's a bad idea to have in the stable
> branch altogether ?
>
> Thanks
> Emil
>

​I don't mind at all. 11.0.6 is perfectly fine by me.
I do think it should be in stable, though.

​Thanks,
Oded
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [Mesa-stable] New stable-branch 11.0 candidate pushed

2015-11-08 Thread Emil Velikov
Hi Mark,

On 7 November 2015 at 23:26, Mark Janes  wrote:
> Hi Emil,
>
> I get this regression testing the new branch:
>
> piglit.spec.oes_compressed_paletted_texture.basic api
>
> /tmp/build_root/m64/lib/piglit/bin/oes_compressed_paletted_texture-api -auto 
> -fbo
> Trying glTexImage2D...
> Trying glCompressedTexImage2D...
> Unexpected GL error: GL_INVALID_ENUM 0x500
> (Error at 
> /home/jenkins/workspace/Leeroy/repos/piglit/tests/spec/oes_compressed_paletted_texture/oes_compressed_paletted_texture-api.c:135)
> Expected GL error: GL_INVALID_VALUE 0x501
>
> Do you see the same thing?
>
I'm afraid I don't have it here. Any chance you can bisect which
commit causes it, so we can revert it ? I can wait for a few of days
for the results, in case you're busy.

Thanks
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/6] glsl: move layout qualifier validation out of the parser

2015-11-08 Thread Emil Velikov
On 6 November 2015 at 21:13, Timothy Arceri  wrote:
> On Fri, 2015-11-06 at 13:16 +, Emil Velikov wrote:
>> On 5 November 2015 at 11:17, Timothy Arceri  wrote:
>> > From: Timothy Arceri 
>> >
>> > This is in preperation for compile-time constant support.
>> typo "preparation"
>>
>> >
>> > Also fix up the locations for some of the extension checking
>> > error messages in the parser. We now correctly give the location
>> > of the layout qualifier identifier rather than the integer constant.
>> >
>> > The validation is moved to two locations, for validation on variables the
>> > checks are moved to the ast to hir pass and for qualifiers that apply to
>> > the
>> > shader the validation is moved into glsl_parser_extras.cpp.
>> >
>> > In order to do validation at the later stage in glsl_parser_extras.cpp we
>> > need to temporarily add a field in ast_type_qualifier to keep track of the
>> > parser location, this will be removed in a following patch when we
>> > introduce a new type for storing the comiple-time qualifiers.
>> >
>> > Also as the set_shader_inout_layout() function in glsl parser extras is
>> > normally called after all validation is done we need to move the code that
>> > sets CompileStatus and InfoLog otherwise the newly moved error messages
>> > will
>> > be ignored.
>> Personally I would split the validate_layout_qualifiers() introduction
>> and the CompileStatus/InfoLog movement into separate patches.
>
> The reason for not doing this in a new patch is that this is existing
> functionality not new functionality, doing so would regress a bunch of piglit
> tests.
>
> I can do it if it makes things easier to review but it should all be pushed as
> one.
>
Fair enough - I'd just keep in as it then. I'll take a closer look at
some time today/tomorrow.


Imho if one needs to made a few different things at once, this is a
clear indication that things are more convoluted as they should be.


Cheers,
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] st/va: add mpeg4 startcode workaround

2015-11-08 Thread Ilia Mirkin
On Sun, Nov 8, 2015 at 6:41 AM, Christian König  wrote:
> On 06.11.2015 20:28, Ilia Mirkin wrote:
>>
>> On Fri, Nov 6, 2015 at 2:15 PM, Christian König 
>> wrote:
>>>
>>> On 06.11.2015 20:10, Ilia Mirkin wrote:

 On Fri, Nov 6, 2015 at 1:48 PM, Zhang, Boyuan 
 wrote:
>
> Hi Emil,
>
> Please see the following information about this patch.
>
> - Issue: For Mpeg4, the VOP and GOV headers were truncated. With the
> existing workaround in st/va, playback shows massive corruptions.
> - This Patch: Provide another way to get the truncated headers back.
> Massive corruptions are gone with this patch. At the same time, add an
> environmental variable to allow user to decide whether to use this
> patch.

 Why would the user not want to use this? Sounds like a correctness
 fix, no? Or is it some thing that a hypothetical gallium driver might
 not need but the radeon uvd-based ones do? In that case it should be
 behind a PIPE_VIDEO_CAP_bla (sorry, I'm still not too clear on what
 "bla" is here...)
>>>
>>>
>>> The problem is that this is a rather extreme hack.
>>>
>>> As you probably knew VA-API didn't correctly specify which start code
>>> should
>>> be included and which shouldn't for MPEG-4. This is an issue for AMD as
>>> well
>>> as NVidia hardware and pretty much everybody which sticks close to an
>>> elementary stream.
>>>
>>> What we do in this hack is just searching the bytes *before* the pointer
>>> and
>>> size we got from the application for the stuff that's missing. E.g. we
>>> access memory the application didn't told us to access.
>>>
>>> This is rather speculative, but works surprisingly well with a lot of
>>> applications.
>>
>> Hm, that is a little dodgy indeed. But making user-selectable options
>> (provided via env var) for correct decoding... doesn't seem ideal
>> either. Is there some "correct" way to resolve this without changing
>> the va api?
>
>
> Unfortunately no. I wasn't involved in everything but we had a couple of
> people working on this which have more knowledge about MPEG-4 part 2 than
> me.
>
> A couple of month back somebody from a different team at AMD even tried to
> convince Intel to fix this, but as far as I know without success.
>
> The over all conclusion is that the interface definition of VA-API for
> MPEG-4 part 2 is just a bloody mess.

I see. Since everything I knew about MPEG4 has been forgotten many
years ago (and I've never figured out the va api in the first place),
I'll take the above as a given. So it sounds like there are a few
options --

(a) enable workaround by default, provide a way to disable
(b) no workaround by default, provide a way to enable, people have
buggy rendering with no great way to find out about the enable
(c) no workaround by default, don't expose MPEG4 in the va endpoint.
provide a way to enable workaround, which in turn enables MPEG4

I kinda like (c). That way end users don't get broken rendering, and
ones who really want to use va-api for mpeg4 can enable it if they
know what they're doing. What do you think?

  -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] android: fix LOCAL_C_INCLUDES to find glsl_types.h

2015-11-08 Thread Mauro Rossi
Hi,

Sending an update because with the export android_x86 target builds ok,
but I'm getting again the "glsl_types.h not found" building error with
android_x86_64 target (specifically for 64 bit modules).

I'll report as soon I may be able to understand what's going on,
added other android-x86 developers in CC.

Mauro

2015-11-07 1:29 GMT+01:00 Mauro Rossi :

> Hi Emil,
>
> by exporting the path of glsl nir headers, mesa builds without problems.
>
> You can find in the attachment the formatted patch.
> Thanks
>
> Mauro
>
>
>
> 2015-11-06 18:26 GMT+01:00 Emil Velikov :
>
>> Hi Mauro
>>
>> On 6 November 2015 at 03:31, Mauro Rossi  wrote:
>> > These changes are necessary to avoid building errors in glsl and i965
>> > ---
>> >  src/glsl/Android.mk  | 6 --
>> >  src/mesa/drivers/dri/i965/Android.mk | 3 ++-
>> >  2 files changed, 6 insertions(+), 3 deletions(-)
>> >
>> > diff --git a/src/glsl/Android.mk b/src/glsl/Android.mk
>> > index f63b7da..6902ea4 100644
>> > --- a/src/glsl/Android.mk
>> > +++ b/src/glsl/Android.mk
>> > @@ -42,7 +42,8 @@ LOCAL_C_INCLUDES := \
>> > $(MESA_TOP)/src/mapi \
>> > $(MESA_TOP)/src/mesa \
>> > $(MESA_TOP)/src/gallium/include \
>> > -   $(MESA_TOP)/src/gallium/auxiliary
>> > +   $(MESA_TOP)/src/gallium/auxiliary \
>> > +   $(MESA_TOP)/src/glsl/nir
>> >
>> >  LOCAL_MODULE := libmesa_glsl
>> >
>> > @@ -63,7 +64,8 @@ LOCAL_C_INCLUDES := \
>> > $(MESA_TOP)/src/mapi \
>> > $(MESA_TOP)/src/mesa \
>> > $(MESA_TOP)/src/gallium/include \
>> > -   $(MESA_TOP)/src/gallium/auxiliary
>> > +   $(MESA_TOP)/src/gallium/auxiliary \
>> > +   $(MESA_TOP)/src/glsl/nir
>> >
>> >  LOCAL_STATIC_LIBRARIES := libmesa_glsl libmesa_glsl_utils libmesa_util
>> >
>> > diff --git a/src/mesa/drivers/dri/i965/Android.mk
>> b/src/mesa/drivers/dri/i965/Android.mk
>> > index d30a053..f9a914a 100644
>> > --- a/src/mesa/drivers/dri/i965/Android.mk
>> > +++ b/src/mesa/drivers/dri/i965/Android.mk
>> > @@ -45,7 +45,8 @@ LOCAL_CFLAGS += \
>> >  endif
>> >
>> >  LOCAL_C_INCLUDES := \
>> > -   $(MESA_DRI_C_INCLUDES)
>> > +   $(MESA_DRI_C_INCLUDES) \
>> > +   $(MESA_TOP)/src/glsl/nir
>> >
>> >  LOCAL_SRC_FILES := \
>> > $(i965_compiler_FILES) \
>>
>> Following the Android way of exporting includes I believe you want the
>> following
>>
>> diff --git a/src/glsl/Android.gen.mk b/src/glsl/Android.gen.mk
>> index 6898fb0..59cc857 100644
>> --- a/src/glsl/Android.gen.mk
>> +++ b/src/glsl/Android.gen.mk
>> @@ -38,7 +38,8 @@ LOCAL_C_INCLUDES += \
>>   $(MESA_TOP)/src/glsl/nir
>>
>>  LOCAL_EXPORT_C_INCLUDE_DIRS += \
>> - $(intermediates)/nir
>> + $(intermediates)/nir \
>> + $(MESA_TOP)/src/glsl/nir
>>
>>  LOCAL_GENERATED_SOURCES += $(addprefix $(intermediates)/, \
>>   $(LIBGLCPP_GENERATED_FILES) \
>>
>>
>> Formatting might be broken (thanks gmail), although the gist is there.
>> Can you give it a try (note the order is important)
>>
>> Thanks
>> Emil
>>
>
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] st/va: add mpeg4 startcode workaround

2015-11-08 Thread Emil Velikov
On 8 November 2015 at 11:41, Christian König  wrote:
> On 06.11.2015 20:28, Ilia Mirkin wrote:
>>
>> On Fri, Nov 6, 2015 at 2:15 PM, Christian König 
>> wrote:
>>>
>>> On 06.11.2015 20:10, Ilia Mirkin wrote:

 On Fri, Nov 6, 2015 at 1:48 PM, Zhang, Boyuan 
 wrote:
>
> Hi Emil,
>
> Please see the following information about this patch.
>
> - Issue: For Mpeg4, the VOP and GOV headers were truncated. With the
> existing workaround in st/va, playback shows massive corruptions.
> - This Patch: Provide another way to get the truncated headers back.
> Massive corruptions are gone with this patch. At the same time, add an
> environmental variable to allow user to decide whether to use this
> patch.

 Why would the user not want to use this? Sounds like a correctness
 fix, no? Or is it some thing that a hypothetical gallium driver might
 not need but the radeon uvd-based ones do? In that case it should be
 behind a PIPE_VIDEO_CAP_bla (sorry, I'm still not too clear on what
 "bla" is here...)
>>>
>>>
>>> The problem is that this is a rather extreme hack.
>>>
>>> As you probably knew VA-API didn't correctly specify which start code
>>> should
>>> be included and which shouldn't for MPEG-4. This is an issue for AMD as
>>> well
>>> as NVidia hardware and pretty much everybody which sticks close to an
>>> elementary stream.
>>>
>>> What we do in this hack is just searching the bytes *before* the pointer
>>> and
>>> size we got from the application for the stuff that's missing. E.g. we
>>> access memory the application didn't told us to access.
>>>
>>> This is rather speculative, but works surprisingly well with a lot of
>>> applications.
>>
>> Hm, that is a little dodgy indeed. But making user-selectable options
>> (provided via env var) for correct decoding... doesn't seem ideal
>> either. Is there some "correct" way to resolve this without changing
>> the va api?
>
>
> Unfortunately no. I wasn't involved in everything but we had a couple of
> people working on this which have more knowledge about MPEG-4 part 2 than
> me.
>
> A couple of month back somebody from a different team at AMD even tried to
> convince Intel to fix this, but as far as I know without success.
>
> The over all conclusion is that the interface definition of VA-API for
> MPEG-4 part 2 is just a bloody mess.
>
And precisely for the above reasons, I keep on going on like an old
hag to keep writing something in the commit logs.



Adding workarounds is fine, as long as they are documented - what it
does (if not obvious), why we need it and even references when
available. Otherwise the next person will just come and remove this
code and/or do something that completely breaks things up, as there is
no information that can prevent (educate) them from doing do.



Cheers
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [Mesa-stable] New stable-branch 11.0 candidate pushed

2015-11-08 Thread Roland Scheidegger
I don't have anything against including it in stable (it is pretty
harmless), albeit I didn't tag it because I didn't think it was
important enough. While technically it makes things more accurate, it
doesn't fix anything known except a couple of piglit tests.

Roland


Am 08.11.2015 um 12:39 schrieb Emil Velikov:
> [Dropping Jason, Ken, adding Roland to CC list]
> 
> Hi Oded,
> 
> On 8 November 2015 at 09:22, Oded Gabbay  wrote:
> 
>> Hi Emil,
>>
>> I would like to propose the following patch to be added to 11.0.5:
>> 39b4dfe6ab1003863778a25c091c080e098833ec llvmpipe: use simple coeffs
>> calc for 128bit vectors
>>
>> It fixes 3 piglit tests in ppc64le (and doesn't cause regressions of course).
>>
> If you don't might I'd rather leave it for 11.0.6. That is, unless
> Roland and/or others feel that it's a bad idea to have in the stable
> branch altogether ?
> 
> Thanks
> Emil
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 1/2] main: Don't restrict several KHR_debug enum to desktop GL

2015-11-08 Thread Boyan Ding
In preparation for supporting GL_KHR_debug in OpenGL ES

v2: add a missing hunk in _mesa_IsEnabled (Emil)

Signed-off-by: Boyan Ding 
Reviewed-by: Emil Velikov 
---
 src/mesa/main/enable.c| 10 ++
 src/mesa/main/getstring.c |  5 +
 2 files changed, 3 insertions(+), 12 deletions(-)

diff --git a/src/mesa/main/enable.c b/src/mesa/main/enable.c
index 42f6799..a8a667e 100644
--- a/src/mesa/main/enable.c
+++ b/src/mesa/main/enable.c
@@ -369,10 +369,7 @@ _mesa_set_enable(struct gl_context *ctx, GLenum cap, 
GLboolean state)
  break;
   case GL_DEBUG_OUTPUT:
   case GL_DEBUG_OUTPUT_SYNCHRONOUS_ARB:
- if (!_mesa_is_desktop_gl(ctx))
-goto invalid_enum_error;
- else
-_mesa_set_debug_state_int(ctx, cap, state);
+ _mesa_set_debug_state_int(ctx, cap, state);
  break;
   case GL_DITHER:
  if (ctx->Color.DitherFlag == state)
@@ -1225,10 +1222,7 @@ _mesa_IsEnabled( GLenum cap )
  return ctx->Polygon.CullFlag;
   case GL_DEBUG_OUTPUT:
   case GL_DEBUG_OUTPUT_SYNCHRONOUS_ARB:
- if (!_mesa_is_desktop_gl(ctx))
-goto invalid_enum_error;
- else
-return (GLboolean) _mesa_get_debug_state_int(ctx, cap);
+ return (GLboolean) _mesa_get_debug_state_int(ctx, cap);
   case GL_DEPTH_TEST:
  return ctx->Depth.Test;
   case GL_DITHER:
diff --git a/src/mesa/main/getstring.c b/src/mesa/main/getstring.c
index 9873fdb..2bca88c 100644
--- a/src/mesa/main/getstring.c
+++ b/src/mesa/main/getstring.c
@@ -268,10 +268,7 @@ _mesa_GetPointerv( GLenum pname, GLvoid **params )
  break;
   case GL_DEBUG_CALLBACK_FUNCTION_ARB:
   case GL_DEBUG_CALLBACK_USER_PARAM_ARB:
- if (!_mesa_is_desktop_gl(ctx))
-goto invalid_pname;
- else
-*params = _mesa_get_debug_state_ptr(ctx, pname);
+ *params = _mesa_get_debug_state_ptr(ctx, pname);
  break;
   default:
  goto invalid_pname;
-- 
2.6.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/6] mesa: add ARB_enhanced_layouts

2015-11-08 Thread Emil Velikov
Hi Tim,

On 6 November 2015 at 21:09, Timothy Arceri  wrote:
> On Fri, 2015-11-06 at 13:02 +, Emil Velikov wrote:
>> Hi Tim,
>>
>> A few comments below
>>
>> On 5 November 2015 at 11:17, Timothy Arceri  wrote:
>> > From: Timothy Arceri 
>> >
>> > Set to dummy_false until the remaining features are added.
>> > ---
>> >  src/glsl/glcpp/glcpp-parse.y| 1 +
>> >  src/glsl/glsl_parser_extras.cpp | 1 +
>> >  src/glsl/glsl_parser_extras.h   | 2 ++
>> >  src/mesa/main/extensions.c  | 1 +
>> >  4 files changed, 5 insertions(+)
>> >
>> > diff --git a/src/glsl/glcpp/glcpp-parse.y b/src/glsl/glcpp/glcpp-parse.y
>> > index 4acccf7..6aa7abe 100644
>> > --- a/src/glsl/glcpp/glcpp-parse.y
>> > +++ b/src/glsl/glcpp/glcpp-parse.y
>> > @@ -2387,6 +2387,7 @@
>> > _glcpp_parser_handle_version_declaration(glcpp_parser_t *parser, intmax_t
>> > versio
>> >}
>> > } else {
>> >add_builtin_define(parser, "GL_ARB_draw_buffers", 1);
>> > +   add_builtin_define(parser, "GL_ARB_enhanced_layouts", 1);
>> > add_builtin_define(parser, "GL_ARB_separate_shader_objects",
>> > 1);
>> >add_builtin_define(parser, "GL_ARB_texture_rectangle", 1);
>> > add_builtin_define(parser, "GL_AMD_shader_trinary_minmax", 1);
>> > diff --git a/src/glsl/glsl_parser_extras.cpp
>> > b/src/glsl/glsl_parser_extras.cpp
>> > index 14cb9fc..be344d6 100644
>> > --- a/src/glsl/glsl_parser_extras.cpp
>> > +++ b/src/glsl/glsl_parser_extras.cpp
>> > @@ -594,6 +594,7 @@ static const _mesa_glsl_extension
>> > _mesa_glsl_supported_extensions[] = {
>> > EXT(ARB_derivative_control,   true,  false,
>> >  ARB_derivative_control),
>> > EXT(ARB_draw_buffers, true,  false, dummy_true),
>> > EXT(ARB_draw_instanced,   true,  false,
>> >  ARB_draw_instanced),
>> > +   EXT(ARB_enhanced_layouts, true,  false, dummy_true),
>> > EXT(ARB_explicit_attrib_location, true,  false,
>> >  ARB_explicit_attrib_location),
>> > EXT(ARB_explicit_uniform_location,true,  false,
>> >  ARB_explicit_uniform_location),
>> > EXT(ARB_fragment_coord_conventions,   true,  false,
>> >  ARB_fragment_coord_conventions),
>> > diff --git a/src/glsl/glsl_parser_extras.h b/src/glsl/glsl_parser_extras.h
>> > index b54c535..684b917 100644
>> > --- a/src/glsl/glsl_parser_extras.h
>> > +++ b/src/glsl/glsl_parser_extras.h
>> > @@ -499,6 +499,8 @@ struct _mesa_glsl_parse_state {
>> > bool ARB_draw_buffers_warn;
>> > bool ARB_draw_instanced_enable;
>> > bool ARB_draw_instanced_warn;
>> > +   bool ARB_enhanced_layouts_enable;
>> > +   bool ARB_enhanced_layouts_warn;
>> > bool ARB_explicit_attrib_location_enable;
>> > bool ARB_explicit_attrib_location_warn;
>> > bool ARB_explicit_uniform_location_enable;
>> > diff --git a/src/mesa/main/extensions.c b/src/mesa/main/extensions.c
>> > index bdc6817..b8556aa 100644
>> > --- a/src/mesa/main/extensions.c
>> > +++ b/src/mesa/main/extensions.c
>> > @@ -111,6 +111,7 @@ static const struct extension extension_table[] = {
>> > { "GL_ARB_draw_elements_base_vertex",
>> >  o(ARB_draw_elements_base_vertex),   GL, 2009 },
>> > { "GL_ARB_draw_indirect",   o(ARB_draw_indirect),
>> >   GLC,2010 },
>> > { "GL_ARB_draw_instanced",  o(ARB_draw_instanced),
>> >   GL, 2008 },
>> > +   { "GL_ARB_enhanced_layouts",o(dummy_false),
>> >   GLC,2013 },
>> > { "GL_ARB_explicit_attrib_location",
>> >  o(ARB_explicit_attrib_location),GL, 2009 },
>> > { "GL_ARB_explicit_uniform_location",
>> >  o(ARB_explicit_uniform_location),   GL, 2012 },
>> > { "GL_ARB_fragment_coord_conventions",
>> >  o(ARB_fragment_coord_conventions),  GL, 2009 },
>> Please add gl_extensions::ARB_enhanced_layouts and use it in the above
>> two tables. Otherwise the extension override won't work.
>
> Are you sure? MESA_EXTENSION_OVERRIDE=GL_ARB_enhanced_layouts has been working
> for my testing without it.
>
From what I recall is that (as I was trying to use this for explicit
offsets) _mesa_glsl_extension::compatible_with_state() was failing as,
it directly dives into gl_extensions, and with the above it was
constantly reading the dummy_false.

Fwiw I would follow the approach set by other extensions -  add it for
now and nuke it as/if needed.

-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] android: fix LOCAL_C_INCLUDES to find glsl_types.h

2015-11-08 Thread Emil Velikov
Hi Mauro,

On 8 November 2015 at 12:59, Mauro Rossi  wrote:
> Update2: I'm getting the building error in both x86 target and x86_64
> target.
>
> I'm relieved it's not arch dependend, I suspect that export will require a
> dependency to be declared,
>  because if i965_dri module is built before glsl ones we will have the
> error.
>
> The LOCAL_C_INCLUDES even if not elegant, avoided the problem in the first
> place,
> but I'd like to learn the by the best practice and apply it in the future.
>
Fwiw I'm all for LOCAL_C_INCLUDES (I even mentioned a few times a way
that we can share those and minimise these issues), but I believe
Chih-Wei was not really a fan of them. If he's ok with it I'll push
your original patch.

Regards,
Emil

P.S. Typos - for each one we fix, another we introduce another one or more  :-P
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] The i965 vec4 backend, exec_masks, and 64-bit types

2015-11-08 Thread Connor Abbott
On Tue, Nov 3, 2015 at 8:04 PM, Francisco Jerez  wrote:
> Francisco Jerez  writes:
>
>> Connor Abbott  writes:
>>
>>> Hi all,
>>>
>>> While working on FP64 for i965, there's an issue that I thought of
>>> with the vec4 backend that I'm not sure how to resolve. From what I
>>> understand, the execmask works the same way in Align16 mode as Align1
>>> mode, except that you only use the first 8 channels in practice for
>>> SIMD4x2, and the first four channels are always the same as well as
>>> the last 4 channels. But this doesn't work for 64-bit things, since
>>> there we only operate on 4 components at the same time, so it's more
>>> like SIMD2x2. For example, imagine that only the second vertex is
>>> currently enabled at the moment. Then the execmask looks like
>>> , and if we do something like:
>>>
>>> mul(4)  g24<1>DF g12<4,4,1>DF g13<4,4,1>DF { align16 };
>>>
>>> then all 4 channels will be disabled, which is not what we want.
>>>
>> AFAIUI this shouldn't be a problem.  In align16 mode each component of
>> an instruction with double-precision execution type maps to *two* bits
>> of the execmask instead of one (one for each 32-bit half), which is
>> compensated by each logical thread having two components instead of
>> four, so in your example [assuming  is little-endian notation
>> and you actually do 'mul(8)' ;)] the x and y components of the first
>> logical thread will be disabled while the x and y components of the
>> second logical thread will be enabled.

That certainly makes sense... I just couldn't find a doc reference to
confirm or deny it.

>>
>
> I've had a look into the simulator's behaviour, and in fact HSW+ seem to
> sort of support actual SIMD4x2 on DF types, so when you do stuff like
>
> | mul(8)   g24.xyzw:dfg12<4>.xyzw:df  g12<4>.xyzw:df { align16 };
>
> it will actually write 8 double floats to g24-25 (using a nibble from
> the execmask for each vec4), what contradicts the hardware spec:
>
> | IVB+
> |
> | In Align16 mode, all regioning parameters must use the syntax of a pair
> | of packed floats, including channel selects and channel enables.
> |
> | // Example:
> | mov (8) r10.0.xyzw:df r11.0.xyzw:df
> | // The above instruction moves four double floats. The .x picks the
> | // low 32 bits and the .y picks the high 32 bits of the double float.
>
> (I believe the quotation above may only apply to IVB even though it's
>  marked IVB+).

Thanks for looking into this. Indeed, at least on BDW the exec_size
does need to be divided by 2 (I have a patch on my branch that does
this, and it fixed a number of piglit tests). That's why the example I
wrote had an exec_size of 4.

>
> Now the really weird thing I've noticed: A DF Align16 instruction with
> writemask XY will actually write components XZ of each vec4, and
> writemask ZW actually writes components YW (!).  Other writemasks seem
> to behave normally (including all scalar ones).  I haven't found any
> mention of this in the docs, but a quick test on real hardware confirms
> the simulator's behaviour.

Ugh, really... that sucks :/

>
> Swizzles OTOH still shuffle individual 32-bit fields and are extended
> cyclically into the ZW components of the instruction (how useful).
>
> I wonder if we would be better off scalarizing all FP64 code...

Yeah, maybe we could get away with putting each component into a
separate register, and always using XYZW writemasks... but we'd still
need to pack two things into a single dvec2 for e.g. SSBO's, so it
wouldn't work there. We don't support them today, although I'm still
not 100% sure we can always get rid of all the packing operations...
and relying on the optimizations to get rid of them seems kinda
fragile. We could make dvec2() work using normal 32-bit MOV's,
although at that point it might be easier not to scalarize and instead
have double operations output to temporaries and then use a 32-bit MOV
to apply the right writemask.

>
>>> I think the first thing to do is to write a piglit test that tests
>>> this case, since currently all the arb_gpu_shader_fp64 tests only use
>>> uniforms. We need a test that uses non-uniform control flow that
>>> triggers the case described above. Once we do that, and if we
>>> determine there's actually a problem, then we need to figure out how
>>> to solve it.. The ideas I had were:
>>>
>>
>> I guess a piglit test would be nice, but you're unlikely to have to do
>> much about it. ;)
>>
> I think I take my word back, this isn't going to be fun. :P

hehe :)

>
>>> 1. make every FP64 thing use WE_all. This isn't actually too bad at
>>> the moment, since our notion of interference already assumes
>>> (more-or-less) that everything is WE_all, but it prevents us from
>>> improving it in the future with FP64 things. Unfortunately, it also
>>> means that we can't use writemasks since setting WE_all makes the EU
>>> ignore the writemask, so we'll have to do some trickery to 

[Mesa-dev] [PATCH 2/7] gallium/radeon: simplify disabling render condition for u_blitter

2015-11-08 Thread Marek Olšák
From: Marek Olšák 

just disable it by not setting the predication bit
---
 src/gallium/drivers/r600/r600_blit.c  | 12 +---
 src/gallium/drivers/r600/r600_state_common.c  | 11 ++-
 src/gallium/drivers/radeon/r600_pipe_common.h |  3 ++-
 src/gallium/drivers/radeonsi/si_blit.c| 10 --
 src/gallium/drivers/radeonsi/si_state_draw.c  |  9 +
 5 files changed, 22 insertions(+), 23 deletions(-)

diff --git a/src/gallium/drivers/r600/r600_blit.c 
b/src/gallium/drivers/r600/r600_blit.c
index fff841c..8a90489 100644
--- a/src/gallium/drivers/r600/r600_blit.c
+++ b/src/gallium/drivers/r600/r600_blit.c
@@ -87,18 +87,16 @@ static void r600_blitter_begin(struct pipe_context *ctx, 
enum r600_blitter_op op
(struct 
pipe_sampler_view**)rctx->samplers[PIPE_SHADER_FRAGMENT].views.views);
}
 
-   if ((op & R600_DISABLE_RENDER_COND) && rctx->b.current_render_cond) {
-   util_blitter_save_render_condition(rctx->blitter,
-  rctx->b.current_render_cond,
-  rctx->b.current_render_cond_cond,
-  
rctx->b.current_render_cond_mode);
-}
+   if (op & R600_DISABLE_RENDER_COND)
+   rctx->b.render_cond_force_off = true;
 }
 
 static void r600_blitter_end(struct pipe_context *ctx)
 {
struct r600_context *rctx = (struct r600_context *)ctx;
-r600_resume_nontimer_queries(>b);
+
+   rctx->b.render_cond_force_off = false;
+   r600_resume_nontimer_queries(>b);
 }
 
 static unsigned u_max_sample(struct pipe_resource *r)
diff --git a/src/gallium/drivers/r600/r600_state_common.c 
b/src/gallium/drivers/r600/r600_state_common.c
index eb54361..28aedff 100644
--- a/src/gallium/drivers/r600/r600_state_common.c
+++ b/src/gallium/drivers/r600/r600_state_common.c
@@ -1478,6 +1478,7 @@ static void r600_draw_vbo(struct pipe_context *ctx, const 
struct pipe_draw_info
struct pipe_draw_info info = *dinfo;
struct pipe_index_buffer ib = {};
struct radeon_winsys_cs *cs = rctx->b.gfx.cs;
+   bool render_cond_bit = rctx->b.predicate_drawing && 
!rctx->b.render_cond_force_off;
uint64_t mask;
 
if (!info.indirect && !info.count && (info.indexed || 
!info.count_from_stream_output)) {
@@ -1696,7 +1697,7 @@ static void r600_draw_vbo(struct pipe_context *ctx, const 
struct pipe_draw_info
if (ib.user_buffer) {
unsigned size_bytes = info.count*ib.index_size;
unsigned size_dw = align(size_bytes, 4) / 4;
-   cs->buf[cs->cdw++] = PKT3(PKT3_DRAW_INDEX_IMMD, 1 + 
size_dw, rctx->b.predicate_drawing);
+   cs->buf[cs->cdw++] = PKT3(PKT3_DRAW_INDEX_IMMD, 1 + 
size_dw, render_cond_bit);
cs->buf[cs->cdw++] = info.count;
cs->buf[cs->cdw++] = V_0287F0_DI_SRC_SEL_IMMEDIATE;
memcpy(cs->buf+cs->cdw, ib.user_buffer, size_bytes);
@@ -1705,7 +1706,7 @@ static void r600_draw_vbo(struct pipe_context *ctx, const 
struct pipe_draw_info
uint64_t va = r600_resource(ib.buffer)->gpu_address + 
ib.offset;
 
if (likely(!info.indirect)) {
-   cs->buf[cs->cdw++] = PKT3(PKT3_DRAW_INDEX, 3, 
rctx->b.predicate_drawing);
+   cs->buf[cs->cdw++] = PKT3(PKT3_DRAW_INDEX, 3, 
render_cond_bit);
cs->buf[cs->cdw++] = va;
cs->buf[cs->cdw++] = (va >> 32UL) & 0xFF;
cs->buf[cs->cdw++] = info.count;
@@ -1732,7 +1733,7 @@ static void r600_draw_vbo(struct pipe_context *ctx, const 
struct pipe_draw_info
cs->buf[cs->cdw++] = 
PKT3(EG_PKT3_INDEX_BUFFER_SIZE, 0, 0);
cs->buf[cs->cdw++] = max_size;
 
-   cs->buf[cs->cdw++] = 
PKT3(EG_PKT3_DRAW_INDEX_INDIRECT, 1, rctx->b.predicate_drawing);
+   cs->buf[cs->cdw++] = 
PKT3(EG_PKT3_DRAW_INDEX_INDIRECT, 1, render_cond_bit);
cs->buf[cs->cdw++] = info.indirect_offset;
cs->buf[cs->cdw++] = V_0287F0_DI_SRC_SEL_DMA;
}
@@ -1758,11 +1759,11 @@ static void r600_draw_vbo(struct pipe_context *ctx, 
const struct pipe_draw_info
}
 
if (likely(!info.indirect)) {
-   cs->buf[cs->cdw++] = PKT3(PKT3_DRAW_INDEX_AUTO, 1, 
rctx->b.predicate_drawing);
+   cs->buf[cs->cdw++] = PKT3(PKT3_DRAW_INDEX_AUTO, 1, 
render_cond_bit);
cs->buf[cs->cdw++] = info.count;
}
else {
-   cs->buf[cs->cdw++] = PKT3(EG_PKT3_DRAW_INDIRECT, 1, 
rctx->b.predicate_drawing);
+

[Mesa-dev] [PATCH 7/7] gallium/radeon: shorten render_cond variable names

2015-11-08 Thread Marek Olšák
From: Marek Olšák 

and ..._cond -> ..._invert
---
 src/gallium/drivers/r600/r600_hw_context.c|  2 +-
 src/gallium/drivers/r600/r600_state_common.c  |  2 +-
 src/gallium/drivers/radeon/r600_pipe_common.h |  6 +++---
 src/gallium/drivers/radeon/r600_query.c   | 14 +++---
 src/gallium/drivers/radeon/r600_texture.c |  2 +-
 src/gallium/drivers/radeonsi/si_state_draw.c  |  2 +-
 6 files changed, 14 insertions(+), 14 deletions(-)

diff --git a/src/gallium/drivers/r600/r600_hw_context.c 
b/src/gallium/drivers/r600/r600_hw_context.c
index 2383175..917808a 100644
--- a/src/gallium/drivers/r600/r600_hw_context.c
+++ b/src/gallium/drivers/r600/r600_hw_context.c
@@ -81,7 +81,7 @@ void r600_need_cs_space(struct r600_context *ctx, unsigned 
num_dw,
}
 
/* Count in render_condition(NULL) at the end of CS. */
-   if (ctx->b.current_render_cond) {
+   if (ctx->b.render_cond) {
num_dw += 3;
}
 
diff --git a/src/gallium/drivers/r600/r600_state_common.c 
b/src/gallium/drivers/r600/r600_state_common.c
index 5cf5208..d629194 100644
--- a/src/gallium/drivers/r600/r600_state_common.c
+++ b/src/gallium/drivers/r600/r600_state_common.c
@@ -1478,7 +1478,7 @@ static void r600_draw_vbo(struct pipe_context *ctx, const 
struct pipe_draw_info
struct pipe_draw_info info = *dinfo;
struct pipe_index_buffer ib = {};
struct radeon_winsys_cs *cs = rctx->b.gfx.cs;
-   bool render_cond_bit = rctx->b.current_render_cond && 
!rctx->b.render_cond_force_off;
+   bool render_cond_bit = rctx->b.render_cond && 
!rctx->b.render_cond_force_off;
uint64_t mask;
 
if (!info.indirect && !info.count && (info.indexed || 
!info.count_from_stream_output)) {
diff --git a/src/gallium/drivers/radeon/r600_pipe_common.h 
b/src/gallium/drivers/radeon/r600_pipe_common.h
index ba9000f..ebe633b 100644
--- a/src/gallium/drivers/radeon/r600_pipe_common.h
+++ b/src/gallium/drivers/radeon/r600_pipe_common.h
@@ -418,9 +418,9 @@ struct r600_common_context {
 
/* Render condition. */
struct r600_atomrender_cond_atom;
-   struct pipe_query   *current_render_cond;
-   unsignedcurrent_render_cond_mode;
-   boolean current_render_cond_cond;
+   struct pipe_query   *render_cond;
+   unsignedrender_cond_mode;
+   boolean render_cond_invert;
boolrender_cond_force_off; /* for u_blitter 
*/
 
/* MSAA sample locations.
diff --git a/src/gallium/drivers/radeon/r600_query.c 
b/src/gallium/drivers/radeon/r600_query.c
index 9f92587..8c2b601 100644
--- a/src/gallium/drivers/radeon/r600_query.c
+++ b/src/gallium/drivers/radeon/r600_query.c
@@ -307,7 +307,7 @@ static void r600_emit_query_predication(struct 
r600_common_context *ctx,
struct r600_atom *atom)
 {
struct radeon_winsys_cs *cs = ctx->gfx.cs;
-   struct r600_query *query = (struct r600_query*)ctx->current_render_cond;
+   struct r600_query *query = (struct r600_query*)ctx->render_cond;
struct r600_query_buffer *qbuf;
uint32_t op;
bool flag_wait;
@@ -315,8 +315,8 @@ static void r600_emit_query_predication(struct 
r600_common_context *ctx,
if (!query)
return;
 
-   flag_wait = ctx->current_render_cond_mode == PIPE_RENDER_COND_WAIT ||
-   ctx->current_render_cond_mode == 
PIPE_RENDER_COND_BY_REGION_WAIT;
+   flag_wait = ctx->render_cond_mode == PIPE_RENDER_COND_WAIT ||
+   ctx->render_cond_mode == PIPE_RENDER_COND_BY_REGION_WAIT;
 
switch (query->type) {
case PIPE_QUERY_OCCLUSION_COUNTER:
@@ -335,7 +335,7 @@ static void r600_emit_query_predication(struct 
r600_common_context *ctx,
}
 
/* if true then invert, see GL_ARB_conditional_render_inverted */
-   if (ctx->current_render_cond_cond)
+   if (ctx->render_cond_invert)
op |= PREDICATION_DRAW_NOT_VISIBLE; /* Draw if not 
visable/overflow */
else
op |= PREDICATION_DRAW_VISIBLE; /* Draw if visable/overflow */
@@ -831,9 +831,9 @@ static void r600_render_condition(struct pipe_context *ctx,
struct r600_query_buffer *qbuf;
struct r600_atom *atom = >render_cond_atom;
 
-   rctx->current_render_cond = query;
-   rctx->current_render_cond_cond = condition;
-   rctx->current_render_cond_mode = mode;
+   rctx->render_cond = query;
+   rctx->render_cond_invert = condition;
+   rctx->render_cond_mode = mode;
 
/* Compute the size of SET_PREDICATION packets. */
atom->num_dw = 0;
diff --git a/src/gallium/drivers/radeon/r600_texture.c 
b/src/gallium/drivers/radeon/r600_texture.c
index edfdfe3..3126cce 100644
--- a/src/gallium/drivers/radeon/r600_texture.c
+++ 

[Mesa-dev] [PATCH 6/7] gallium/radeon: remove predicate_drawing flag

2015-11-08 Thread Marek Olšák
From: Marek Olšák 

---
 src/gallium/drivers/r600/r600_hw_context.c| 2 +-
 src/gallium/drivers/r600/r600_state_common.c  | 2 +-
 src/gallium/drivers/radeon/r600_pipe_common.h | 1 -
 src/gallium/drivers/radeon/r600_query.c   | 1 -
 src/gallium/drivers/radeonsi/si_state_draw.c  | 2 +-
 5 files changed, 3 insertions(+), 5 deletions(-)

diff --git a/src/gallium/drivers/r600/r600_hw_context.c 
b/src/gallium/drivers/r600/r600_hw_context.c
index 44e7cf2..2383175 100644
--- a/src/gallium/drivers/r600/r600_hw_context.c
+++ b/src/gallium/drivers/r600/r600_hw_context.c
@@ -81,7 +81,7 @@ void r600_need_cs_space(struct r600_context *ctx, unsigned 
num_dw,
}
 
/* Count in render_condition(NULL) at the end of CS. */
-   if (ctx->b.predicate_drawing) {
+   if (ctx->b.current_render_cond) {
num_dw += 3;
}
 
diff --git a/src/gallium/drivers/r600/r600_state_common.c 
b/src/gallium/drivers/r600/r600_state_common.c
index 28aedff..5cf5208 100644
--- a/src/gallium/drivers/r600/r600_state_common.c
+++ b/src/gallium/drivers/r600/r600_state_common.c
@@ -1478,7 +1478,7 @@ static void r600_draw_vbo(struct pipe_context *ctx, const 
struct pipe_draw_info
struct pipe_draw_info info = *dinfo;
struct pipe_index_buffer ib = {};
struct radeon_winsys_cs *cs = rctx->b.gfx.cs;
-   bool render_cond_bit = rctx->b.predicate_drawing && 
!rctx->b.render_cond_force_off;
+   bool render_cond_bit = rctx->b.current_render_cond && 
!rctx->b.render_cond_force_off;
uint64_t mask;
 
if (!info.indirect && !info.count && (info.indexed || 
!info.count_from_stream_output)) {
diff --git a/src/gallium/drivers/radeon/r600_pipe_common.h 
b/src/gallium/drivers/radeon/r600_pipe_common.h
index 09465ae..ba9000f 100644
--- a/src/gallium/drivers/radeon/r600_pipe_common.h
+++ b/src/gallium/drivers/radeon/r600_pipe_common.h
@@ -421,7 +421,6 @@ struct r600_common_context {
struct pipe_query   *current_render_cond;
unsignedcurrent_render_cond_mode;
boolean current_render_cond_cond;
-   boolpredicate_drawing;
boolrender_cond_force_off; /* for u_blitter 
*/
 
/* MSAA sample locations.
diff --git a/src/gallium/drivers/radeon/r600_query.c 
b/src/gallium/drivers/radeon/r600_query.c
index 145b629..9f92587 100644
--- a/src/gallium/drivers/radeon/r600_query.c
+++ b/src/gallium/drivers/radeon/r600_query.c
@@ -834,7 +834,6 @@ static void r600_render_condition(struct pipe_context *ctx,
rctx->current_render_cond = query;
rctx->current_render_cond_cond = condition;
rctx->current_render_cond_mode = mode;
-   rctx->predicate_drawing = query != NULL;
 
/* Compute the size of SET_PREDICATION packets. */
atom->num_dw = 0;
diff --git a/src/gallium/drivers/radeonsi/si_state_draw.c 
b/src/gallium/drivers/radeonsi/si_state_draw.c
index ebc01e8..79e8876 100644
--- a/src/gallium/drivers/radeonsi/si_state_draw.c
+++ b/src/gallium/drivers/radeonsi/si_state_draw.c
@@ -457,7 +457,7 @@ static void si_emit_draw_packets(struct si_context *sctx,
 {
struct radeon_winsys_cs *cs = sctx->b.gfx.cs;
unsigned sh_base_reg = 
sctx->shader_userdata.sh_base[PIPE_SHADER_VERTEX];
-   bool render_cond_bit = sctx->b.predicate_drawing && 
!sctx->b.render_cond_force_off;
+   bool render_cond_bit = sctx->b.current_render_cond && 
!sctx->b.render_cond_force_off;
 
if (info->count_from_stream_output) {
struct r600_so_target *t =
-- 
2.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 4/7] gallium/radeon: simplify restoring render condition after flush

2015-11-08 Thread Marek Olšák
From: Marek Olšák 

---
 src/gallium/drivers/radeon/r600_pipe_common.c | 22 +-
 src/gallium/drivers/radeon/r600_pipe_common.h |  4 
 2 files changed, 5 insertions(+), 21 deletions(-)

diff --git a/src/gallium/drivers/radeon/r600_pipe_common.c 
b/src/gallium/drivers/radeon/r600_pipe_common.c
index 8739914..224da11 100644
--- a/src/gallium/drivers/radeon/r600_pipe_common.c
+++ b/src/gallium/drivers/radeon/r600_pipe_common.c
@@ -134,17 +134,6 @@ static void r600_memory_barrier(struct pipe_context *ctx, 
unsigned flags)
 
 void r600_preflush_suspend_features(struct r600_common_context *ctx)
 {
-   /* Disable render condition. */
-   ctx->saved_render_cond = NULL;
-   ctx->saved_render_cond_cond = FALSE;
-   ctx->saved_render_cond_mode = 0;
-   if (ctx->current_render_cond) {
-   ctx->saved_render_cond = ctx->current_render_cond;
-   ctx->saved_render_cond_cond = ctx->current_render_cond_cond;
-   ctx->saved_render_cond_mode = ctx->current_render_cond_mode;
-   ctx->b.render_condition(>b, NULL, FALSE, 0);
-   }
-
/* suspend queries */
ctx->queries_suspended_for_flush = false;
if (ctx->num_cs_dw_nontimer_queries_suspend) {
@@ -173,12 +162,11 @@ void r600_postflush_resume_features(struct 
r600_common_context *ctx)
r600_resume_timer_queries(ctx);
}
 
-   /* Re-enable render condition. */
-   if (ctx->saved_render_cond) {
-   ctx->b.render_condition(>b, ctx->saved_render_cond,
- ctx->saved_render_cond_cond,
- ctx->saved_render_cond_mode);
-   }
+   /* Just re-emit PKT3_SET_PREDICATION. */
+   if (ctx->current_render_cond)
+   ctx->b.render_condition(>b, ctx->current_render_cond,
+   ctx->current_render_cond_cond,
+   ctx->current_render_cond_mode);
 }
 
 static void r600_flush_from_st(struct pipe_context *ctx,
diff --git a/src/gallium/drivers/radeon/r600_pipe_common.h 
b/src/gallium/drivers/radeon/r600_pipe_common.h
index 139c377..2a3a3a7 100644
--- a/src/gallium/drivers/radeon/r600_pipe_common.h
+++ b/src/gallium/drivers/radeon/r600_pipe_common.h
@@ -422,10 +422,6 @@ struct r600_common_context {
boolean current_render_cond_cond;
boolpredicate_drawing;
boolrender_cond_force_off; /* for u_blitter 
*/
-   /* For context flushing. */
-   struct pipe_query   *saved_render_cond;
-   boolean saved_render_cond_cond;
-   unsignedsaved_render_cond_mode;
 
/* MSAA sample locations.
 * The first index is the sample index.
-- 
2.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 5/7] gallium/radeon: atomize render condition (SET_PREDICATION)

2015-11-08 Thread Marek Olšák
From: Marek Olšák 

---
 src/gallium/drivers/r600/evergreen_state.c|  1 +
 src/gallium/drivers/r600/r600_hw_context.c|  1 +
 src/gallium/drivers/r600/r600_pipe.h  |  2 +-
 src/gallium/drivers/r600/r600_state.c |  1 +
 src/gallium/drivers/radeon/r600_pipe_common.c |  6 ---
 src/gallium/drivers/radeon/r600_pipe_common.h |  1 +
 src/gallium/drivers/radeon/r600_query.c   | 75 +--
 src/gallium/drivers/radeonsi/si_hw_context.c  |  1 +
 src/gallium/drivers/radeonsi/si_state.c   |  1 +
 src/gallium/drivers/radeonsi/si_state.h   |  1 +
 10 files changed, 45 insertions(+), 45 deletions(-)

diff --git a/src/gallium/drivers/r600/evergreen_state.c 
b/src/gallium/drivers/r600/evergreen_state.c
index 88d50b3..9ab326f 100644
--- a/src/gallium/drivers/r600/evergreen_state.c
+++ b/src/gallium/drivers/r600/evergreen_state.c
@@ -3515,6 +3515,7 @@ void evergreen_init_state_functions(struct r600_context 
*rctx)
r600_init_atom(rctx, >viewport.atom, id++, 
r600_emit_viewport_state, 0);
r600_init_atom(rctx, >stencil_ref.atom, id++, 
r600_emit_stencil_ref, 4);
r600_init_atom(rctx, >vertex_fetch_shader.atom, id++, 
evergreen_emit_vertex_fetch_shader, 5);
+   r600_add_atom(rctx, >b.render_cond_atom, id++);
r600_add_atom(rctx, >b.streamout.begin_atom, id++);
r600_add_atom(rctx, >b.streamout.enable_atom, id++);
r600_init_atom(rctx, >vertex_shader.atom, id++, r600_emit_shader, 
23);
diff --git a/src/gallium/drivers/r600/r600_hw_context.c 
b/src/gallium/drivers/r600/r600_hw_context.c
index 0fc58df..44e7cf2 100644
--- a/src/gallium/drivers/r600/r600_hw_context.c
+++ b/src/gallium/drivers/r600/r600_hw_context.c
@@ -328,6 +328,7 @@ void r600_begin_new_cs(struct r600_context *ctx)
}
r600_mark_atom_dirty(ctx, >vertex_shader.atom);
r600_mark_atom_dirty(ctx, >b.streamout.enable_atom);
+   r600_mark_atom_dirty(ctx, >b.render_cond_atom);
 
if (ctx->blend_state.cso)
r600_mark_atom_dirty(ctx, >blend_state.atom);
diff --git a/src/gallium/drivers/r600/r600_pipe.h 
b/src/gallium/drivers/r600/r600_pipe.h
index 520b03f..92de0e1 100644
--- a/src/gallium/drivers/r600/r600_pipe.h
+++ b/src/gallium/drivers/r600/r600_pipe.h
@@ -38,7 +38,7 @@
 
 #include "tgsi/tgsi_scan.h"
 
-#define R600_NUM_ATOMS 42
+#define R600_NUM_ATOMS 43
 
 #define R600_MAX_VIEWPORTS 16
 
diff --git a/src/gallium/drivers/r600/r600_state.c 
b/src/gallium/drivers/r600/r600_state.c
index a44dca8..458e80b 100644
--- a/src/gallium/drivers/r600/r600_state.c
+++ b/src/gallium/drivers/r600/r600_state.c
@@ -3086,6 +3086,7 @@ void r600_init_state_functions(struct r600_context *rctx)
r600_init_atom(rctx, >config_state.atom, id++, 
r600_emit_config_state, 3);
r600_init_atom(rctx, >stencil_ref.atom, id++, 
r600_emit_stencil_ref, 4);
r600_init_atom(rctx, >vertex_fetch_shader.atom, id++, 
r600_emit_vertex_fetch_shader, 5);
+   r600_add_atom(rctx, >b.render_cond_atom, id++);
r600_add_atom(rctx, >b.streamout.begin_atom, id++);
r600_add_atom(rctx, >b.streamout.enable_atom, id++);
r600_init_atom(rctx, >vertex_shader.atom, id++, r600_emit_shader, 
23);
diff --git a/src/gallium/drivers/radeon/r600_pipe_common.c 
b/src/gallium/drivers/radeon/r600_pipe_common.c
index 224da11..3599692 100644
--- a/src/gallium/drivers/radeon/r600_pipe_common.c
+++ b/src/gallium/drivers/radeon/r600_pipe_common.c
@@ -161,12 +161,6 @@ void r600_postflush_resume_features(struct 
r600_common_context *ctx)
r600_resume_nontimer_queries(ctx);
r600_resume_timer_queries(ctx);
}
-
-   /* Just re-emit PKT3_SET_PREDICATION. */
-   if (ctx->current_render_cond)
-   ctx->b.render_condition(>b, ctx->current_render_cond,
-   ctx->current_render_cond_cond,
-   ctx->current_render_cond_mode);
 }
 
 static void r600_flush_from_st(struct pipe_context *ctx,
diff --git a/src/gallium/drivers/radeon/r600_pipe_common.h 
b/src/gallium/drivers/radeon/r600_pipe_common.h
index 2a3a3a7..09465ae 100644
--- a/src/gallium/drivers/radeon/r600_pipe_common.h
+++ b/src/gallium/drivers/radeon/r600_pipe_common.h
@@ -417,6 +417,7 @@ struct r600_common_context {
unsignednum_draw_calls;
 
/* Render condition. */
+   struct r600_atomrender_cond_atom;
struct pipe_query   *current_render_cond;
unsignedcurrent_render_cond_mode;
boolean current_render_cond_cond;
diff --git a/src/gallium/drivers/radeon/r600_query.c 
b/src/gallium/drivers/radeon/r600_query.c
index 1838314..145b629 100644
--- a/src/gallium/drivers/radeon/r600_query.c
+++ b/src/gallium/drivers/radeon/r600_query.c
@@ -303,13 +303,36 @@ static void r600_emit_query_end(struct 
r600_common_context *ctx, struct 

[Mesa-dev] [PATCH 1/7] r600g: don't set predication on non-draw packets

2015-11-08 Thread Marek Olšák
From: Marek Olšák 

This has no effect.
---
 src/gallium/drivers/r600/r600_state_common.c | 16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/src/gallium/drivers/r600/r600_state_common.c 
b/src/gallium/drivers/r600/r600_state_common.c
index e160857..eb54361 100644
--- a/src/gallium/drivers/r600/r600_state_common.c
+++ b/src/gallium/drivers/r600/r600_state_common.c
@@ -1663,7 +1663,7 @@ static void r600_draw_vbo(struct pipe_context *ctx, const 
struct pipe_draw_info
 
/* Draw packets. */
if (!info.indirect) {
-   cs->buf[cs->cdw++] = PKT3(PKT3_NUM_INSTANCES, 0, 
rctx->b.predicate_drawing);
+   cs->buf[cs->cdw++] = PKT3(PKT3_NUM_INSTANCES, 0, 0);
cs->buf[cs->cdw++] = info.instance_count;
}
 
@@ -1675,12 +1675,12 @@ static void r600_draw_vbo(struct pipe_context *ctx, 
const struct pipe_draw_info
rctx->vgt_state.last_draw_was_indirect = true;
rctx->last_start_instance = -1;
 
-   cs->buf[cs->cdw++] = PKT3(EG_PKT3_SET_BASE, 2, 
rctx->b.predicate_drawing);
+   cs->buf[cs->cdw++] = PKT3(EG_PKT3_SET_BASE, 2, 0);
cs->buf[cs->cdw++] = EG_DRAW_INDEX_INDIRECT_PATCH_TABLE_BASE;
cs->buf[cs->cdw++] = va;
cs->buf[cs->cdw++] = (va >> 32UL) & 0xFF;
 
-   cs->buf[cs->cdw++] = PKT3(PKT3_NOP, 0, 
rctx->b.predicate_drawing);
+   cs->buf[cs->cdw++] = PKT3(PKT3_NOP, 0, 0);
cs->buf[cs->cdw++] = radeon_add_to_buffer_list(>b, 
>b.gfx,
   (struct 
r600_resource*)info.indirect,
   RADEON_USAGE_READ,
@@ -1688,7 +1688,7 @@ static void r600_draw_vbo(struct pipe_context *ctx, const 
struct pipe_draw_info
}
 
if (info.indexed) {
-   cs->buf[cs->cdw++] = PKT3(PKT3_INDEX_TYPE, 0, 
rctx->b.predicate_drawing);
+   cs->buf[cs->cdw++] = PKT3(PKT3_INDEX_TYPE, 0, 0);
cs->buf[cs->cdw++] = ib.index_size == 4 ?
(VGT_INDEX_32 | (R600_BIG_ENDIAN ? 
VGT_DMA_SWAP_32_BIT : 0)) :
(VGT_INDEX_16 | (R600_BIG_ENDIAN ? 
VGT_DMA_SWAP_16_BIT : 0));
@@ -1710,7 +1710,7 @@ static void r600_draw_vbo(struct pipe_context *ctx, const 
struct pipe_draw_info
cs->buf[cs->cdw++] = (va >> 32UL) & 0xFF;
cs->buf[cs->cdw++] = info.count;
cs->buf[cs->cdw++] = V_0287F0_DI_SRC_SEL_DMA;
-   cs->buf[cs->cdw++] = PKT3(PKT3_NOP, 0, 
rctx->b.predicate_drawing);
+   cs->buf[cs->cdw++] = PKT3(PKT3_NOP, 0, 0);
cs->buf[cs->cdw++] = 
radeon_add_to_buffer_list(>b, >b.gfx,
   
(struct r600_resource*)ib.buffer,
   
RADEON_USAGE_READ,
@@ -1719,17 +1719,17 @@ static void r600_draw_vbo(struct pipe_context *ctx, 
const struct pipe_draw_info
else {
uint32_t max_size = (ib.buffer->width0 - 
ib.offset) / ib.index_size;
 
-   cs->buf[cs->cdw++] = PKT3(EG_PKT3_INDEX_BASE, 
1, rctx->b.predicate_drawing);
+   cs->buf[cs->cdw++] = PKT3(EG_PKT3_INDEX_BASE, 
1, 0);
cs->buf[cs->cdw++] = va;
cs->buf[cs->cdw++] = (va >> 32UL) & 0xFF;
 
-   cs->buf[cs->cdw++] = PKT3(PKT3_NOP, 0, 
rctx->b.predicate_drawing);
+   cs->buf[cs->cdw++] = PKT3(PKT3_NOP, 0, 0);
cs->buf[cs->cdw++] = 
radeon_add_to_buffer_list(>b, >b.gfx,
   
(struct r600_resource*)ib.buffer,
   
RADEON_USAGE_READ,

RADEON_PRIO_INDEX_BUFFER);
 
-   cs->buf[cs->cdw++] = 
PKT3(EG_PKT3_INDEX_BUFFER_SIZE, 0, rctx->b.predicate_drawing);
+   cs->buf[cs->cdw++] = 
PKT3(EG_PKT3_INDEX_BUFFER_SIZE, 0, 0);
cs->buf[cs->cdw++] = max_size;
 
cs->buf[cs->cdw++] = 
PKT3(EG_PKT3_DRAW_INDEX_INDIRECT, 1, rctx->b.predicate_drawing);
-- 
2.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/7] gallium/radeon: don't use PREDICATION_OP_CLEAR

2015-11-08 Thread Marek Olšák
From: Marek Olšák 

Not setting the predication bit is sufficient.
---
 src/gallium/drivers/radeon/r600_query.c | 60 +
 1 file changed, 24 insertions(+), 36 deletions(-)

diff --git a/src/gallium/drivers/radeon/r600_query.c 
b/src/gallium/drivers/radeon/r600_query.c
index ce0d7e7..1838314 100644
--- a/src/gallium/drivers/radeon/r600_query.c
+++ b/src/gallium/drivers/radeon/r600_query.c
@@ -307,6 +307,8 @@ static void r600_emit_query_predication(struct 
r600_common_context *ctx, struct
int operation, bool flag_wait)
 {
struct radeon_winsys_cs *cs = ctx->gfx.cs;
+   struct r600_query_buffer *qbuf;
+   unsigned count;
uint32_t op = PRED_OP(operation);
 
/* if true then invert, see GL_ARB_conditional_render_inverted */
@@ -315,41 +317,30 @@ static void r600_emit_query_predication(struct 
r600_common_context *ctx, struct
else
op |= PREDICATION_DRAW_VISIBLE; /* Draw if visable/overflow */
 
-   if (operation == PREDICATION_OP_CLEAR) {
-   ctx->need_gfx_cs_space(>b, 3, FALSE);
-
-   radeon_emit(cs, PKT3(PKT3_SET_PREDICATION, 1, 0));
-   radeon_emit(cs, 0);
-   radeon_emit(cs, PRED_OP(PREDICATION_OP_CLEAR));
-   } else {
-   struct r600_query_buffer *qbuf;
-   unsigned count;
-   /* Find how many results there are. */
-   count = 0;
-   for (qbuf = >buffer; qbuf; qbuf = qbuf->previous) {
-   count += qbuf->results_end / query->result_size;
-   }
-   
-   ctx->need_gfx_cs_space(>b, 5 * count, TRUE);
+   /* Find how many results there are. */
+   count = 0;
+   for (qbuf = >buffer; qbuf; qbuf = qbuf->previous)
+   count += qbuf->results_end / query->result_size;

-   op |= flag_wait ? PREDICATION_HINT_WAIT : 
PREDICATION_HINT_NOWAIT_DRAW;
+   ctx->need_gfx_cs_space(>b, 5 * count, TRUE);

-   /* emit predicate packets for all data blocks */
-   for (qbuf = >buffer; qbuf; qbuf = qbuf->previous) {
-   unsigned results_base = 0;
-   uint64_t va = qbuf->buf->gpu_address;
+   op |= flag_wait ? PREDICATION_HINT_WAIT : PREDICATION_HINT_NOWAIT_DRAW;

-   while (results_base < qbuf->results_end) {
-   radeon_emit(cs, PKT3(PKT3_SET_PREDICATION, 1, 
0));
-   radeon_emit(cs, va + results_base);
-   radeon_emit(cs, op | (((va + results_base) >> 
32) & 0xFF));
-   r600_emit_reloc(ctx, >gfx, qbuf->buf, 
RADEON_USAGE_READ,
-   RADEON_PRIO_QUERY);
-   results_base += query->result_size;
-   
-   /* set CONTINUE bit for all packets except the 
first */
-   op |= PREDICATION_CONTINUE;
-   }
+   /* emit predicate packets for all data blocks */
+   for (qbuf = >buffer; qbuf; qbuf = qbuf->previous) {
+   unsigned results_base = 0;
+   uint64_t va = qbuf->buf->gpu_address;
+
+   while (results_base < qbuf->results_end) {
+   radeon_emit(cs, PKT3(PKT3_SET_PREDICATION, 1, 0));
+   radeon_emit(cs, va + results_base);
+   radeon_emit(cs, op | (((va + results_base) >> 32) & 
0xFF));
+   r600_emit_reloc(ctx, >gfx, qbuf->buf, 
RADEON_USAGE_READ,
+   RADEON_PRIO_QUERY);
+   results_base += query->result_size;
+
+   /* set CONTINUE bit for all packets except the first */
+   op |= PREDICATION_CONTINUE;
}
}
 }
@@ -828,10 +819,7 @@ static void r600_render_condition(struct pipe_context *ctx,
rctx->current_render_cond_mode = mode;
 
if (query == NULL) {
-   if (rctx->predicate_drawing) {
-   rctx->predicate_drawing = false;
-   r600_emit_query_predication(rctx, NULL, 
PREDICATION_OP_CLEAR, false);
-   }
+   rctx->predicate_drawing = false;
return;
}
 
-- 
2.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 0/7] RadeonSI: Render condition cleanup

2015-11-08 Thread Marek Olšák
I thought this would fix a bug I was hunting, but it didn't. Well, at least it 
simplifies render condition handling.

Please review.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 5/5] radeonsi: calculate optimal GS ring sizes to fix GS hangs on Tonga

2015-11-08 Thread Marek Olšák
From: Marek Olšák 

I discovered that increasing the ESGS ring size fixes GS hangs on Tonga,
so let's do it properly.

There is now a separate init_config_gs_rings state that is not immutable,
because GS rings are resized when needed.

This also saves some memory. Most apps won't need more than 1MB
per ring per shader engine.
---
 src/gallium/drivers/radeonsi/si_hw_context.c|   2 +
 src/gallium/drivers/radeonsi/si_pipe.c  |   2 +
 src/gallium/drivers/radeonsi/si_pipe.h  |   1 +
 src/gallium/drivers/radeonsi/si_shader.h|   1 +
 src/gallium/drivers/radeonsi/si_state_shaders.c | 153 
 5 files changed, 112 insertions(+), 47 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_hw_context.c 
b/src/gallium/drivers/radeonsi/si_hw_context.c
index f28c11c..baa0229 100644
--- a/src/gallium/drivers/radeonsi/si_hw_context.c
+++ b/src/gallium/drivers/radeonsi/si_hw_context.c
@@ -165,6 +165,8 @@ void si_begin_new_cs(struct si_context *ctx)
 
/* The CS initialization should be emitted before everything else. */
si_pm4_emit(ctx, ctx->init_config);
+   if (ctx->init_config_gs_rings)
+   si_pm4_emit(ctx, ctx->init_config_gs_rings);
 
ctx->framebuffer.dirty_cbufs = (1 << 8) - 1;
ctx->framebuffer.dirty_zsbuf = true;
diff --git a/src/gallium/drivers/radeonsi/si_pipe.c 
b/src/gallium/drivers/radeonsi/si_pipe.c
index 2b9784f..1cbee8d 100644
--- a/src/gallium/drivers/radeonsi/si_pipe.c
+++ b/src/gallium/drivers/radeonsi/si_pipe.c
@@ -50,6 +50,8 @@ static void si_destroy_context(struct pipe_context *context)
sctx->b.ws->fence_reference(>last_gfx_fence, NULL);
 
si_pm4_free_state(sctx, sctx->init_config, ~0);
+   if (sctx->init_config_gs_rings)
+   si_pm4_free_state(sctx, sctx->init_config_gs_rings, ~0);
for (i = 0; i < Elements(sctx->vgt_shader_config); i++)
si_pm4_delete_state(sctx, vgt_shader_config, 
sctx->vgt_shader_config[i]);
 
diff --git a/src/gallium/drivers/radeonsi/si_pipe.h 
b/src/gallium/drivers/radeonsi/si_pipe.h
index 6e742fc..05d52fe 100644
--- a/src/gallium/drivers/radeonsi/si_pipe.h
+++ b/src/gallium/drivers/radeonsi/si_pipe.h
@@ -202,6 +202,7 @@ struct si_context {
 
/* Precomputed states. */
struct si_pm4_state *init_config;
+   struct si_pm4_state *init_config_gs_rings;
boolinit_config_has_vgt_flush;
struct si_pm4_state *vgt_shader_config[4];
 
diff --git a/src/gallium/drivers/radeonsi/si_shader.h 
b/src/gallium/drivers/radeonsi/si_shader.h
index 1f4f0de..3400a03 100644
--- a/src/gallium/drivers/radeonsi/si_shader.h
+++ b/src/gallium/drivers/radeonsi/si_shader.h
@@ -202,6 +202,7 @@ struct si_shader_selector {
boolforces_persample_interp_for_linear;
 
unsignedesgs_itemsize;
+   unsignedgs_input_verts_per_prim;
unsignedgs_output_prim;
unsignedgs_max_out_vertices;
unsignedgs_num_invocations;
diff --git a/src/gallium/drivers/radeonsi/si_state_shaders.c 
b/src/gallium/drivers/radeonsi/si_state_shaders.c
index c402ce2..b543971 100644
--- a/src/gallium/drivers/radeonsi/si_state_shaders.c
+++ b/src/gallium/drivers/radeonsi/si_state_shaders.c
@@ -33,6 +33,7 @@
 #include "tgsi/tgsi_parse.h"
 #include "tgsi/tgsi_ureg.h"
 #include "util/u_memory.h"
+#include "util/u_prim.h"
 #include "util/u_simple_shaders.h"
 
 static void si_set_tesseval_regs(struct si_shader *shader,
@@ -703,6 +704,9 @@ static void *si_create_shader_selector(struct pipe_context 
*ctx,
for (i = 0; i < sel->so.num_outputs; i++)
sel->max_gs_stream = MAX2(sel->max_gs_stream,
  sel->so.output[i].stream);
+
+   sel->gs_input_verts_per_prim =
+   
u_vertices_per_prim(sel->info.properties[TGSI_PROPERTY_GS_INPUT_PRIM]);
break;
 
case PIPE_SHADER_VERTEX:
@@ -1054,6 +1058,7 @@ static void si_init_config_add_vgt_flush(struct 
si_context *sctx)
if (sctx->init_config_has_vgt_flush)
return;
 
+   /* VGT_FLUSH is required even if VGT is idle. It resets VGT pointers. */
si_pm4_cmd_begin(sctx->init_config, PKT3_EVENT_WRITE);
si_pm4_cmd_add(sctx->init_config, EVENT_TYPE(V_028A90_VGT_FLUSH) | 
EVENT_INDEX(0));
si_pm4_cmd_end(sctx->init_config, false);
@@ -1061,62 +1066,119 @@ static void si_init_config_add_vgt_flush(struct 
si_context *sctx)
 }
 
 /* Initialize state related to ESGS / GSVS ring buffers */
-static void si_init_gs_rings(struct si_context *sctx)
+static bool si_update_gs_ring_buffers(struct si_context *sctx)
 {
-   unsigned esgs_ring_size = 128 * 1024;
-   unsigned gsvs_ring_size = 60 * 1024 * 1024;
+   struct si_shader_selector *es =
+   sctx->tes_shader.cso ? 

Re: [Mesa-dev] [PATCH 3/6] glsl: move layout qualifier validation out of the parser

2015-11-08 Thread Timothy Arceri
On Sun, 2015-11-08 at 12:07 +, Emil Velikov wrote:
> On 6 November 2015 at 21:13, Timothy Arceri  wrote:
> > On Fri, 2015-11-06 at 13:16 +, Emil Velikov wrote:
> > > On 5 November 2015 at 11:17, Timothy Arceri 
> > > wrote:
> > > > From: Timothy Arceri 
> > > > 
> > > > This is in preperation for compile-time constant support.
> > > typo "preparation"
> > > 
> > > > 
> > > > Also fix up the locations for some of the extension checking
> > > > error messages in the parser. We now correctly give the location
> > > > of the layout qualifier identifier rather than the integer constant.
> > > > 
> > > > The validation is moved to two locations, for validation on variables
> > > > the
> > > > checks are moved to the ast to hir pass and for qualifiers that apply
> > > > to
> > > > the
> > > > shader the validation is moved into glsl_parser_extras.cpp.
> > > > 
> > > > In order to do validation at the later stage in glsl_parser_extras.cpp
> > > > we
> > > > need to temporarily add a field in ast_type_qualifier to keep track of
> > > > the
> > > > parser location, this will be removed in a following patch when we
> > > > introduce a new type for storing the comiple-time qualifiers.
> > > > 
> > > > Also as the set_shader_inout_layout() function in glsl parser extras
> > > > is
> > > > normally called after all validation is done we need to move the code
> > > > that
> > > > sets CompileStatus and InfoLog otherwise the newly moved error
> > > > messages
> > > > will
> > > > be ignored.
> > > Personally I would split the validate_layout_qualifiers() introduction
> > > and the CompileStatus/InfoLog movement into separate patches.
> > 
> > The reason for not doing this in a new patch is that this is existing
> > functionality not new functionality, doing so would regress a bunch of
> > piglit
> > tests.
> > 
> > I can do it if it makes things easier to review but it should all be
> > pushed as
> > one.
> > 
> Fair enough - I'd just keep in as it then. I'll take a closer look at
> some time today/tomorrow.

On second thoughts I should be able to break this up into shader level layouts
and per variable layouts I think I missunderstood what you were getting at
when reading your first reply.

I have a version 2 in progress based on your other feedback so you might want
to hold of reviewing until I send that. 

> 
> 
> Imho if one needs to made a few different things at once, this is a
> clear indication that things are more convoluted as they should be.
> 
> 
> Cheers,
> Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] gallium/radeon: fix PIPE_QUERY_GPU_FINISHED

2015-11-08 Thread Marek Olšák
From: Marek Olšák 

Broken by the addition of r600_multi_fence
in 3b37155a68acc351cba86a1fa142bd0de2192d4c
---
 src/gallium/drivers/radeon/r600_query.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/gallium/drivers/radeon/r600_query.c 
b/src/gallium/drivers/radeon/r600_query.c
index 9a54025..2bb5732 100644
--- a/src/gallium/drivers/radeon/r600_query.c
+++ b/src/gallium/drivers/radeon/r600_query.c
@@ -532,7 +532,7 @@ static void r600_end_query(struct pipe_context *ctx, struct 
pipe_query *query)
case PIPE_QUERY_TIMESTAMP_DISJOINT:
return;
case PIPE_QUERY_GPU_FINISHED:
-   rctx->rings.gfx.flush(rctx, RADEON_FLUSH_ASYNC, >fence);
+   ctx->flush(ctx, >fence, 0);
return;
case R600_QUERY_DRAW_CALLS:
rquery->end_result = rctx->num_draw_calls;
-- 
2.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] ARB_enhanced_layout compile-time-constants V2

2015-11-08 Thread Timothy Arceri
This series adds support for compile time constants and also adds
subroutine index qualifier support which was missing for 
ARB_explicit_uniform_location.

V2: Validation of minimum qualifier value moved to the helper functions, all
qualifiers will now always have the mimimum value checked. Split the patches
that move validation out of the parser and add the compile time constant
support into smaller patches where possible. 

Piglit tests have been reviewed and pushed to master, there is one outstanding
that tests querying of the subroutine index [1].

The extension is disabled by default until the remaining features are added.

MESA_EXTENSION_OVERRIDE=GL_ARB_enhanced_layouts can be used for testing.

You can get the series from my arb_enhanced_layouts4 branch [2]

[1] https://patchwork.freedesktop.org/patch/63795/
[2] https://github.com/tarceri/Mesa_arrays_of_arrays.git

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [RFCv2 01/13] gallium: refactor pipe_shader_state to support multiple IR's

2015-11-08 Thread Rob Clark
The goal is to allow the pipe driver to request something other than
TGSI, but detect whether what is getting is TGSI vs what it requested.
The pipe drivers will always have to support TGSI (and convert that into
whatever it is that they prefer), but in some cases we should be able to
skip the TGSI intermediate step (such as glsl->nir vs glsl->tgsi->nir).

I think pipe_compute_state should get similar treatment.  Currently,
afaict, it has one user and one consumer, which has allowed it to be
sloppy wrt. supporting alternative IR's.
---
 src/gallium/auxiliary/hud/hud_context.c   | 14 +++--
 src/gallium/auxiliary/postprocess/pp_run.c|  4 ++-
 src/gallium/auxiliary/tgsi/tgsi_ureg.c|  6 ++--
 src/gallium/auxiliary/util/u_simple_shaders.c | 42 +++
 src/gallium/auxiliary/util/u_tests.c  |  7 -
 src/gallium/include/pipe/p_defines.h  | 12 ++--
 src/gallium/include/pipe/p_state.h| 20 +++--
 7 files changed, 89 insertions(+), 16 deletions(-)

diff --git a/src/gallium/auxiliary/hud/hud_context.c 
b/src/gallium/auxiliary/hud/hud_context.c
index ffe30b8..2344a48 100644
--- a/src/gallium/auxiliary/hud/hud_context.c
+++ b/src/gallium/auxiliary/hud/hud_context.c
@@ -1182,7 +1182,12 @@ hud_create(struct pipe_context *pipe, struct cso_context 
*cso)
   };
 
   struct tgsi_token tokens[1000];
-  struct pipe_shader_state state = {tokens};
+  struct pipe_shader_state state;
+
+  memset(, 0, sizeof(state));
+
+  state.ir = PIPE_SHADER_IR_TGSI;
+  state.tokens = tokens;
 
   if (!tgsi_text_translate(fragment_shader_text, tokens, 
Elements(tokens))) {
  assert(0);
@@ -1229,7 +1234,12 @@ hud_create(struct pipe_context *pipe, struct cso_context 
*cso)
   };
 
   struct tgsi_token tokens[1000];
-  struct pipe_shader_state state = {tokens};
+  struct pipe_shader_state state;
+
+  memset(, 0, sizeof(state));
+
+  state.ir = PIPE_SHADER_IR_TGSI;
+  state.tokens = tokens;
 
   if (!tgsi_text_translate(vertex_shader_text, tokens, Elements(tokens))) {
  assert(0);
diff --git a/src/gallium/auxiliary/postprocess/pp_run.c 
b/src/gallium/auxiliary/postprocess/pp_run.c
index caa2062..6cd2b70 100644
--- a/src/gallium/auxiliary/postprocess/pp_run.c
+++ b/src/gallium/auxiliary/postprocess/pp_run.c
@@ -272,8 +272,10 @@ pp_tgsi_to_state(struct pipe_context *pipe, const char 
*text, bool isvs,
   return NULL;
}
 
+   memset(, 0, sizeof(state));
+
+   state.ir = PIPE_SHADER_IR_TGSI;
state.tokens = tokens;
-   memset(_output, 0, sizeof(state.stream_output));
 
if (isvs) {
   ret_state = pipe->create_vs_state(pipe, );
diff --git a/src/gallium/auxiliary/tgsi/tgsi_ureg.c 
b/src/gallium/auxiliary/tgsi/tgsi_ureg.c
index f2f5181..6c40bc1 100644
--- a/src/gallium/auxiliary/tgsi/tgsi_ureg.c
+++ b/src/gallium/auxiliary/tgsi/tgsi_ureg.c
@@ -1778,14 +1778,16 @@ void *ureg_create_shader( struct ureg_program *ureg,
 {
struct pipe_shader_state state;
 
+   memset(, 0, sizeof(state));
+
+   state.ir = PIPE_SHADER_IR_TGSI;
+
state.tokens = ureg_finalize(ureg);
if(!state.tokens)
   return NULL;
 
if (so)
   state.stream_output = *so;
-   else
-  memset(_output, 0, sizeof(state.stream_output));
 
switch (ureg->processor) {
case TGSI_PROCESSOR_VERTEX:
diff --git a/src/gallium/auxiliary/util/u_simple_shaders.c 
b/src/gallium/auxiliary/util/u_simple_shaders.c
index 6eed337..8be0da9 100644
--- a/src/gallium/auxiliary/util/u_simple_shaders.c
+++ b/src/gallium/auxiliary/util/u_simple_shaders.c
@@ -121,7 +121,12 @@ void *util_make_layered_clear_vertex_shader(struct 
pipe_context *pipe)
  "MOV OUT[2], SV[0]\n"
  "END\n";
struct tgsi_token tokens[1000];
-   struct pipe_shader_state state = {tokens};
+   struct pipe_shader_state state;
+
+   memset(, 0, sizeof(state));
+
+   state.ir = PIPE_SHADER_IR_TGSI;
+   state.tokens = tokens;
 
if (!tgsi_text_translate(text, tokens, Elements(tokens))) {
   assert(0);
@@ -149,7 +154,12 @@ void *util_make_layered_clear_helper_vertex_shader(struct 
pipe_context *pipe)
  "MOV OUT[2].x, SV[0].\n"
  "END\n";
struct tgsi_token tokens[1000];
-   struct pipe_shader_state state = {tokens};
+   struct pipe_shader_state state;
+
+   memset(, 0, sizeof(state));
+
+   state.ir = PIPE_SHADER_IR_TGSI;
+   state.tokens = tokens;
 
if (!tgsi_text_translate(text, tokens, Elements(tokens))) {
   assert(0);
@@ -192,7 +202,12 @@ void *util_make_layered_clear_geometry_shader(struct 
pipe_context *pipe)
   "EMIT IMM[0].\n"
   "END\n";
struct tgsi_token tokens[1000];
-   struct pipe_shader_state state = {tokens};
+   struct pipe_shader_state state;
+
+   memset(, 0, sizeof(state));
+
+   state.ir = PIPE_SHADER_IR_TGSI;
+   state.tokens = tokens;
 
if (!tgsi_text_translate(text, tokens, Elements(tokens))) {
   assert(0);
@@ -471,7 +486,12 @@ 

[Mesa-dev] [RFCv2 03/13] nir: allow pre-resolved sampler uniform locations

2015-11-08 Thread Rob Clark
From: Rob Clark 

With TGSI, the ir_variable::data.location gets fixed up to be a stage
local location (rather than program global).  In this case we need to
skip the UniformStorage[location] lookup.
---
 src/glsl/nir/nir_lower_samplers.c | 23 ---
 1 file changed, 16 insertions(+), 7 deletions(-)

diff --git a/src/glsl/nir/nir_lower_samplers.c 
b/src/glsl/nir/nir_lower_samplers.c
index 5df79a6..d99ba4c 100644
--- a/src/glsl/nir/nir_lower_samplers.c
+++ b/src/glsl/nir/nir_lower_samplers.c
@@ -130,14 +130,18 @@ lower_sampler(nir_tex_instr *instr, const struct 
gl_shader_program *shader_progr
   instr->sampler_array_size = array_elements;
}
 
-   if (location > shader_program->NumUniformStorage - 1 ||
-   !shader_program->UniformStorage[location].opaque[stage].active) {
-  assert(!"cannot return a sampler");
-  return;
-   }
+   if (!shader_program) {
+  instr->sampler_index = location;
+   } else {
+  if (location > shader_program->NumUniformStorage - 1 ||
+  !shader_program->UniformStorage[location].opaque[stage].active) {
+ assert(!"cannot return a sampler");
+ return;
+  }
 
-   instr->sampler_index +=
-  shader_program->UniformStorage[location].opaque[stage].index;
+  instr->sampler_index =
+ shader_program->UniformStorage[location].opaque[stage].index;
+   }
 
instr->sampler = NULL;
 }
@@ -177,6 +181,11 @@ lower_impl(nir_function_impl *impl, const struct 
gl_shader_program *shader_progr
nir_foreach_block(impl, lower_block_cb, );
 }
 
+/* Call with a null 'shader_program' if uniform locations are
+ * already local to the shader, ie. skipping the
+ * shader_program->UniformStorage[location].opaque[stage].index
+ * lookup
+ */
 void
 nir_lower_samplers(nir_shader *shader,
const struct gl_shader_program *shader_program)
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [RFCv2 05/13] gallium/auxiliary: introduce nir_emulate

2015-11-08 Thread Rob Clark
From: Rob Clark 

NIR equivalent of tgsi_emulate
---
 src/gallium/auxiliary/Makefile.sources  |   2 +
 src/gallium/auxiliary/nir/nir_emulate.c | 139 
 src/gallium/auxiliary/nir/nir_emulate.h |  34 
 3 files changed, 175 insertions(+)
 create mode 100644 src/gallium/auxiliary/nir/nir_emulate.c
 create mode 100644 src/gallium/auxiliary/nir/nir_emulate.h

diff --git a/src/gallium/auxiliary/Makefile.sources 
b/src/gallium/auxiliary/Makefile.sources
index 6e22ced..bc99ee9 100644
--- a/src/gallium/auxiliary/Makefile.sources
+++ b/src/gallium/auxiliary/Makefile.sources
@@ -311,6 +311,8 @@ C_SOURCES := \
util/u_video.h
 
 NIR_SOURCES := \
+   nir/nir_emulate.c \
+   nir/nir_emulate.h \
nir/tgsi_to_nir.c \
nir/tgsi_to_nir.h
 
diff --git a/src/gallium/auxiliary/nir/nir_emulate.c 
b/src/gallium/auxiliary/nir/nir_emulate.c
new file mode 100644
index 000..105744c
--- /dev/null
+++ b/src/gallium/auxiliary/nir/nir_emulate.c
@@ -0,0 +1,139 @@
+/*
+ * Copyright © 2015 Red Hat
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING 
FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN 
THE
+ * SOFTWARE.
+ */
+
+#include "nir/nir_emulate.h"
+
+#include "glsl/nir/nir_builder.h"
+
+typedef struct {
+   nir_shader *shader;
+   nir_builder b;
+   unsigned flags;
+} emu_state;
+
+static nir_variable *
+find_output(emu_state *state, unsigned drvloc)
+{
+   foreach_list_typed(nir_variable, var, node, >shader->outputs) {
+  if (var->data.driver_location == drvloc)
+ return var;
+   }
+   return NULL;
+}
+
+static bool
+is_color_output(emu_state *state, nir_variable *out)
+{
+   switch (state->shader->stage) {
+   case MESA_SHADER_VERTEX:
+   case MESA_SHADER_GEOMETRY:
+  switch (out->data.location) {
+  case VARYING_SLOT_COL0:
+  case VARYING_SLOT_COL1:
+  case VARYING_SLOT_BFC0:
+  case VARYING_SLOT_BFC1:
+ return true;
+  default:
+ return false;
+  }
+  break;
+   case MESA_SHADER_FRAGMENT:
+  switch (out->data.location) {
+  case FRAG_RESULT_COLOR:
+ return true;
+  default:
+ return false;
+  }
+  break;
+   default:
+  return false;
+   }
+}
+
+static void
+emu_intrinsic(emu_state *state, nir_intrinsic_instr *intr)
+{
+   nir_variable *out;
+   nir_builder *b = >b;
+   nir_ssa_def *s;
+
+   assert(state->flags & TGSI_EMU_CLAMP_COLOR_OUTPUTS);
+
+   if (intr->intrinsic != nir_intrinsic_store_output)
+  return;
+
+   out = find_output(state, intr->const_index[0]);
+
+   /* NOTE: 'out' can be null for types larger than vec4,
+* but these will never be color out's so we can ignore
+*/
+
+   if (out && is_color_output(state, out)) {
+  b->cursor = nir_before_instr(>instr);
+  s = nir_ssa_for_src(b, intr->src[0], intr->num_components);
+  s = nir_fsat(b, s);
+  nir_instr_rewrite_src(>instr, >src[0], nir_src_for_ssa(s));
+   }
+}
+
+static bool
+emu_block(nir_block *block, void *_state)
+{
+   emu_state *state = _state;
+
+   /* early return if we don't need per-instruction lowering: */
+   if (!(state->flags & TGSI_EMU_CLAMP_COLOR_OUTPUTS))
+  return false;
+
+   nir_foreach_instr_safe(block, instr) {
+  if (instr->type == nir_instr_type_intrinsic)
+ emu_intrinsic(state, nir_instr_as_intrinsic(instr));
+   }
+
+   return true;
+}
+static void
+emu_impl(emu_state *state, nir_function_impl *impl)
+{
+   nir_builder_init(>b, impl);
+
+   nir_foreach_block(impl, emu_block, state);
+   nir_metadata_preserve(impl, nir_metadata_block_index |
+   nir_metadata_dominance);
+}
+
+void nir_emulate(nir_shader *shader, unsigned flags)
+{
+   emu_state state = {
+  .shader = shader,
+  .flags = flags,
+   };
+
+   assert(flags != 0);
+   assert(flags == TGSI_EMU_CLAMP_COLOR_OUTPUTS);  // todo others..
+
+   nir_foreach_overload(shader, overload) {
+  if (overload->impl)
+ 

[Mesa-dev] [RFCv2 09/13] freedreno/ir3: handle large inputs/outputs

2015-11-08 Thread Rob Clark
From: Rob Clark 

Internally split them into vec4's..  although perhaps it makes sense to
have a generic nir pass which could do this after lower_io?
---
 .../drivers/freedreno/ir3/ir3_compiler_nir.c   | 67 +++---
 1 file changed, 47 insertions(+), 20 deletions(-)

diff --git a/src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c 
b/src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c
index 51a6acc..6e77244 100644
--- a/src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c
+++ b/src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c
@@ -2069,13 +2069,11 @@ emit_function(struct ir3_compile *ctx, 
nir_function_impl *impl)
 }
 
 static void
-setup_input(struct ir3_compile *ctx, nir_variable *in)
+setup_input(struct ir3_compile *ctx, nir_variable *in,
+   unsigned ncomp, unsigned n, unsigned slot)
 {
struct ir3_shader_variant *so = ctx->so;
unsigned array_len = MAX2(glsl_get_length(in->type), 1);
-   unsigned ncomp = glsl_get_components(in->type);
-   unsigned n = in->data.driver_location;
-   unsigned slot = in->data.location;
 
DBG("; in: slot=%u, len=%ux%u, drvloc=%u",
slot, array_len, ncomp, n);
@@ -2083,7 +2081,6 @@ setup_input(struct ir3_compile *ctx, nir_variable *in)
so->inputs[n].slot = slot;
so->inputs[n].compmask = (1 << ncomp) - 1;
so->inputs[n].inloc = ctx->next_inloc;
-   so->inputs[n].interpolate = INTERP_QUALIFIER_NONE;
so->inputs_count = MAX2(so->inputs_count, n + 1);
so->inputs[n].interpolate = in->data.interpolation;
 
@@ -2150,13 +2147,11 @@ setup_input(struct ir3_compile *ctx, nir_variable *in)
 }
 
 static void
-setup_output(struct ir3_compile *ctx, nir_variable *out)
+setup_output(struct ir3_compile *ctx, nir_variable *out,
+   unsigned ncomp, unsigned n, unsigned slot)
 {
struct ir3_shader_variant *so = ctx->so;
unsigned array_len = MAX2(glsl_get_length(out->type), 1);
-   unsigned ncomp = glsl_get_components(out->type);
-   unsigned n = out->data.driver_location;
-   unsigned slot = out->data.location;
unsigned comp = 0;
 
DBG("; out: slot=%u, len=%ux%u, drvloc=%u",
@@ -2218,6 +2213,45 @@ setup_output(struct ir3_compile *ctx, nir_variable *out)
}
 }
 
+static unsigned
+get_components(const struct glsl_type *type)
+{
+   unsigned ncomp;
+
+   /* TODO how should this work.. how do arrays of float/vec2/etc get 
packed? */
+   if (glsl_get_base_type(type) == GLSL_TYPE_ARRAY)
+   ncomp = glsl_get_length(type) * 4;
+   else
+   ncomp = glsl_get_components(type);
+
+   debug_assert(ncomp > 0);
+
+   return ncomp;
+}
+
+/* split larger var's into vec4's since that is what the hw understands..
+ * maybe we should have a generic nir pass to do this rather than doing
+ * it in-place?
+ */
+static void
+setup_vars(struct ir3_compile *ctx, struct exec_list *vars,
+   void (*setup)(struct ir3_compile *ctx, nir_variable *,
+   unsigned ncomp, unsigned n, unsigned slot))
+{
+   nir_foreach_variable(var, vars) {
+   int ncomp = get_components(var->type);
+   unsigned n = var->data.driver_location;
+   unsigned slot = var->data.location;
+
+   while (ncomp > 0) {
+   setup(ctx, var, MAX2(ncomp, 4), n, slot);
+   ncomp -= 4;
+   n++;
+   slot++;
+   }
+   }
+}
+
 static void
 emit_instructions(struct ir3_compile *ctx)
 {
@@ -2232,8 +2266,8 @@ emit_instructions(struct ir3_compile *ctx)
break;
}
 
-   ninputs  = exec_list_length(>s->inputs) * 4;
-   noutputs = exec_list_length(>s->outputs) * 4;
+   ninputs  = ctx->s->num_inputs * 4;
+   noutputs = ctx->s->num_outputs * 4;
 
/* or vtx shaders, we need to leave room for sysvals:
 */
@@ -2265,15 +2299,8 @@ emit_instructions(struct ir3_compile *ctx)
ctx->frag_pos = instr;
}
 
-   /* Setup inputs: */
-   nir_foreach_variable(var, >s->inputs) {
-   setup_input(ctx, var);
-   }
-
-   /* Setup outputs: */
-   nir_foreach_variable(var, >s->outputs) {
-   setup_output(ctx, var);
-   }
+   setup_vars(ctx, >s->inputs, setup_input);
+   setup_vars(ctx, >s->outputs, setup_output);
 
/* Setup variables (which should only be arrays): */
nir_foreach_variable(var, >s->globals) {
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [RFCv2 13/13] HACK: freedreno/a4xx: workaround glsl_to_nir hang..

2015-11-08 Thread Rob Clark
From: Rob Clark 

was getting a hang w/ smaller compmask which happens now with
glsl_to_nir since not everything is a vec4 anymore..
---
 src/gallium/drivers/freedreno/a4xx/fd4_program.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/gallium/drivers/freedreno/a4xx/fd4_program.c 
b/src/gallium/drivers/freedreno/a4xx/fd4_program.c
index e3d5dab..67b8fb2 100644
--- a/src/gallium/drivers/freedreno/a4xx/fd4_program.c
+++ b/src/gallium/drivers/freedreno/a4xx/fd4_program.c
@@ -344,14 +344,14 @@ fd4_program_emit(struct fd_ringbuffer *ring, struct 
fd4_emit *emit,
if (j < s[FS].v->inputs_count) {
k = ir3_find_output(s[VS].v, s[FS].v->inputs[j].slot);
reg |= 
A4XX_SP_VS_OUT_REG_A_REGID(s[VS].v->outputs[k].regid);
-   reg |= 
A4XX_SP_VS_OUT_REG_A_COMPMASK(s[FS].v->inputs[j].compmask);
+   reg |= A4XX_SP_VS_OUT_REG_A_COMPMASK(0xf); 
//s[FS].v->inputs[j].compmask);
}
 
j = ir3_next_varying(s[FS].v, j);
if (j < s[FS].v->inputs_count) {
k = ir3_find_output(s[VS].v, s[FS].v->inputs[j].slot);
reg |= 
A4XX_SP_VS_OUT_REG_B_REGID(s[VS].v->outputs[k].regid);
-   reg |= 
A4XX_SP_VS_OUT_REG_B_COMPMASK(s[FS].v->inputs[j].compmask);
+   reg |= A4XX_SP_VS_OUT_REG_B_COMPMASK(0xf); 
//s[FS].v->inputs[j].compmask);
}
 
OUT_RING(ring, reg);
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [RFCv2 12/13] freedreno/ir3: don't ignore local vars

2015-11-08 Thread Rob Clark
From: Rob Clark 

With glsl_to_nir we end up with local variables, instead of global, for
arrays.

Note that we'll eventually have to do something more clever, I think,
when we support multiple functions, but that will probably take some
work in a few places.
---
 src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c 
b/src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c
index e77afcc..cd664bc 100644
--- a/src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c
+++ b/src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c
@@ -2330,11 +2330,17 @@ emit_instructions(struct ir3_compile *ctx)
setup_vars(ctx, >s->inputs, setup_input);
setup_vars(ctx, >s->outputs, setup_output);
 
-   /* Setup variables (which should only be arrays): */
+   /* Setup global variables (which should only be arrays): */
nir_foreach_variable(var, >s->globals) {
declare_var(ctx, var);
}
 
+   /* Setup local variables (which should only be arrays): */
+   /* NOTE: need to do something more clever when we support >1 fxn */
+   nir_foreach_variable(var, >locals) {
+   declare_var(ctx, var);
+   }
+
/* And emit the body: */
ctx->impl = fxn;
emit_function(ctx, fxn);
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [RFCv2 08/13] freedreno/ir3: fix const_index handling for uniforms

2015-11-08 Thread Rob Clark
When coming directly from glsl_to_nir (rather than via TGSI where
information about, for example, mat4's is lost), both const_index
fields will be used (vs. tgsi_to_nir where the 2nd is always zero).

For example:

  decl_var uniform INTERP_QUALIFIER_NONE mat4 ModelViewProjectionMatrix (0, 0)
  decl_var uniform INTERP_QUALIFIER_NONE mat4 NormalMatrix (4, 4)
  ...
vec4 ssa_54 = intrinsic load_uniform () () (4, 0)   /* NormalMatrix 
*/
vec4 ssa_56 = intrinsic load_uniform () () (4, 1)   /* NormalMatrix 
*/

vs:

  decl_var uniform INTERP_QUALIFIER_NONE vec4[8] uniform_0 (0, 0)
  ...
vec4 ssa_54 = intrinsic load_uniform () () (4, 0)
vec4 ssa_56 = intrinsic load_uniform () () (5, 0)
---
 src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c 
b/src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c
index 2001872..51a6acc 100644
--- a/src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c
+++ b/src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c
@@ -1302,7 +1302,10 @@ emit_intrinisic(struct ir3_compile *ctx, 
nir_intrinsic_instr *intr)
const nir_intrinsic_info *info = _intrinsic_infos[intr->intrinsic];
struct ir3_instruction **dst, **src;
struct ir3_block *b = ctx->block;
-   unsigned idx = intr->const_index[0];
+   unsigned idx = 0;
+
+   for (unsigned i = 0; i < 
nir_intrinsic_infos[intr->intrinsic].num_indices; i++)
+   idx += intr->const_index[i];
 
if (info->has_dest) {
dst = get_dst(ctx, >dest, intr->num_components);
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [RFCv2 04/13] nir: add lowering pass for y-transform

2015-11-08 Thread Rob Clark
From: Rob Clark 

---
 src/glsl/Makefile.sources|   1 +
 src/glsl/nir/nir.h   |  12 ++
 src/glsl/nir/nir_lower_wpos_ytransform.c | 320 +++
 3 files changed, 333 insertions(+)
 create mode 100644 src/glsl/nir/nir_lower_wpos_ytransform.c

diff --git a/src/glsl/Makefile.sources b/src/glsl/Makefile.sources
index 9089918..045d3d2 100644
--- a/src/glsl/Makefile.sources
+++ b/src/glsl/Makefile.sources
@@ -58,6 +58,7 @@ NIR_FILES = \
nir/nir_lower_vars_to_ssa.c \
nir/nir_lower_var_copies.c \
nir/nir_lower_vec_to_movs.c \
+   nir/nir_lower_wpos_ytransform.c \
nir/nir_metadata.c \
nir/nir_move_vec_src_uses_to_dest.c \
nir/nir_normalize_cubemap_coords.c \
diff --git a/src/glsl/nir/nir.h b/src/glsl/nir/nir.h
index cad6f8a..4617322 100644
--- a/src/glsl/nir/nir.h
+++ b/src/glsl/nir/nir.h
@@ -2057,6 +2057,18 @@ void nir_lower_clip_fs(nir_shader *shader, unsigned 
ucp_enables);
 
 void nir_lower_two_sided_color(nir_shader *shader);
 
+
+typedef struct nir_lower_wpos_ytransform_options {
+   int state_tokens[5];
+   bool fs_coord_origin_upper_left :1;
+   bool fs_coord_origin_lower_left :1;
+   bool fs_coord_pixel_center_integer :1;
+   bool fs_coord_pixel_center_half_integer :1;
+} nir_lower_wpos_ytransform_options;
+
+bool nir_lower_wpos_ytransform(nir_shader *shader,
+   const nir_lower_wpos_ytransform_options 
*options);
+
 void nir_lower_atomics(nir_shader *shader,
const struct gl_shader_program *shader_program);
 void nir_lower_to_source_mods(nir_shader *shader);
diff --git a/src/glsl/nir/nir_lower_wpos_ytransform.c 
b/src/glsl/nir/nir_lower_wpos_ytransform.c
new file mode 100644
index 000..9009926
--- /dev/null
+++ b/src/glsl/nir/nir_lower_wpos_ytransform.c
@@ -0,0 +1,320 @@
+/*
+ * Copyright © 2015 Red Hat
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING 
FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN 
THE
+ * SOFTWARE.
+ */
+
+#include "nir.h"
+#include "nir_builder.h"
+
+/* Lower gl_FragCoord (and fddy) to account for driver's requested coordinate-
+ * origin and pixel-center vs. shader.  If transformation is required, a
+ * gl_FbWposYTransform uniform is inserted (with the specified state-slots)
+ * and additional instructions are inserted to transform gl_FragCoord (and
+ * fddy src arg).
+ *
+ * This is based on the logic in emit_wpos()/emit_wpos_adjustment() in TGSI
+ * compiler.
+ *
+ * Run before nir_lower_io.
+ */
+
+typedef struct {
+   const nir_lower_wpos_ytransform_options *options;
+   nir_shader   *shader;
+   nir_builder   b;
+   nir_variable *transform;
+} lower_wpos_ytransform_state;
+
+static nir_ssa_def *
+get_transform(lower_wpos_ytransform_state *state)
+{
+   if (state->transform == NULL) {
+  nir_variable *var = rzalloc(state->shader, nir_variable);
+
+  var->type = glsl_vec4_type();
+  var->data.mode = nir_var_uniform;
+  /* NOTE: name must be prefixed w/ "gl_" to trigger slot based
+   * special handling in uniform setup:
+   */
+  var->name = "gl_FbWposYTransform";
+
+  var->num_state_slots = 1;
+  var->state_slots = ralloc_array(var, nir_state_slot, 1);
+  memcpy(var->state_slots[0].tokens, state->options->state_tokens,
+ sizeof(var->state_slots[0].tokens));
+
+  exec_list_push_tail(>shader->uniforms, >node);
+
+  state->transform = var;
+   }
+   return nir_load_var(>b, state->transform);
+}
+
+/* NIR equiv of TGSI CMP instruction: */
+static nir_ssa_def *
+nir_cmp(nir_builder *b, nir_ssa_def *src0, nir_ssa_def *src1, nir_ssa_def 
*src2)
+{
+   return nir_bcsel(b, nir_flt(b, src0, nir_imm_float(b, 0.0)), src1, src2);
+}
+
+static nir_ssa_def *
+nir_scalar(nir_builder *b, nir_ssa_def *def, int c)
+{
+   unsigned swizzle[4] = {c, c, c, c};
+   return nir_swizzle(b, def, swizzle, 4, false);
+}
+
+/* see emit_wpos_adjustment() in 

[Mesa-dev] [RFCv2 07/13] freedreno/ir3: add support for NIR as preferred IR

2015-11-08 Thread Rob Clark
For now under debug flag, since only suitable for debugging/testing.
---
 src/gallium/drivers/freedreno/freedreno_screen.c |  5 -
 src/gallium/drivers/freedreno/freedreno_util.h   |  1 +
 src/gallium/drivers/freedreno/ir3/ir3_shader.c   | 16 
 3 files changed, 17 insertions(+), 5 deletions(-)

diff --git a/src/gallium/drivers/freedreno/freedreno_screen.c 
b/src/gallium/drivers/freedreno/freedreno_screen.c
index 7ee1a3f..fad2e7d 100644
--- a/src/gallium/drivers/freedreno/freedreno_screen.c
+++ b/src/gallium/drivers/freedreno/freedreno_screen.c
@@ -71,6 +71,7 @@ static const struct debug_named_value debug_options[] = {
{"glsl120",   FD_DBG_GLSL120,"Temporary flag to force GLSL 1.20 
(rather than 1.30) on a3xx+"},
{"shaderdb",  FD_DBG_SHADERDB, "Enable shaderdb output"},
{"flush", FD_DBG_FLUSH,  "Force flush after every draw"},
+   {"nir",   FD_DBG_NIR,"Prefer NIR as native IR"},
DEBUG_NAMED_VALUE_END
 };
 
@@ -400,7 +401,7 @@ fd_screen_get_shader_param(struct pipe_screen *pscreen, 
unsigned shader,
case PIPE_SHADER_CAP_TGSI_DROUND_SUPPORTED:
case PIPE_SHADER_CAP_TGSI_DFRACEXP_DLDEXP_SUPPORTED:
case PIPE_SHADER_CAP_TGSI_FMA_SUPPORTED:
-case PIPE_SHADER_CAP_TGSI_ANY_INOUT_DECL_RANGE:
+   case PIPE_SHADER_CAP_TGSI_ANY_INOUT_DECL_RANGE:
return 0;
case PIPE_SHADER_CAP_TGSI_SQRT_SUPPORTED:
return 1;
@@ -412,6 +413,8 @@ fd_screen_get_shader_param(struct pipe_screen *pscreen, 
unsigned shader,
case PIPE_SHADER_CAP_MAX_SAMPLER_VIEWS:
return 16;
case PIPE_SHADER_CAP_PREFERRED_IR:
+   if ((fd_mesa_debug & FD_DBG_NIR) && is_ir3(screen))
+   return PIPE_SHADER_IR_NIR;
return PIPE_SHADER_IR_TGSI;
case PIPE_SHADER_CAP_MAX_UNROLL_ITERATIONS_HINT:
return 32;
diff --git a/src/gallium/drivers/freedreno/freedreno_util.h 
b/src/gallium/drivers/freedreno/freedreno_util.h
index 0d2418e..56d3235 100644
--- a/src/gallium/drivers/freedreno/freedreno_util.h
+++ b/src/gallium/drivers/freedreno/freedreno_util.h
@@ -73,6 +73,7 @@ enum adreno_stencil_op fd_stencil_op(unsigned op);
 #define FD_DBG_GLSL120  0x0400
 #define FD_DBG_SHADERDB 0x0800
 #define FD_DBG_FLUSH0x1000
+#define FD_DBG_NIR  0x2000
 
 extern int fd_mesa_debug;
 extern bool fd_binning_enabled;
diff --git a/src/gallium/drivers/freedreno/ir3/ir3_shader.c 
b/src/gallium/drivers/freedreno/ir3/ir3_shader.c
index f800278..857fd4c 100644
--- a/src/gallium/drivers/freedreno/ir3/ir3_shader.c
+++ b/src/gallium/drivers/freedreno/ir3/ir3_shader.c
@@ -275,17 +275,25 @@ ir3_shader_create(struct pipe_context *pctx,
shader->id = ++shader->compiler->shader_count;
shader->pctx = pctx;
shader->type = type;
-   if (fd_mesa_debug & FD_DBG_DISASM) {
-   DBG("dump tgsi: type=%d", shader->type);
-   tgsi_dump(cso->tokens, 0);
+
+   nir_shader *nir;
+   if (cso->ir == PIPE_SHADER_IR_NIR) {
+   /* we take ownership of the reference: */
+   nir = cso->nir;
+   } else {
+   if (fd_mesa_debug & FD_DBG_DISASM) {
+   DBG("dump tgsi: type=%d", shader->type);
+   tgsi_dump(cso->tokens, 0);
+   }
+   nir = ir3_tgsi_to_nir(cso->tokens);
}
-   nir_shader *nir = ir3_tgsi_to_nir(cso->tokens);
/* do first pass optimization, ignoring the key: */
shader->nir = ir3_optimize_nir(shader, nir, NULL);
if (fd_mesa_debug & FD_DBG_DISASM) {
DBG("dump nir%d: type=%d", shader->id, shader->type);
nir_print_shader(shader->nir, stdout);
}
+
shader->stream_output = cso->stream_output;
if (fd_mesa_debug & FD_DBG_SHADERDB) {
/* if shader-db run, create a standard variant immediately
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [RFCv2 06/13] mesa/st: add support for NIR as possible driver IR

2015-11-08 Thread Rob Clark
---
 src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 410 -
 src/mesa/state_tracker/st_glsl_to_tgsi.h   |   5 +
 src/mesa/state_tracker/st_program.c| 118 +++--
 src/mesa/state_tracker/st_program.h|   6 +
 4 files changed, 520 insertions(+), 19 deletions(-)

diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp 
b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
index f481e89..fbc598e 100644
--- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
@@ -35,6 +35,9 @@
 #include "glsl_parser_extras.h"
 #include "ir_optimization.h"
 
+#include "nir.h"
+#include "glsl_to_nir.h"
+
 #include "main/errors.h"
 #include "main/shaderobj.h"
 #include "main/uniforms.h"
@@ -5486,9 +5489,9 @@ out:
  * generating Mesa IR.
  */
 static struct gl_program *
-get_mesa_program(struct gl_context *ctx,
- struct gl_shader_program *shader_program,
- struct gl_shader *shader)
+get_mesa_program_tgsi(struct gl_context *ctx,
+  struct gl_shader_program *shader_program,
+  struct gl_shader *shader)
 {
glsl_to_tgsi_visitor* v;
struct gl_program *prog;
@@ -5680,6 +5683,396 @@ get_mesa_program(struct gl_context *ctx,
return prog;
 }
 
+/* TODO dup'd from brw_vec4_vistor.cpp..  what should we do? */
+static int
+type_size_vec4(const struct glsl_type *type)
+{
+   unsigned int i;
+   int size;
+
+   switch (type->base_type) {
+   case GLSL_TYPE_UINT:
+   case GLSL_TYPE_INT:
+   case GLSL_TYPE_FLOAT:
+   case GLSL_TYPE_BOOL:
+  if (type->is_matrix()) {
+return type->matrix_columns;
+  } else {
+/* Regardless of size of vector, it gets a vec4. This is bad
+ * packing for things like floats, but otherwise arrays become a
+ * mess.  Hopefully a later pass over the code can pack scalars
+ * down if appropriate.
+ */
+return 1;
+  }
+   case GLSL_TYPE_ARRAY:
+  assert(type->length > 0);
+  return type_size_vec4(type->fields.array) * type->length;
+   case GLSL_TYPE_STRUCT:
+  size = 0;
+  for (i = 0; i < type->length; i++) {
+size += type_size_vec4(type->fields.structure[i].type);
+  }
+  return size;
+   case GLSL_TYPE_SUBROUTINE:
+  return 1;
+
+   case GLSL_TYPE_SAMPLER:
+  /* Samplers take up no register space, since they're baked in at
+   * link time.
+   */
+  return 0;
+   case GLSL_TYPE_ATOMIC_UINT:
+  return 0;
+   case GLSL_TYPE_IMAGE:
+//  return DIV_ROUND_UP(BRW_IMAGE_PARAM_SIZE, 4);
+   case GLSL_TYPE_VOID:
+   case GLSL_TYPE_DOUBLE:
+   case GLSL_TYPE_ERROR:
+   case GLSL_TYPE_INTERFACE:
+  unreachable("not reached");
+   }
+
+   return 0;
+}
+
+/* Depending on PIPE_CAP_TGSI_TEXCOORD (st->needs_texcoord_semantic) we
+ * may need to fix up varying slots so the glsl->nir path is aligned
+ * with the anything->tgsi->nir path.
+ */
+static void
+st_nir_fixup_varying_slots(struct st_context *st, struct exec_list *var_list)
+{
+   if (st->needs_texcoord_semantic)
+  return;
+
+   nir_foreach_variable(var, var_list) {
+  if (var->data.location >= VARYING_SLOT_VAR0) {
+ var->data.location += 9;
+  } else if ((var->data.location >= VARYING_SLOT_TEX0) &&
+   (var->data.location <= VARYING_SLOT_TEX7)) {
+ var->data.location += VARYING_SLOT_VAR0 - VARYING_SLOT_TEX0;
+  }
+   }
+}
+
+/* input location assignment for VS inputs must be handled specially, so
+ * that it is aligned w/ st's vbo state.
+ * (This isn't the case with, for ex, FS inputs, which only need to agree
+ * on varying-slot w/ the VS outputs)
+ */
+static void
+st_nir_assign_vs_in_locations(struct gl_program *prog,
+  struct exec_list *var_list, unsigned *size)
+{
+   unsigned attr, num_inputs = 0;
+   unsigned input_to_index[VERT_ATTRIB_MAX] = {0};
+
+   /* TODO de-duplicate w/ similar code in st_translate_vertex_program()? */
+   for (attr = 0; attr < VERT_ATTRIB_MAX; attr++) {
+  if ((prog->InputsRead & BITFIELD64_BIT(attr)) != 0) {
+ input_to_index[attr] = num_inputs;
+ num_inputs++;
+ if ((prog->DoubleInputsRead & BITFIELD64_BIT(attr)) != 0) {
+/* add placeholder for second part of a double attribute */
+num_inputs++;
+ }
+  }
+   }
+
+   *size = 0;
+   nir_foreach_variable(var, var_list) {
+  attr = var->data.location;
+  assert(attr < ARRAY_SIZE(input_to_index));
+  var->data.driver_location = input_to_index[attr];
+  (*size)++;
+   }
+}
+
+static void
+st_nir_assign_uniform_locations(struct gl_program *prog,
+struct exec_list *uniform_list, unsigned *size)
+{
+   int max = 0;
+   int shaderidx = 0;
+
+   nir_foreach_variable(uniform, uniform_list) {
+  int loc;
+
+  if (uniform->type->is_sampler()) {
+ loc = shaderidx++;
+ uniform->data.location = loc; /* this should match 

[Mesa-dev] [RFCv2 10/13] freedreno/ir3: support load_front_face intrinsic

2015-11-08 Thread Rob Clark
From: Rob Clark 

With tgsi_to_nir we get this as a normal input with VARYING_SLOT_FACE.
But glsl_to_nir plus nir_lower_system_values this becomes an intrinsic.
---
 src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c | 16 ++--
 1 file changed, 14 insertions(+), 2 deletions(-)

diff --git a/src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c 
b/src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c
index 6e77244..2320464 100644
--- a/src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c
+++ b/src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c
@@ -713,6 +713,10 @@ create_frag_coord(struct ir3_compile *ctx, unsigned comp)
}
 }
 
+/* NOTE: this creates the "TGSI" style fragface (ie. input slot
+ * VARYING_SLOT_FACE).  For NIR style nir_intrinsic_load_front_face
+ * we can just use the value from hw directly (since it is boolean)
+ */
 static struct ir3_instruction *
 create_frag_face(struct ir3_compile *ctx, unsigned comp)
 {
@@ -1377,7 +1381,7 @@ emit_intrinisic(struct ir3_compile *ctx, 
nir_intrinsic_instr *intr)
break;
case nir_intrinsic_load_vertex_id_zero_base:
if (!ctx->vertex_id) {
-   ctx->vertex_id = create_input(ctx->block, 0);
+   ctx->vertex_id = create_input(b, 0);
add_sysval_input(ctx, SYSTEM_VALUE_VERTEX_ID_ZERO_BASE,
ctx->vertex_id);
}
@@ -1385,7 +1389,7 @@ emit_intrinisic(struct ir3_compile *ctx, 
nir_intrinsic_instr *intr)
break;
case nir_intrinsic_load_instance_id:
if (!ctx->instance_id) {
-   ctx->instance_id = create_input(ctx->block, 0);
+   ctx->instance_id = create_input(b, 0);
add_sysval_input(ctx, SYSTEM_VALUE_INSTANCE_ID,
ctx->instance_id);
}
@@ -1397,6 +1401,14 @@ emit_intrinisic(struct ir3_compile *ctx, 
nir_intrinsic_instr *intr)
dst[i] = create_driver_param(ctx, IR3_DP_UCP0_X + n);
}
break;
+   case nir_intrinsic_load_front_face:
+   if (!ctx->frag_face) {
+   ctx->so->frag_face = true;
+   ctx->frag_face = create_input(b, 0);
+   ctx->frag_face->regs[0]->flags |= IR3_REG_HALF;
+   }
+   dst[0] = ir3_ADD_S(b, ctx->frag_face, 0, create_immed(b, 1), 0);
+   break;
case nir_intrinsic_discard_if:
case nir_intrinsic_discard: {
struct ir3_instruction *cond, *kill;
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [RFCv2 02/13] gallium: add NIR as a possible IR

2015-11-08 Thread Rob Clark
---
 src/gallium/include/pipe/p_defines.h | 1 +
 src/gallium/include/pipe/p_state.h   | 7 +++
 2 files changed, 8 insertions(+)

diff --git a/src/gallium/include/pipe/p_defines.h 
b/src/gallium/include/pipe/p_defines.h
index 0a9d98d..572461f 100644
--- a/src/gallium/include/pipe/p_defines.h
+++ b/src/gallium/include/pipe/p_defines.h
@@ -718,6 +718,7 @@ enum pipe_shader_ir
PIPE_SHADER_IR_TGSI = 0,
PIPE_SHADER_IR_LLVM,
PIPE_SHADER_IR_NATIVE,
+   PIPE_SHADER_IR_NIR,
 };
 
 /**
diff --git a/src/gallium/include/pipe/p_state.h 
b/src/gallium/include/pipe/p_state.h
index f1c4b49..7eee709 100644
--- a/src/gallium/include/pipe/p_state.h
+++ b/src/gallium/include/pipe/p_state.h
@@ -221,6 +221,12 @@ struct pipe_stream_output_info
  *
  * TODO pipe_compute_state should probably get similar treatment to handle
  * multiple IR's in a cleaner way..
+ *
+ * NOTE: since the nir_shader is reference counted, the semantics are a bit
+ * different from create_xyz_state(ir=TGSI).  The driver takes ownership of
+ * the nir_shader (and must nir_shader_unref()) at some point.  If state
+ * trackers need to hang on to the IR (for example, variant management), it
+ * should increment the refcnt before calling create_xyz_shader(ir=NIR).
  */
 struct pipe_shader_state
 {
@@ -230,6 +236,7 @@ struct pipe_shader_state
   const struct tgsi_token *tokens;
   void *llvm;
   void *native;
+  void *nir;
};
struct pipe_stream_output_info stream_output;
 };
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [RFCv2 00/13] gallium: add support for NIR as alternate IR

2015-11-08 Thread Rob Clark
From: Rob Clark 

Things have progressed somewhat since the initial RFC, to the point
that all sorts of common things are working (glmark2, xonotic, stk,
etc), and piglit is *mostly* working (~330 regressions or so)..

(This is with both VS and FS converted, fwiw, compared to initial RFC
which was only using glsl_to_nir for VS.)

Some of the remaining piglit regressions might be bugs in ir3.. still
tracking down some assumptions about everything being vec4's (which is
true in the glsl->tgsi->nir path but not in the glsl->nir path).

Still some cleanup needed, now that I think I have a reasonable grasp
on how things should work.  But I think not too early to start getting
some comments.

I still need to ditch the anon union in pipe_shader_state, but was
planning to leave flag-day rename-everything changes until closer to
being ready to merge to avoid getting bogged down in rebase conflicts.

This is based on top of nir_clone, nir_shader refcnt'ing and a few
other in-flight patches which are not part of this patchset.  For the
complete branch see:

https://github.com/freedreno/mesa/commits/wip-gallium-skip-tgsi

Rob Clark (13):
  gallium: refactor pipe_shader_state to support multiple IR's
  gallium: add NIR as a possible IR
  nir: allow pre-resolved sampler uniform locations
  nir: add lowering pass for y-transform
  gallium/auxiliary: introduce nir_emulate
  mesa/st: add support for NIR as possible driver IR
  freedreno/ir3: add support for NIR as preferred IR
  freedreno/ir3: fix const_index handling for uniforms
  freedreno/ir3: handle large inputs/outputs
  freedreno/ir3: support load_front_face intrinsic
  freedreno/ir3: handle tex instrs w/ const offset
  freedreno/ir3: don't ignore local vars
  HACK: freedreno/a4xx: workaround glsl_to_nir hang..

 src/gallium/auxiliary/Makefile.sources |   2 +
 src/gallium/auxiliary/hud/hud_context.c|  14 +-
 src/gallium/auxiliary/nir/nir_emulate.c| 139 +++
 src/gallium/auxiliary/nir/nir_emulate.h|  34 ++
 src/gallium/auxiliary/postprocess/pp_run.c |   4 +-
 src/gallium/auxiliary/tgsi/tgsi_ureg.c |   6 +-
 src/gallium/auxiliary/util/u_simple_shaders.c  |  42 ++-
 src/gallium/auxiliary/util/u_tests.c   |   7 +-
 src/gallium/drivers/freedreno/a4xx/fd4_program.c   |   4 +-
 src/gallium/drivers/freedreno/freedreno_screen.c   |   5 +-
 src/gallium/drivers/freedreno/freedreno_util.h |   1 +
 .../drivers/freedreno/ir3/ir3_compiler_nir.c   | 110 --
 src/gallium/drivers/freedreno/ir3/ir3_shader.c |  16 +-
 src/gallium/include/pipe/p_defines.h   |  13 +-
 src/gallium/include/pipe/p_state.h |  27 +-
 src/glsl/Makefile.sources  |   1 +
 src/glsl/nir/nir.h |  12 +
 src/glsl/nir/nir_lower_samplers.c  |  23 +-
 src/glsl/nir/nir_lower_wpos_ytransform.c   | 320 
 src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 410 -
 src/mesa/state_tracker/st_glsl_to_tgsi.h   |   5 +
 src/mesa/state_tracker/st_program.c| 118 +-
 src/mesa/state_tracker/st_program.h|   6 +
 23 files changed, 1247 insertions(+), 72 deletions(-)
 create mode 100644 src/gallium/auxiliary/nir/nir_emulate.c
 create mode 100644 src/gallium/auxiliary/nir/nir_emulate.h
 create mode 100644 src/glsl/nir/nir_lower_wpos_ytransform.c

-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [RFCv2 11/13] freedreno/ir3: handle tex instrs w/ const offset

2015-11-08 Thread Rob Clark
From: Rob Clark 

---
 src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c | 16 
 1 file changed, 16 insertions(+)

diff --git a/src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c 
b/src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c
index 2320464..e77afcc 100644
--- a/src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c
+++ b/src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c
@@ -1514,6 +1514,7 @@ emit_tex(struct ir3_compile *ctx, nir_tex_instr *tex)
struct ir3_block *b = ctx->block;
struct ir3_instruction **dst, *sam, *src0[12], *src1[4];
struct ir3_instruction **coord, *lod, *compare, *proj, **off, **ddx, 
**ddy;
+   struct ir3_instruction *const_off[4];
bool has_bias = false, has_lod = false, has_proj = false, has_off = 
false;
unsigned i, coords, flags;
unsigned nsrc0 = 0, nsrc1 = 0;
@@ -1581,6 +1582,21 @@ emit_tex(struct ir3_compile *ctx, nir_tex_instr *tex)
 
tex_info(tex, , );
 
+   if (!has_off) {
+   /* could still have a constant offset: */
+   if (tex->const_offset[0] || tex->const_offset[1] ||
+   tex->const_offset[2] || tex->const_offset[3]) {
+   off = const_off;
+
+   off[0] = create_immed(b, tex->const_offset[0]);
+   off[1] = create_immed(b, tex->const_offset[1]);
+   off[2] = create_immed(b, tex->const_offset[2]);
+   off[3] = create_immed(b, tex->const_offset[3]);
+
+   has_off = true;
+   }
+   }
+
/* scale up integer coords for TXF based on the LOD */
if (ctx->unminify_coords && (opc == OPC_ISAML)) {
assert(has_lod);
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 6/7] radeonsi: prevent recursion in si_context_gfx_flush

2015-11-08 Thread Marek Olšák
From: Marek Olšák 

The recursion can only occur if you modify need_cs_space to always flush.
---
 src/gallium/drivers/radeonsi/si_hw_context.c | 7 +++
 src/gallium/drivers/radeonsi/si_pipe.h   | 1 +
 2 files changed, 8 insertions(+)

diff --git a/src/gallium/drivers/radeonsi/si_hw_context.c 
b/src/gallium/drivers/radeonsi/si_hw_context.c
index 8eade11..e5f1c84 100644
--- a/src/gallium/drivers/radeonsi/si_hw_context.c
+++ b/src/gallium/drivers/radeonsi/si_hw_context.c
@@ -64,12 +64,18 @@ void si_context_gfx_flush(void *context, unsigned flags,
struct radeon_winsys_cs *cs = ctx->b.rings.gfx.cs;
struct radeon_winsys *ws = ctx->b.ws;
 
+   if (ctx->gfx_flush_in_progress)
+   return;
+
+   ctx->gfx_flush_in_progress = true;
+
if (cs->cdw == ctx->b.initial_gfx_cs_size &&
(!fence || ctx->last_gfx_fence)) {
if (fence)
ws->fence_reference(fence, ctx->last_gfx_fence);
if (!(flags & RADEON_FLUSH_ASYNC))
ws->cs_sync_flush(cs);
+   ctx->gfx_flush_in_progress = false;
return;
}
 
@@ -123,6 +129,7 @@ void si_context_gfx_flush(void *context, unsigned flags,
si_check_vm_faults(ctx);
 
si_begin_new_cs(ctx);
+   ctx->gfx_flush_in_progress = false;
 }
 
 void si_begin_new_cs(struct si_context *ctx)
diff --git a/src/gallium/drivers/radeonsi/si_pipe.h 
b/src/gallium/drivers/radeonsi/si_pipe.h
index 20fd695..6e742fc 100644
--- a/src/gallium/drivers/radeonsi/si_pipe.h
+++ b/src/gallium/drivers/radeonsi/si_pipe.h
@@ -173,6 +173,7 @@ struct si_context {
struct pipe_fence_handle*last_gfx_fence;
struct si_shader_ctx_state  fixed_func_tcs_shader;
LLVMTargetMachineReftm;
+   boolgfx_flush_in_progress;
 
/* Atoms (direct states). */
union si_state_atomsatoms;
-- 
2.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 7/7] gallium/radeon: inline the r600_rings structure

2015-11-08 Thread Marek Olšák
From: Marek Olšák 

---
 src/gallium/drivers/r600/evergreen_compute.c| 14 ++---
 src/gallium/drivers/r600/evergreen_hw_context.c | 10 ++--
 src/gallium/drivers/r600/evergreen_state.c  | 66 
 src/gallium/drivers/r600/r600_blit.c|  2 +-
 src/gallium/drivers/r600/r600_hw_context.c  | 34 ++---
 src/gallium/drivers/r600/r600_pipe.c| 10 ++--
 src/gallium/drivers/r600/r600_state.c   | 68 -
 src/gallium/drivers/r600/r600_state_common.c| 36 ++---
 src/gallium/drivers/radeon/r600_buffer_common.c | 32 ++--
 src/gallium/drivers/radeon/r600_pipe_common.c   | 34 ++---
 src/gallium/drivers/radeon/r600_pipe_common.h   |  8 +--
 src/gallium/drivers/radeon/r600_query.c | 16 +++---
 src/gallium/drivers/radeon/r600_streamout.c | 18 +++
 src/gallium/drivers/radeonsi/cik_sdma.c | 14 ++---
 src/gallium/drivers/radeonsi/si_compute.c   | 12 ++---
 src/gallium/drivers/radeonsi/si_cp_dma.c| 10 ++--
 src/gallium/drivers/radeonsi/si_descriptors.c   | 38 +++---
 src/gallium/drivers/radeonsi/si_dma.c   | 14 ++---
 src/gallium/drivers/radeonsi/si_hw_context.c| 16 +++---
 src/gallium/drivers/radeonsi/si_pipe.c  |  8 +--
 src/gallium/drivers/radeonsi/si_pm4.c   |  6 +--
 src/gallium/drivers/radeonsi/si_state.c | 34 ++---
 src/gallium/drivers/radeonsi/si_state_draw.c| 24 -
 src/gallium/drivers/radeonsi/si_state_shaders.c |  4 +-
 24 files changed, 262 insertions(+), 266 deletions(-)

diff --git a/src/gallium/drivers/r600/evergreen_compute.c 
b/src/gallium/drivers/r600/evergreen_compute.c
index 6f2b7ba..5743e3f 100644
--- a/src/gallium/drivers/r600/evergreen_compute.c
+++ b/src/gallium/drivers/r600/evergreen_compute.c
@@ -346,7 +346,7 @@ static void evergreen_emit_direct_dispatch(
const uint *block_layout, const uint *grid_layout)
 {
int i;
-   struct radeon_winsys_cs *cs = rctx->b.rings.gfx.cs;
+   struct radeon_winsys_cs *cs = rctx->b.gfx.cs;
struct r600_pipe_compute *shader = rctx->cs_shader_state.shader;
unsigned num_waves;
unsigned num_pipes = rctx->screen->b.info.r600_max_pipes;
@@ -417,12 +417,12 @@ static void evergreen_emit_direct_dispatch(
 static void compute_emit_cs(struct r600_context *ctx, const uint *block_layout,
const uint *grid_layout)
 {
-   struct radeon_winsys_cs *cs = ctx->b.rings.gfx.cs;
+   struct radeon_winsys_cs *cs = ctx->b.gfx.cs;
unsigned i;
 
/* make sure that the gfx ring is only one active */
-   if (ctx->b.rings.dma.cs && ctx->b.rings.dma.cs->cdw) {
-   ctx->b.rings.dma.flush(ctx, RADEON_FLUSH_ASYNC, NULL);
+   if (ctx->b.dma.cs && ctx->b.dma.cs->cdw) {
+   ctx->b.dma.flush(ctx, RADEON_FLUSH_ASYNC, NULL);
}
 
/* Initialize all the compute-related registers.
@@ -439,7 +439,7 @@ static void compute_emit_cs(struct r600_context *ctx, const 
uint *block_layout,
/* XXX support more than 8 colorbuffers (the offsets are not a multiple 
of 0x3C for CB8-11) */
for (i = 0; i < 8 && i < ctx->framebuffer.state.nr_cbufs; i++) {
struct r600_surface *cb = (struct 
r600_surface*)ctx->framebuffer.state.cbufs[i];
-   unsigned reloc = radeon_add_to_buffer_list(>b, 
>b.rings.gfx,
+   unsigned reloc = radeon_add_to_buffer_list(>b, >b.gfx,
   (struct 
r600_resource*)cb->base.texture,
   RADEON_USAGE_READWRITE,
   
RADEON_PRIO_SHADER_RW_BUFFER);
@@ -538,7 +538,7 @@ void evergreen_emit_cs_shader(
struct r600_cs_shader_state *state =
(struct r600_cs_shader_state*)atom;
struct r600_pipe_compute *shader = state->shader;
-   struct radeon_winsys_cs *cs = rctx->b.rings.gfx.cs;
+   struct radeon_winsys_cs *cs = rctx->b.gfx.cs;
uint64_t va;
struct r600_resource *code_bo;
unsigned ngpr, nstack;
@@ -564,7 +564,7 @@ void evergreen_emit_cs_shader(
radeon_emit(cs, 0); /* R_0288D8_SQ_PGM_RESOURCES_LS_2 */
 
radeon_emit(cs, PKT3C(PKT3_NOP, 0, 0));
-   radeon_emit(cs, radeon_add_to_buffer_list(>b, >b.rings.gfx,
+   radeon_emit(cs, radeon_add_to_buffer_list(>b, >b.gfx,
  code_bo, RADEON_USAGE_READ,
  RADEON_PRIO_USER_SHADER));
 }
diff --git a/src/gallium/drivers/r600/evergreen_hw_context.c 
b/src/gallium/drivers/r600/evergreen_hw_context.c
index 89abe92..a0f4680 100644
--- a/src/gallium/drivers/r600/evergreen_hw_context.c
+++ b/src/gallium/drivers/r600/evergreen_hw_context.c
@@ -35,7 +35,7 @@ void evergreen_dma_copy_buffer(struct r600_context *rctx,
  

[Mesa-dev] [PATCH 3/7] radeonsi: rename cache flushing flags once more

2015-11-08 Thread Marek Olšák
From: Marek Olšák 

KCACHE, TC L1 and TC L2 are renamed to:
- SMEM L1
- VMEM L1
- GLOBAL L2

You can easily tell what they are used for now.
Shaders must deal with coherency issues between both L1s manually,
e.g. by setting GLC=1 or by using s_dcache_*.

BOTH_ICACHE_KCACHE was an unused definition.
---
 src/gallium/drivers/radeonsi/si_compute.c | 12 ++--
 src/gallium/drivers/radeonsi/si_cp_dma.c  |  6 +++---
 src/gallium/drivers/radeonsi/si_descriptors.c |  4 ++--
 src/gallium/drivers/radeonsi/si_hw_context.c  | 10 +-
 src/gallium/drivers/radeonsi/si_pipe.h| 15 ++-
 src/gallium/drivers/radeonsi/si_state.c   |  8 
 src/gallium/drivers/radeonsi/si_state_draw.c  | 10 --
 7 files changed, 30 insertions(+), 35 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_compute.c 
b/src/gallium/drivers/radeonsi/si_compute.c
index 697e60a..c008f8b 100644
--- a/src/gallium/drivers/radeonsi/si_compute.c
+++ b/src/gallium/drivers/radeonsi/si_compute.c
@@ -253,10 +253,10 @@ static void si_launch_grid(
radeon_emit(cs, 0x8000);
radeon_emit(cs, 0x8000);
 
-   sctx->b.flags |= SI_CONTEXT_INV_TC_L1 |
-SI_CONTEXT_INV_TC_L2 |
+   sctx->b.flags |= SI_CONTEXT_INV_VMEM_L1 |
+SI_CONTEXT_INV_GLOBAL_L2 |
 SI_CONTEXT_INV_ICACHE |
-SI_CONTEXT_INV_KCACHE |
+SI_CONTEXT_INV_SMEM_L1 |
 SI_CONTEXT_FLUSH_WITH_INV_L2 |
 SI_CONTEXT_FLAG_COMPUTE;
si_emit_cache_flush(sctx, NULL);
@@ -449,10 +449,10 @@ static void si_launch_grid(
si_pm4_free_state(sctx, pm4, ~0);
 
sctx->b.flags |= SI_CONTEXT_CS_PARTIAL_FLUSH |
-SI_CONTEXT_INV_TC_L1 |
-SI_CONTEXT_INV_TC_L2 |
+SI_CONTEXT_INV_VMEM_L1 |
+SI_CONTEXT_INV_GLOBAL_L2 |
 SI_CONTEXT_INV_ICACHE |
-SI_CONTEXT_INV_KCACHE |
+SI_CONTEXT_INV_SMEM_L1 |
 SI_CONTEXT_FLAG_COMPUTE;
si_emit_cache_flush(sctx, NULL);
 }
diff --git a/src/gallium/drivers/radeonsi/si_cp_dma.c 
b/src/gallium/drivers/radeonsi/si_cp_dma.c
index 55d423a..b547b89 100644
--- a/src/gallium/drivers/radeonsi/si_cp_dma.c
+++ b/src/gallium/drivers/radeonsi/si_cp_dma.c
@@ -112,9 +112,9 @@ static unsigned get_flush_flags(struct si_context *sctx, 
bool is_framebuffer)
if (is_framebuffer)
return SI_CONTEXT_FLUSH_AND_INV_FRAMEBUFFER;
 
-   return SI_CONTEXT_INV_TC_L1 |
-  (sctx->b.chip_class == SI ? SI_CONTEXT_INV_TC_L2 : 0) |
-  SI_CONTEXT_INV_KCACHE;
+   return SI_CONTEXT_INV_SMEM_L1 |
+  SI_CONTEXT_INV_VMEM_L1 |
+  (sctx->b.chip_class == SI ? SI_CONTEXT_INV_GLOBAL_L2 : 0);
 }
 
 static unsigned get_tc_l2_flag(struct si_context *sctx, bool is_framebuffer)
diff --git a/src/gallium/drivers/radeonsi/si_descriptors.c 
b/src/gallium/drivers/radeonsi/si_descriptors.c
index a8ff6f2..b4dc3cb 100644
--- a/src/gallium/drivers/radeonsi/si_descriptors.c
+++ b/src/gallium/drivers/radeonsi/si_descriptors.c
@@ -670,8 +670,8 @@ static void si_set_streamout_targets(struct pipe_context 
*ctx,
 * VS_PARTIAL_FLUSH is required if the buffers are going to be
 * used as an input immediately.
 */
-   sctx->b.flags |= SI_CONTEXT_INV_KCACHE |
-SI_CONTEXT_INV_TC_L1 |
+   sctx->b.flags |= SI_CONTEXT_INV_SMEM_L1 |
+SI_CONTEXT_INV_VMEM_L1 |
 SI_CONTEXT_VS_PARTIAL_FLUSH;
}
 
diff --git a/src/gallium/drivers/radeonsi/si_hw_context.c 
b/src/gallium/drivers/radeonsi/si_hw_context.c
index 7c147e2..9b8bdf5 100644
--- a/src/gallium/drivers/radeonsi/si_hw_context.c
+++ b/src/gallium/drivers/radeonsi/si_hw_context.c
@@ -73,8 +73,8 @@ void si_context_gfx_flush(void *context, unsigned flags,
r600_preflush_suspend_features(>b);
 
ctx->b.flags |= SI_CONTEXT_FLUSH_AND_INV_FRAMEBUFFER |
-   SI_CONTEXT_INV_TC_L1 |
-   SI_CONTEXT_INV_TC_L2 |
+   SI_CONTEXT_INV_VMEM_L1 |
+   SI_CONTEXT_INV_GLOBAL_L2 |
/* this is probably not needed anymore */
SI_CONTEXT_PS_PARTIAL_FLUSH;
si_emit_cache_flush(ctx, NULL);
@@ -144,9 +144,9 @@ void si_begin_new_cs(struct si_context *ctx)
 
/* Flush read caches at the beginning of CS. */
ctx->b.flags |= SI_CONTEXT_FLUSH_AND_INV_FRAMEBUFFER |
-   SI_CONTEXT_INV_TC_L1 |
-   SI_CONTEXT_INV_TC_L2 |
-   SI_CONTEXT_INV_KCACHE |
+   SI_CONTEXT_INV_VMEM_L1 |
+   

[Mesa-dev] [PATCH 5/7] gallium/radeon: remove the IB flushing flag

2015-11-08 Thread Marek Olšák
From: Marek Olšák 

Not needed anymore. A similar flag will be introduced in the next commit,
which will be private in radeonsi.
---
 src/gallium/drivers/r600/r600_hw_context.c| 3 ---
 src/gallium/drivers/radeon/r600_pipe_common.c | 9 ++---
 src/gallium/drivers/radeon/r600_pipe_common.h | 1 -
 src/gallium/drivers/radeonsi/si_hw_context.c  | 3 ---
 4 files changed, 2 insertions(+), 14 deletions(-)

diff --git a/src/gallium/drivers/r600/r600_hw_context.c 
b/src/gallium/drivers/r600/r600_hw_context.c
index cf8a07f..1cffc34 100644
--- a/src/gallium/drivers/r600/r600_hw_context.c
+++ b/src/gallium/drivers/r600/r600_hw_context.c
@@ -256,8 +256,6 @@ void r600_context_gfx_flush(void *context, unsigned flags,
if (cs->cdw == ctx->b.initial_gfx_cs_size && !fence)
return;
 
-   ctx->b.rings.gfx.flushing = true;
-
r600_preflush_suspend_features(>b);
 
/* flush the framebuffer cache */
@@ -283,7 +281,6 @@ void r600_context_gfx_flush(void *context, unsigned flags,
 
/* Flush the CS. */
ctx->b.ws->cs_flush(cs, flags, fence, ctx->screen->b.cs_count++);
-   ctx->b.rings.gfx.flushing = false;
 
r600_begin_new_cs(ctx);
 }
diff --git a/src/gallium/drivers/radeon/r600_pipe_common.c 
b/src/gallium/drivers/radeon/r600_pipe_common.c
index e7179dc..daa325d 100644
--- a/src/gallium/drivers/radeon/r600_pipe_common.c
+++ b/src/gallium/drivers/radeon/r600_pipe_common.c
@@ -221,13 +221,8 @@ static void r600_flush_dma_ring(void *ctx, unsigned flags,
struct r600_common_context *rctx = (struct r600_common_context *)ctx;
struct radeon_winsys_cs *cs = rctx->rings.dma.cs;
 
-   if (!cs->cdw)
-   goto done;
-
-   rctx->rings.dma.flushing = true;
-   rctx->ws->cs_flush(cs, flags, >last_sdma_fence, 0);
-   rctx->rings.dma.flushing = false;
-done:
+   if (cs->cdw)
+   rctx->ws->cs_flush(cs, flags, >last_sdma_fence, 0);
if (fence)
rctx->ws->fence_reference(fence, rctx->last_sdma_fence);
 }
diff --git a/src/gallium/drivers/radeon/r600_pipe_common.h 
b/src/gallium/drivers/radeon/r600_pipe_common.h
index b7f1a23..9fae5c8 100644
--- a/src/gallium/drivers/radeon/r600_pipe_common.h
+++ b/src/gallium/drivers/radeon/r600_pipe_common.h
@@ -365,7 +365,6 @@ struct r600_streamout {
 
 struct r600_ring {
struct radeon_winsys_cs *cs;
-   boolflushing;
void (*flush)(void *ctx, unsigned flags,
  struct pipe_fence_handle **fence);
 };
diff --git a/src/gallium/drivers/radeonsi/si_hw_context.c 
b/src/gallium/drivers/radeonsi/si_hw_context.c
index 7d0e6d4..8eade11 100644
--- a/src/gallium/drivers/radeonsi/si_hw_context.c
+++ b/src/gallium/drivers/radeonsi/si_hw_context.c
@@ -73,8 +73,6 @@ void si_context_gfx_flush(void *context, unsigned flags,
return;
}
 
-   ctx->b.rings.gfx.flushing = true;
-
r600_preflush_suspend_features(>b);
 
ctx->b.flags |= SI_CONTEXT_FLUSH_AND_INV_FRAMEBUFFER |
@@ -116,7 +114,6 @@ void si_context_gfx_flush(void *context, unsigned flags,
/* Flush the CS. */
ws->cs_flush(cs, flags, >last_gfx_fence,
 ctx->screen->b.cs_count++);
-   ctx->b.rings.gfx.flushing = false;
 
if (fence)
ws->fence_reference(fence, ctx->last_gfx_fence);
-- 
2.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/7] radeonsi: initialize SX_PS_DOWNCONVERT to 0 on Stoney

2015-11-08 Thread Marek Olšák
From: Marek Olšák 

otherwise the SX or CB blocks can go bananas
---
 src/gallium/drivers/radeonsi/si_state.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/src/gallium/drivers/radeonsi/si_state.c 
b/src/gallium/drivers/radeonsi/si_state.c
index eba9c61..6d97049 100644
--- a/src/gallium/drivers/radeonsi/si_state.c
+++ b/src/gallium/drivers/radeonsi/si_state.c
@@ -3444,6 +3444,9 @@ static void si_init_config(struct si_context *sctx)
si_pm4_set_reg(pm4, R_028C5C_VGT_OUT_DEALLOC_CNTL, 32);
}
 
+   if (sctx->b.family == CHIP_STONEY)
+   si_pm4_set_reg(pm4, R_028754_SX_PS_DOWNCONVERT, 0);
+
si_pm4_set_reg(pm4, R_028080_TA_BC_BASE_ADDR, border_color_va >> 8);
if (sctx->b.chip_class >= CIK)
si_pm4_set_reg(pm4, R_028084_TA_BC_BASE_ADDR_HI, 
border_color_va >> 40);
-- 
2.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/6] radeonsi: fix a future crash in emit_cb_target_mask

2015-11-08 Thread Marek Olšák
From: Marek Olšák 

This can't crash currently, but it would crash if clear_buffer
from u_blitter were used with a clean context.
---
 src/gallium/drivers/radeonsi/si_state.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/gallium/drivers/radeonsi/si_state.c 
b/src/gallium/drivers/radeonsi/si_state.c
index 18b6405..eba9c61 100644
--- a/src/gallium/drivers/radeonsi/si_state.c
+++ b/src/gallium/drivers/radeonsi/si_state.c
@@ -265,7 +265,7 @@ static void si_emit_cb_target_mask(struct si_context *sctx, 
struct r600_atom *at
 *
 * Reproducible with Unigine Heaven 4.0 and drirc missing.
 */
-   if (blend->dual_src_blend &&
+   if (blend && blend->dual_src_blend &&
sctx->ps_shader.cso &&
(sctx->ps_shader.cso->ps_colors_written & 0x3) != 0x3)
mask = 0;
-- 
2.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/6] r600g: fix clear_buffer fallback with offset != 0

2015-11-08 Thread Marek Olšák
From: Marek Olšák 

Discovered by luck. This code path hasn't been exercised since transform
feedback was implemented.
---
 src/gallium/drivers/r600/r600_blit.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/gallium/drivers/r600/r600_blit.c 
b/src/gallium/drivers/r600/r600_blit.c
index aede840..90a1453 100644
--- a/src/gallium/drivers/r600/r600_blit.c
+++ b/src/gallium/drivers/r600/r600_blit.c
@@ -604,6 +604,7 @@ static void r600_clear_buffer(struct pipe_context *ctx, 
struct pipe_resource *ds
} else {
uint32_t *map = r600_buffer_map_sync_with_rings(>b, 
r600_resource(dst),
 
PIPE_TRANSFER_WRITE);
+   map += offset / 4;
size /= 4;
for (unsigned i = 0; i < size; i++)
*map++ = value;
-- 
2.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 5/6] radeonsi: add SI_SAVE_FRAGMENT_STATE blitter flag

2015-11-08 Thread Marek Olšák
From: Marek Olšák 

Buffer clears via transform feedback won't set this.
---
 src/gallium/drivers/radeonsi/si_blit.c | 44 +++---
 1 file changed, 25 insertions(+), 19 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_blit.c 
b/src/gallium/drivers/radeonsi/si_blit.c
index fce014a..d320ac4 100644
--- a/src/gallium/drivers/radeonsi/si_blit.c
+++ b/src/gallium/drivers/radeonsi/si_blit.c
@@ -29,20 +29,23 @@ enum si_blitter_op /* bitmask */
 {
SI_SAVE_TEXTURES  = 1,
SI_SAVE_FRAMEBUFFER   = 2,
-   SI_DISABLE_RENDER_COND = 4,
+   SI_SAVE_FRAGMENT_STATE = 4,
+   SI_DISABLE_RENDER_COND = 8,
 
-   SI_CLEAR = 0,
+   SI_CLEAR = SI_SAVE_FRAGMENT_STATE,
 
-   SI_CLEAR_SURFACE = SI_SAVE_FRAMEBUFFER,
+   SI_CLEAR_SURFACE = SI_SAVE_FRAMEBUFFER | SI_SAVE_FRAGMENT_STATE,
 
SI_COPY  = SI_SAVE_FRAMEBUFFER | SI_SAVE_TEXTURES |
-  SI_DISABLE_RENDER_COND,
+  SI_SAVE_FRAGMENT_STATE | SI_DISABLE_RENDER_COND,
 
-   SI_BLIT  = SI_SAVE_FRAMEBUFFER | SI_SAVE_TEXTURES,
+   SI_BLIT  = SI_SAVE_FRAMEBUFFER | SI_SAVE_TEXTURES |
+  SI_SAVE_FRAGMENT_STATE,
 
-   SI_DECOMPRESS= SI_SAVE_FRAMEBUFFER | SI_DISABLE_RENDER_COND,
+   SI_DECOMPRESS= SI_SAVE_FRAMEBUFFER | SI_SAVE_FRAGMENT_STATE |
+  SI_DISABLE_RENDER_COND,
 
-   SI_COLOR_RESOLVE = SI_SAVE_FRAMEBUFFER
+   SI_COLOR_RESOLVE = SI_SAVE_FRAMEBUFFER | SI_SAVE_FRAGMENT_STATE
 };
 
 static void si_blitter_begin(struct pipe_context *ctx, enum si_blitter_op op)
@@ -51,22 +54,25 @@ static void si_blitter_begin(struct pipe_context *ctx, enum 
si_blitter_op op)
 
r600_suspend_nontimer_queries(>b);
 
-   util_blitter_save_blend(sctx->blitter, sctx->queued.named.blend);
-   util_blitter_save_depth_stencil_alpha(sctx->blitter, 
sctx->queued.named.dsa);
-   util_blitter_save_stencil_ref(sctx->blitter, >stencil_ref.state);
-   util_blitter_save_rasterizer(sctx->blitter, 
sctx->queued.named.rasterizer);
-   util_blitter_save_fragment_shader(sctx->blitter, sctx->ps_shader.cso);
-   util_blitter_save_geometry_shader(sctx->blitter, sctx->gs_shader.cso);
+   util_blitter_save_vertex_buffer_slot(sctx->blitter, 
sctx->vertex_buffer);
+   util_blitter_save_vertex_elements(sctx->blitter, sctx->vertex_elements);
+   util_blitter_save_vertex_shader(sctx->blitter, sctx->vs_shader.cso);
util_blitter_save_tessctrl_shader(sctx->blitter, sctx->tcs_shader.cso);
util_blitter_save_tesseval_shader(sctx->blitter, sctx->tes_shader.cso);
-   util_blitter_save_vertex_shader(sctx->blitter, sctx->vs_shader.cso);
-   util_blitter_save_vertex_elements(sctx->blitter, sctx->vertex_elements);
-   util_blitter_save_sample_mask(sctx->blitter, 
sctx->sample_mask.sample_mask);
-   util_blitter_save_viewport(sctx->blitter, >viewports.states[0]);
-   util_blitter_save_scissor(sctx->blitter, >scissors.states[0]);
-   util_blitter_save_vertex_buffer_slot(sctx->blitter, 
sctx->vertex_buffer);
+   util_blitter_save_geometry_shader(sctx->blitter, sctx->gs_shader.cso);
util_blitter_save_so_targets(sctx->blitter, 
sctx->b.streamout.num_targets,
 (struct 
pipe_stream_output_target**)sctx->b.streamout.targets);
+   util_blitter_save_rasterizer(sctx->blitter, 
sctx->queued.named.rasterizer);
+
+   if (op & SI_SAVE_FRAGMENT_STATE) {
+   util_blitter_save_blend(sctx->blitter, 
sctx->queued.named.blend);
+   util_blitter_save_depth_stencil_alpha(sctx->blitter, 
sctx->queued.named.dsa);
+   util_blitter_save_stencil_ref(sctx->blitter, 
>stencil_ref.state);
+   util_blitter_save_fragment_shader(sctx->blitter, 
sctx->ps_shader.cso);
+   util_blitter_save_sample_mask(sctx->blitter, 
sctx->sample_mask.sample_mask);
+   util_blitter_save_viewport(sctx->blitter, 
>viewports.states[0]);
+   util_blitter_save_scissor(sctx->blitter, 
>scissors.states[0]);
+   }
 
if (op & SI_SAVE_FRAMEBUFFER)
util_blitter_save_framebuffer(sctx->blitter, 
>framebuffer.state);
-- 
2.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 4/6] gallium/u_blitter: add support for multi-dword clear values in clear_buffer

2015-11-08 Thread Marek Olšák
From: Marek Olšák 

---
 src/gallium/auxiliary/util/u_blitter.c | 25 ++---
 1 file changed, 14 insertions(+), 11 deletions(-)

diff --git a/src/gallium/auxiliary/util/u_blitter.c 
b/src/gallium/auxiliary/util/u_blitter.c
index b7b1ece..fccc92c 100644
--- a/src/gallium/auxiliary/util/u_blitter.c
+++ b/src/gallium/auxiliary/util/u_blitter.c
@@ -70,7 +70,7 @@ struct blitter_context_priv
/* Constant state objects. */
/* Vertex shaders. */
void *vs; /**< Vertex shader which passes {pos, generic} to the output.*/
-   void *vs_pos_only; /**< Vertex shader which passes pos to the output.*/
+   void *vs_pos_only[4]; /**< Vertex shader which passes pos to the output.*/
void *vs_layered; /**< Vertex shader which sets LAYER = INSTANCEID. */
 
/* Fragment shaders. */
@@ -325,27 +325,29 @@ struct blitter_context *util_blitter_create(struct 
pipe_context *pipe)
return >base;
 }
 
-static void bind_vs_pos_only(struct blitter_context_priv *ctx)
+static void bind_vs_pos_only(struct blitter_context_priv *ctx,
+ unsigned num_so_channels)
 {
struct pipe_context *pipe = ctx->base.pipe;
+   int index = num_so_channels ? num_so_channels - 1 : 0;
 
-   if (!ctx->vs_pos_only) {
+   if (!ctx->vs_pos_only[index]) {
   struct pipe_stream_output_info so;
   const uint semantic_names[] = { TGSI_SEMANTIC_POSITION };
   const uint semantic_indices[] = { 0 };
 
   memset(, 0, sizeof(so));
   so.num_outputs = 1;
-  so.output[0].num_components = 1;
-  so.stride[0] = 1;
+  so.output[0].num_components = num_so_channels;
+  so.stride[0] = num_so_channels;
 
-  ctx->vs_pos_only =
+  ctx->vs_pos_only[index] =
  util_make_vertex_passthrough_shader_with_so(pipe, 1, semantic_names,
  semantic_indices, FALSE,
  );
}
 
-   pipe->bind_vs_state(pipe, ctx->vs_pos_only);
+   pipe->bind_vs_state(pipe, ctx->vs_pos_only[index]);
 }
 
 static void bind_vs_passthrough(struct blitter_context_priv *ctx)
@@ -441,8 +443,9 @@ void util_blitter_destroy(struct blitter_context *blitter)
   pipe->delete_rasterizer_state(pipe, ctx->rs_discard_state);
if (ctx->vs)
   pipe->delete_vs_state(pipe, ctx->vs);
-   if (ctx->vs_pos_only)
-  pipe->delete_vs_state(pipe, ctx->vs_pos_only);
+   for (i = 0; i < 4; i++)
+  if (ctx->vs_pos_only[i])
+ pipe->delete_vs_state(pipe, ctx->vs_pos_only[i]);
if (ctx->vs_layered)
   pipe->delete_vs_state(pipe, ctx->vs_layered);
pipe->delete_vertex_elements_state(pipe, ctx->velem_state);
@@ -2036,7 +2039,7 @@ void util_blitter_copy_buffer(struct blitter_context 
*blitter,
 
pipe->set_vertex_buffers(pipe, ctx->base.vb_slot, 1, );
pipe->bind_vertex_elements_state(pipe, ctx->velem_state_readbuf[0]);
-   bind_vs_pos_only(ctx);
+   bind_vs_pos_only(ctx, 1);
if (ctx->has_geometry_shader)
   pipe->bind_gs_state(pipe, NULL);
if (ctx->has_tessellation) {
@@ -2103,7 +2106,7 @@ void util_blitter_clear_buffer(struct blitter_context 
*blitter,
pipe->set_vertex_buffers(pipe, ctx->base.vb_slot, 1, );
pipe->bind_vertex_elements_state(pipe,
 ctx->velem_state_readbuf[num_channels-1]);
-   bind_vs_pos_only(ctx);
+   bind_vs_pos_only(ctx, num_channels);
if (ctx->has_geometry_shader)
   pipe->bind_gs_state(pipe, NULL);
if (ctx->has_tessellation) {
-- 
2.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 6/6] radeonsi: add glClearBufferSubData acceleration

2015-11-08 Thread Marek Olšák
From: Marek Olšák 

Unaligned 8-bit and 16-bit clears are done in software.
---
 src/gallium/drivers/radeonsi/si_blit.c | 60 ++
 1 file changed, 60 insertions(+)

diff --git a/src/gallium/drivers/radeonsi/si_blit.c 
b/src/gallium/drivers/radeonsi/si_blit.c
index d320ac4..31f22c4 100644
--- a/src/gallium/drivers/radeonsi/si_blit.c
+++ b/src/gallium/drivers/radeonsi/si_blit.c
@@ -737,9 +737,69 @@ static void si_flush_resource(struct pipe_context *ctx,
}
 }
 
+static void si_pipe_clear_buffer(struct pipe_context *ctx,
+struct pipe_resource *dst,
+unsigned offset, unsigned size,
+const void *clear_value_ptr,
+int clear_value_size)
+{
+   struct si_context *sctx = (struct si_context*)ctx;
+   uint32_t dword_value;
+   unsigned i;
+
+   assert(offset % clear_value_size == 0);
+   assert(size % clear_value_size == 0);
+
+   if (clear_value_size > 4) {
+   const uint32_t *u32 = clear_value_ptr;
+   bool clear_dword_duplicated = true;
+
+   /* See if we can lower large fills to dword fills. */
+   for (i = 1; i < clear_value_size / 4; i++)
+   if (u32[0] != u32[i]) {
+   clear_dword_duplicated = false;
+   break;
+   }
+
+   if (!clear_dword_duplicated) {
+   /* Use transform feedback for 64-bit, 96-bit, and
+* 128-bit fills.
+*/
+   union pipe_color_union clear_value;
+
+   memcpy(_value, clear_value_ptr, clear_value_size);
+   si_blitter_begin(ctx, SI_DISABLE_RENDER_COND);
+   util_blitter_clear_buffer(sctx->blitter, dst, offset,
+ size, clear_value_size / 4,
+ _value);
+   si_blitter_end(ctx);
+   return;
+   }
+   }
+
+   /* Expand the clear value to a dword. */
+   switch (clear_value_size) {
+   case 1:
+   dword_value = *(uint8_t*)clear_value_ptr;
+   dword_value |= (dword_value << 8) |
+  (dword_value << 16) |
+  (dword_value << 24);
+   break;
+   case 2:
+   dword_value = *(uint16_t*)clear_value_ptr;
+   dword_value |= dword_value << 16;
+   break;
+   default:
+   dword_value = *(uint32_t*)clear_value_ptr;
+   }
+
+   sctx->b.clear_buffer(ctx, dst, offset, size, dword_value, false);
+}
+
 void si_init_blit_functions(struct si_context *sctx)
 {
sctx->b.b.clear = si_clear;
+   sctx->b.b.clear_buffer = si_pipe_clear_buffer;
sctx->b.b.clear_render_target = si_clear_render_target;
sctx->b.b.clear_depth_stencil = si_clear_depth_stencil;
sctx->b.b.resource_copy_region = si_resource_copy_region;
-- 
2.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 4/7] gallium/radeon: move GFX/DMA flushing from add_to_buffer_list to need_cs_space

2015-11-08 Thread Marek Olšák
From: Marek Olšák 

need_cs_space isn't invoked so often and is called before all commands too.
This is a lot cleaner. The code in radeon_add_to_buffer_list always seemed
dodgy to me.
---
 src/gallium/drivers/r600/r600_hw_context.c|  5 +
 src/gallium/drivers/radeon/r600_cs.h  | 15 ---
 src/gallium/drivers/radeon/r600_pipe_common.c |  4 
 src/gallium/drivers/radeonsi/si_hw_context.c  |  5 +
 4 files changed, 14 insertions(+), 15 deletions(-)

diff --git a/src/gallium/drivers/r600/r600_hw_context.c 
b/src/gallium/drivers/r600/r600_hw_context.c
index 6f11366..cf8a07f 100644
--- a/src/gallium/drivers/r600/r600_hw_context.c
+++ b/src/gallium/drivers/r600/r600_hw_context.c
@@ -33,6 +33,11 @@
 void r600_need_cs_space(struct r600_context *ctx, unsigned num_dw,
boolean count_draw_in)
 {
+   struct radeon_winsys_cs *dma = ctx->b.rings.dma.cs;
+
+   /* Flush the DMA IB if it's not empty. */
+   if (dma && dma->cdw)
+   ctx->b.rings.dma.flush(ctx, RADEON_FLUSH_ASYNC, NULL);
 
if (!ctx->b.ws->cs_memory_below_limit(ctx->b.rings.gfx.cs, ctx->b.vram, 
ctx->b.gtt)) {
ctx->b.gtt = 0;
diff --git a/src/gallium/drivers/radeon/r600_cs.h 
b/src/gallium/drivers/radeon/r600_cs.h
index b5a1daf..ad067ce 100644
--- a/src/gallium/drivers/radeon/r600_cs.h
+++ b/src/gallium/drivers/radeon/r600_cs.h
@@ -50,21 +50,6 @@ static inline unsigned radeon_add_to_buffer_list(struct 
r600_common_context *rct
 enum radeon_bo_priority 
priority)
 {
assert(usage);
-
-   /* Make sure that all previous rings are flushed so that everything
-* looks serialized from the driver point of view.
-*/
-   if (!ring->flushing) {
-   if (ring == >rings.gfx) {
-   if (rctx->rings.dma.cs) {
-   /* flush dma ring */
-   rctx->rings.dma.flush(rctx, RADEON_FLUSH_ASYNC, 
NULL);
-   }
-   } else {
-   /* flush gfx ring */
-   rctx->rings.gfx.flush(rctx, RADEON_FLUSH_ASYNC, NULL);
-   }
-   }
return rctx->ws->cs_add_buffer(ring->cs, rbo->cs_buf, usage,
  rbo->domains, priority) * 4;
 }
diff --git a/src/gallium/drivers/radeon/r600_pipe_common.c 
b/src/gallium/drivers/radeon/r600_pipe_common.c
index 79e624e..e7179dc 100644
--- a/src/gallium/drivers/radeon/r600_pipe_common.c
+++ b/src/gallium/drivers/radeon/r600_pipe_common.c
@@ -117,6 +117,10 @@ void r600_draw_rectangle(struct blitter_context *blitter,
 
 void r600_need_dma_space(struct r600_common_context *ctx, unsigned num_dw)
 {
+   /* Flush the GFX IB if it's not empty. */
+   if (ctx->rings.gfx.cs->cdw > ctx->initial_gfx_cs_size)
+   ctx->rings.gfx.flush(ctx, RADEON_FLUSH_ASYNC, NULL);
+
/* Flush if there's not enough space. */
if ((num_dw + ctx->rings.dma.cs->cdw) > ctx->rings.dma.cs->max_dw) {
ctx->rings.dma.flush(ctx, RADEON_FLUSH_ASYNC, NULL);
diff --git a/src/gallium/drivers/radeonsi/si_hw_context.c 
b/src/gallium/drivers/radeonsi/si_hw_context.c
index 9b8bdf5..7d0e6d4 100644
--- a/src/gallium/drivers/radeonsi/si_hw_context.c
+++ b/src/gallium/drivers/radeonsi/si_hw_context.c
@@ -30,6 +30,11 @@
 void si_need_cs_space(struct si_context *ctx)
 {
struct radeon_winsys_cs *cs = ctx->b.rings.gfx.cs;
+   struct radeon_winsys_cs *dma = ctx->b.rings.dma.cs;
+
+   /* Flush the DMA IB if it's not empty. */
+   if (dma && dma->cdw)
+   ctx->b.rings.dma.flush(ctx, RADEON_FLUSH_ASYNC, NULL);
 
/* There are two memory usage counters in the winsys for all buffers
 * that have been added (cs_add_buffer) and two counters in the pipe
-- 
2.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/7] radeonsi: set the DISABLE_WR_CONFIRM flag on CI-VI as well

2015-11-08 Thread Marek Olšák
From: Marek Olšák 

I missed this in commit c3e527f93d4281ad6e2ca165eaf6ff588e4faefa
radeonsi: only enable write confirmation on the last CP DMA packet
---
 src/gallium/drivers/radeonsi/si_cp_dma.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_cp_dma.c 
b/src/gallium/drivers/radeonsi/si_cp_dma.c
index 7b8c6d0..55d423a 100644
--- a/src/gallium/drivers/radeonsi/si_cp_dma.c
+++ b/src/gallium/drivers/radeonsi/si_cp_dma.c
@@ -64,7 +64,7 @@ static void si_emit_cp_dma_copy_buffer(struct si_context 
*sctx,
radeon_emit(cs, src_va >> 32);  /* SRC_ADDR_HI [31:0] */
radeon_emit(cs, dst_va);/* DST_ADDR_LO [31:0] */
radeon_emit(cs, dst_va >> 32);  /* DST_ADDR_HI [31:0] */
-   radeon_emit(cs, size | raw_wait);   /* COMMAND [29:22] | 
BYTE_COUNT [20:0] */
+   radeon_emit(cs, size | wr_confirm |raw_wait);   /* COMMAND 
[29:22] | BYTE_COUNT [20:0] */
} else {
radeon_emit(cs, PKT3(PKT3_CP_DMA, 4, 0));
radeon_emit(cs, src_va);/* SRC_ADDR_LO 
[31:0] */
@@ -96,7 +96,7 @@ static void si_emit_cp_dma_clear_buffer(struct si_context 
*sctx,
radeon_emit(cs, 0);
radeon_emit(cs, dst_va);/* DST_ADDR_LO [31:0] */
radeon_emit(cs, dst_va >> 32);  /* DST_ADDR_HI [15:0] */
-   radeon_emit(cs, size | raw_wait);   /* COMMAND [29:22] | 
BYTE_COUNT [20:0] */
+   radeon_emit(cs, size | wr_confirm | raw_wait);  /* COMMAND 
[29:22] | BYTE_COUNT [20:0] */
} else {
radeon_emit(cs, PKT3(PKT3_CP_DMA, 4, 0));
radeon_emit(cs, clear_value);   /* DATA [31:0] */
-- 
2.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/6] radeonsi: fix unaligned clear_buffer fallback

2015-11-08 Thread Marek Olšák
From: Marek Olšák 

This is unreachable currently, but it will be used by unaligned 8-bit and
16-bit fills.
---
 src/gallium/drivers/radeonsi/si_cp_dma.c | 14 --
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_cp_dma.c 
b/src/gallium/drivers/radeonsi/si_cp_dma.c
index 418b2cf..7b8c6d0 100644
--- a/src/gallium/drivers/radeonsi/si_cp_dma.c
+++ b/src/gallium/drivers/radeonsi/si_cp_dma.c
@@ -176,12 +176,14 @@ static void si_clear_buffer(struct pipe_context *ctx, 
struct pipe_resource *dst,
 
/* Fallback for unaligned clears. */
if (offset % 4 != 0 || size % 4 != 0) {
-   uint32_t *map = 
sctx->b.ws->buffer_map(r600_resource(dst)->cs_buf,
-  sctx->b.rings.gfx.cs,
-  PIPE_TRANSFER_WRITE);
-   size /= 4;
-   for (unsigned i = 0; i < size; i++)
-   *map++ = value;
+   uint8_t *map = 
sctx->b.ws->buffer_map(r600_resource(dst)->cs_buf,
+ sctx->b.rings.gfx.cs,
+ PIPE_TRANSFER_WRITE);
+   map += offset;
+   for (unsigned i = 0; i < size; i++) {
+   unsigned byte_within_dword = (offset + i) % 4;
+   *map++ = (value >> (byte_within_dword * 8)) & 0xff;
+   }
return;
}
 
-- 
2.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH V2 11/12] glsl: add subroutine index qualifier support

2015-11-08 Thread Timothy Arceri
From: Timothy Arceri 

ARB_explicit_uniform_location allows the index for subroutine functions
to be explicitly set in the shader.

This patch reduces the restriction on the index qualifier in
validate_layout_qualifiers() to allow it to be applied to subroutines
and adds the new subroutine qualifier validation to ast_function::hir().

ast_fully_specified_type::has_qualifiers() is updated to allow the
index qualifier on subroutine functions when explicit uniform loctions
is available.

A new check is added to ast_type_qualifier::merge_qualifier() to stop
multiple function qualifiers from being defied, before this patch this
would cause a segfault.

Finally a new variable is added to ir_function_signature to store the
index. This value is validated and the non explicit values assigned in
link_assign_subroutine_types().
---
 src/glsl/ast.h |  2 +-
 src/glsl/ast_to_hir.cpp| 68 ++
 src/glsl/ast_type.cpp  | 14 -
 src/glsl/ir.cpp|  1 +
 src/glsl/ir.h  |  2 ++
 src/glsl/ir_clone.cpp  |  1 +
 src/glsl/linker.cpp| 33 
 src/mesa/main/mtypes.h |  1 +
 src/mesa/main/shader_query.cpp |  7 +
 9 files changed, 108 insertions(+), 21 deletions(-)

diff --git a/src/glsl/ast.h b/src/glsl/ast.h
index 4fd049c..e88a213 100644
--- a/src/glsl/ast.h
+++ b/src/glsl/ast.h
@@ -771,7 +771,7 @@ public:
 class ast_fully_specified_type : public ast_node {
 public:
virtual void print(void) const;
-   bool has_qualifiers() const;
+   bool has_qualifiers(_mesa_glsl_parse_state *state) const;
 
ast_fully_specified_type() : qualifier(), specifier(NULL)
{
diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp
index d55a99e..ea22f08 100644
--- a/src/glsl/ast_to_hir.cpp
+++ b/src/glsl/ast_to_hir.cpp
@@ -2607,28 +2607,37 @@ validate_layout_qualifiers(const struct 
ast_type_qualifier *qual,
   validate_explicit_location(qual, var, state, loc);
 
   if (qual->flags.q.explicit_index) {
- /* From the GLSL 4.30 specification, section 4.4.2 (Output
-  * Layout Qualifiers):
-  *
-  * "It is also a compile-time error if a fragment shader
-  *  sets a layout index to less than 0 or greater than 1."
-  *
-  * Older specifications don't mandate a behavior; we take
-  * this as a clarification and always generate the error.
-  */
- unsigned qual_index;
- if (process_qualifier_constant(state, loc, "index",
-qual->index, _index, 0) &&
- qual_index > 1) {
-_mesa_glsl_error(loc, state,
- "explicit index may only be 0 or 1");
+ /* Check if index was set for the uniform instead of the function */
+ if (qual->flags.q.subroutine) {
+_mesa_glsl_error(loc, state, "an index qualifier can only be "
+ "used with subroutine functions");
  } else {
-var->data.explicit_index = true;
-var->data.index = qual_index;
+
+/* From the GLSL 4.30 specification, section 4.4.2 (Output
+ * Layout Qualifiers):
+ *
+ * "It is also a compile-time error if a fragment shader
+ *  sets a layout index to less than 0 or greater than 1."
+ *
+ * Older specifications don't mandate a behavior; we take
+ * this as a clarification and always generate the error.
+ */
+unsigned qual_index;
+if (process_qualifier_constant(state, loc, "index",
+   qual->index, _index, 0) &&
+qual_index > 1) {
+   _mesa_glsl_error(loc, state,
+"explicit index may only be 0 or 1");
+} else {
+   var->data.explicit_index = true;
+   var->data.index = qual_index;
+}
  }
   }
} else if (qual->flags.q.explicit_index) {
-  _mesa_glsl_error(loc, state, "explicit index requires explicit 
location");
+  if (!qual->flags.q.subroutine_def)
+ _mesa_glsl_error(loc, state,
+  "explicit index requires explicit location");
}
 
if (qual->flags.q.explicit_binding) {
@@ -4851,7 +4860,7 @@ ast_function::hir(exec_list *instructions,
/* From page 56 (page 62 of the PDF) of the GLSL 1.30 spec:
 * "No qualifier is allowed on the return type of a function."
 */
-   if (this->return_type->has_qualifiers()) {
+   if (this->return_type->has_qualifiers(state)) {
   YYLTYPE loc = this->get_location();
   _mesa_glsl_error(& loc, state,
"function `%s' return type has qualifiers", name);
@@ -4983,6 +4992,27 @@ ast_function::hir(exec_list *instructions,
if 

[Mesa-dev] [PATCH V2 06/12] glsl: remove layout qualifier validation from the parser

2015-11-08 Thread Timothy Arceri
From: Timothy Arceri 

Now that we have added validation elsewhere we can remove it from
the parser.
---
 src/glsl/glsl_parser.yy | 100 +++-
 1 file changed, 23 insertions(+), 77 deletions(-)

diff --git a/src/glsl/glsl_parser.yy b/src/glsl/glsl_parser.yy
index 4636435..44853b0 100644
--- a/src/glsl/glsl_parser.yy
+++ b/src/glsl/glsl_parser.yy
@@ -1446,6 +1446,7 @@ layout_qualifier_id:
 
   if (match_layout_qualifier("location", $1, state) == 0) {
  $$.flags.q.explicit_location = 1;
+ $$.location = $3;
 
  if ($$.flags.q.attribute == 1 &&
  state->ARB_explicit_attrib_location_warn) {
@@ -1453,24 +1454,11 @@ layout_qualifier_id:
"GL_ARB_explicit_attrib_location layout "
"identifier `%s' used", $1);
  }
-
- if ($3 >= 0) {
-$$.location = $3;
- } else {
- _mesa_glsl_error(& @3, state, "invalid location %d specified", 
$3);
- YYERROR;
- }
   }
 
   if (match_layout_qualifier("index", $1, state) == 0) {
  $$.flags.q.explicit_index = 1;
-
- if ($3 >= 0) {
-$$.index = $3;
- } else {
-_mesa_glsl_error(& @3, state, "invalid index %d specified", $3);
-YYERROR;
- }
+ $$.index = $3;
   }
 
   if ((state->has_420pack() ||
@@ -1489,18 +1477,12 @@ layout_qualifier_id:
 
   if (match_layout_qualifier("max_vertices", $1, state) == 0) {
  $$.flags.q.max_vertices = 1;
+ $$.max_vertices = $3;
 
- if ($3 < 0) {
-_mesa_glsl_error(& @3, state,
- "invalid max_vertices %d specified", $3);
-YYERROR;
- } else {
-$$.max_vertices = $3;
-if (!state->is_version(150, 0)) {
-   _mesa_glsl_error(& @3, state,
-"#version 150 max_vertices qualifier "
-"specified", $3);
-}
+ if (!state->is_version(150, 0)) {
+_mesa_glsl_error(& @1, state,
+ "#version 150 max_vertices qualifier "
+ "specified", $3);
  }
   }
 
@@ -1508,15 +1490,8 @@ layout_qualifier_id:
  if (match_layout_qualifier("stream", $1, state) == 0 &&
  state->check_explicit_attrib_stream_allowed(& @3)) {
 $$.flags.q.stream = 1;
-
-if ($3 < 0) {
-   _mesa_glsl_error(& @3, state,
-"invalid stream %d specified", $3);
-   YYERROR;
-} else {
-   $$.flags.q.explicit_stream = 1;
-   $$.stream = $3;
-}
+$$.flags.q.explicit_stream = 1;
+$$.stream = $3;
  }
   }
 
@@ -1528,13 +1503,8 @@ layout_qualifier_id:
   for (int i = 0; i < 3; i++) {
  if (match_layout_qualifier(local_size_qualifiers[i], $1,
 state) == 0) {
-if ($3 <= 0) {
-   _mesa_glsl_error(& @3, state,
-"invalid %s of %d specified",
-local_size_qualifiers[i], $3);
-   YYERROR;
-} else if (!state->has_compute_shader()) {
-   _mesa_glsl_error(& @3, state,
+if (!state->has_compute_shader()) {
+   _mesa_glsl_error(& @1, state,
 "%s qualifier requires GLSL 4.30 or "
 "GLSL ES 3.10 or ARB_compute_shader",
 local_size_qualifiers[i]);
@@ -1549,48 +1519,24 @@ layout_qualifier_id:
 
   if (match_layout_qualifier("invocations", $1, state) == 0) {
  $$.flags.q.invocations = 1;
-
- if ($3 <= 0) {
-_mesa_glsl_error(& @3, state,
- "invalid invocations %d specified", $3);
-YYERROR;
- } else if ($3 > MAX_GEOMETRY_SHADER_INVOCATIONS) {
-_mesa_glsl_error(& @3, state,
- "invocations (%d) exceeds "
- "GL_MAX_GEOMETRY_SHADER_INVOCATIONS", $3);
-YYERROR;
- } else {
-$$.invocations = $3;
-if (!state->is_version(400, 0) &&
-!state->ARB_gpu_shader5_enable) {
-   _mesa_glsl_error(& @3, state,
-"GL_ARB_gpu_shader5 invocations "
-"qualifier specified", $3);
-}
+ $$.invocations = $3;
+ if (!state->is_version(400, 0) &&
+ !state->ARB_gpu_shader5_enable) {
+_mesa_glsl_error(& @1, state,
+ "GL_ARB_gpu_shader5 invocations "
+ "qualifier specified", $3);
  }
   }
 
   /* Layout qualifiers 

[Mesa-dev] [PATCH V2 04/12] glsl: add layout qualifier validation to the ast for vars

2015-11-08 Thread Timothy Arceri
From: Timothy Arceri 

This is in preparation for compile-time constant support,
a later patch will remove validation from the parser.
---
 src/glsl/ast.h  |  2 ++
 src/glsl/ast_to_hir.cpp | 93 ++---
 2 files changed, 60 insertions(+), 35 deletions(-)

diff --git a/src/glsl/ast.h b/src/glsl/ast.h
index e803e6d..afd2d41 100644
--- a/src/glsl/ast.h
+++ b/src/glsl/ast.h
@@ -553,6 +553,8 @@ struct ast_type_qualifier {
   uint64_t i;
} flags;
 
+   struct YYLTYPE *loc;
+
/** Precision of the type (highp/medium/lowp). */
unsigned precision:2;
 
diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp
index 5a22820..0cea607 100644
--- a/src/glsl/ast_to_hir.cpp
+++ b/src/glsl/ast_to_hir.cpp
@@ -2414,6 +2414,12 @@ validate_explicit_location(const struct 
ast_type_qualifier *qual,
 {
bool fail = false;
 
+   if (qual->location < 0) {
+   _mesa_glsl_error(loc, state, "invalid location %d specified",
+qual->location);
+   return;
+   }
+
/* Checks for GL_ARB_explicit_uniform_location. */
if (qual->flags.q.uniform) {
   if (!state->check_explicit_uniform_location_allowed(loc, var))
@@ -2537,6 +2543,18 @@ validate_explicit_location(const struct 
ast_type_qualifier *qual,
  assert(!"Unexpected shader type");
  break;
   }
+   }
+}
+
+static void
+validate_layout_qualifiers(const struct ast_type_qualifier *qual,
+   ir_variable *var,
+   struct _mesa_glsl_parse_state *state,
+   YYLTYPE *loc)
+{
+   if (qual->flags.q.explicit_location) {
+
+  validate_explicit_location(qual, var, state, loc);
 
   if (qual->flags.q.explicit_index) {
  /* From the GLSL 4.30 specification, section 4.4.2 (Output
@@ -2556,6 +2574,38 @@ validate_explicit_location(const struct 
ast_type_qualifier *qual,
 var->data.index = qual->index;
  }
   }
+   } else if (qual->flags.q.explicit_index) {
+  _mesa_glsl_error(loc, state, "explicit index requires explicit 
location");
+   }
+
+   if (qual->flags.q.explicit_binding &&
+   validate_binding_qualifier(state, loc, var->type, qual)) {
+  var->data.explicit_binding = true;
+  var->data.binding = qual->binding;
+   }
+
+   if (var->type->contains_atomic()) {
+  if (var->data.mode == ir_var_uniform) {
+ if (var->data.explicit_binding) {
+unsigned *offset =
+   >atomic_counter_offsets[var->data.binding];
+
+if (*offset % ATOMIC_COUNTER_SIZE)
+   _mesa_glsl_error(loc, state,
+"misaligned atomic counter offset");
+
+var->data.atomic.offset = *offset;
+*offset += var->type->atomic_size();
+
+ } else {
+_mesa_glsl_error(loc, state,
+ "atomic counters require explicit binding point");
+ }
+  } else if (var->data.mode != ir_var_function_in) {
+ _mesa_glsl_error(loc, state, "atomic counters may only be declared as 
"
+  "function parameters or uniform-qualified "
+  "global variables");
+  }
}
 }
 
@@ -2935,41 +2985,7 @@ apply_type_qualifier_to_variable(const struct 
ast_type_qualifier *qual,
  state->fs_redeclares_gl_fragcoord_with_no_layout_qualifiers;
}
 
-   if (qual->flags.q.explicit_location) {
-  validate_explicit_location(qual, var, state, loc);
-   } else if (qual->flags.q.explicit_index) {
-  _mesa_glsl_error(loc, state, "explicit index requires explicit 
location");
-   }
-
-   if (qual->flags.q.explicit_binding &&
-   validate_binding_qualifier(state, loc, var->type, qual)) {
-  var->data.explicit_binding = true;
-  var->data.binding = qual->binding;
-   }
-
-   if (var->type->contains_atomic()) {
-  if (var->data.mode == ir_var_uniform) {
- if (var->data.explicit_binding) {
-unsigned *offset =
-   >atomic_counter_offsets[var->data.binding];
-
-if (*offset % ATOMIC_COUNTER_SIZE)
-   _mesa_glsl_error(loc, state,
-"misaligned atomic counter offset");
-
-var->data.atomic.offset = *offset;
-*offset += var->type->atomic_size();
-
- } else {
-_mesa_glsl_error(loc, state,
- "atomic counters require explicit binding point");
- }
-  } else if (var->data.mode != ir_var_function_in) {
- _mesa_glsl_error(loc, state, "atomic counters may only be declared as 
"
-  "function parameters or uniform-qualified "
-  "global variables");
-  }
-   }
+   validate_layout_qualifiers(qual, var, state, loc);
 
/* Does the declaration use the deprecated 'attribute' or 'varying'
 * keywords?
@@ -6921,6 +6937,13 @@ 

[Mesa-dev] [PATCH V2 12/12] docs: mark compile-time constant expressions as done

2015-11-08 Thread Timothy Arceri
From: Timothy Arceri 

---
 docs/GL3.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/docs/GL3.txt b/docs/GL3.txt
index 7abdcd8..416b734 100644
--- a/docs/GL3.txt
+++ b/docs/GL3.txt
@@ -179,7 +179,7 @@ GL 4.4, GLSL 4.40:
   GL_ARB_buffer_storageDONE (i965, nv50, nvc0, 
r600, radeonsi)
   GL_ARB_clear_texture DONE (i965) (gallium - 
in progress, VMware)
   GL_ARB_enhanced_layouts  in progress (Timothy)
-  - compile-time constant expressions  in progress
+  - compile-time constant expressions  DONE
   - explicit byte offsets for blocks   in progress
   - forced alignment within blocks in progress
   - specified vec4-slot component numbers  in progress
-- 
2.4.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH V2 01/12] glsl: simplify interface block stream qualifier validation

2015-11-08 Thread Timothy Arceri
From: Timothy Arceri 

Qualifiers on member variables are redundent all we need to do
if check if it matches the stream associated with the block and
throw an error if its not.

Cc: Samuel Iglesias Gonsalvez 
Cc: Emil Velikov 
---
 src/glsl/ast_to_hir.cpp   | 27 +--
 src/glsl/nir/glsl_types.h | 10 +-
 2 files changed, 14 insertions(+), 23 deletions(-)

diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp
index 0306530..5a22820 100644
--- a/src/glsl/ast_to_hir.cpp
+++ b/src/glsl/ast_to_hir.cpp
@@ -5964,8 +5964,19 @@ ast_process_structure_or_interface_block(exec_list 
*instructions,
  fields[i].sample = qual->flags.q.sample ? 1 : 0;
  fields[i].patch = qual->flags.q.patch ? 1 : 0;
 
- /* Only save explicitly defined streams in block's field */
- fields[i].stream = qual->flags.q.explicit_stream ? qual->stream : -1;
+ /* From Section 4.4.2.3 (Geometry Outputs) of the GLSL 4.50 spec:
+  *
+  *   "A block member may be declared with a stream identifier, but
+  *   the specified stream must match the stream associated with the
+  *   containing block."
+  */
+ if (qual->flags.q.explicit_stream &&
+ qual->stream != layout->stream) {
+_mesa_glsl_error(, state, "stream layout qualifier on "
+ "interface block member `%s' does not match "
+ "the interface block (%d vs %d)",
+ fields[i].name, qual->stream, layout->stream);
+ }
 
  if (qual->flags.q.row_major || qual->flags.q.column_major) {
 if (!qual->flags.q.uniform && !qual->flags.q.buffer) {
@@ -6267,18 +6278,6 @@ ast_interface_block::hir(exec_list *instructions,
 
state->struct_specifier_depth--;
 
-   for (unsigned i = 0; i < num_variables; i++) {
-  if (fields[i].stream != -1 &&
-  (unsigned) fields[i].stream != this->layout.stream) {
- _mesa_glsl_error(, state,
-  "stream layout qualifier on "
-  "interface block member `%s' does not match "
-  "the interface block (%d vs %d)",
-  fields[i].name, fields[i].stream,
-  this->layout.stream);
-  }
-   }
-
if (!redeclaring_per_vertex) {
   validate_identifier(this->block_name, loc, state);
 
diff --git a/src/glsl/nir/glsl_types.h b/src/glsl/nir/glsl_types.h
index 52ca826..1f17ad5 100644
--- a/src/glsl/nir/glsl_types.h
+++ b/src/glsl/nir/glsl_types.h
@@ -829,13 +829,6 @@ struct glsl_struct_field {
unsigned patch:1;
 
/**
-* For interface blocks, it has a value if this variable uses multiple 
vertex
-* streams (as in ir_variable::stream). -1 otherwise.
-*/
-   int stream;
-
-
-   /**
 * Image qualifiers, applicable to buffer variables defined in shader
 * storage buffer objects (SSBOs)
 */
@@ -847,8 +840,7 @@ struct glsl_struct_field {
 
glsl_struct_field(const struct glsl_type *_type, const char *_name)
   : type(_type), name(_name), location(-1), interpolation(0), centroid(0),
-sample(0), matrix_layout(GLSL_MATRIX_LAYOUT_INHERITED), patch(0),
-stream(-1)
+sample(0), matrix_layout(GLSL_MATRIX_LAYOUT_INHERITED), patch(0)
{
   /* empty */
}
-- 
2.4.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH V2 07/12] glsl: add new type from compile time constants

2015-11-08 Thread Timothy Arceri
From: Timothy Arceri 

In this patch we introduce a new ast type for holding the new
compile-time constant expressions. The main reason for this is that
we can no longer do merging of layout qualifiers before they have been
converted into GLSL IR so we need to store them to be proccessed later.

The new type has two helper functions:

- process_qualifier_constant()

 Used to merge and then evaluate qualifier expressions

- merge_qualifier()

 Simply appends a qualifier to a list to be merged later by
 process_qualifier_constant()

In order to avoid cascading error messages the process_qualifier_constant()
helpers return a bool
---
 src/glsl/ast.h| 20 ++
 src/glsl/ast_type.cpp | 57 +++
 2 files changed, 77 insertions(+)

diff --git a/src/glsl/ast.h b/src/glsl/ast.h
index afd2d41..ef94cff 100644
--- a/src/glsl/ast.h
+++ b/src/glsl/ast.h
@@ -350,6 +350,26 @@ public:
exec_list array_dimensions;
 };
 
+class ast_layout_expression : public ast_node {
+public:
+   ast_layout_expression(const struct YYLTYPE , ast_expression *expr)
+   {
+  set_location(locp);
+  layout_const_expressions.push_tail(>link);
+   }
+
+   bool process_qualifier_constant(struct _mesa_glsl_parse_state *state,
+   const char *qual_indentifier,
+   unsigned *value, int min_value);
+
+   void merge_qualifier(ast_layout_expression *l_expr)
+   {
+  layout_const_expressions.append_list(_expr->layout_const_expressions);
+   }
+
+   exec_list layout_const_expressions;
+};
+
 /**
  * C-style aggregate initialization class
  *
diff --git a/src/glsl/ast_type.cpp b/src/glsl/ast_type.cpp
index 53d1023..8ceb3b1 100644
--- a/src/glsl/ast_type.cpp
+++ b/src/glsl/ast_type.cpp
@@ -482,3 +482,60 @@ ast_type_qualifier::merge_in_qualifier(YYLTYPE *loc,
 
return true;
 }
+
+bool
+ast_layout_expression::process_qualifier_constant(struct 
_mesa_glsl_parse_state *state,
+  const char *qual_indentifier,
+  unsigned *value,
+  int min_value)
+{
+   bool first_pass = true;
+   *value = 0;
+
+   for (exec_node *node = layout_const_expressions.head;
+   !node->is_tail_sentinel(); node = node->next) {
+
+  exec_list dummy_instructions;
+  ast_node *const_expression = exec_node_data(ast_node, node, link);
+
+  ir_rvalue *const ir = const_expression->hir(_instructions, state);
+
+  ir_constant *const const_int = ir->constant_expression_value();
+  if (const_int == NULL || !const_int->type->is_integer()) {
+ YYLTYPE loc = const_expression->get_location();
+ _mesa_glsl_error(, state, "%s must be an integral constant "
+  "expression", qual_indentifier);
+ return false;
+  }
+
+  assert(min_value >= 0);
+  if (const_int->value.i[0] < min_value) {
+ YYLTYPE loc = const_expression->get_location();
+ _mesa_glsl_error(, state, "%s layout qualifier is invalid "
+  "(%d < %d)", qual_indentifier,
+  const_int->value.i[0], min_value);
+ return false;
+  }
+
+  if (!first_pass && *value != const_int->value.u[0]) {
+ YYLTYPE loc = const_expression->get_location();
+ _mesa_glsl_error(, state, "%s layout qualifier does not "
+ "match previous declaration (%d vs %d)",
+  qual_indentifier, *value, const_int->value.i[0]);
+ return false;
+  } else {
+ first_pass = false;
+ *value = const_int->value.u[0];
+  }
+
+  /* If the location is const (and we've verified that
+   * it is) then no instructions should have been emitted
+   * when we converted it to HIR. If they were emitted,
+   * then either the location isn't const after all, or
+   * we are emitting unnecessary instructions.
+   */
+  assert(dummy_instructions.is_empty());
+   }
+
+   return true;
+}
-- 
2.4.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] nir: fix typo in idiv lowering, causing large-udiv-udiv failures

2015-11-08 Thread Ilia Mirkin
In nv50, and in the python script that Rob circulated, we do:

   bld.mkCmp(OP_SET, CC_GE, TYPE_U32, (s = bld.getSSA()), TYPE_U32, m, b);

Do the same in the nir div lowering pass. This fixes the large-udiv-udiv
piglit tests on freedreno.

Signed-off-by: Ilia Mirkin 
Cc: mesa-sta...@lists.freedesktop.org
---
 src/glsl/nir/nir_lower_idiv.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/glsl/nir/nir_lower_idiv.c b/src/glsl/nir/nir_lower_idiv.c
index c961178..3580ced 100644
--- a/src/glsl/nir/nir_lower_idiv.c
+++ b/src/glsl/nir/nir_lower_idiv.c
@@ -96,7 +96,7 @@ convert_instr(nir_builder *bld, nir_alu_instr *alu)
r = nir_imul(bld, q, b);
r = nir_isub(bld, a, r);
 
-   r = nir_ige(bld, r, b);
+   r = nir_uge(bld, r, b);
r = nir_b2i(bld, r);
 
q = nir_iadd(bld, q, r);
-- 
2.4.10

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/5] radeonsi: move maximum gs stream calculation into create_shader

2015-11-08 Thread Marek Olšák
From: Marek Olšák 

---
 src/gallium/drivers/radeonsi/si_shader.h|  1 +
 src/gallium/drivers/radeonsi/si_state_shaders.c | 22 ++
 2 files changed, 7 insertions(+), 16 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_shader.h 
b/src/gallium/drivers/radeonsi/si_shader.h
index 6d41aff..ec2d8c5 100644
--- a/src/gallium/drivers/radeonsi/si_shader.h
+++ b/src/gallium/drivers/radeonsi/si_shader.h
@@ -204,6 +204,7 @@ struct si_shader_selector {
unsignedgs_output_prim;
unsignedgs_max_out_vertices;
unsignedgs_num_invocations;
+   unsignedmax_gs_stream; /* count - 1 */
unsignedgsvs_vertex_size;
unsignedmax_gsvs_emit_size;
 
diff --git a/src/gallium/drivers/radeonsi/si_state_shaders.c 
b/src/gallium/drivers/radeonsi/si_state_shaders.c
index 9282297..b565923 100644
--- a/src/gallium/drivers/radeonsi/si_state_shaders.c
+++ b/src/gallium/drivers/radeonsi/si_state_shaders.c
@@ -211,21 +211,6 @@ static void si_shader_es(struct si_shader *shader)
si_set_tesseval_regs(shader, pm4);
 }
 
-static unsigned si_gs_get_max_stream(struct si_shader *shader)
-{
-   struct pipe_stream_output_info *so = >selector->so;
-   unsigned max_stream = 0, i;
-
-   if (so->num_outputs == 0)
-   return 0;
-
-   for (i = 0; i < so->num_outputs; i++) {
-   if (so->output[i].stream > max_stream)
-   max_stream = so->output[i].stream;
-   }
-   return max_stream;
-}
-
 static void si_shader_gs(struct si_shader *shader)
 {
unsigned gs_vert_itemsize = shader->selector->gsvs_vertex_size;
@@ -236,7 +221,7 @@ static void si_shader_gs(struct si_shader *shader)
struct si_pm4_state *pm4;
unsigned num_sgprs, num_user_sgprs;
uint64_t va;
-   unsigned max_stream = si_gs_get_max_stream(shader);
+   unsigned max_stream = shader->selector->max_gs_stream;
 
/* The GSVS_RING_ITEMSIZE register takes 15 bits */
assert(gsvs_itemsize < (1 << 15));
@@ -713,6 +698,11 @@ static void *si_create_shader_selector(struct pipe_context 
*ctx,
sel->gsvs_vertex_size = sel->info.num_outputs * 16;
sel->max_gsvs_emit_size = sel->gsvs_vertex_size *
  sel->gs_max_out_vertices;
+
+   sel->max_gs_stream = 0;
+   for (i = 0; i < sel->so.num_outputs; i++)
+   sel->max_gs_stream = MAX2(sel->max_gs_stream,
+ sel->so.output[i].stream);
break;
 
case PIPE_SHADER_VERTEX:
-- 
2.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/5] radeonsi: clean up small duplication in si_shader_gs

2015-11-08 Thread Marek Olšák
From: Marek Olšák 

---
 src/gallium/drivers/radeonsi/si_shader.h|  3 ++-
 src/gallium/drivers/radeonsi/si_state_shaders.c | 11 ++-
 2 files changed, 8 insertions(+), 6 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_shader.h 
b/src/gallium/drivers/radeonsi/si_shader.h
index b72cb1a..6d41aff 100644
--- a/src/gallium/drivers/radeonsi/si_shader.h
+++ b/src/gallium/drivers/radeonsi/si_shader.h
@@ -204,7 +204,8 @@ struct si_shader_selector {
unsignedgs_output_prim;
unsignedgs_max_out_vertices;
unsignedgs_num_invocations;
-   unsignedgsvs_itemsize;
+   unsignedgsvs_vertex_size;
+   unsignedmax_gsvs_emit_size;
 
/* masks of "get_unique_index" bits */
uint64_toutputs_written;
diff --git a/src/gallium/drivers/radeonsi/si_state_shaders.c 
b/src/gallium/drivers/radeonsi/si_state_shaders.c
index 996004a..9282297 100644
--- a/src/gallium/drivers/radeonsi/si_state_shaders.c
+++ b/src/gallium/drivers/radeonsi/si_state_shaders.c
@@ -228,9 +228,9 @@ static unsigned si_gs_get_max_stream(struct si_shader 
*shader)
 
 static void si_shader_gs(struct si_shader *shader)
 {
-   unsigned gs_vert_itemsize = shader->selector->info.num_outputs * 16;
+   unsigned gs_vert_itemsize = shader->selector->gsvs_vertex_size;
unsigned gs_max_vert_out = shader->selector->gs_max_out_vertices;
-   unsigned gsvs_itemsize = (gs_vert_itemsize * gs_max_vert_out) >> 2;
+   unsigned gsvs_itemsize = shader->selector->max_gsvs_emit_size >> 2;
unsigned gs_num_invocations = shader->selector->gs_num_invocations;
unsigned cut_mode;
struct si_pm4_state *pm4;
@@ -710,8 +710,9 @@ static void *si_create_shader_selector(struct pipe_context 
*ctx,

sel->info.properties[TGSI_PROPERTY_GS_MAX_OUTPUT_VERTICES];
sel->gs_num_invocations =
sel->info.properties[TGSI_PROPERTY_GS_INVOCATIONS];
-   sel->gsvs_itemsize = sel->info.num_outputs * 16 *
-sel->gs_max_out_vertices;
+   sel->gsvs_vertex_size = sel->info.num_outputs * 16;
+   sel->max_gsvs_emit_size = sel->gsvs_vertex_size *
+ sel->gs_max_out_vertices;
break;
 
case PIPE_SHADER_VERTEX:
@@ -1129,7 +1130,7 @@ static void si_init_gs_rings(struct si_context *sctx)
 
 static void si_update_gs_rings(struct si_context *sctx)
 {
-   unsigned gsvs_itemsize = sctx->gs_shader.cso->gsvs_itemsize;
+   unsigned gsvs_itemsize = sctx->gs_shader.cso->max_gsvs_emit_size;
uint64_t offset;
 
if (gsvs_itemsize == sctx->last_gsvs_itemsize)
-- 
2.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 4/5] radeonsi: rename si_update_gs_rings

2015-11-08 Thread Marek Olšák
From: Marek Olšák 

---
 src/gallium/drivers/radeonsi/si_state_shaders.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_state_shaders.c 
b/src/gallium/drivers/radeonsi/si_state_shaders.c
index 89d365b..c402ce2 100644
--- a/src/gallium/drivers/radeonsi/si_state_shaders.c
+++ b/src/gallium/drivers/radeonsi/si_state_shaders.c
@@ -1119,7 +1119,7 @@ static void si_init_gs_rings(struct si_context *sctx)
   false, false, 0, 0, 0);
 }
 
-static void si_update_gs_rings(struct si_context *sctx)
+static void si_update_gsvs_ring_bindings(struct si_context *sctx)
 {
unsigned gsvs_itemsize = sctx->gs_shader.cso->max_gsvs_emit_size;
uint64_t offset;
@@ -1491,7 +1491,7 @@ bool si_update_shaders(struct si_context *sctx)
return false;
}
 
-   si_update_gs_rings(sctx);
+   si_update_gsvs_ring_bindings(sctx);
} else {
si_pm4_bind_state(sctx, gs, NULL);
si_pm4_bind_state(sctx, es, NULL);
-- 
2.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/5] radeonsi: calculate ESGS_RING_ITEMSIZE in create_shader

2015-11-08 Thread Marek Olšák
From: Marek Olšák 

---
 src/gallium/drivers/radeonsi/si_shader.h| 1 +
 src/gallium/drivers/radeonsi/si_state_shaders.c | 3 ++-
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/src/gallium/drivers/radeonsi/si_shader.h 
b/src/gallium/drivers/radeonsi/si_shader.h
index ec2d8c5..1f4f0de 100644
--- a/src/gallium/drivers/radeonsi/si_shader.h
+++ b/src/gallium/drivers/radeonsi/si_shader.h
@@ -201,6 +201,7 @@ struct si_shader_selector {
boolforces_persample_interp_for_persp;
boolforces_persample_interp_for_linear;
 
+   unsignedesgs_itemsize;
unsignedgs_output_prim;
unsignedgs_max_out_vertices;
unsignedgs_num_invocations;
diff --git a/src/gallium/drivers/radeonsi/si_state_shaders.c 
b/src/gallium/drivers/radeonsi/si_state_shaders.c
index b565923..89d365b 100644
--- a/src/gallium/drivers/radeonsi/si_state_shaders.c
+++ b/src/gallium/drivers/radeonsi/si_state_shaders.c
@@ -195,7 +195,7 @@ static void si_shader_es(struct si_shader *shader)
assert(num_sgprs <= 104);
 
si_pm4_set_reg(pm4, R_028AAC_VGT_ESGS_RING_ITEMSIZE,
-  util_last_bit64(shader->selector->outputs_written) * 4);
+  shader->selector->esgs_itemsize / 4);
si_pm4_set_reg(pm4, R_00B320_SPI_SHADER_PGM_LO_ES, va >> 8);
si_pm4_set_reg(pm4, R_00B324_SPI_SHADER_PGM_HI_ES, va >> 40);
si_pm4_set_reg(pm4, R_00B328_SPI_SHADER_PGM_RSRC1_ES,
@@ -724,6 +724,7 @@ static void *si_create_shader_selector(struct pipe_context 
*ctx,
1llu << 
si_shader_io_get_unique_index(name, index);
}
}
+   sel->esgs_itemsize = util_last_bit64(sel->outputs_written) * 16;
break;
case PIPE_SHADER_FRAGMENT:
for (i = 0; i < sel->info.num_outputs; i++) {
-- 
2.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 0/5] RadeonSI: Optimal GS ring sizes, fixing Tonga hangs

2015-11-08 Thread Marek Olšák
This fixes hangs on Tonga when the ESGS ring isn't large enough. It's also a 
requirement for this not-yet-committed patch:

   "radeonsi: link ES-GS just like LS-HS"

which makes GS hangs easier to reproduce. The ring size equations are based on 
VGT docs and my discussion with VGT guys.

Please review.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH V2 03/12] glsl: add helper to check for enhanced layouts support

2015-11-08 Thread Timothy Arceri
From: Timothy Arceri 

---
 src/glsl/glsl_parser_extras.h | 5 +
 1 file changed, 5 insertions(+)

diff --git a/src/glsl/glsl_parser_extras.h b/src/glsl/glsl_parser_extras.h
index 684b917..1d8c1b8 100644
--- a/src/glsl/glsl_parser_extras.h
+++ b/src/glsl/glsl_parser_extras.h
@@ -209,6 +209,11 @@ struct _mesa_glsl_parse_state {
   return ARB_shader_atomic_counters_enable || is_version(420, 310);
}
 
+   bool has_enhanced_layouts() const
+   {
+  return ARB_enhanced_layouts_enable || is_version(440, 0);
+   }
+
bool has_explicit_attrib_stream() const
{
   return ARB_gpu_shader5_enable || is_version(400, 0);
-- 
2.4.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH V2 10/12] glsl: add support for complie-time constant expressions

2015-11-08 Thread Timothy Arceri
From: Timothy Arceri 

This patch replaces the old interger constant qualifiers with either
the new ast_layout_expression type if the qualifier requires merging
or ast_expression if the qualifier can't have mulitple declorations
or if all but he newest qualifier is simply ignored.

This also remove the location field that was temporarily added to
ast_type_qualifier to keep track of the parser location.
---
 src/glsl/ast.h  |  33 +++---
 src/glsl/ast_to_hir.cpp | 253 +---
 src/glsl/ast_type.cpp   |  81 +
 src/glsl/glsl_parser.yy |  25 ++--
 src/glsl/glsl_parser_extras.cpp |  43 ---
 5 files changed, 236 insertions(+), 199 deletions(-)

diff --git a/src/glsl/ast.h b/src/glsl/ast.h
index ef94cff..4fd049c 100644
--- a/src/glsl/ast.h
+++ b/src/glsl/ast.h
@@ -573,13 +573,11 @@ struct ast_type_qualifier {
   uint64_t i;
} flags;
 
-   struct YYLTYPE *loc;
-
/** Precision of the type (highp/medium/lowp). */
unsigned precision:2;
 
/** Geometry shader invocations for GL_ARB_gpu_shader5. */
-   int invocations;
+   ast_layout_expression *invocations;
 
/**
 * Location specified via GL_ARB_explicit_attrib_location layout
@@ -587,20 +585,20 @@ struct ast_type_qualifier {
 * \note
 * This field is only valid if \c explicit_location is set.
 */
-   int location;
+   ast_expression *location;
/**
 * Index specified via GL_ARB_explicit_attrib_location layout
 *
 * \note
 * This field is only valid if \c explicit_index is set.
 */
-   int index;
+   ast_expression *index;
 
/** Maximum output vertices in GLSL 1.50 geometry shaders. */
-   int max_vertices;
+   ast_layout_expression *max_vertices;
 
/** Stream in GLSL 1.50 geometry shaders. */
-   unsigned stream;
+   ast_expression *stream;
 
/**
 * Input or output primitive type in GLSL 1.50 geometry shaders
@@ -614,7 +612,7 @@ struct ast_type_qualifier {
 * \note
 * This field is only valid if \c explicit_binding is set.
 */
-   int binding;
+   ast_expression *binding;
 
/**
 * Offset specified via GL_ARB_shader_atomic_counter's "offset"
@@ -623,14 +621,14 @@ struct ast_type_qualifier {
 * \note
 * This field is only valid if \c explicit_offset is set.
 */
-   int offset;
+   ast_expression *offset;
 
/**
 * Local size specified via GL_ARB_compute_shader's "local_size_{x,y,z}"
 * layout qualifier.  Element i of this array is only valid if
 * flags.q.local_size & (1 << i) is set.
 */
-   int local_size[3];
+   ast_layout_expression *local_size[3];
 
/** Tessellation evaluation shader: vertex spacing (equal, fractional 
even/odd) */
GLenum vertex_spacing;
@@ -642,7 +640,7 @@ struct ast_type_qualifier {
bool point_mode;
 
/** Tessellation control shader: number of output vertices */
-   int vertices;
+   ast_layout_expression *vertices;
 
/**
 * Image format specified with an ARB_shader_image_load_store
@@ -1114,17 +1112,13 @@ public:
 class ast_tcs_output_layout : public ast_node
 {
 public:
-   ast_tcs_output_layout(const struct YYLTYPE , int vertices)
-  : vertices(vertices)
+   ast_tcs_output_layout(const struct YYLTYPE )
{
   set_location(locp);
}
 
virtual ir_rvalue *hir(exec_list *instructions,
   struct _mesa_glsl_parse_state *state);
-
-private:
-   const int vertices;
 };
 
 
@@ -1156,9 +1150,10 @@ private:
 class ast_cs_input_layout : public ast_node
 {
 public:
-   ast_cs_input_layout(const struct YYLTYPE , const unsigned *local_size)
+   ast_cs_input_layout(const struct YYLTYPE ,
+   ast_layout_expression **local_size)
{
-  memcpy(this->local_size, local_size, sizeof(this->local_size));
+  memcpy(this->local_size, *local_size, sizeof(this->local_size));
   set_location(locp);
}
 
@@ -1166,7 +1161,7 @@ public:
   struct _mesa_glsl_parse_state *state);
 
 private:
-   unsigned local_size[3];
+   ast_layout_expression *local_size[3];
 };
 
 /*@}*/
diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp
index fcf7566..d55a99e 100644
--- a/src/glsl/ast_to_hir.cpp
+++ b/src/glsl/ast_to_hir.cpp
@@ -2319,7 +2319,8 @@ static bool
 validate_binding_qualifier(struct _mesa_glsl_parse_state *state,
YYLTYPE *loc,
const glsl_type *type,
-   const ast_type_qualifier *qual)
+   const ast_type_qualifier *qual,
+   unsigned qual_binding)
 {
if (!qual->flags.q.uniform && !qual->flags.q.buffer) {
   _mesa_glsl_error(loc, state,
@@ -2328,14 +2329,9 @@ validate_binding_qualifier(struct _mesa_glsl_parse_state 
*state,
   return false;
}
 
-   if (qual->binding < 0) {
-  _mesa_glsl_error(loc, state, "binding values must be >= 0");
-  return false;
- 

[Mesa-dev] [PATCH V2 08/12] glsl: add process_qualifier_constant() helper

2015-11-08 Thread Timothy Arceri
From: Timothy Arceri 

This helper is similar to the function added as part of the
ast_layout_expression class but will be used when only the
ast_expression type is required for the qualifier.

ast_expression is used if qualifier can't have mulitple declorations
or if all but he newest qualifier is simply ignored.
---
 src/glsl/ast_to_hir.cpp | 42 ++
 1 file changed, 42 insertions(+)

diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp
index 5643c86..21a956d 100644
--- a/src/glsl/ast_to_hir.cpp
+++ b/src/glsl/ast_to_hir.cpp
@@ -2261,6 +2261,48 @@ validate_matrix_layout_for_type(struct 
_mesa_glsl_parse_state *state,
}
 }
 
+bool
+process_qualifier_constant(struct _mesa_glsl_parse_state *state,
+   YYLTYPE *loc,
+   const char *qual_indentifier,
+   ast_expression *const_expression,
+   unsigned *value, int minimum_value)
+{
+   exec_list dummy_instructions;
+
+   if (const_expression == NULL) {
+  *value = 0;
+  return true;
+   }
+
+   ir_rvalue *const ir = const_expression->hir(_instructions, state);
+
+   ir_constant *const const_int = ir->constant_expression_value();
+   if (const_int == NULL || !const_int->type->is_integer()) {
+  _mesa_glsl_error(loc, state, "%s must be an integral constant "
+   "expression", qual_indentifier);
+  return false;
+   }
+
+   assert(minimum_value >= 0);
+   if (const_int->value.i[0] < minimum_value) {
+  _mesa_glsl_error(loc, state, "%s layout qualifier is invalid (%d < %d)",
+   qual_indentifier, const_int->value.i[0], minimum_value);
+  return false;
+   }
+
+   /* If the location is const (and we've verified that
+* it is) then no instructions should have been emitted
+* when we converted it to HIR. If they were emitted,
+* then either the location isn't const after all, or
+* we are emitting unnecessary instructions.
+*/
+   assert(dummy_instructions.is_empty());
+
+   *value = const_int->value.u[0];
+   return true;
+}
+
 static bool
 validate_binding_qualifier(struct _mesa_glsl_parse_state *state,
YYLTYPE *loc,
-- 
2.4.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH V2 05/12] glsl: add layout qualifier validation for the shader outside the parser

2015-11-08 Thread Timothy Arceri
From: Timothy Arceri 

This is in preparation for compile-time constant support, a later patch
will remove the validation from the shader.

The global shader layout qualifiers will now mostly be validated in
glsl_parser_extras.cpp.

In order to do validation at the later stage in glsl_parser_extras.cpp we
need to temporarily add a field in ast_type_qualifier to keep track of the
parser location, this will be removed in a following patch when we
introduce a new type for storing the comiple-time qualifiers.

Also as the set_shader_inout_layout() function in glsl parser extras is
normally called after all validation is done we need to move the code that
sets CompileStatus and InfoLog otherwise the newly add error messages would
be ignored.
---
 src/glsl/ast_to_hir.cpp | 14 --
 src/glsl/ast_type.cpp   |  2 ++
 src/glsl/glsl_parser_extras.cpp | 37 -
 3 files changed, 46 insertions(+), 7 deletions(-)

diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp
index 0cea607..5643c86 100644
--- a/src/glsl/ast_to_hir.cpp
+++ b/src/glsl/ast_to_hir.cpp
@@ -3544,10 +3544,19 @@ static void
 handle_tess_ctrl_shader_output_decl(struct _mesa_glsl_parse_state *state,
 YYLTYPE loc, ir_variable *var)
 {
-   unsigned num_vertices = 0;
+   int num_vertices = 0;
 
if (state->tcs_output_vertices_specified) {
   num_vertices = state->out_qualifier->vertices;
+  if (num_vertices <= 0) {
+ _mesa_glsl_error(, state, "invalid vertices (%d) specified",
+  num_vertices);
+ return;
+  } else if ((unsigned) num_vertices > state->Const.MaxPatchVertices) {
+ _mesa_glsl_error(, state, "vertices (%d) exceeds "
+  "GL_MAX_PATCH_VERTICES", num_vertices);
+ return;
+  }
}
 
if (!var->type->is_array() && !var->data.patch) {
@@ -3561,7 +3570,8 @@ handle_tess_ctrl_shader_output_decl(struct 
_mesa_glsl_parse_state *state,
if (var->data.patch)
   return;
 
-   validate_layout_qualifier_vertex_count(state, loc, var, num_vertices,
+   validate_layout_qualifier_vertex_count(state, loc, var,
+  (unsigned) num_vertices,
   >tcs_output_size,
   "tessellation control shader 
output");
 }
diff --git a/src/glsl/ast_type.cpp b/src/glsl/ast_type.cpp
index 08a4504..53d1023 100644
--- a/src/glsl/ast_type.cpp
+++ b/src/glsl/ast_type.cpp
@@ -310,6 +310,7 @@ ast_type_qualifier::merge_out_qualifier(YYLTYPE *loc,
 {
void *mem_ctx = state;
const bool r = this->merge_qualifier(loc, state, q);
+   this->loc = loc;
 
if (state->stage == MESA_SHADER_TESS_CTRL) {
   node = new(mem_ctx) ast_tcs_output_layout(*loc, q.vertices);
@@ -329,6 +330,7 @@ ast_type_qualifier::merge_in_qualifier(YYLTYPE *loc,
bool create_cs_ast = false;
ast_type_qualifier valid_in_mask;
valid_in_mask.flags.i = 0;
+   this->loc = loc;
 
switch (state->stage) {
case MESA_SHADER_TESS_EVAL:
diff --git a/src/glsl/glsl_parser_extras.cpp b/src/glsl/glsl_parser_extras.cpp
index 2dba7d9..7d7f45c 100644
--- a/src/glsl/glsl_parser_extras.cpp
+++ b/src/glsl/glsl_parser_extras.cpp
@@ -947,6 +947,14 @@ _mesa_ast_process_interface_block(YYLTYPE *locp,
 
if (state->stage == MESA_SHADER_GEOMETRY &&
state->has_explicit_attrib_stream()) {
+
+  if (state->out_qualifier->flags.q.explicit_stream) {
+ if (state->out_qualifier->stream < 0) {
+_mesa_glsl_error(locp, state, "invalid stream %d specified",
+ state->out_qualifier->stream);
+ }
+  }
+
   /* Assign global layout's stream value. */
   block->layout.flags.q.stream = 1;
   block->layout.flags.q.explicit_stream = 0;
@@ -1615,7 +1623,7 @@ void ast_subroutine_list::print(void) const
 
 static void
 set_shader_inout_layout(struct gl_shader *shader,
-struct _mesa_glsl_parse_state *state)
+struct _mesa_glsl_parse_state *state)
 {
/* Should have been prevented by the parser. */
if (shader->Stage == MESA_SHADER_TESS_CTRL) {
@@ -1666,8 +1674,14 @@ set_shader_inout_layout(struct gl_shader *shader,
   break;
case MESA_SHADER_GEOMETRY:
   shader->Geom.VerticesOut = 0;
-  if (state->out_qualifier->flags.q.max_vertices)
+  if (state->out_qualifier->flags.q.max_vertices) {
+ if (state->out_qualifier->max_vertices < 0) {
+_mesa_glsl_error(state->out_qualifier->loc, state,
+ "invalid max_vertices %d specified",
+ state->out_qualifier->max_vertices);
+ }
  shader->Geom.VerticesOut = state->out_qualifier->max_vertices;
+  }
 
   if (state->gs_input_prim_type_specified) {
  shader->Geom.InputType = 

[Mesa-dev] [PATCH V2 02/12] mesa: add ARB_enhanced_layouts

2015-11-08 Thread Timothy Arceri
From: Timothy Arceri 

---
 src/glsl/glcpp/glcpp-parse.y| 1 +
 src/glsl/glsl_parser_extras.cpp | 1 +
 src/glsl/glsl_parser_extras.h   | 2 ++
 src/mesa/main/extensions.c  | 1 +
 src/mesa/main/mtypes.h  | 1 +
 5 files changed, 6 insertions(+)

diff --git a/src/glsl/glcpp/glcpp-parse.y b/src/glsl/glcpp/glcpp-parse.y
index 4acccf7..6aa7abe 100644
--- a/src/glsl/glcpp/glcpp-parse.y
+++ b/src/glsl/glcpp/glcpp-parse.y
@@ -2387,6 +2387,7 @@ _glcpp_parser_handle_version_declaration(glcpp_parser_t 
*parser, intmax_t versio
   }
} else {
   add_builtin_define(parser, "GL_ARB_draw_buffers", 1);
+   add_builtin_define(parser, "GL_ARB_enhanced_layouts", 1);
add_builtin_define(parser, "GL_ARB_separate_shader_objects", 1);
   add_builtin_define(parser, "GL_ARB_texture_rectangle", 1);
add_builtin_define(parser, "GL_AMD_shader_trinary_minmax", 1);
diff --git a/src/glsl/glsl_parser_extras.cpp b/src/glsl/glsl_parser_extras.cpp
index 14cb9fc..2dba7d9 100644
--- a/src/glsl/glsl_parser_extras.cpp
+++ b/src/glsl/glsl_parser_extras.cpp
@@ -594,6 +594,7 @@ static const _mesa_glsl_extension 
_mesa_glsl_supported_extensions[] = {
EXT(ARB_derivative_control,   true,  false, 
ARB_derivative_control),
EXT(ARB_draw_buffers, true,  false, dummy_true),
EXT(ARB_draw_instanced,   true,  false, ARB_draw_instanced),
+   EXT(ARB_enhanced_layouts, true,  false, 
ARB_enhanced_layouts),
EXT(ARB_explicit_attrib_location, true,  false, 
ARB_explicit_attrib_location),
EXT(ARB_explicit_uniform_location,true,  false, 
ARB_explicit_uniform_location),
EXT(ARB_fragment_coord_conventions,   true,  false, 
ARB_fragment_coord_conventions),
diff --git a/src/glsl/glsl_parser_extras.h b/src/glsl/glsl_parser_extras.h
index b54c535..684b917 100644
--- a/src/glsl/glsl_parser_extras.h
+++ b/src/glsl/glsl_parser_extras.h
@@ -499,6 +499,8 @@ struct _mesa_glsl_parse_state {
bool ARB_draw_buffers_warn;
bool ARB_draw_instanced_enable;
bool ARB_draw_instanced_warn;
+   bool ARB_enhanced_layouts_enable;
+   bool ARB_enhanced_layouts_warn;
bool ARB_explicit_attrib_location_enable;
bool ARB_explicit_attrib_location_warn;
bool ARB_explicit_uniform_location_enable;
diff --git a/src/mesa/main/extensions.c b/src/mesa/main/extensions.c
index bdc6817..1facad1 100644
--- a/src/mesa/main/extensions.c
+++ b/src/mesa/main/extensions.c
@@ -111,6 +111,7 @@ static const struct extension extension_table[] = {
{ "GL_ARB_draw_elements_base_vertex",   
o(ARB_draw_elements_base_vertex),   GL, 2009 },
{ "GL_ARB_draw_indirect",   o(ARB_draw_indirect),   
GLC,2010 },
{ "GL_ARB_draw_instanced",  o(ARB_draw_instanced),  
GL, 2008 },
+   { "GL_ARB_enhanced_layouts",o(ARB_enhanced_layouts),
GLC,2013 },
{ "GL_ARB_explicit_attrib_location",
o(ARB_explicit_attrib_location),GL, 2009 },
{ "GL_ARB_explicit_uniform_location",   
o(ARB_explicit_uniform_location),   GL, 2012 },
{ "GL_ARB_fragment_coord_conventions",  
o(ARB_fragment_coord_conventions),  GL, 2009 },
diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
index fdb3b3d..9b9fd4e 100644
--- a/src/mesa/main/mtypes.h
+++ b/src/mesa/main/mtypes.h
@@ -3663,6 +3663,7 @@ struct gl_extensions
GLboolean ARB_fragment_shader;
GLboolean ARB_framebuffer_no_attachments;
GLboolean ARB_framebuffer_object;
+   GLboolean ARB_enhanced_layouts;
GLboolean ARB_explicit_attrib_location;
GLboolean ARB_explicit_uniform_location;
GLboolean ARB_geometry_shader4;
-- 
2.4.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH V2 09/12] glsl: add validate_stream_qualifier() helper

2015-11-08 Thread Timothy Arceri
From: Timothy Arceri 

---
 src/glsl/ast_to_hir.cpp | 12 
 1 file changed, 12 insertions(+)

diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp
index 21a956d..fcf7566 100644
--- a/src/glsl/ast_to_hir.cpp
+++ b/src/glsl/ast_to_hir.cpp
@@ -2303,6 +2303,18 @@ process_qualifier_constant(struct _mesa_glsl_parse_state 
*state,
return true;
 }
 
+static void
+validate_stream_qualifier(YYLTYPE *loc, struct _mesa_glsl_parse_state *state,
+  unsigned stream)
+{
+   if (stream >= state->ctx->Const.MaxVertexStreams) {
+  _mesa_glsl_error(loc, state,
+   "invalid stream specified %d is larger than "
+   "MAX_VERTEX_STREAMS - 1 (%d).",
+   stream, state->ctx->Const.MaxVertexStreams - 1);
+   }
+}
+
 static bool
 validate_binding_qualifier(struct _mesa_glsl_parse_state *state,
YYLTYPE *loc,
-- 
2.4.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 92860] [radeonsi][bisected] st/mesa: implement ARB_copy_image - Corruption in ARK Survival Evolved

2015-11-08 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=92860

Michel Dänzer  changed:

   What|Removed |Added

  Component|Drivers/Gallium/radeonsi|Mesa core
   Assignee|dri-devel@lists.freedesktop |mesa-dev@lists.freedesktop.
   |.org|org
 QA Contact|dri-devel@lists.freedesktop |mesa-dev@lists.freedesktop.
   |.org|org

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] glsl: Parse shared keyword for compute shader variables

2015-11-08 Thread Timothy Arceri
On Sat, 2015-11-07 at 23:16 -0800, Jordan Justen wrote:
> On 2015-11-06 19:33:38, Timothy Arceri wrote:
> > On Fri, 2015-11-06 at 17:56 -0800, Jordan Justen wrote:
> > > Signed-off-by: Jordan Justen 
> > > ---
> > > 
> > > Notes:
> > > git://people.freedesktop.org/~jljusten/mesa cs-parse-shared-vars-v1
> > > 
> > > http://patchwork.freedesktop.org/bundle/jljusten/cs-parse-shared-vars-v1
> > > 
> > > With these environment overrides:
> > > 
> > >   export MESA_GL_VERSION_OVERRIDE=4.3
> > >   export MESA_GLSL_VERSION_OVERRIDE=430
> > >   export MESA_EXTENSION_OVERRIDE=GL_ARB_compute_shader
> > > 
> > > This fixes my recently posted piglit test:
> > > 
> > >   tests/spec/arb_compute_shader/compiler/shared-variables.comp
> > >   http://patchwork.freedesktop.org/patch/63944/
> > > 
> > >  src/glsl/ast_to_hir.cpp | 2 +-
> > >  src/glsl/glsl_lexer.ll  | 2 ++
> > >  src/glsl/glsl_parser.yy | 6 ++
> > >  3 files changed, 9 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp
> > > index 0306530..dd5ba4e 100644
> > > --- a/src/glsl/ast_to_hir.cpp
> > > +++ b/src/glsl/ast_to_hir.cpp
> > > @@ -3081,7 +3081,7 @@ apply_type_qualifier_to_variable(const struct
> > > ast_type_qualifier *qual,
> > > if (qual->flags.q.std140 ||
> > > qual->flags.q.std430 ||
> > > qual->flags.q.packed ||
> > > -   qual->flags.q.shared) {
> > > +   (qual->flags.q.shared && (state->stage != MESA_SHADER_COMPUTE)))
> > > {
> > >_mesa_glsl_error(loc, state,
> > > "uniform and shader storage block layout
> > > qualifiers
> > > "
> > > "std140, std430, packed, and shared can only be
> > > "
> > > diff --git a/src/glsl/glsl_lexer.ll b/src/glsl/glsl_lexer.ll
> > > index 2142817..e59f93e 100644
> > > --- a/src/glsl/glsl_lexer.ll
> > > +++ b/src/glsl/glsl_lexer.ll
> > > @@ -414,6 +414,8 @@ writeonly  KEYWORD_WITH_ALT(420, 300, 420, 310,
> > > yyextra->ARB_shader_image_lo
> > >  
> > >  atomic_uint KEYWORD_WITH_ALT(420, 300, 420, 310, yyextra
> > > ->ARB_shader_atomic_counters_enable, ATOMIC_UINT);
> > >  
> > > +shared  KEYWORD_WITH_ALT(430, 310, 430, 310, yyextra
> > > ->ARB_compute_shader_enable, SHARED);
> > > +
> > >  struct   return STRUCT;
> > >  void return VOID_TOK;
> > >  
> > > diff --git a/src/glsl/glsl_parser.yy b/src/glsl/glsl_parser.yy
> > > index 4636435..2598356 100644
> > > --- a/src/glsl/glsl_parser.yy
> > > +++ b/src/glsl/glsl_parser.yy
> > > @@ -165,6 +165,7 @@ static bool match_layout_qualifier(const char *s1,
> > > const
> > > char *s2,
> > >  %token IMAGE1DSHADOW IMAGE2DSHADOW IMAGE1DARRAYSHADOW
> > > IMAGE2DARRAYSHADOW
> > >  %token COHERENT VOLATILE RESTRICT READONLY WRITEONLY
> > >  %token ATOMIC_UINT
> > > +%token SHARED
> > >  %token STRUCT VOID_TOK WHILE
> > >  %token  IDENTIFIER TYPE_IDENTIFIER NEW_IDENTIFIER
> > >  %type  any_identifier
> > > @@ -1958,6 +1959,11 @@ memory_qualifier:
> > >memset(& $$, 0, sizeof($$));
> > >$$.flags.q.write_only = 1;
> > > }
> > > +   | SHARED
> > > +   {
> > > +  memset(& $$, 0, sizeof($$));
> > > +  $$.flags.q.shared = 1;
> > > +   }
> > 
> > Hi Jordan,
> > 
> > This should be in storage_qualifier: rather than memory_qualifier:
> > 
> > Also it should be restricted to the computer shader stage e.g.
> > 
> > if (state->stage == MESA_SHADER_COMPUTE) {
> >   memset(& $$, 0, sizeof($$));
> >   $$.flags.q.shared = 1;
> > } else {
> > _mesa_glsl_error(&@1, state, "the shared storage qualifiers can
> > only  
> >  "be used with compute shaders");
> > }
> > 
> > Maybe add a piglit test to make sure it fails in another stage?
> 
> I tested nvidia, and they don't fail to compile when a variable is
> declared as shared in the render stages.
> 
> The spec says:
> 
>"Variables declared as shared may only be used in compute shaders"
> 
> Unfortunately, that doesn't specifically say that it should fail at
> the compile step.

Yeah your right, in that case feel free not to add this check. 

> 
> I think it is fine for us to fail at the compile phase, but I'm not
> sure about tests for this.
> 
> -Jordan
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [RFCv2 03/13] nir: allow pre-resolved sampler uniform locations

2015-11-08 Thread Timothy Arceri
On Sun, 2015-11-08 at 15:12 -0500, Rob Clark wrote:
> From: Rob Clark 
> 
> With TGSI, the ir_variable::data.location gets fixed up to be a stage
> local location (rather than program global).  In this case we need to
> skip the UniformStorage[location] lookup.
> ---
>  src/glsl/nir/nir_lower_samplers.c | 23 ---
>  1 file changed, 16 insertions(+), 7 deletions(-)
> 
> diff --git a/src/glsl/nir/nir_lower_samplers.c
> b/src/glsl/nir/nir_lower_samplers.c
> index 5df79a6..d99ba4c 100644
> --- a/src/glsl/nir/nir_lower_samplers.c
> +++ b/src/glsl/nir/nir_lower_samplers.c
> @@ -130,14 +130,18 @@ lower_sampler(nir_tex_instr *instr, const struct
> gl_shader_program *shader_progr
>instr->sampler_array_size = array_elements;
> }
>  
> -   if (location > shader_program->NumUniformStorage - 1 ||
> -   !shader_program->UniformStorage[location].opaque[stage].active) {
> -  assert(!"cannot return a sampler");
> -  return;
> -   }
> +   if (!shader_program) {
> +  instr->sampler_index = location;
> +   } else {
> +  if (location > shader_program->NumUniformStorage - 1 ||
> +  !shader_program->UniformStorage[location].opaque[stage].active) {
> + assert(!"cannot return a sampler");
> + return;
> +  }
>  
> -   instr->sampler_index +=
> -  shader_program->UniformStorage[location].opaque[stage].index;
> +  instr->sampler_index =
> + shader_program->UniformStorage[location].opaque[stage].index;

Hi Rob,

This will break arrays as instr->sampler_index is increamented inside
 calc_sampler_offsets()

calc_sampler_offsets() also modifies the value of location is this what you
want? I would assume not as we are counting uniforms not just samplers here.

The other thing to note is that glsl to tgsi doesn't handle indirects on
structs or arrays of arrays correctly (Ilia was trying to fix this).

Tim
 


> +   }
>  
> instr->sampler = NULL;
>  }
> @@ -177,6 +181,11 @@ lower_impl(nir_function_impl *impl, const struct
> gl_shader_program *shader_progr
> nir_foreach_block(impl, lower_block_cb, );
>  }
>  
> +/* Call with a null 'shader_program' if uniform locations are

uniform locations -> sampler indices?

> + * already local to the shader, ie. skipping the
> + * shader_program->UniformStorage[location].opaque[stage].index
> + * lookup
> + */
>  void
>  nir_lower_samplers(nir_shader *shader,
> const struct gl_shader_program *shader_program)
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] RFC: llvmpipe map scene buffers outside thread.

2015-11-08 Thread Dave Airlie
From: Dave Airlie 

There might be a reason we do this inside the thread, but I'm not aware of it
yet, move stuff around and see if this jogs anyone's memory.

Doing this outside the thread at least with front buffer rendering avoids
problems with XGetImage failing in the thread and deadlocking, now things
crash, which is a lot nicer from a piglit point of view.
---
 src/gallium/drivers/llvmpipe/lp_scene.c | 21 -
 src/gallium/drivers/llvmpipe/lp_scene.h |  5 +++--
 src/gallium/drivers/llvmpipe/lp_setup.c |  1 +
 3 files changed, 20 insertions(+), 7 deletions(-)

diff --git a/src/gallium/drivers/llvmpipe/lp_scene.c 
b/src/gallium/drivers/llvmpipe/lp_scene.c
index 2441b3c..1a6fe5c 100644
--- a/src/gallium/drivers/llvmpipe/lp_scene.c
+++ b/src/gallium/drivers/llvmpipe/lp_scene.c
@@ -147,7 +147,7 @@ lp_scene_bin_reset(struct lp_scene *scene, unsigned x, 
unsigned y)
 
 
 void
-lp_scene_begin_rasterization(struct lp_scene *scene)
+lp_scene_map_buffers(struct lp_scene *scene)
 {
const struct pipe_framebuffer_state *fb = >fb;
int i;
@@ -200,16 +200,20 @@ lp_scene_begin_rasterization(struct lp_scene *scene)
}
 }
 
-
+void
+lp_scene_begin_rasterization(struct lp_scene *scene)
+{
+   scene->started = true;
+}
 
 
 /**
  * Free all the temporary data in a scene.
  */
-void
-lp_scene_end_rasterization(struct lp_scene *scene )
+static void
+lp_scene_unmap_buffers(struct lp_scene *scene )
 {
-   int i, j;
+   int i;
 
/* Unmap color buffers */
for (i = 0; i < scene->fb.nr_cbufs; i++) {
@@ -232,7 +236,14 @@ lp_scene_end_rasterization(struct lp_scene *scene )
   zsbuf->u.tex.first_layer);
   scene->zsbuf.map = NULL;
}
+}
 
+void
+lp_scene_end_rasterization(struct lp_scene *scene )
+{
+   int i, j;
+   lp_scene_unmap_buffers(scene);
+   scene->started = false;
/* Reset all command lists:
 */
for (i = 0; i < scene->tiles_x; i++) {
diff --git a/src/gallium/drivers/llvmpipe/lp_scene.h 
b/src/gallium/drivers/llvmpipe/lp_scene.h
index b1464bb..7ed38c9 100644
--- a/src/gallium/drivers/llvmpipe/lp_scene.h
+++ b/src/gallium/drivers/llvmpipe/lp_scene.h
@@ -178,6 +178,7 @@ struct lp_scene {
 
struct cmd_bin tile[TILES_X][TILES_Y];
struct data_block_list data;
+   boolean started;
 };
 
 
@@ -405,8 +406,8 @@ lp_scene_begin_rasterization(struct lp_scene *scene);
 void
 lp_scene_end_rasterization(struct lp_scene *scene );
 
-
-
+void
+lp_scene_map_buffers(struct lp_scene *scene);
 
 
 #endif /* LP_BIN_H */
diff --git a/src/gallium/drivers/llvmpipe/lp_setup.c 
b/src/gallium/drivers/llvmpipe/lp_setup.c
index 1778b13..df2c323 100644
--- a/src/gallium/drivers/llvmpipe/lp_setup.c
+++ b/src/gallium/drivers/llvmpipe/lp_setup.c
@@ -176,6 +176,7 @@ lp_setup_rasterize_scene( struct lp_setup_context *setup )
 * Certainly, lp_scene_end_rasterization() would need to be deferred too
 * and there's probably other bits why this doesn't actually work.
 */
+   lp_scene_map_buffers(scene);
lp_rast_queue_scene(screen->rast, scene);
lp_rast_finish(screen->rast);
pipe_mutex_unlock(screen->rast_mutex);
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] soft/llvmpipe front buffer access and piglit regressions

2015-11-08 Thread Dave Airlie
So it appears my patch to enable front buffer access on soft/llvmpipe
causes some piglit regressions. However these are due to piglit having
undefined behaviour where it doesn't create a window but has tests
requiring a front buffer. The new code does an XGetImage on the front
buffer and when it fails all sorts of bad things tend to happen. I
don't think there is a way to check if we have a window mapped inside
Mesa to avoid this path.

swrast suffers from the same failure pattern in a number of tests when
run with -auto.

I'm not sure what to do here, the patch is making the driver
conformant and is fixing a missing
feature being used by OpenGL apps (gtk).

I can probably make it fail more gracefully (llvmpipe deadlocks on the
Xlib error path inside it's rasteriser threads), but I'm not sure I
want to go back to the old behaviour just to satisfy piglit's
requirement to do undefined things.

Dave.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [Mesa-stable] New stable-branch 11.0 candidate pushed

2015-11-08 Thread Oded Gabbay
On Sat, Nov 7, 2015 at 6:37 PM, Emil Velikov  wrote:
> Hello list,
>
> The candidate for the Mesa 11.0.5 is now available. Currently we have:
>  - 31 queued
>  - 13 nominated (outstanding)
>  - and 6 rejected/obsolete patches
>
> Currently we have mostly driver related fixes - some in i965 and
> nouveau, a few llvm related, the odd bugfix in the VA state-tracker.
> Additionally we have a few new PCI ids for i965 and radeonsi.
>
> Take a look at section "Mesa stable queue" for more information.
>
>
> Jason, Kenneth,
>
> I had to pick a few extra commits as prerequirements to the "nir:
> Properly invalidate metadata in nir_foo" patches. Can you let me know
> if any of these should not be in 11.0. Thanks !
>
> commit 800217a1654ab7932870b1510981f5e38712d58b
> Author: Kenneth Graunke 
>
> nir: Report progress from nir_split_var_copies().
>
> (cherry picked from commit dc18b9357b553a972ea439facfbc55e376f1179f)
>
>
> commit 2cc4e973962c1d5ea0357685036879c7bf9575ce
> Author: Jason Ekstrand 
>
> nir/lower_vec_to_movs: Pass the shader around directly
>
> (cherry picked from commit b7eeced3c724bf5de05290551ced8621ce2c7c52)
>
>
> commit ef4e862396ae81b0d59f172d0d5273a4e6b5992d
> Author: Jason Ekstrand 
>
> nir: Report progress from lower_vec_to_movs().
>
> (cherry picked from commit 9f5e7ae9d83ce6de761936b95cd0b7ba4c1219c4)
>
>
>
>
> Testing
> ---
> The following results are against piglit 4b6848c131c.
>
>
> Changes - classic i965(snb)
> ---
> None.
>
>
> Changes - swrast classic
> 
> None.
>
>
> Changes - gallium softpipe
> --
> None.
>
>
> Changes - gallium llvmpipe (LLVM 3.7)
> -
> None.
>
>
> Testing reports/general approval
> 
> Any testing reports (or general approval of the state of the branch)
> will be greatly appreciated.
>
>
> Trivial merge conflicts
> ---
> commit ef4e862396ae81b0d59f172d0d5273a4e6b5992d
> Author: Jason Ekstrand 
>
> nir: Report progress from lower_vec_to_movs().
>
> (cherry picked from commit 9f5e7ae9d83ce6de761936b95cd0b7ba4c1219c4)
>
>
> The plan is to have 11.0.5 this Monday (9th of November) or shortly after.
>
> If you have any questions or comments that you would like to share
> before the release, please go ahead.
>
>
> Cheers,
> Emil
>
>
> Mesa stable queue
> -
>
> Nominated (13)
> ==
>
> Ben Widawsky (1):
>   i965/skl/gt4: Fix URB programming restriction.
>
> Boyan Ding (1):
>   i915: Add XRGB format to intel_screen_make_configs
>
> Brian Paul (1):
>   configure: don't try to build gallium DRI drivers if --disable-dri is 
> set
>
> Dave Airlie (1):
>   gallium/swrast: fix front buffer blitting. (v2)
>
> Emil Velikov (3):
>   i965: store reference to the context within struct brw_fence
>   egl/dri2: expose srgb configs when KHR_gl_colorspace is available
>   mesa; add get-extra-pick-list.sh script into bin/
>
> Jean-Sébastien Pédron (1):
>   ralloc: Use __attribute__((destructor)) instead of atexit(3)
>
> Tapani Pälli (1):
>   mesa: fix error type for GetFramebufferAttachmentParameteriv
>
> Tom Stellard (4):
>   clover: Call clBuildProgram() notification function when build
> completes v2
>   gallium/drivers: Add threadsafe wrappers for pipe_context v2
>   clover: Use threadsafe wrappers for pipe_context v2
>   clover: Properly initialize LLVM targets when linking with component 
> libs
>
>
>
> Queued (31)
> ===
>
> Alex Deucher (1):
>   radeon/uvd: don't expose HEVC on old UVD hw (v3)
>
> Ben Widawsky (1):
>   i965/skl: Add GT4 PCI IDs
>
> Emil Velikov (2):
>   docs: add sha256 checksums for 11.0.4
>   cherry-ignore: ignore a possible wrong nomination
>
> Emmanuel Gil Peyrot (1):
>   gbm.h: Add a missing stddef.h include for size_t.
>
> Eric Anholt (1):
>   vc4: When the create ioctl fails, free our cache and try again.
>
> Ian Romanick (1):
>   i965: Fix is-renderable check in intel_image_target_renderbuffer_storage
>
> Ilia Mirkin (3):
>   nvc0: respect edgeflag attribute width
>   nouveau: set MaxDrawBuffers to the same value as MaxColorAttachments
>   nouveau: relax fence emit space assert
>
> Ivan Kalvachev (1):
>   r600g: Fix special negative immediate constants when using ABS modifier.
>
> Jason Ekstrand (2):
>   nir/lower_vec_to_movs: Pass the shader around directly
>   nir: Report progress from lower_vec_to_movs().
>
> Jose Fonseca (2):
>   gallivm: Translate all util_cpu_caps bits to LLVM attributes.
>   gallivm: Explicitly disable unsupported CPU features.
>
> Julien Isorce (4):
>   st/va: pass picture desc to begin and decode
>   nvc0: fix crash when nv50_miptree_from_handle fails
>   st/va: do 

[Mesa-dev] [PATCH] mesa/copyimage: allow width/height to not be multiples of block

2015-11-08 Thread Ilia Mirkin
For compressed textures, the image size is not necessarily a multiple of
the block size (e.g. the last mip levels). For example the RGTC spec
calls out this condition as legal, saying to error only when:

*  is not a multiple of four, and  plus 
* is not equal to TEXTURE_WIDTH;

While the GL_ARB_copy_image spec does not call this out explicitly, it
appears that some games rely on this. Also it seems like it should be
possible to copy in the last miplevels of a compressed texture's
miptree.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92860
Signed-off-by: Ilia Mirkin 
Cc: mesa-sta...@lists.freedesktop.org
---
 src/mesa/main/copyimage.c | 14 +++---
 1 file changed, 11 insertions(+), 3 deletions(-)

diff --git a/src/mesa/main/copyimage.c b/src/mesa/main/copyimage.c
index f02e842..974de7f 100644
--- a/src/mesa/main/copyimage.c
+++ b/src/mesa/main/copyimage.c
@@ -62,6 +62,8 @@ prepare_target(struct gl_context *ctx, GLuint name, GLenum 
target,
struct gl_renderbuffer **renderbuffer,
mesa_format *format,
GLenum *internalFormat,
+   GLuint *width,
+   GLuint *height,
const char *dbg_prefix)
 {
if (name == 0) {
@@ -126,6 +128,8 @@ prepare_target(struct gl_context *ctx, GLuint name, GLenum 
target,
   *renderbuffer = rb;
   *format = rb->Format;
   *internalFormat = rb->InternalFormat;
+  *width = rb->Width;
+  *height = rb->Height;
   *tex_image = NULL;
} else {
   struct gl_texture_object *texObj = _mesa_lookup_texture(ctx, name);
@@ -194,6 +198,8 @@ prepare_target(struct gl_context *ctx, GLuint name, GLenum 
target,
   *renderbuffer = NULL;
   *format = (*tex_image)->TexFormat;
   *internalFormat = (*tex_image)->InternalFormat;
+  *width = (*tex_image)->Width;
+  *height = (*tex_image)->Height;
}
 
return true;
@@ -423,6 +429,7 @@ _mesa_CopyImageSubData(GLuint srcName, GLenum srcTarget, 
GLint srcLevel,
struct gl_renderbuffer *srcRenderbuffer, *dstRenderbuffer;
mesa_format srcFormat, dstFormat;
GLenum srcIntFormat, dstIntFormat;
+   GLuint src_w, src_h, dst_w, dst_h;
GLuint src_bw, src_bh, dst_bw, dst_bh;
int dstWidth, dstHeight, dstDepth;
int i;
@@ -445,17 +452,18 @@ _mesa_CopyImageSubData(GLuint srcName, GLenum srcTarget, 
GLint srcLevel,
 
if (!prepare_target(ctx, srcName, srcTarget, srcLevel, srcZ, srcDepth,
, , ,
-   , "src"))
+   , _w, _h, "src"))
   return;
 
if (!prepare_target(ctx, dstName, dstTarget, dstLevel, dstZ, srcDepth,
, , ,
-   , "dst"))
+   , _w, _h, "dst"))
   return;
 
_mesa_get_format_block_size(srcFormat, _bw, _bh);
if ((srcX % src_bw != 0) || (srcY % src_bh != 0) ||
-   (srcWidth % src_bw != 0) || (srcHeight % src_bh != 0)) {
+   (srcWidth % src_bw != 0 && (srcX + srcWidth) != src_w) ||
+   (srcHeight % src_bh != 0 && (srcY + srcHeight) != src_h)) {
   _mesa_error(ctx, GL_INVALID_VALUE,
   "glCopyImageSubData(unaligned src rectangle)");
   return;
-- 
2.4.10

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] st/va: add mpeg4 startcode workaround

2015-11-08 Thread Christian König

On 06.11.2015 20:28, Ilia Mirkin wrote:

On Fri, Nov 6, 2015 at 2:15 PM, Christian König  wrote:

On 06.11.2015 20:10, Ilia Mirkin wrote:

On Fri, Nov 6, 2015 at 1:48 PM, Zhang, Boyuan 
wrote:

Hi Emil,

Please see the following information about this patch.

- Issue: For Mpeg4, the VOP and GOV headers were truncated. With the
existing workaround in st/va, playback shows massive corruptions.
- This Patch: Provide another way to get the truncated headers back.
Massive corruptions are gone with this patch. At the same time, add an
environmental variable to allow user to decide whether to use this patch.

Why would the user not want to use this? Sounds like a correctness
fix, no? Or is it some thing that a hypothetical gallium driver might
not need but the radeon uvd-based ones do? In that case it should be
behind a PIPE_VIDEO_CAP_bla (sorry, I'm still not too clear on what
"bla" is here...)


The problem is that this is a rather extreme hack.

As you probably knew VA-API didn't correctly specify which start code should
be included and which shouldn't for MPEG-4. This is an issue for AMD as well
as NVidia hardware and pretty much everybody which sticks close to an
elementary stream.

What we do in this hack is just searching the bytes *before* the pointer and
size we got from the application for the stuff that's missing. E.g. we
access memory the application didn't told us to access.

This is rather speculative, but works surprisingly well with a lot of
applications.

Hm, that is a little dodgy indeed. But making user-selectable options
(provided via env var) for correct decoding... doesn't seem ideal
either. Is there some "correct" way to resolve this without changing
the va api?


Unfortunately no. I wasn't involved in everything but we had a couple of 
people working on this which have more knowledge about MPEG-4 part 2 
than me.


A couple of month back somebody from a different team at AMD even tried 
to convince Intel to fix this, but as far as I know without success.


The over all conclusion is that the interface definition of VA-API for 
MPEG-4 part 2 is just a bloody mess.


Regards,
Christian.



   -ilia


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] gallium/swrast: fix front buffer blitting. (v2)

2015-11-08 Thread Emil Velikov
On 7 November 2015 at 21:57, Dave Airlie  wrote:
> On 8 November 2015 at 02:47, Emil Velikov  wrote:
>> Hi Dave,
>>
>> On 9 October 2015 at 01:38, Dave Airlie  wrote:
>>> From: Dave Airlie 
>>>
>>> So I've known this was broken before, cogl has a workaround
>>> for it from what I know, but with the gallium based swrast
>>> drivers BlitFramebuffer from back to front or vice-versa
>>> was pretty broken.
>>>
>>> The legacy swrast driver tracks when a front buffer is used
>>> and does the get/put images when it is mapped/unmapped,
>>> so this patch attempts to add the same functionality to the
>>> gallium drivers.
>>>
>>> It creates a new context interface to denote when a front
>>> buffer is being created, and passes a private pointer to it,
>>> this pointer is then used to decide on map/unmap if the
>>> contents should be updated from the real frontbuffer using
>>> get/put image.
>>>
>>> This is primarily to make gtk's gl code work, the only
>>> thing I've tested so far is the glarea test from
>>> https://github.com/ebassi/glarea-example.git
>>>
>>> v2: bump extension version,
>>> check extension version before calling get image. (Ian)
>>>
>>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91930
>>>
>>> Signed-off-by: Dave Airlie 
>> Seems that you've added the mesa-stable tag just prior to pushing.
>>
>> Thus as I picked it up was welcomed by some 200 regressions (and three
>> front buffer rendering fixes) on softpipe and for llvmpipe
>> hiz-depth-test-window-stencil1 was consistently going into a deadloop,
>> and killing it was dragging down the whole piglit run :(
>> Are you seeing a similar thing or there is something funny with my setup ?
>>
>> I've removed the patch from the queue for now, and if we cannot
>> resolve the regressions I will have to drop it from 11.0.
>
> Drop it for now. threads and xlib fun by the looks of it. I'll have to
> revisit asap.
> I've disabled it upstream for llvmpipe for now.
>
Ack. Glad I could find it before it hit "critical mass of" people.

Regards,
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] android: fix LOCAL_C_INCLUDES to find glsl_types.h

2015-11-08 Thread Mauro Rossi
Update2: I'm getting the building error in both x86 target and x86_64
target.

I'm relieved it's not arch dependend, I suspect that export will require a
dependency to be declared,
 because if i965_dri module is built before glsl ones we will have the
error.

The LOCAL_C_INCLUDES even if not elegant, avoided the problem in the first
place,
but I'd like to learn the by the best practice and apply it in the future.

Emil, Chih-Wei what is your thought on this?
Added also other android-x86 developers in CC

Mauro


In file included from
external/mesa/src/mesa/drivers/dri/i965/brw_cubemap_normalize.cpp:34:0:
external/mesa/src/glsl/ir.h:33:24: fatal error: glsl_types.h: No such file
or directory
 #include "glsl_types.h"
^
compilation terminated.
build/core/binary.mk:620: recipe for target
'out/target/product/x86/obj/SHARED_LIBRARIES/i965_dri_intermediates/brw_cubemap_normalize.o'
failed
make: ***
[out/target/product/x86/obj/SHARED_LIBRARIES/i965_dri_intermediates/brw_cubemap_normalize.o]
Error 1
make: *** Waiting for unfinished jobs
target  C++: i965_dri <=
external/mesa/src/mesa/drivers/dri/i965/brw_fs_channel_expressions.cpp
In file included from
external/mesa/src/mesa/drivers/dri/i965/brw_shader.h:29:0,
 from
external/mesa/src/mesa/drivers/dri/i965/brw_dead_control_flow.cpp:29:
external/mesa/src/glsl/ir.h:33:24: fatal error: glsl_types.h: No such file
or directory
 #include "glsl_types.h"
^
compilation terminated.
In file included from
external/mesa/src/mesa/drivers/dri/i965/brw_shader.h:29:0,
 from external/mesa/src/mesa/drivers/dri/i965/brw_cfg.h:32,
 from
external/mesa/src/mesa/drivers/dri/i965/brw_cfg.cpp:28:
external/mesa/src/glsl/ir.h:33:24: fatal error: glsl_types.h: No such file
or directory
 #include "glsl_types.h"
^
compilation terminated.
build/core/binary.mk:620: recipe for target
'out/target/product/x86/obj/SHARED_LIBRARIES/i965_dri_intermediates/brw_dead_control_flow.o'
failed
make: ***
[out/target/product/x86/obj/SHARED_LIBRARIES/i965_dri_intermediates/brw_dead_control_flow.o]
Error 1
build/core/binary.mk:620: recipe for target
'out/target/product/x86/obj/SHARED_LIBRARIES/i965_dri_intermediates/brw_cfg.o'
failed
make: ***
[out/target/product/x86/obj/SHARED_LIBRARIES/i965_dri_intermediates/brw_cfg.o]
Error 1
In file included from
external/mesa/src/mesa/drivers/dri/i965/brw_fs_channel_expressions.cpp:46:0:
external/mesa/src/glsl/ir.h:33:24: fatal error: glsl_types.h: No such file
or directory
 #include "glsl_types.h"
^
compilation terminated.
build/core/binary.mk:620: recipe for target
'out/target/product/x86/obj/SHARED_LIBRARIES/i965_dri_intermediates/brw_fs_channel_expressions.o'
failed
make: ***
[out/target/product/x86/obj/SHARED_LIBRARIES/i965_dri_intermediates/brw_fs_channel_expressions.o]
Error 1

2015-11-08 13:36 GMT+01:00 Mauro Rossi :

> Hi,
>
> Sending an update because with the export android_x86 target builds ok,
> but I'm getting again the "glsl_types.h not found" building error with
> android_x86_64 target (specifically for 64 bit modules).
>
> I'll report as soon I may be able to understand what's going on,
> added other android-x86 developers in CC.
>
> Mauro
>
> 2015-11-07 1:29 GMT+01:00 Mauro Rossi :
>
>> Hi Emil,
>>
>> by exporting the path of glsl nir headers, mesa builds without problems.
>>
>> You can find in the attachment the formatted patch.
>> Thanks
>>
>> Mauro
>>
>>
>>
>> 2015-11-06 18:26 GMT+01:00 Emil Velikov :
>>
>>> Hi Mauro
>>>
>>> On 6 November 2015 at 03:31, Mauro Rossi  wrote:
>>> > These changes are necessary to avoid building errors in glsl and i965
>>> > ---
>>> >  src/glsl/Android.mk  | 6 --
>>> >  src/mesa/drivers/dri/i965/Android.mk | 3 ++-
>>> >  2 files changed, 6 insertions(+), 3 deletions(-)
>>> >
>>> > diff --git a/src/glsl/Android.mk b/src/glsl/Android.mk
>>> > index f63b7da..6902ea4 100644
>>> > --- a/src/glsl/Android.mk
>>> > +++ b/src/glsl/Android.mk
>>> > @@ -42,7 +42,8 @@ LOCAL_C_INCLUDES := \
>>> > $(MESA_TOP)/src/mapi \
>>> > $(MESA_TOP)/src/mesa \
>>> > $(MESA_TOP)/src/gallium/include \
>>> > -   $(MESA_TOP)/src/gallium/auxiliary
>>> > +   $(MESA_TOP)/src/gallium/auxiliary \
>>> > +   $(MESA_TOP)/src/glsl/nir
>>> >
>>> >  LOCAL_MODULE := libmesa_glsl
>>> >
>>> > @@ -63,7 +64,8 @@ LOCAL_C_INCLUDES := \
>>> > $(MESA_TOP)/src/mapi \
>>> > $(MESA_TOP)/src/mesa \
>>> > $(MESA_TOP)/src/gallium/include \
>>> > -   $(MESA_TOP)/src/gallium/auxiliary
>>> > +   $(MESA_TOP)/src/gallium/auxiliary \
>>> > +   $(MESA_TOP)/src/glsl/nir
>>> >
>>> >  LOCAL_STATIC_LIBRARIES := libmesa_glsl libmesa_glsl_utils libmesa_util
>>> >
>>> > diff --git a/src/mesa/drivers/dri/i965/Android.mk
>>> b/src/mesa/drivers/dri/i965/Android.mk
>>> > 

[Mesa-dev] [RFC PATCH shader-db 2/2] run: request a debug context

2015-11-08 Thread Ilia Mirkin
st/mesa only prints messages in a debug context. Without always enabling
the message generation, I don't see a way to hook into the glEnable() to
turn it on/off.
---
 run.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/run.c b/run.c
index 73e468d..1d8d3b1 100644
--- a/run.c
+++ b/run.c
@@ -417,6 +417,7 @@ main(int argc, char **argv)
 EGL_CONTEXT_OPENGL_CORE_PROFILE_BIT_KHR,
 EGL_CONTEXT_MAJOR_VERSION_KHR, 3,
 EGL_CONTEXT_MINOR_VERSION_KHR, 2,
+EGL_CONTEXT_FLAGS_KHR, EGL_CONTEXT_OPENGL_DEBUG_BIT_KHR,
 EGL_NONE
 };
 EGLContext core_ctx = eglCreateContext(egl_dpy, cfg, EGL_NO_CONTEXT,
-- 
2.4.10

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [RFC PATCH shader-db 1/2] run: don't expect incoming message to contain a newline

2015-11-08 Thread Ilia Mirkin
It seems a bit odd to expect a debug message to contain a newline --
what if you wanted to include something *after* the message, for
example. It makes more sense for the code actually printing to have the
newline rather than the string being passed around.
---
 run.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/run.c b/run.c
index 2c2a810..73e468d 100644
--- a/run.c
+++ b/run.c
@@ -209,7 +209,7 @@ callback(GLenum source, GLenum type, GLuint id, GLenum 
severity, GLsizei length,
 assert(severity == GL_DEBUG_SEVERITY_NOTIFICATION);
 
 const char *const *shader_name = userParam;
-printf("%s - %s", *shader_name, message);
+printf("%s - %s\n", *shader_name, message);
 }
 
 static unsigned shader_test_size = 1 << 15; /* next-pow-2(num shaders in db) */
-- 
2.4.10

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] gallium/radeon: fix PIPE_QUERY_GPU_FINISHED

2015-11-08 Thread Michel Dänzer
On 09.11.2015 06:43, Marek Olšák wrote:
> From: Marek Olšák 
> 
> Broken by the addition of r600_multi_fence
> in 3b37155a68acc351cba86a1fa142bd0de2192d4c
> ---
>  src/gallium/drivers/radeon/r600_query.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/src/gallium/drivers/radeon/r600_query.c 
> b/src/gallium/drivers/radeon/r600_query.c
> index 9a54025..2bb5732 100644
> --- a/src/gallium/drivers/radeon/r600_query.c
> +++ b/src/gallium/drivers/radeon/r600_query.c
> @@ -532,7 +532,7 @@ static void r600_end_query(struct pipe_context *ctx, 
> struct pipe_query *query)
>   case PIPE_QUERY_TIMESTAMP_DISJOINT:
>   return;
>   case PIPE_QUERY_GPU_FINISHED:
> - rctx->rings.gfx.flush(rctx, RADEON_FLUSH_ASYNC, >fence);
> + ctx->flush(ctx, >fence, 0);
>   return;
>   case R600_QUERY_DRAW_CALLS:
>   rquery->end_result = rctx->num_draw_calls;
> 

Reviewed-by: Michel Dänzer 


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] RFC: llvmpipe map scene buffers outside thread. (v2)

2015-11-08 Thread Dave Airlie
From: Dave Airlie 

There might be a reason we do this inside the thread, but I'm not aware of it
yet, move stuff around and see if this jogs anyone's memory.

Doing this outside the thread at least with front buffer rendering avoids
problems with XGetImage failing in the thread and deadlocking, now things
crash, which is a lot nicer from a piglit point of view.

v2: map outside rast mutex, so if we fail XGetImage the fail paths
don't deadlock
---
 src/gallium/drivers/llvmpipe/lp_scene.c | 21 -
 src/gallium/drivers/llvmpipe/lp_scene.h |  5 +++--
 src/gallium/drivers/llvmpipe/lp_setup.c |  1 +
 3 files changed, 20 insertions(+), 7 deletions(-)

diff --git a/src/gallium/drivers/llvmpipe/lp_scene.c 
b/src/gallium/drivers/llvmpipe/lp_scene.c
index 2441b3c..1a6fe5c 100644
--- a/src/gallium/drivers/llvmpipe/lp_scene.c
+++ b/src/gallium/drivers/llvmpipe/lp_scene.c
@@ -147,7 +147,7 @@ lp_scene_bin_reset(struct lp_scene *scene, unsigned x, 
unsigned y)
 
 
 void
-lp_scene_begin_rasterization(struct lp_scene *scene)
+lp_scene_map_buffers(struct lp_scene *scene)
 {
const struct pipe_framebuffer_state *fb = >fb;
int i;
@@ -200,16 +200,20 @@ lp_scene_begin_rasterization(struct lp_scene *scene)
}
 }
 
-
+void
+lp_scene_begin_rasterization(struct lp_scene *scene)
+{
+   scene->started = true;
+}
 
 
 /**
  * Free all the temporary data in a scene.
  */
-void
-lp_scene_end_rasterization(struct lp_scene *scene )
+static void
+lp_scene_unmap_buffers(struct lp_scene *scene )
 {
-   int i, j;
+   int i;
 
/* Unmap color buffers */
for (i = 0; i < scene->fb.nr_cbufs; i++) {
@@ -232,7 +236,14 @@ lp_scene_end_rasterization(struct lp_scene *scene )
   zsbuf->u.tex.first_layer);
   scene->zsbuf.map = NULL;
}
+}
 
+void
+lp_scene_end_rasterization(struct lp_scene *scene )
+{
+   int i, j;
+   lp_scene_unmap_buffers(scene);
+   scene->started = false;
/* Reset all command lists:
 */
for (i = 0; i < scene->tiles_x; i++) {
diff --git a/src/gallium/drivers/llvmpipe/lp_scene.h 
b/src/gallium/drivers/llvmpipe/lp_scene.h
index b1464bb..7ed38c9 100644
--- a/src/gallium/drivers/llvmpipe/lp_scene.h
+++ b/src/gallium/drivers/llvmpipe/lp_scene.h
@@ -178,6 +178,7 @@ struct lp_scene {
 
struct cmd_bin tile[TILES_X][TILES_Y];
struct data_block_list data;
+   boolean started;
 };
 
 
@@ -405,8 +406,8 @@ lp_scene_begin_rasterization(struct lp_scene *scene);
 void
 lp_scene_end_rasterization(struct lp_scene *scene );
 
-
-
+void
+lp_scene_map_buffers(struct lp_scene *scene);
 
 
 #endif /* LP_BIN_H */
diff --git a/src/gallium/drivers/llvmpipe/lp_setup.c 
b/src/gallium/drivers/llvmpipe/lp_setup.c
index 1778b13..481dfb1 100644
--- a/src/gallium/drivers/llvmpipe/lp_setup.c
+++ b/src/gallium/drivers/llvmpipe/lp_setup.c
@@ -163,6 +163,7 @@ lp_setup_rasterize_scene( struct lp_setup_context *setup )
 
if (setup->last_fence)
   setup->last_fence->issued = TRUE;
+   lp_scene_map_buffers(scene);
 
pipe_mutex_lock(screen->rast_mutex);
 
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev