date:20151116

Re: [Mesa-dev] [PATCH 5/5] i965/nir: use vectorization for non-scalar stages

2015-11-16 Thread Jason Ekstrand

On Sat, Nov 14, 2015 at 6:59 PM, Connor Abbott  wrote:
> Shader-db results on bdw with INTEL_DEBUG=vec4:
>
> total instructions in shared programs: 1634044 -> 1612936 (-1.29%)
> instructions in affected programs: 802502 -> 781394 (-2.63%)
> helped: 5036
> HURT: 1442
>
> total cycles in shared programs: 9397790 -> 9355382 (-0.45%)
> cycles in affected programs: 5078600 -> 5036192 (-0.84%)
> helped: 3875
> HURT: 2554
>
> LOST:   0
> GAINED: 0
>
> Most of the hurt programs seem to be because we generate extra MOV's due
> to vectorizing things. For example, in
> shaders/non-free/steam/anomaly-2/158.shader_test, this:
>
> add(8)  g116<1>.xyF g12<4,4,1>.xyyyF g1.4<0,4,1>.xyyyF { align16 
> NoDDClr 1Q };
> add(8)  g117<1>.xyF g12<4,4,1>.xyyyF g1.4<0,4,1>.zwwwF { align16 
> NoDDClr 1Q };
> add(8)  g116<1>.zwF g12<4,4,1>.xxxyF -g1.4<0,4,1>.xxxyF { align16 
> NoDDChk 1Q };
> add(8)  g117<1>.zwF g12<4,4,1>.xxxyF -g1.4<0,4,1>.zzzwF { align16 
> NoDDChk 1Q };
>
> Turns into this:
>
> add(8)  g13<1>F g12<4,4,1>.xyxyF g1.4<0,4,1>F   { align16 1Q 
> };
> add(8)  g14<1>F g12<4,4,1>.xyxyF -g1.4<0,4,1>F  { align16 1Q 
> };
> mov(8)  g116<1>.xyD g13<4,4,1>.xyyyD{ align16 
> NoDDClr 1Q };
> mov(8)  g117<1>.xyD g13<4,4,1>.zwwwD{ align16 
> NoDDClr 1Q };
> mov(8)  g116<1>.zwD g14<4,4,1>.xxxyD{ align16 
> NoDDChk 1Q };
> mov(8)  g117<1>.zwD g14<4,4,1>.zzzwD{ align16 
> NoDDChk 1Q };
>
> So we eliminated two add's, but then had to introduce four mov's to
> transpose the result. I don't think there's much we can do about this at
> the NIR level, unfortunately.

Given the shader-db numbers above, I think we can probably eat the
hurt programs.  Would you mind cherry-picking back onto a time when we
had GLSL IR and doing a GLSL IR vs. NIR comparison with this series?
This is one of the places we were still hurting so it would be good to
know how it changes the picture.  Not that it *really* matters at this
point...

> Signed-off-by: Connor Abbott 
> ---
>  src/mesa/drivers/dri/i965/brw_nir.c | 8 
>  1 file changed, 8 insertions(+)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_nir.c 
> b/src/mesa/drivers/dri/i965/brw_nir.c
> index fe5cad4..29cafe6 100644
> --- a/src/mesa/drivers/dri/i965/brw_nir.c
> +++ b/src/mesa/drivers/dri/i965/brw_nir.c
> @@ -198,6 +198,14 @@ nir_optimize(nir_shader *nir, bool is_scalar)
>nir_validate_shader(nir);
>progress |= nir_opt_cse(nir);
>nir_validate_shader(nir);
> +
> +  if (!is_scalar) {
> + progress |= nir_opt_vectorize(nir);
> + nir_validate_shader(nir);
> + progress |= nir_copy_prop(nir);
> + nir_validate_shader(nir);
> +  }
> +
>progress |= nir_opt_peephole_select(nir);
>nir_validate_shader(nir);
>progress |= nir_opt_algebraic(nir);
> --
> 2.4.3
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v5 5/7] glsl: Add precision information to ir_variable

2015-11-16 Thread Samuel Iglesias Gonsálvez



On 16/11/15 17:34, Ilia Mirkin wrote:
> On Mon, Nov 16, 2015 at 11:29 AM, Samuel Iglesias Gonsálvez
>  wrote:
>>
>>
>> On 16/11/15 13:07, Tapani Pälli wrote:
>>>
>>> On 11/16/2015 01:35 PM, Tapani Pälli wrote:


 On 11/16/2015 01:29 PM, Samuel Iglesias Gonsálvez wrote:
> Hello Ilia, Tapani:
>
> I have reproduced the issue with a piglit test but not with the trace
> uploaded in the bug report :-(
>
> The piglit test was: bin/arb_shader_storage_buffer_object-maxblocks
>
> I have upload a branch with some fixes at Igalia's mesa repo:
>
> Git repo: https://github.com/Igalia/mesa.git
> Branch: wip/siglesias/precision-fixes
>
> But as this error might come from other initializations that I might
> overlook:
> * Ilia: Could you test if this issue is still happening to you? As I
> cannot reproduce it locally, I might be forgetting something.
> * Tapani: Could you do a quick run on CTS to check I have not broken
> anything?

 Sure thing, I'll run testing. FWIW one of the patches was identical to
 my fix sent for fixing tessellation shader problems:

 http://lists.freedesktop.org/archives/mesa-dev/2015-November/100396.html
>>>
>>> No CTS regressions with these patches, I've gone through these and
>>> changes look good to me!
>>>
>>>
>>
>> OK, once Ilia replies that the issue is fixed with those patches, I will
>> send them for review to the mailing list :-)
> 
> I won't have time to look until tonight. However the repro steps were
> pretty simple... download the trace and run through valgrind. Probably
> tons of other ways to trigger it too, of course... I'd esp look for
> piglits that have uniform structs.
>

The problem is that I could not reproduce it with the trace. That's why
I am asking.

I reproduce it with a piglit tests, but maybe precision is uninitialized
in other cases. Tomorrow I will do some more testing, just in case.

Thanks,

Sam
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 11/11] i965: Use nir_lower_tex for texture coordinate lowering

2015-11-16 Thread Jason Ekstrand

On Mon, Nov 16, 2015 at 6:27 AM, Iago Toral  wrote:
> On Mon, 2015-11-16 at 11:33 +0100, Iago Toral wrote:
>> On Wed, 2015-11-11 at 17:26 -0800, Jason Ekstrand wrote:
>> > Previously, we had a rescale_texcoords helper in the FS backend for
>> > handling rescaling of texture coordinates.  Now that we can do variants in
>> > NIR, we can use nir_lower_tex to do the rescaling for us.  This allows us
>> > to delete the i965-specific code and gives us proper TEXTURE_RECTANGLE and
>> > GL_CLAMP handling in vertex and geometry shaders.
>> > ---
>> >  src/mesa/drivers/dri/i965/brw_fs.cpp  |   2 +
>> >  src/mesa/drivers/dri/i965/brw_fs.h|   3 -
>> >  src/mesa/drivers/dri/i965/brw_fs_nir.cpp  |   4 +-
>> >  src/mesa/drivers/dri/i965/brw_fs_visitor.cpp  | 125 
>> > --
>> >  src/mesa/drivers/dri/i965/brw_nir.c   |  23 
>> >  src/mesa/drivers/dri/i965/brw_nir.h   |   6 ++
>> >  src/mesa/drivers/dri/i965/brw_vec4.cpp|   2 +
>> >  src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp |   2 +
>> >  8 files changed, 36 insertions(+), 131 deletions(-)
>> >
>> > diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
>> > b/src/mesa/drivers/dri/i965/brw_fs.cpp
>> > index b8713ab..c56cafe 100644
>> > --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
>> > +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
>> > @@ -5468,6 +5468,7 @@ brw_compile_fs(const struct brw_compiler *compiler, 
>> > void *log_data,
>> > char **error_str)
>> >  {
>> > nir_shader *shader = nir_shader_clone(mem_ctx, src_shader);
>> > +   brw_nir_apply_sampler_key(shader, compiler->devinfo, >tex, true);
>>
>> This looks like it is part of the post-processing process. In fact, you
>> call this right before brw_postprocess_nir() for every stage. Why not
>> just add a key parameter to brw_postprocess_nir() and call
>> brw_nir_apply_sampler_key from there instead?

Well, right now, brw_nir_apply_sampler_key is used for all stages.
However, if we do more variant stuff in NIR, we'll need to have
brw_nir_apply_vs_key, brw_nir_apply_fs_key, etc. and we can't put it
in postprocess_nir.  I didn't want to join it prematurely.

>> Either way,
>> Reviewed-by: Iago Toral Quiroga 

Thanks!

>> > brw_postprocess_nir(shader, compiler->devinfo, true);
>> >
>> > /* key->alpha_test_func means simulating alpha testing via discards,
>> > @@ -5628,6 +5629,7 @@ brw_compile_cs(const struct brw_compiler *compiler, 
>> > void *log_data,
>> > char **error_str)
>> >  {
>> > nir_shader *shader = nir_shader_clone(mem_ctx, src_shader);
>> > +   brw_nir_apply_sampler_key(shader, compiler->devinfo, >tex, true);
>> > brw_postprocess_nir(shader, compiler->devinfo, true);
>> >
>> > prog_data->local_size[0] = shader->info.cs.local_size[0];
>> > diff --git a/src/mesa/drivers/dri/i965/brw_fs.h 
>> > b/src/mesa/drivers/dri/i965/brw_fs.h
>> > index 2dfcab1..8a181d7 100644
>> > --- a/src/mesa/drivers/dri/i965/brw_fs.h
>> > +++ b/src/mesa/drivers/dri/i965/brw_fs.h
>> > @@ -217,8 +217,6 @@ public:
>> > void emit_interpolation_setup_gen4();
>> > void emit_interpolation_setup_gen6();
>> > void compute_sample_position(fs_reg dst, fs_reg int_sample_pos);
>> > -   fs_reg rescale_texcoord(fs_reg coordinate, int coord_components,
>> > -   bool is_rect, uint32_t sampler);
>> > void emit_texture(ir_texture_opcode op,
>> >   const glsl_type *dest_type,
>> >   fs_reg coordinate, int components,
>> > @@ -229,7 +227,6 @@ public:
>> >   fs_reg mcs,
>> >   int gather_component,
>> >   bool is_cube_array,
>> > - bool is_rect,
>> >   uint32_t sampler,
>> >   fs_reg sampler_reg);
>> > fs_reg emit_mcs_fetch(const fs_reg , unsigned components,
>> > diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp 
>> > b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
>> > index 02b9f5b..3d83d7c 100644
>> > --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
>> > +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
>> > @@ -2411,8 +2411,6 @@ fs_visitor::nir_emit_texture(const fs_builder , 
>> > nir_tex_instr *instr)
>> >
>> > int gather_component = instr->component;
>> >
>> > -   bool is_rect = instr->sampler_dim == GLSL_SAMPLER_DIM_RECT;
>> > -
>> > bool is_cube_array = instr->sampler_dim == GLSL_SAMPLER_DIM_CUBE &&
>> >  instr->is_array;
>> >
>> > @@ -2549,7 +2547,7 @@ fs_visitor::nir_emit_texture(const fs_builder , 
>> > nir_tex_instr *instr)
>> > emit_texture(op, dest_type, coordinate, instr->coord_components,
>> >  shadow_comparitor, lod, lod2, lod_components, 
>> > sample_index,
>> >  tex_offset, mcs, gather_component,
>> > -is_cube_array, is_rect, sampler, sampler_reg);
>> > +

Re: [Mesa-dev] [PATCH] r600g: Support TGSI_SEMANTIC_HELPER_INVOCATION

2015-11-16 Thread Ilia Mirkin

On Mon, Nov 16, 2015 at 8:31 AM, Nicolai Hähnle  wrote:
> Hi Glenn,
>
> On 14.11.2015 00:11, Glenn Kennard wrote:
>>
>> On Fri, 13 Nov 2015 18:57:28 +0100, Nicolai Hähnle 
>> wrote:
>>
>>> On 13.11.2015 00:14, Glenn Kennard wrote:

 Signed-off-by: Glenn Kennard 
 ---
 Maybe there is a better way to check if a thread is a helper invocation?
>>>
>>>
>>> Is ctx->face_gpr guaranteed to be initialized when
>>> load_helper_invocation is called?
>>>
>>
>> allocate_system_value_inputs() sets that if needed, and is called before
>> parsing any opcodes.
>
>
> Sorry, you're right, I missed the second change to the inputs array there.
>
>
>>> Aside, I'm not sure I understand correctly what this is supposed to
>>> do. The values you're querying are related to multi-sampling, but my
>>> understanding has always been that helper invocations can also happen
>>> without multi-sampling: you always want to process 2x2 quads of pixels
>>> at a time to be able to compute derivatives for texture sampling. When
>>> the boundary of primitive intersects such a quad, you get helper
>>> invocations outside the primitive.
>>>
>>
>> Non-MSAA buffers act just like 1 sample buffers with regards to the
>> coverage mask supplied by the hardware, so helper invocations which have
>> no coverage get a 0 for the mask value, and normal fragments get 1.
>> Works with the piglit test case posted at least...
>
>
> Here's why I'm still skeptical: According to the GLSL spec, the fragment
> shader is only run once per pixel by default, even when MSAA is enabled.
> _However_, if a shader statically accesses the SampleID, _then_ it must be
> run once per fragment. The way I understand it, your change forces the
> fragment shader to access SampleID, even when people ostensibly use
> HelperInvocation in the hope of optimizing something.

GPU's don't operate based on GLSL specs. Per-sample shading is enabled
separately.

>
> In the usual MSAA operation of only running the fragment shader once per
> pixel, HelperInvocation should be the same as SampleMask != 0, right? It
> seems like the right thing to do is to _not_ allocate the
> TGSI_SEMANTIC_SAMPLEID when TGSI_SEMANTIC_HELPER_INVOCATION is used, and
> then use different code paths in load_helper_invocation based on which of
> the source registers are actually there.

This, however is a good point -- for MSAA presumably the sample id
will always work out to 0, but the bottom bit of the sample mask may
not be set. In the non-SSAA case, should probably check if the whole
mask != 0.

  -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v5 5/7] glsl: Add precision information to ir_variable

2015-11-16 Thread Tapani Pälli



On 11/16/2015 01:35 PM, Tapani Pälli wrote:



On 11/16/2015 01:29 PM, Samuel Iglesias Gonsálvez wrote:

Hello Ilia, Tapani:

I have reproduced the issue with a piglit test but not with the trace
uploaded in the bug report :-(

The piglit test was: bin/arb_shader_storage_buffer_object-maxblocks

I have upload a branch with some fixes at Igalia's mesa repo:

Git repo: https://github.com/Igalia/mesa.git
Branch: wip/siglesias/precision-fixes

But as this error might come from other initializations that I might
overlook:
* Ilia: Could you test if this issue is still happening to you? As I
cannot reproduce it locally, I might be forgetting something.
* Tapani: Could you do a quick run on CTS to check I have not broken
anything?


Sure thing, I'll run testing. FWIW one of the patches was identical to
my fix sent for fixing tessellation shader problems:

http://lists.freedesktop.org/archives/mesa-dev/2015-November/100396.html


No CTS regressions with these patches, I've gone through these and 
changes look good to me!




Thanks!

Sam


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] llvm TGSI backend (WIP) questions

2015-11-16 Thread Hans de Goede


Hi,

On 13-11-15 19:51, Tom Stellard wrote:

On Fri, Nov 13, 2015 at 02:46:52PM +0100, Hans de Goede wrote:

Hi All,

So as discussed I've started working on a TGSI backend for
llvm to use as a way to get compute going on nouveau (and other gpu-s).

I'm still learning all the ins and outs of llvm so I do not have
much to show yet.

I've rebased Francisco's (curro's) latest version on top of llvm
trunk, and added a commit on top to actual get it build with the
latest trunk. So currently I'm at the point where I've just
taken Francisco's code, and made it compile, no more and no less.

I have a git repo with this work available here:

http://cgit.freedesktop.org/~jwrdegoede/llvm/

So the next step would be to test this and see if it actually
does anything, questions:

1) Does anyone have a simple test case / command where I can
invoke just llvm and get TGSI asm output to check ?



The easiest way to do this is with the llc tool which ships with llvm.
It compiles LLVM IR to target code, which in this case is tgsi.
I would recommend taking one of the simple examples from
test/CodeGen/AMDGPU (you may need to get these from llvm trunk, not sure
what llvm version you are using).

To use llc:

llc -march=tgsi input.ll -o -


This will output TGSI.


If you want to use clang to compile OpenCL C kernels to clang you will
need to teach clang about the TGSI target by implementing the a
sub-class of TargetInfo in lib/Basic/Targets.cpp.  Look at the
AMDGPU target for examples, but I recommend starting with llc.


2) Assuming I get the above to (somewhat) work, is there a
way to make llvm show the output of the various intermediate
passes in a human readable form ?



You can pass -print-before-all or -print-after-all to dump the
intermediate forms.


Thanks this is exactly what I was looking for. I'll send another
status update when I've something worthwhile to report :)

Regards,

Hans
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 00/11] i965/nir: Do texture rectangle lowering in NIR

2015-11-16 Thread Rob Clark

On Sat, Nov 14, 2015 at 1:00 PM, Jason Ekstrand  wrote:
> On Sat, Nov 14, 2015 at 9:44 AM, Rob Clark  wrote:
>> On Sat, Nov 14, 2015 at 12:30 PM, Jason Ekstrand  
>> wrote:
>>> On Sat, Nov 14, 2015 at 8:58 AM, Rob Clark  wrote:
 On Sat, Nov 14, 2015 at 11:01 AM, Jason Ekstrand  
 wrote:
> On Thu, Nov 12, 2015 at 7:30 AM, Iago Toral  wrote:
>> On Thu, 2015-11-12 at 16:23 +0100, Iago Toral wrote:
>>> Patches 1-4 are,
>>> Reviewed-by: Iago Toral Quiroga 
>>>
>>> Patch 5 seems to be missing.
>
> If it helps to calm reviewer's minds, I ran patches 1-5 with this patch 
> on top:
>
> http://cgit.freedesktop.org/~jekstrand/mesa/commit/?h=wip/nir-clone
>
> Zero regressions in piglit, dEQP, and the CTS.

 imho, please push something like this to master, w/ perhaps an env-var
 switch (ofc just for debug builds)..  this way we can work nir_clone
 testing into normal CI test cycle, and protect against future
 difficult-to-track-down breakage
>>>
>>> I thought about doing that but it didn't really work very well with
>>> patch 6.  Also, by the time we get to patch 7, it's getting tested
>>> pretty well. About the only thing that doesn't get tested there is
>>> registers.  I'm not opposed to adding support for testing it in CI,
>>> but I don't want to dirty up an API to do so if it can be avoided.
>>> Would you be ok with cloning in a few key places?
>>
>> Well, I prefer testing between each stage.. it's a little brute-force,
>> but it ensures we don't miss something that only appears between
>> certain stages, now or in the future.  The few-key-places approach is
>> certainly better than nothing.
>>
>> I guess the 'dirty up an API' bit referred to returning nir_shader's?
>> I don't think that is *that* horrible a price to pay..
>
> I'm more concerned about the fact that you get out a pointer that may
> or may not be the one you passed in and we may or may not have deleted
> the one you passed in and, to make things better, if you accidentally
> ignore the return value it will work fine unless INTEL_NIR_CLONE is
> enabled.  I think we're going to find ourselves breaking the nir_clone
> testing code more often than breaking nir_clone.

hmm, I don't think it is *that* hard, plus if NIR_TEST_CLONE is
enabled on a regular bases in CI tests, it shouldn't be very hard to
keep it all working properly.

(we should put the pass-runner macros plus env-var for running clone
test somewhere in nir, rather than i965, so all the drivers are
following the same pattern, but that is a different topic)

BR,
-R

> --Jason
>
>> BR,
>> -R
>>
>>> --Jason
>>>
 (and you can even pre-emptively slap my r-b on that, since I'm happy
 however that is accomplished..)

 BR,
 -R

>> Oh never mind, I've just seen your reply to the thread pointing to the
>> repository.
>>
>> Iago
>>
>>> Iago
>>>
>>> On Wed, 2015-11-11 at 17:23 -0800, Jason Ekstrand wrote:
>>> > On older hardware (Iron Lake and below), we can't support texture 
>>> > rectangle
>>> > natively.  Sandy Bridge through Haswell can support it but don't 
>>> > support
>>> > the GL_CLAMP wrap mode natively.  It isn't until Broadwell that 
>>> > GL_CLAMP is
>>> > supported together with GL_TEXTURE_RECTANGLE in hardware.  In the 
>>> > cases
>>> > where it isn't supported, we have to fake it by dividing by the 
>>> > texture
>>> > size.
>>> >
>>> > Previously, we had a rescale_texcoord function added a uniform to 
>>> > hold the
>>> > texture coordinate and used that to rescale/clamp the texture 
>>> > coordinates.
>>> > For a while now, nir_lower_tex has been able to lower texture 
>>> > rectangle to
>>> > a textureSize and a regular texture2D operation.  This series makes 
>>> > i965
>>> > use the nir_lower_tex path instead.  Incidentally, this fixes texture
>>> > rectangle support in vertex and geometry shaders on Haswell and below.
>>> > (The backend lowering was only ever done in the FS backend.)
>>> >
>>> > Since this is the first time we're doing any sort of shader variants 
>>> > in
>>> > NIR, the first several passes add the infastructure to do so.  Two of 
>>> > these
>>> > patches are from Ken, two are from Rob, and one (nir_clone itself) is 
>>> > my
>>> > rendition but heavily based on what Rob did only with less hashing.
>>> >
>>> > Jason Ekstrand (7):
>>> >   nir: support to clone shaders
>>> >   i965/nir: Split shader optimization and lowering into three satages
>>> >   i965: Move postprocess_nir to codegen time
>>> >   nir/lower_tex: Report progress
>>> >   nir/lower_tex: Set the dest_type for txs instructions
>>> >   i965/fs: Don't allow SINT32 as a

Re: [Mesa-dev] [PATCH 32/36] glsl: Translate atomic intrinsic functions on shared variables

2015-11-16 Thread Iago Toral

hOn Sat, 2015-11-14 at 13:44 -0800, Jordan Justen wrote:
> When an intrinsic atomic operation is used on a shared variable, we
> translate it to a new 'share variable' specific intrinsic function
> call.
> 
> For example, add call to __intrinsic_atomic_add when used on a shared
> variable will be translated to a call to
> __intrinsic_atomic_add_shared.
> 
> Signed-off-by: Jordan Justen 
> ---
>  src/glsl/lower_shared_reference.cpp | 151 
> 
>  1 file changed, 151 insertions(+)
> 
> diff --git a/src/glsl/lower_shared_reference.cpp 
> b/src/glsl/lower_shared_reference.cpp
> index 810c6b6..7ff2c0c 100644
> --- a/src/glsl/lower_shared_reference.cpp
> +++ b/src/glsl/lower_shared_reference.cpp
> @@ -79,6 +79,10 @@ public:
> ir_visitor_status visit_enter(ir_assignment *ir);
> void handle_assignment(ir_assignment *ir);
>  
> +   ir_call *lower_shared_atomic_intrinsic(ir_call *ir);
> +   ir_call *check_for_shared_atomic_intrinsic(ir_call *ir);
> +   ir_visitor_status visit_enter(ir_call *ir);
> +
> unsigned get_shared_offset(const ir_variable *);
>  
> ir_call *shared_load(const struct glsl_type *type, ir_rvalue *offset);
> @@ -337,6 +341,153 @@ lower_shared_reference_visitor::shared_load(const 
> struct glsl_type *type,
> return new(mem_ctx) ir_call(sig, deref_result, _params);
>  }
>  
> +/* Lowers the intrinsic call to a new internal intrinsic that swaps the
> + * access to the buffer variable in the first parameter by an offset
> + * and block index. This involves creating the new internal intrinsic
> + * (i.e. the new function signature).
> + */
> +ir_call *
> +lower_shared_reference_visitor::lower_shared_atomic_intrinsic(ir_call *ir)
> +{
> +   /* Shared atomics usually have 2 parameters, the shared variable and an
> +* integer argument. The exception is CompSwap, that has an additional
> +* integer parameter.
> +*/
> +   int param_count = ir->actual_parameters.length();
> +   assert(param_count == 2 || param_count == 3);
> +
> +   /* First argument must be a scalar integer buffer variable */
> +   exec_node *param = ir->actual_parameters.get_head();
> +   ir_instruction *inst = (ir_instruction *) param;
> +   assert(inst->ir_type == ir_type_dereference_variable ||
> +  inst->ir_type == ir_type_dereference_array ||
> +  inst->ir_type == ir_type_dereference_record ||
> +  inst->ir_type == ir_type_swizzle);
> +
> +   ir_rvalue *deref = (ir_rvalue *) inst;
> +   assert(deref->type->is_scalar() && deref->type->is_integer());
> +
> +   ir_variable *var = deref->variable_referenced();
> +   assert(var);
> +
> +   /* Compute the offset to the start if the dereference and the
> +* block index
> +*/
> +   mem_ctx = ralloc_parent(shader->ir);
> +
> +   ir_rvalue *offset = NULL;
> +   unsigned const_offset = get_shared_offset(var);
> +   bool row_major;
> +   int matrix_columns;
> +   const glsl_type *iface = var->get_interface_type();
> +   unsigned packing =
> +  iface ? iface->interface_packing : GLSL_INTERFACE_PACKING_STD430;
> +   buffer_access_type = shared_atomic_access;
> +
> +   setup_buffer_access(var, deref,
> +   , _offset,
> +   _major, _columns, packing);
> +
> +   assert(offset);
> +   assert(!row_major);
> +   assert(matrix_columns == 1);
> +
> +   ir_rvalue *deref_offset =
> +  add(offset, new(mem_ctx) ir_constant(const_offset));
> +
> +   /* Create the new internal function signature that will take a block
> +* index and offset instead of a buffer variable
> +*/
> +   exec_list sig_params;
> +   ir_variable *sig_param = new(mem_ctx)
> +  ir_variable(glsl_type::uint_type, "offset" , ir_var_function_in);
> +   sig_params.push_tail(sig_param);
> +
> +   const glsl_type *type = deref->type->base_type == GLSL_TYPE_INT ?
> +  glsl_type::int_type : glsl_type::uint_type;
> +   sig_param = new(mem_ctx)
> + ir_variable(type, "data1", ir_var_function_in);
> +   sig_params.push_tail(sig_param);
> +
> +   if (param_count == 3) {
> +  sig_param = new(mem_ctx)
> +ir_variable(type, "data2", ir_var_function_in);
> +  sig_params.push_tail(sig_param);
> +   }
> +
> +   ir_function_signature *sig =
> +  new(mem_ctx) ir_function_signature(deref->type,
> + compute_shader_enabled);
> +   assert(sig);
> +   sig->replace_parameters(_params);
> +   sig->is_intrinsic = true;
> +
> +   char func_name[64];
> +   sprintf(func_name, "%s_shared", ir->callee_name());
> +   ir_function *f = new(mem_ctx) ir_function(func_name);
> +   f->add_signature(sig);
> +
> +   /* Now, create the call to the internal intrinsic */
> +   exec_list call_params;
> +   call_params.push_tail(deref_offset);
> +   param = ir->actual_parameters.get_head()->get_next();
> +   ir_rvalue *param_as_rvalue = ((ir_instruction *) param)->as_rvalue();
> +

Re: [Mesa-dev] [PATCH v5 5/7] glsl: Add precision information to ir_variable

2015-11-16 Thread Ilia Mirkin

On Mon, Nov 16, 2015 at 11:29 AM, Samuel Iglesias Gonsálvez
 wrote:
>
>
> On 16/11/15 13:07, Tapani Pälli wrote:
>>
>> On 11/16/2015 01:35 PM, Tapani Pälli wrote:
>>>
>>>
>>> On 11/16/2015 01:29 PM, Samuel Iglesias Gonsálvez wrote:
 Hello Ilia, Tapani:

 I have reproduced the issue with a piglit test but not with the trace
 uploaded in the bug report :-(

 The piglit test was: bin/arb_shader_storage_buffer_object-maxblocks

 I have upload a branch with some fixes at Igalia's mesa repo:

 Git repo: https://github.com/Igalia/mesa.git
 Branch: wip/siglesias/precision-fixes

 But as this error might come from other initializations that I might
 overlook:
 * Ilia: Could you test if this issue is still happening to you? As I
 cannot reproduce it locally, I might be forgetting something.
 * Tapani: Could you do a quick run on CTS to check I have not broken
 anything?
>>>
>>> Sure thing, I'll run testing. FWIW one of the patches was identical to
>>> my fix sent for fixing tessellation shader problems:
>>>
>>> http://lists.freedesktop.org/archives/mesa-dev/2015-November/100396.html
>>
>> No CTS regressions with these patches, I've gone through these and
>> changes look good to me!
>>
>>
>
> OK, once Ilia replies that the issue is fixed with those patches, I will
> send them for review to the mailing list :-)

I won't have time to look until tonight. However the repro steps were
pretty simple... download the trace and run through valgrind. Probably
tons of other ways to trigger it too, of course... I'd esp look for
piglits that have uniform structs.

  -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v3 07/14] glsl: move stream layout max validation

2015-11-16 Thread Samuel Iglesias Gonsálvez



On 14/11/15 14:58, Timothy Arceri wrote:
> On Sun, 2015-11-15 at 00:42 +1100, Timothy Arceri wrote:
>> From: Timothy Arceri 
>>
>> This validation is moved later so we can validate the
>> max value when compile time constant support is added in a
>> later patch.
>> ---
>>  src/glsl/ast_to_hir.cpp | 22 --
>>  src/glsl/ast_type.cpp   | 14 --
>>  2 files changed, 20 insertions(+), 16 deletions(-)
>>
>> diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp
>> index 53faacf..dedc39f 100644
>> --- a/src/glsl/ast_to_hir.cpp
>> +++ b/src/glsl/ast_to_hir.cpp
>> @@ -2522,8 +2522,24 @@ process_qualifier_constant(struct
>> _mesa_glsl_parse_state *state,
>>  }
>>  
>>  static bool
>> +validate_stream_qualifier(YYLTYPE *loc, struct _mesa_glsl_parse_state
>> *state,
>> +  unsigned stream)
>> +{
>> +   if (stream >= state->ctx->Const.MaxVertexStreams) {
>> +  _mesa_glsl_error(loc, state,
>> +   "invalid stream specified %d is larger than "
>> +   "MAX_VERTEX_STREAMS - 1 (%d).",
>> +   stream, state->ctx->Const.MaxVertexStreams - 1);
>> +  return false;
>> +   }
>> +
>> +   return true;
>> +}
>> +
>> +static void
>>  validate_binding_qualifier(struct _mesa_glsl_parse_state *state,
>> YYLTYPE *loc,
>> +   ir_variable *var,
> 
> This and changeing the function to not return bool are meant to be in the next
> patch. I've fixed this up locally.
> 

OK, with those changes and assuming no piglit regressions, patches 1-7 are:

Reviewed-by: Samuel Iglesias Gonsálvez 

Sam

> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v5 5/7] glsl: Add precision information to ir_variable

2015-11-16 Thread Samuel Iglesias Gonsálvez



On 16/11/15 13:07, Tapani Pälli wrote:
> 
> On 11/16/2015 01:35 PM, Tapani Pälli wrote:
>>
>>
>> On 11/16/2015 01:29 PM, Samuel Iglesias Gonsálvez wrote:
>>> Hello Ilia, Tapani:
>>>
>>> I have reproduced the issue with a piglit test but not with the trace
>>> uploaded in the bug report :-(
>>>
>>> The piglit test was: bin/arb_shader_storage_buffer_object-maxblocks
>>>
>>> I have upload a branch with some fixes at Igalia's mesa repo:
>>>
>>> Git repo: https://github.com/Igalia/mesa.git
>>> Branch: wip/siglesias/precision-fixes
>>>
>>> But as this error might come from other initializations that I might
>>> overlook:
>>> * Ilia: Could you test if this issue is still happening to you? As I
>>> cannot reproduce it locally, I might be forgetting something.
>>> * Tapani: Could you do a quick run on CTS to check I have not broken
>>> anything?
>>
>> Sure thing, I'll run testing. FWIW one of the patches was identical to
>> my fix sent for fixing tessellation shader problems:
>>
>> http://lists.freedesktop.org/archives/mesa-dev/2015-November/100396.html
> 
> No CTS regressions with these patches, I've gone through these and
> changes look good to me!
> 
> 

OK, once Ilia replies that the issue is fixed with those patches, I will
send them for review to the mailing list :-)

Thanks!

Sam

>>> Thanks!
>>>
>>> Sam
>>>
>> ___
>> mesa-dev mailing list
>> mesa-dev@lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v5 5/7] glsl: Add precision information to ir_variable

2015-11-16 Thread Ilia Mirkin

On Mon, Nov 16, 2015 at 11:42 AM, Samuel Iglesias Gonsálvez
 wrote:
>
>
> On 16/11/15 17:34, Ilia Mirkin wrote:
>> On Mon, Nov 16, 2015 at 11:29 AM, Samuel Iglesias Gonsálvez
>>  wrote:
>>>
>>>
>>> On 16/11/15 13:07, Tapani Pälli wrote:

 On 11/16/2015 01:35 PM, Tapani Pälli wrote:
>
>
> On 11/16/2015 01:29 PM, Samuel Iglesias Gonsálvez wrote:
>> Hello Ilia, Tapani:
>>
>> I have reproduced the issue with a piglit test but not with the trace
>> uploaded in the bug report :-(
>>
>> The piglit test was: bin/arb_shader_storage_buffer_object-maxblocks
>>
>> I have upload a branch with some fixes at Igalia's mesa repo:
>>
>> Git repo: https://github.com/Igalia/mesa.git
>> Branch: wip/siglesias/precision-fixes
>>
>> But as this error might come from other initializations that I might
>> overlook:
>> * Ilia: Could you test if this issue is still happening to you? As I
>> cannot reproduce it locally, I might be forgetting something.
>> * Tapani: Could you do a quick run on CTS to check I have not broken
>> anything?
>
> Sure thing, I'll run testing. FWIW one of the patches was identical to
> my fix sent for fixing tessellation shader problems:
>
> http://lists.freedesktop.org/archives/mesa-dev/2015-November/100396.html

 No CTS regressions with these patches, I've gone through these and
 changes look good to me!


>>>
>>> OK, once Ilia replies that the issue is fixed with those patches, I will
>>> send them for review to the mailing list :-)
>>
>> I won't have time to look until tonight. However the repro steps were
>> pretty simple... download the trace and run through valgrind. Probably
>> tons of other ways to trigger it too, of course... I'd esp look for
>> piglits that have uniform structs.
>>
>
> The problem is that I could not reproduce it with the trace. That's why
> I am asking.
>
> I reproduce it with a piglit tests, but maybe precision is uninitialized
> in other cases. Tomorrow I will do some more testing, just in case.

Well, irrespective of other cases, if the things you're fixing are
real fixes, no need to wait on me. I'll be sure to complain again if I
still see problems. FWIW I did see them with nouveau, not i965. I
suspect llvmpipe would take the same paths.

  -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v5 5/7] glsl: Add precision information to ir_variable

2015-11-16 Thread Samuel Iglesias Gonsálvez



On 16/11/15 12:35, Tapani Pälli wrote:
> 
> 
> On 11/16/2015 01:29 PM, Samuel Iglesias Gonsálvez wrote:
>> Hello Ilia, Tapani:
>>
>> I have reproduced the issue with a piglit test but not with the trace
>> uploaded in the bug report :-(
>>
>> The piglit test was: bin/arb_shader_storage_buffer_object-maxblocks
>>
>> I have upload a branch with some fixes at Igalia's mesa repo:
>>
>> Git repo: https://github.com/Igalia/mesa.git
>> Branch: wip/siglesias/precision-fixes
>>
>> But as this error might come from other initializations that I might
>> overlook:
>> * Ilia: Could you test if this issue is still happening to you? As I
>> cannot reproduce it locally, I might be forgetting something.
>> * Tapani: Could you do a quick run on CTS to check I have not broken
>> anything?
> 
> Sure thing, I'll run testing. FWIW one of the patches was identical to
> my fix sent for fixing tessellation shader problems:
> 
> http://lists.freedesktop.org/archives/mesa-dev/2015-November/100396.html
> 

OK, thanks. I have reviewed your patch.

Sam
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 31/36] glsl: Check for SSBO variable in SSBO atomic lowering

2015-11-16 Thread Iago Toral

On Sat, 2015-11-14 at 13:44 -0800, Jordan Justen wrote:
> When an atomic function is called, we need to check to see if it is
> for an SSBO variable before lowering it to the SSBO specific intrinsic
> function.
> 
> Signed-off-by: Jordan Justen 
> Cc: Samuel Iglesias Gonsalvez 
> Cc: Iago Toral Quiroga 
> ---
>  src/glsl/lower_ubo_reference.cpp | 14 ++
>  1 file changed, 14 insertions(+)
> 
> diff --git a/src/glsl/lower_ubo_reference.cpp 
> b/src/glsl/lower_ubo_reference.cpp
> index a64e9d7..d083936 100644
> --- a/src/glsl/lower_ubo_reference.cpp
> +++ b/src/glsl/lower_ubo_reference.cpp
> @@ -855,6 +855,20 @@ 
> lower_ubo_reference_visitor::lower_ssbo_atomic_intrinsic(ir_call *ir)
>  ir_call *
>  lower_ubo_reference_visitor::check_for_ssbo_atomic_intrinsic(ir_call *ir)
>  {
> +   exec_list& params = ir->actual_parameters;
> +
> +   if (params.length() < 2)
> +  return ir;
> +
> +   ir_rvalue *rvalue =
> +  ((ir_instruction *) params.get_head())->as_rvalue();
> +   if (!rvalue)
> +  return ir;
> +
> +   ir_variable *var = rvalue->variable_referenced();
> +   if (!var || !var->is_in_buffer_block())

The above should be:

if (!var || !var->is_in_shader_storage_block())

With that change,
Reviewed-by: Iago Toral Quiroga 

> +  return ir;
> +
> const char *callee = ir->callee_name();
> if (!strcmp("__intrinsic_atomic_add", callee) ||
> !strcmp("__intrinsic_atomic_min", callee) ||


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 92706] glBlitFramebuffer refuses to blit RGBA to RGB with MSAA

2015-11-16 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=92706

--- Comment #6 from Neil Roberts  ---
(In reply to EoD from comment #5)

I think we would need some of the nouveau and radeon people to test the Piglit
test on their drivers with the patch in order to ensure that it also works
there before landing it. If you are able to test both patches on your Radeon
card I think that would be a big help.

We'd also need to merge the fix for i965 but that already has some review
thanks to Ben Widawsky so I'm hoping to land it later today.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 32/36] glsl: Translate atomic intrinsic functions on shared variables

2015-11-16 Thread Iago Toral

On Sat, 2015-11-14 at 13:44 -0800, Jordan Justen wrote:
> When an intrinsic atomic operation is used on a shared variable, we
> translate it to a new 'share variable' specific intrinsic function
> call.
> 
> For example, add call to __intrinsic_atomic_add when used on a shared
> variable will be translated to a call to
> __intrinsic_atomic_add_shared.

I suppose we should name these __intrinsic_atomic__shared_internal
for consistency with the ssbo atomic intrinsics... or just remove the
'internal' suffix from the ssbo atomics. I think I'd prefer the latter,
we can do this as a separate patch after this lands though.

Iago

> Signed-off-by: Jordan Justen 
> ---
>  src/glsl/lower_shared_reference.cpp | 151 
> 
>  1 file changed, 151 insertions(+)
> 
> diff --git a/src/glsl/lower_shared_reference.cpp 
> b/src/glsl/lower_shared_reference.cpp
> index 810c6b6..7ff2c0c 100644
> --- a/src/glsl/lower_shared_reference.cpp
> +++ b/src/glsl/lower_shared_reference.cpp
> @@ -79,6 +79,10 @@ public:
> ir_visitor_status visit_enter(ir_assignment *ir);
> void handle_assignment(ir_assignment *ir);
>  
> +   ir_call *lower_shared_atomic_intrinsic(ir_call *ir);
> +   ir_call *check_for_shared_atomic_intrinsic(ir_call *ir);
> +   ir_visitor_status visit_enter(ir_call *ir);
> +
> unsigned get_shared_offset(const ir_variable *);
>  
> ir_call *shared_load(const struct glsl_type *type, ir_rvalue *offset);
> @@ -337,6 +341,153 @@ lower_shared_reference_visitor::shared_load(const 
> struct glsl_type *type,
> return new(mem_ctx) ir_call(sig, deref_result, _params);
>  }
>  
> +/* Lowers the intrinsic call to a new internal intrinsic that swaps the
> + * access to the buffer variable in the first parameter by an offset
> + * and block index. This involves creating the new internal intrinsic
> + * (i.e. the new function signature).
> + */
> +ir_call *
> +lower_shared_reference_visitor::lower_shared_atomic_intrinsic(ir_call *ir)
> +{
> +   /* Shared atomics usually have 2 parameters, the shared variable and an
> +* integer argument. The exception is CompSwap, that has an additional
> +* integer parameter.
> +*/
> +   int param_count = ir->actual_parameters.length();
> +   assert(param_count == 2 || param_count == 3);
> +
> +   /* First argument must be a scalar integer buffer variable */
> +   exec_node *param = ir->actual_parameters.get_head();
> +   ir_instruction *inst = (ir_instruction *) param;
> +   assert(inst->ir_type == ir_type_dereference_variable ||
> +  inst->ir_type == ir_type_dereference_array ||
> +  inst->ir_type == ir_type_dereference_record ||
> +  inst->ir_type == ir_type_swizzle);
> +
> +   ir_rvalue *deref = (ir_rvalue *) inst;
> +   assert(deref->type->is_scalar() && deref->type->is_integer());
> +
> +   ir_variable *var = deref->variable_referenced();
> +   assert(var);
> +
> +   /* Compute the offset to the start if the dereference and the
> +* block index
> +*/
> +   mem_ctx = ralloc_parent(shader->ir);
> +
> +   ir_rvalue *offset = NULL;
> +   unsigned const_offset = get_shared_offset(var);
> +   bool row_major;
> +   int matrix_columns;
> +   const glsl_type *iface = var->get_interface_type();
> +   unsigned packing =
> +  iface ? iface->interface_packing : GLSL_INTERFACE_PACKING_STD430;
> +   buffer_access_type = shared_atomic_access;
> +
> +   setup_buffer_access(var, deref,
> +   , _offset,
> +   _major, _columns, packing);
> +
> +   assert(offset);
> +   assert(!row_major);
> +   assert(matrix_columns == 1);
> +
> +   ir_rvalue *deref_offset =
> +  add(offset, new(mem_ctx) ir_constant(const_offset));
> +
> +   /* Create the new internal function signature that will take a block
> +* index and offset instead of a buffer variable
> +*/
> +   exec_list sig_params;
> +   ir_variable *sig_param = new(mem_ctx)
> +  ir_variable(glsl_type::uint_type, "offset" , ir_var_function_in);
> +   sig_params.push_tail(sig_param);
> +
> +   const glsl_type *type = deref->type->base_type == GLSL_TYPE_INT ?
> +  glsl_type::int_type : glsl_type::uint_type;
> +   sig_param = new(mem_ctx)
> + ir_variable(type, "data1", ir_var_function_in);
> +   sig_params.push_tail(sig_param);
> +
> +   if (param_count == 3) {
> +  sig_param = new(mem_ctx)
> +ir_variable(type, "data2", ir_var_function_in);
> +  sig_params.push_tail(sig_param);
> +   }
> +
> +   ir_function_signature *sig =
> +  new(mem_ctx) ir_function_signature(deref->type,
> + compute_shader_enabled);
> +   assert(sig);
> +   sig->replace_parameters(_params);
> +   sig->is_intrinsic = true;
> +
> +   char func_name[64];
> +   sprintf(func_name, "%s_shared", ir->callee_name());
> +   ir_function *f = new(mem_ctx) ir_function(func_name);
> +   f->add_signature(sig);
> +
> +   /* Now, create the call to the internal

Re: [Mesa-dev] [PATCH 2/2] mesa: do runtime validation of precision varyings only on ES

2015-11-16 Thread Samuel Iglesias Gonsálvez

Please add the spec quote in the commit log. For example:

From OpenGL 4.4, section 4.7 "Precision and Precision Qualifiers":

"For the purposes of determining if an output from one shader stage
matches an input of the next stage, the precision qualifier need not match."

Other than that,

Reviewed-by: Samuel Iglesias Gonsálvez 

I wonder if this check is better placed inside validate_io() so we don't
forget about it if validate_io() does more things than only check
precision qualifier in the future. But I don't have a strong opinion
about it.

Sam


On 16/11/15 07:44, Tapani Pälli wrote:
> Precision qualifier should be ignored on desktop OpenGL.
> 
> Signed-off-by: Tapani Pälli 
> ---
>  src/mesa/main/shader_query.cpp | 12 +---
>  1 file changed, 9 insertions(+), 3 deletions(-)
> 
> diff --git a/src/mesa/main/shader_query.cpp b/src/mesa/main/shader_query.cpp
> index 58ba041..dc39f24 100644
> --- a/src/mesa/main/shader_query.cpp
> +++ b/src/mesa/main/shader_query.cpp
> @@ -1413,9 +1413,15 @@ _mesa_validate_pipeline_io(struct gl_pipeline_object 
> *pipeline)
>  
> for (idx = prev + 1; idx < ARRAY_SIZE(pipeline->CurrentProgram); idx++) {
>if (shProg[idx]) {
> - if (!validate_io(shProg[prev]->_LinkedShaders[prev],
> -  shProg[idx]->_LinkedShaders[idx]))
> -return false;
> +
> + /* Since we now only validate precision, we can skip this step for
> +  * desktop GLSL shaders, there precision qualifier is ignored.
> +  */
> + if (shProg[prev]->IsES || shProg[idx]->IsES) {
> +if (!validate_io(shProg[prev]->_LinkedShaders[prev],
> + shProg[idx]->_LinkedShaders[idx]))
> +   return false;
> + }
>   prev = idx;
>}
> }
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 11/11] i965: Use nir_lower_tex for texture coordinate lowering

2015-11-16 Thread Iago Toral

On Mon, 2015-11-16 at 11:33 +0100, Iago Toral wrote:
> On Wed, 2015-11-11 at 17:26 -0800, Jason Ekstrand wrote:
> > Previously, we had a rescale_texcoords helper in the FS backend for
> > handling rescaling of texture coordinates.  Now that we can do variants in
> > NIR, we can use nir_lower_tex to do the rescaling for us.  This allows us
> > to delete the i965-specific code and gives us proper TEXTURE_RECTANGLE and
> > GL_CLAMP handling in vertex and geometry shaders.
> > ---
> >  src/mesa/drivers/dri/i965/brw_fs.cpp  |   2 +
> >  src/mesa/drivers/dri/i965/brw_fs.h|   3 -
> >  src/mesa/drivers/dri/i965/brw_fs_nir.cpp  |   4 +-
> >  src/mesa/drivers/dri/i965/brw_fs_visitor.cpp  | 125 
> > --
> >  src/mesa/drivers/dri/i965/brw_nir.c   |  23 
> >  src/mesa/drivers/dri/i965/brw_nir.h   |   6 ++
> >  src/mesa/drivers/dri/i965/brw_vec4.cpp|   2 +
> >  src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp |   2 +
> >  8 files changed, 36 insertions(+), 131 deletions(-)
> > 
> > diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
> > b/src/mesa/drivers/dri/i965/brw_fs.cpp
> > index b8713ab..c56cafe 100644
> > --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
> > +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
> > @@ -5468,6 +5468,7 @@ brw_compile_fs(const struct brw_compiler *compiler, 
> > void *log_data,
> > char **error_str)
> >  {
> > nir_shader *shader = nir_shader_clone(mem_ctx, src_shader);
> > +   brw_nir_apply_sampler_key(shader, compiler->devinfo, >tex, true);
> 
> This looks like it is part of the post-processing process. In fact, you
> call this right before brw_postprocess_nir() for every stage. Why not
> just add a key parameter to brw_postprocess_nir() and call
> brw_nir_apply_sampler_key from there instead?
> 
> Either way,
> Reviewed-by: Iago Toral Quiroga 
> 
> > brw_postprocess_nir(shader, compiler->devinfo, true);
> >  
> > /* key->alpha_test_func means simulating alpha testing via discards,
> > @@ -5628,6 +5629,7 @@ brw_compile_cs(const struct brw_compiler *compiler, 
> > void *log_data,
> > char **error_str)
> >  {
> > nir_shader *shader = nir_shader_clone(mem_ctx, src_shader);
> > +   brw_nir_apply_sampler_key(shader, compiler->devinfo, >tex, true);
> > brw_postprocess_nir(shader, compiler->devinfo, true);
> >  
> > prog_data->local_size[0] = shader->info.cs.local_size[0];
> > diff --git a/src/mesa/drivers/dri/i965/brw_fs.h 
> > b/src/mesa/drivers/dri/i965/brw_fs.h
> > index 2dfcab1..8a181d7 100644
> > --- a/src/mesa/drivers/dri/i965/brw_fs.h
> > +++ b/src/mesa/drivers/dri/i965/brw_fs.h
> > @@ -217,8 +217,6 @@ public:
> > void emit_interpolation_setup_gen4();
> > void emit_interpolation_setup_gen6();
> > void compute_sample_position(fs_reg dst, fs_reg int_sample_pos);
> > -   fs_reg rescale_texcoord(fs_reg coordinate, int coord_components,
> > -   bool is_rect, uint32_t sampler);
> > void emit_texture(ir_texture_opcode op,
> >   const glsl_type *dest_type,
> >   fs_reg coordinate, int components,
> > @@ -229,7 +227,6 @@ public:
> >   fs_reg mcs,
> >   int gather_component,
> >   bool is_cube_array,
> > - bool is_rect,
> >   uint32_t sampler,
> >   fs_reg sampler_reg);
> > fs_reg emit_mcs_fetch(const fs_reg , unsigned components,
> > diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp 
> > b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> > index 02b9f5b..3d83d7c 100644
> > --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> > +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> > @@ -2411,8 +2411,6 @@ fs_visitor::nir_emit_texture(const fs_builder , 
> > nir_tex_instr *instr)
> >  
> > int gather_component = instr->component;
> >  
> > -   bool is_rect = instr->sampler_dim == GLSL_SAMPLER_DIM_RECT;
> > -
> > bool is_cube_array = instr->sampler_dim == GLSL_SAMPLER_DIM_CUBE &&
> >  instr->is_array;
> >  
> > @@ -2549,7 +2547,7 @@ fs_visitor::nir_emit_texture(const fs_builder , 
> > nir_tex_instr *instr)
> > emit_texture(op, dest_type, coordinate, instr->coord_components,
> >  shadow_comparitor, lod, lod2, lod_components, sample_index,
> >  tex_offset, mcs, gather_component,
> > -is_cube_array, is_rect, sampler, sampler_reg);
> > +is_cube_array, sampler, sampler_reg);
> >  
> > fs_reg dest = get_nir_dest(instr->dest);
> > dest.type = this->result.type;
> > diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp 
> > b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
> > index 213c912..faf304c 100644
> > --- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
> > +++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
> > @@ -79,122 +79,6 @@

Re: [Mesa-dev] [PATCH] mesa: error out in indirect draw when vertex bindings mismatch

2015-11-16 Thread Samuel Iglesias Gonsálvez



On 13/11/15 16:55, Tapani Pälli wrote:
> On 11/13/2015 03:40 PM, Samuel Iglesias Gonsálvez wrote:
>>
>> On 13/11/15 11:32, Tapani Pälli wrote:
>>> Patch adds additional mask for tracking which vertex buffer bindings
>>> are set. This array can be directly compared to which vertex arrays
>>> are enabled and should match when drawing.
>>>
>>> Fixes following CTS tests:
>>>
>>> ES31-CTS.draw_indirect.negative-noVBO-arrays
>>> ES31-CTS.draw_indirect.negative-noVBO-elements
>>>
>>> Signed-off-by: Tapani Pälli 
>>> ---
>>>   src/mesa/main/api_validate.c | 13 +
>>>   src/mesa/main/mtypes.h   |  3 +++
>>>   src/mesa/main/varray.c   |  5 +
>>>   3 files changed, 21 insertions(+)
>>>
>>> diff --git a/src/mesa/main/api_validate.c b/src/mesa/main/api_validate.c
>>> index a490189..e82e89a 100644
>>> --- a/src/mesa/main/api_validate.c
>>> +++ b/src/mesa/main/api_validate.c
>>> @@ -710,6 +710,19 @@ valid_draw_indirect(struct gl_context *ctx,
>>> return GL_FALSE;
>>>  }
>>>   +   /* From OpenGL ES 3.1 spec. section 10.5:
>>> +* "An INVALID_OPERATION error is generated if zero is bound to
>>> +* VERTEX_ARRAY_BINDING, DRAW_INDIRECT_BUFFER or to any enabled
>>> +* vertex array."
>>> +*
>>> +* Here we check that vertex buffer bindings match with enabled
>>> +* vertex arrays.
>>> +*/
>>> +   if (ctx->Array.VAO->_Enabled != ctx->Array.VAO->VertexBindingMask) {
>>> +  _mesa_error(ctx, GL_INVALID_OPERATION, "%s(No VBO bound)", name);
>>> +  return GL_FALSE;
>>> +   }
>>> +
>>>  if (!_mesa_valid_prim_mode(ctx, mode, name))
>>> return GL_FALSE;
>>>   diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
>>> index 4efdf1e..6c6187f 100644
>>> --- a/src/mesa/main/mtypes.h
>>> +++ b/src/mesa/main/mtypes.h
>>> @@ -1419,6 +1419,9 @@ struct gl_vertex_array_object
>>>  /** Vertex buffer bindings */
>>>  struct gl_vertex_buffer_binding VertexBinding[VERT_ATTRIB_MAX];
>>>   +   /** Mask indicating which binding points are set. */
>>> +   GLbitfield64 VertexBindingMask;
>>> +
>>>  /** Mask of VERT_BIT_* values indicating which arrays are
>>> enabled */
>>>  GLbitfield64 _Enabled;
>>>   diff --git a/src/mesa/main/varray.c b/src/mesa/main/varray.c
>>> index 887d0c0..0a94c5a 100644
>>> --- a/src/mesa/main/varray.c
>>> +++ b/src/mesa/main/varray.c
>>> @@ -174,6 +174,11 @@ bind_vertex_buffer(struct gl_context *ctx,
>>> binding->Offset = offset;
>>> binding->Stride = stride;
>>>   +  if (vbo == ctx->Shared->NullBufferObj)
>>> + vao->VertexBindingMask &= ~VERT_BIT(index);
>>> +  else
>>> + vao->VertexBindingMask |= VERT_BIT(index);
>>> +
>> Should't it be VERT_BIT_GENERIC()?
> 
> I used VERT_BIT because that is used when enabling vertex arrays and
> this mask should match that one.
> 

For that reason, I think it is VERT_BIT_GENERIC(). See:

http://cgit.freedesktop.org/mesa/mesa/tree/src/mesa/main/varray.c#n759

Or am I missing something?

Sam

>> Sam
>>
>>> vao->NewArrays |= binding->_BoundArrays;
>>>  }
>>>   }
>>>
> 
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 30/36] glsl: Replace atomic_ssbo and ssbo_atomic with atomic

2015-11-16 Thread Iago Toral

Reviewed-by: Iago Toral Quiroga 

On Sat, 2015-11-14 at 13:44 -0800, Jordan Justen wrote:
> The atomic functions can also be used with shared variables in compute
> shaders.
> 
> When lowering the intrinsic in lower_ubo_reference, we still create an
> SSBO specific intrinsic since SSBO accesses can be indirectly
> addressed, whereas all compute shader shared variable live in a single
> shared variable area.
> 
> Signed-off-by: Jordan Justen 
> Cc: Samuel Iglesias Gonsalvez 
> Cc: Iago Toral Quiroga 
> ---
>  src/glsl/builtin_functions.cpp   | 230 
> +++
>  src/glsl/lower_ubo_reference.cpp |  18 +--
>  src/glsl/nir/glsl_to_nir.cpp |  16 +--
>  3 files changed, 132 insertions(+), 132 deletions(-)
> 
> diff --git a/src/glsl/builtin_functions.cpp b/src/glsl/builtin_functions.cpp
> index 1349444..3e767e8 100644
> --- a/src/glsl/builtin_functions.cpp
> +++ b/src/glsl/builtin_functions.cpp
> @@ -759,16 +759,16 @@ private:
> ir_function_signature *_atomic_counter_op(const char *intrinsic,
>   builtin_available_predicate 
> avail);
>  
> -   ir_function_signature 
> *_atomic_ssbo_intrinsic2(builtin_available_predicate avail,
> -  const glsl_type *type);
> -   ir_function_signature *_atomic_ssbo_op2(const char *intrinsic,
> -   builtin_available_predicate avail,
> -   const glsl_type *type);
> -   ir_function_signature 
> *_atomic_ssbo_intrinsic3(builtin_available_predicate avail,
> -  const glsl_type *type);
> -   ir_function_signature *_atomic_ssbo_op3(const char *intrinsic,
> -   builtin_available_predicate avail,
> -   const glsl_type *type);
> +   ir_function_signature *_atomic_intrinsic2(builtin_available_predicate 
> avail,
> + const glsl_type *type);
> +   ir_function_signature *_atomic_op2(const char *intrinsic,
> +  builtin_available_predicate avail,
> +  const glsl_type *type);
> +   ir_function_signature *_atomic_intrinsic3(builtin_available_predicate 
> avail,
> + const glsl_type *type);
> +   ir_function_signature *_atomic_op3(const char *intrinsic,
> +  builtin_available_predicate avail,
> +  const glsl_type *type);
>  
> B1(min3)
> B1(max3)
> @@ -915,53 +915,53 @@ builtin_builder::create_intrinsics()
>  _atomic_counter_intrinsic(shader_atomic_counters),
>  NULL);
>  
> -   add_function("__intrinsic_ssbo_atomic_add",
> -_atomic_ssbo_intrinsic2(shader_storage_buffer_object,
> -glsl_type::uint_type),
> -_atomic_ssbo_intrinsic2(shader_storage_buffer_object,
> -glsl_type::int_type),
> -NULL);
> -   add_function("__intrinsic_ssbo_atomic_min",
> -_atomic_ssbo_intrinsic2(shader_storage_buffer_object,
> -glsl_type::uint_type),
> -_atomic_ssbo_intrinsic2(shader_storage_buffer_object,
> -glsl_type::int_type),
> -NULL);
> -   add_function("__intrinsic_ssbo_atomic_max",
> -_atomic_ssbo_intrinsic2(shader_storage_buffer_object,
> -glsl_type::uint_type),
> -_atomic_ssbo_intrinsic2(shader_storage_buffer_object,
> -glsl_type::int_type),
> -NULL);
> -   add_function("__intrinsic_ssbo_atomic_and",
> -_atomic_ssbo_intrinsic2(shader_storage_buffer_object,
> -glsl_type::uint_type),
> -_atomic_ssbo_intrinsic2(shader_storage_buffer_object,
> -glsl_type::int_type),
> -NULL);
> -   add_function("__intrinsic_ssbo_atomic_or",
> -_atomic_ssbo_intrinsic2(shader_storage_buffer_object,
> -glsl_type::uint_type),
> -_atomic_ssbo_intrinsic2(shader_storage_buffer_object,
> -glsl_type::int_type),
> -NULL);
> -   add_function("__intrinsic_ssbo_atomic_xor",
> -_atomic_ssbo_intrinsic2(shader_storage_buffer_object,
> -glsl_type::uint_type),
> -_atomic_ssbo_intrinsic2(shader_storage_buffer_object,
> -

Re: [Mesa-dev] [PATCH 18/36] glsl ubo/ssbo: Move common code into lower_buffer_access::setup_buffer_access

2015-11-16 Thread Iago Toral

On Sat, 2015-11-14 at 13:43 -0800, Jordan Justen wrote:
> This code will also be usable by the pass to lower shared variables.
> 
> Note, that *const_offset is adjusted by setup_buffer_access so it must
> be initialized before calling setup_buffer_access.
> 
> Signed-off-by: Jordan Justen 
> Cc: Samuel Iglesias Gonsalvez 
> Cc: Iago Toral Quiroga 
> ---
>  src/glsl/lower_buffer_access.cpp | 167 
> +++
>  src/glsl/lower_buffer_access.h   |   5 ++
>  src/glsl/lower_ubo_reference.cpp | 160 +
>  3 files changed, 175 insertions(+), 157 deletions(-)
> 
> diff --git a/src/glsl/lower_buffer_access.cpp 
> b/src/glsl/lower_buffer_access.cpp
> index b7fc107..87f64a9 100644
> --- a/src/glsl/lower_buffer_access.cpp
> +++ b/src/glsl/lower_buffer_access.cpp
> @@ -394,4 +394,171 @@ 
> lower_buffer_access::is_dereferenced_thing_row_major(const ir_rvalue *deref)
> return false;
>  }

Maybe add a comment here explaining that this expects to receive a
constant offset that has already been initialized to the starting offset
of the variable being dereferenced and that this function will only
adjust that to account for the offset of the particular component of the
variable that is accessed by the dereference.

Other than this,
Reviewed-by: Iago Toral Quiroga 
 
> +void
> +lower_buffer_access::setup_buffer_access(ir_variable *var,
> + ir_rvalue *deref,
> + ir_rvalue **offset,
> + unsigned *const_offset,
> + bool *row_major,
> + int *matrix_columns,
> + unsigned packing)
> +{
> +   *offset = new(mem_ctx) ir_constant(0u);
> +   *row_major = is_dereferenced_thing_row_major(deref);
> +   *matrix_columns = 1;
> +
> +   /* Calculate the offset to the start of the region of the UBO
> +* dereferenced by *rvalue.  This may be a variable offset if an
> +* array dereference has a variable index.
> +*/
> +   while (deref) {
> +  switch (deref->ir_type) {
> +  case ir_type_dereference_variable: {
> + deref = NULL;
> + break;
> +  }
> +
> +  case ir_type_dereference_array: {
> + ir_dereference_array *deref_array = (ir_dereference_array *) deref;
> + unsigned array_stride;
> + if (deref_array->array->type->is_vector()) {
> +/* We get this when storing or loading a component out of a 
> vector
> + * with a non-constant index. This happens for v[i] = f where v 
> is
> + * a vector (or m[i][j] = f where m is a matrix). If we don't
> + * lower that here, it gets turned into v = vector_insert(v, i,
> + * f), which loads the entire vector, modifies one component and
> + * then write the entire thing back.  That breaks if another
> + * thread or SIMD channel is modifying the same vector.
> + */
> +array_stride = 4;
> +if (deref_array->array->type->is_double())
> +   array_stride *= 2;
> + } else if (deref_array->array->type->is_matrix() && *row_major) {
> +/* When loading a vector out of a row major matrix, the
> + * step between the columns (vectors) is the size of a
> + * float, while the step between the rows (elements of a
> + * vector) is handled below in emit_ubo_loads.
> + */
> +array_stride = 4;
> +if (deref_array->array->type->is_double())
> +   array_stride *= 2;
> +*matrix_columns = deref_array->array->type->matrix_columns;
> + } else if (deref_array->type->without_array()->is_interface()) {
> +/* We're processing an array dereference of an interface instance
> + * array. The thing being dereferenced *must* be a variable
> + * dereference because interfaces cannot be embedded in other
> + * types. In terms of calculating the offsets for the lowering
> + * pass, we don't care about the array index. All elements of an
> + * interface instance array will have the same offsets relative 
> to
> + * the base of the block that backs them.
> + */
> +deref = deref_array->array->as_dereference();
> +break;
> + } else {
> +/* Whether or not the field is row-major (because it might be a
> + * bvec2 or something) does not affect the array itself. We need
> + * to know whether an array element in its entirety is row-major.
> + */
> +const bool array_row_major =
> +   is_dereferenced_thing_row_major(deref_array);
> +
> +/* The array

[Mesa-dev] [PATCH v2] i965: Prevent fast clears for MSRTs on SKL

2015-11-16 Thread Neil Roberts

There are currently a bunch of formats that behave strangely when
sampling the cleared color from the MCS buffer on SKL. They seem to
mostly be formats that don't have an alpha component, although it's
not all of them, and we haven't yet found anything in the specs which
would explain this. For now to be on the safe side this patch just
prevents fast clears for MSRTs on SKL altogether so that when fast
clears are eventually enabled it will only be for single-sampled
surfaces. The assumption is that clears are probably more likely to be
used in single-sampled applications anyway so we can at least get them
working and we can enable MSRTs later once we understand the problem
better.

This patch should have no functional effect other than perhaps
receiving fewer perf_debug messages on SKL+.

v2: Improve the commit message to avoid saying the patch disables fast
clears because it will be merged before fast clears are enabled
for any surfaces so it doesn't actually disable anything.
Reviewed-by: Ben Widawsky 
---
 src/mesa/drivers/dri/i965/brw_meta_fast_clear.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_meta_fast_clear.c 
b/src/mesa/drivers/dri/i965/brw_meta_fast_clear.c
index dc085ba..85576a8 100644
--- a/src/mesa/drivers/dri/i965/brw_meta_fast_clear.c
+++ b/src/mesa/drivers/dri/i965/brw_meta_fast_clear.c
@@ -524,6 +524,13 @@ brw_meta_fast_clear(struct brw_context *brw, struct 
gl_framebuffer *fb,
   if (brw->gen < 7)
  clear_type = REP_CLEAR;
 
+  /* Certain formats have unresolved issues with sampling from the MCS
+   * buffer on Gen9. This disables fast clears altogether for MSRTs until
+   * we can figure out what's going on.
+   */
+  if (brw->gen >= 9 && irb->mt->num_samples > 1)
+ clear_type = REP_CLEAR;
+
   if (irb->mt->fast_clear_state == INTEL_FAST_CLEAR_STATE_NO_MCS)
  clear_type = REP_CLEAR;
 
-- 
1.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] r600g: Support TGSI_SEMANTIC_HELPER_INVOCATION

2015-11-16 Thread Nicolai Hähnle


Hi Glenn,

On 14.11.2015 00:11, Glenn Kennard wrote:

On Fri, 13 Nov 2015 18:57:28 +0100, Nicolai Hähnle 
wrote:


On 13.11.2015 00:14, Glenn Kennard wrote:

Signed-off-by: Glenn Kennard 
---
Maybe there is a better way to check if a thread is a helper invocation?


Is ctx->face_gpr guaranteed to be initialized when
load_helper_invocation is called?



allocate_system_value_inputs() sets that if needed, and is called before
parsing any opcodes.


Sorry, you're right, I missed the second change to the inputs array there.



Aside, I'm not sure I understand correctly what this is supposed to
do. The values you're querying are related to multi-sampling, but my
understanding has always been that helper invocations can also happen
without multi-sampling: you always want to process 2x2 quads of pixels
at a time to be able to compute derivatives for texture sampling. When
the boundary of primitive intersects such a quad, you get helper
invocations outside the primitive.



Non-MSAA buffers act just like 1 sample buffers with regards to the
coverage mask supplied by the hardware, so helper invocations which have
no coverage get a 0 for the mask value, and normal fragments get 1.
Works with the piglit test case posted at least...


Here's why I'm still skeptical: According to the GLSL spec, the fragment 
shader is only run once per pixel by default, even when MSAA is enabled. 
_However_, if a shader statically accesses the SampleID, _then_ it must 
be run once per fragment. The way I understand it, your change forces 
the fragment shader to access SampleID, even when people ostensibly use 
HelperInvocation in the hope of optimizing something.


In the usual MSAA operation of only running the fragment shader once per 
pixel, HelperInvocation should be the same as SampleMask != 0, right? It 
seems like the right thing to do is to _not_ allocate the 
TGSI_SEMANTIC_SAMPLEID when TGSI_SEMANTIC_HELPER_INVOCATION is used, and 
then use different code paths in load_helper_invocation based on which 
of the source registers are actually there.


Cheers,
Nicolai




Cheers,
Nicolai


  src/gallium/drivers/r600/r600_shader.c | 83
+-
  1 file changed, 72 insertions(+), 11 deletions(-)

diff --git a/src/gallium/drivers/r600/r600_shader.c
b/src/gallium/drivers/r600/r600_shader.c
index 560197c..a227d78 100644
--- a/src/gallium/drivers/r600/r600_shader.c
+++ b/src/gallium/drivers/r600/r600_shader.c
@@ -530,7 +530,8 @@ static int r600_spi_sid(struct r600_shader_io * io)
  name == TGSI_SEMANTIC_PSIZE ||
  name == TGSI_SEMANTIC_EDGEFLAG ||
  name == TGSI_SEMANTIC_FACE ||
-name == TGSI_SEMANTIC_SAMPLEMASK)
+name == TGSI_SEMANTIC_SAMPLEMASK ||
+name == TGSI_SEMANTIC_HELPER_INVOCATION)
  index = 0;
  else {
  if (name == TGSI_SEMANTIC_GENERIC) {
@@ -734,7 +735,8 @@ static int tgsi_declaration(struct
r600_shader_ctx *ctx)
  case TGSI_FILE_SYSTEM_VALUE:
  if (d->Semantic.Name == TGSI_SEMANTIC_SAMPLEMASK ||
  d->Semantic.Name == TGSI_SEMANTIC_SAMPLEID ||
-d->Semantic.Name == TGSI_SEMANTIC_SAMPLEPOS) {
+d->Semantic.Name == TGSI_SEMANTIC_SAMPLEPOS ||
+d->Semantic.Name == TGSI_SEMANTIC_HELPER_INVOCATION) {
  break; /* Already handled from
allocate_system_value_inputs */
  } else if (d->Semantic.Name == TGSI_SEMANTIC_INSTANCEID) {
  if (!ctx->native_integers) {
@@ -776,13 +778,14 @@ static int allocate_system_value_inputs(struct
r600_shader_ctx *ctx, int gpr_off
  struct {
  boolean enabled;
  int *reg;
-unsigned name, alternate_name;
+unsigned associated_semantics[3];
  } inputs[2] = {
-{ false, >face_gpr, TGSI_SEMANTIC_SAMPLEMASK, ~0u }, /*
lives in Front Face GPR.z */
-
-{ false, >fixed_pt_position_gpr,
TGSI_SEMANTIC_SAMPLEID, TGSI_SEMANTIC_SAMPLEPOS } /* SAMPLEID is in
Fixed Point Position GPR.w */
+{ false, >face_gpr, { TGSI_SEMANTIC_SAMPLEMASK /* lives
in Front Face GPR.z */,
+TGSI_SEMANTIC_HELPER_INVOCATION, ~0u } },
+{ false, >fixed_pt_position_gpr, {
TGSI_SEMANTIC_SAMPLEID  /* in Fixed Point Position GPR.w */,
+TGSI_SEMANTIC_SAMPLEPOS, TGSI_SEMANTIC_HELPER_INVOCATION
} }
  };
-int i, k, num_regs = 0;
+int i, k, l, num_regs = 0;

  if (tgsi_parse_init(, ctx->tokens) != TGSI_PARSE_OK) {
  return 0;
@@ -818,9 +821,11 @@ static int allocate_system_value_inputs(struct
r600_shader_ctx *ctx, int gpr_off
  struct tgsi_full_declaration *d =

  if (d->Declaration.File == TGSI_FILE_SYSTEM_VALUE) {
  for (k = 0; k < Elements(inputs); k++) {
-if (d->Semantic.Name == inputs[k].name ||
-d->Semantic.Name == inputs[k].alternate_name) {
-inputs[k].enabled = true;
+for (l = 0;

Re: [Mesa-dev] [PATCH 16/36] glsl ubo/ssbo: Add lower_buffer_access class

2015-11-16 Thread Iago Toral

On Sat, 2015-11-14 at 13:43 -0800, Jordan Justen wrote:
> This class has code that will be shared by lower_ubo_reference and
> lower_shared_reference. (lower_shared_reference will be used to
> support compute shader shared variables.)
> 
> Signed-off-by: Jordan Justen 
> Cc: Samuel Iglesias Gonsalvez 
> Cc: Iago Toral Quiroga 
> ---
>  src/glsl/Makefile.sources|   1 +
>  src/glsl/lower_buffer_access.cpp | 307 
> +++
>  src/glsl/lower_buffer_access.h   |  56 +++
>  src/glsl/lower_ubo_reference.cpp | 180 +--
>  4 files changed, 367 insertions(+), 177 deletions(-)
>  create mode 100644 src/glsl/lower_buffer_access.cpp
>  create mode 100644 src/glsl/lower_buffer_access.h
> 
> diff --git a/src/glsl/Makefile.sources b/src/glsl/Makefile.sources
> index d4b02c1..f2c95c0 100644
> --- a/src/glsl/Makefile.sources
> +++ b/src/glsl/Makefile.sources
> @@ -155,6 +155,7 @@ LIBGLSL_FILES = \
>   loop_analysis.h \
>   loop_controls.cpp \
>   loop_unroll.cpp \
> + lower_buffer_access.cpp \
>   lower_clip_distance.cpp \
>   lower_const_arrays_to_uniforms.cpp \
>   lower_discard.cpp \
> diff --git a/src/glsl/lower_buffer_access.cpp 
> b/src/glsl/lower_buffer_access.cpp
> new file mode 100644
> index 000..e0b5a2f
> --- /dev/null
> +++ b/src/glsl/lower_buffer_access.cpp
> @@ -0,0 +1,307 @@
> +/*
> + * Copyright (c) 2015 Intel Corporation
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a
> + * copy of this software and associated documentation files (the "Software"),
> + * to deal in the Software without restriction, including without limitation
> + * the rights to use, copy, modify, merge, publish, distribute, sublicense,
> + * and/or sell copies of the Software, and to permit persons to whom the
> + * Software is furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice (including the next
> + * paragraph) shall be included in all copies or substantial portions of the
> + * Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
> + * DEALINGS IN THE SOFTWARE.
> + */
> +
> +/**
> + * \file lower_buffer_access.cpp
> + *
> + * Helper for IR lowering pass to replace dereferences of buffer object based
> + * shader variables with intrinsic function calls.
> + *
> + * This helper is used by lowering passes for UBOs, SSBOs and compute shader
> + * shared variables.
> + */
> +
> +#include "ir.h"
> +#include "ir_builder.h"
> +#include "ir_rvalue_visitor.h"
> +#include "main/macros.h"
> +#include "util/list.h"
> +#include "glsl_parser_extras.h"
> +#include "lower_buffer_access.h"
> +
> +using namespace ir_builder;
> +
> +namespace lower_buffer_access {
> +
> +static inline int
> +writemask_for_size(unsigned n)
> +{
> +   return ((1 << n) - 1);
> +}
> +
> +/**
> + * Takes LHS and emits a series of assignments into its components
> + * from the shared variable storage.

I find this part of the comment a bit confusing. This function breaks a
dereference access into one or multiple accesses to the underlying
buffer storage. Such dereference could be in a RHS expression, and in
fact, that will always be the case for UBO and SSBO loads.

> + * Recursively calls itself to break the deref down to the point that
> + * the intrinsic calls are generated.
> + */
> +void
> +lower_buffer_access::emit_access(bool is_write,
> + ir_dereference *deref,
> + ir_variable *base_offset,
> + unsigned int deref_offset,
> + bool row_major,
> + int matrix_columns,
> + unsigned int packing,
> + unsigned int write_mask)
> +{

Why not pass mem_ctx as parameter instead of having it be a class
member? I find it a bit odd that this class defines mem_ctx but never
really takes care of initializing it, expecting that subclasses do that
for it, so in that case why not just make them actually take care of
passing the mem_ctx to use instead?

If you rather keep mem_ctx defined here I'd at least suggest to add an
assert to the functions that use it to check that it has indeed been
initialized by the subclass.

> +   if (deref->type->is_record()) {
> +  unsigned int field_offset = 0;
> +
> +  for (unsigned i = 0; i < deref->type->length; i++) {
> + const

Re: [Mesa-dev] [PATCH 17/36] glsl ubo/ssbo: Move is_dereferenced_thing_row_major into lower_buffer_access

2015-11-16 Thread Iago Toral

Reviewed-by: Iago Toral Quiroga 

On Sat, 2015-11-14 at 13:43 -0800, Jordan Justen wrote:
> Signed-off-by: Jordan Justen 
> Cc: Samuel Iglesias Gonsalvez 
> Cc: Iago Toral Quiroga 
> ---
>  src/glsl/lower_buffer_access.cpp | 90 
> 
>  src/glsl/lower_buffer_access.h   |  2 +
>  src/glsl/lower_ubo_reference.cpp | 90 
> 
>  3 files changed, 92 insertions(+), 90 deletions(-)
> 
> diff --git a/src/glsl/lower_buffer_access.cpp 
> b/src/glsl/lower_buffer_access.cpp
> index e0b5a2f..b7fc107 100644
> --- a/src/glsl/lower_buffer_access.cpp
> +++ b/src/glsl/lower_buffer_access.cpp
> @@ -304,4 +304,94 @@ is_dereferenced_thing_row_major(const ir_dereference 
> *deref)
> return false;
>  }
>  
> +/**
> + * Determine if a thing being dereferenced is row-major
> + *
> + * There is some trickery here.
> + *
> + * If the thing being dereferenced is a member of uniform block \b without an
> + * instance name, then the name of the \c ir_variable is the field name of an
> + * interface type.  If this field is row-major, then the thing referenced is
> + * row-major.
> + *
> + * If the thing being dereferenced is a member of uniform block \b with an
> + * instance name, then the last dereference in the tree will be an
> + * \c ir_dereference_record.  If that record field is row-major, then the
> + * thing referenced is row-major.
> + */
> +bool
> +lower_buffer_access::is_dereferenced_thing_row_major(const ir_rvalue *deref)
> +{
> +   bool matrix = false;
> +   const ir_rvalue *ir = deref;
> +
> +   while (true) {
> +  matrix = matrix || ir->type->without_array()->is_matrix();
> +
> +  switch (ir->ir_type) {
> +  case ir_type_dereference_array: {
> + const ir_dereference_array *const array_deref =
> +(const ir_dereference_array *) ir;
> +
> + ir = array_deref->array;
> + break;
> +  }
> +
> +  case ir_type_dereference_record: {
> + const ir_dereference_record *const record_deref =
> +(const ir_dereference_record *) ir;
> +
> + ir = record_deref->record;
> +
> + const int idx = ir->type->field_index(record_deref->field);
> + assert(idx >= 0);
> +
> + const enum glsl_matrix_layout matrix_layout =
> +
> glsl_matrix_layout(ir->type->fields.structure[idx].matrix_layout);
> +
> + switch (matrix_layout) {
> + case GLSL_MATRIX_LAYOUT_INHERITED:
> +break;
> + case GLSL_MATRIX_LAYOUT_COLUMN_MAJOR:
> +return false;
> + case GLSL_MATRIX_LAYOUT_ROW_MAJOR:
> +return matrix || deref->type->without_array()->is_record();
> + }
> +
> + break;
> +  }
> +
> +  case ir_type_dereference_variable: {
> + const ir_dereference_variable *const var_deref =
> +(const ir_dereference_variable *) ir;
> +
> + const enum glsl_matrix_layout matrix_layout =
> +glsl_matrix_layout(var_deref->var->data.matrix_layout);
> +
> + switch (matrix_layout) {
> + case GLSL_MATRIX_LAYOUT_INHERITED:
> +assert(!matrix);
> +return false;
> + case GLSL_MATRIX_LAYOUT_COLUMN_MAJOR:
> +return false;
> + case GLSL_MATRIX_LAYOUT_ROW_MAJOR:
> +return matrix || deref->type->without_array()->is_record();
> + }
> +
> + unreachable("invalid matrix layout");
> + break;
> +  }
> +
> +  default:
> + return false;
> +  }
> +   }
> +
> +   /* The tree must have ended with a dereference that wasn't an
> +* ir_dereference_variable.  That is invalid, and it should be impossible.
> +*/
> +   unreachable("invalid dereference tree");
> +   return false;
> +}
> +
>  } /* namespace lower_buffer_access */
> diff --git a/src/glsl/lower_buffer_access.h b/src/glsl/lower_buffer_access.h
> index 3138963..0698e22 100644
> --- a/src/glsl/lower_buffer_access.h
> +++ b/src/glsl/lower_buffer_access.h
> @@ -48,6 +48,8 @@ public:
>  bool row_major, int matrix_columns,
>  unsigned int packing, unsigned int write_mask);
>  
> +   bool is_dereferenced_thing_row_major(const ir_rvalue *deref);
> +
> void *mem_ctx;
>  };
>  
> diff --git a/src/glsl/lower_ubo_reference.cpp 
> b/src/glsl/lower_ubo_reference.cpp
> index 8de4f5e..7e1221b 100644
> --- a/src/glsl/lower_ubo_reference.cpp
> +++ b/src/glsl/lower_ubo_reference.cpp
> @@ -42,96 +42,6 @@
>  
>  using namespace ir_builder;
>  
> -/**
> - * Determine if a thing being dereferenced is row-major
> - *
> - * There is some trickery here.
> - *
> - * If the thing being dereferenced is a member of uniform block \b without an
> - * instance name, then the name of the \c ir_variable is the field name of an
> - * interface type.  If this field is row-major, then the thing referenced is
> - *

Re: [Mesa-dev] [PATCH 3/7] [v2] i965/skl: skip fast clears for certain surface formats

2015-11-16 Thread Matt Turner

On Wed, Nov 11, 2015 at 2:06 PM, Ben Widawsky
 wrote:
> Some of the information originally in this commit message is now in the patch
> before this.
>
> SKL adds compressible render targets and as a result mutates some of the
> programming for fast clears and resolves. There is a new internal surface type
> called the CCS. The old AUX_MCS bit becomes AUX_CCS_D. "The Auxiliary surface 
> is
> a CCS (Color Control Surface) with compression disabled or an MCS with
> compression enabled, depending on number of multisamples. MCS (Multisample
> Control Surface) is a special type of CCS."
>
> The formats which are supported are defined in the table titled "Render Target
> Surface Types [SKL+]". There is no PRM yet to reference. The previously
> implemented helper function already does the right thing provided the table is
> correct.
>
> v2: Use better English in commit message (Matt)
> s/compressable/compressible/ (Matt)
> Don't compare bools to true (Matt)
> Use the helper function and don't increase the context size - this is mostly
> implemented in the patch just before this (Chad, Neil)
> Remove an "invalid" assert (Chad)
> Fix assertion to check num_samples > 1, instead of num_samples (Chad)
>
> Cc: Chad Versace 
> Cc: Neil Roberts 
> Signed-off-by: Ben Widawsky 
> ---
>  src/mesa/drivers/dri/i965/brw_surface_formats.c | 52 
> -
>  src/mesa/drivers/dri/i965/gen8_surface_state.c  |  7 +++-
>  2 files changed, 31 insertions(+), 28 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_surface_formats.c 
> b/src/mesa/drivers/dri/i965/brw_surface_formats.c
> index a7cdc13..a527f2f 100644
> --- a/src/mesa/drivers/dri/i965/brw_surface_formats.c
> +++ b/src/mesa/drivers/dri/i965/brw_surface_formats.c
> @@ -90,9 +90,9 @@ struct surface_format_info {
>   */
>  const struct surface_format_info surface_formats[] = {
>  /* smpl filt shad CK  RT  AB  VB  SO  color ccs */
> -   SF( Y, 50,  x,  x,  Y,  Y,  Y,  Y,  x,x,   R32G32B32A32_FLOAT)
> -   SF( Y,  x,  x,  x,  Y,  x,  Y,  Y,  x,x,   R32G32B32A32_SINT)
> -   SF( Y,  x,  x,  x,  Y,  x,  Y,  Y,  x,x,   R32G32B32A32_UINT)
> +   SF( Y, 50,  x,  x,  Y,  Y,  Y,  Y,  x,   90,   R32G32B32A32_FLOAT)
> +   SF( Y,  x,  x,  x,  Y,  x,  Y,  Y,  x,   90,   R32G32B32A32_SINT)
> +   SF( Y,  x,  x,  x,  Y,  x,  Y,  Y,  x,   90,   R32G32B32A32_UINT)
> SF( x,  x,  x,  x,  x,  x,  Y,  x,  x,x,   R32G32B32A32_UNORM)
> SF( x,  x,  x,  x,  x,  x,  Y,  x,  x,x,   R32G32B32A32_SNORM)
> SF( x,  x,  x,  x,  x,  x,  Y,  x,  x,x,   R64G64_FLOAT)
> @@ -109,15 +109,15 @@ const struct surface_format_info surface_formats[] = {
> SF( x,  x,  x,  x,  x,  x,  Y,  x,  x,x,   R32G32B32_SSCALED)
> SF( x,  x,  x,  x,  x,  x,  Y,  x,  x,x,   R32G32B32_USCALED)
> SF( x,  x,  x,  x,  x,  x,  x,  x,  x,x,   R32G32B32_SFIXED)
> -   SF( Y,  Y,  x,  x,  Y, 45,  Y,  x, 60,x,   R16G16B16A16_UNORM)
> -   SF( Y,  Y,  x,  x,  Y, 60,  Y,  x,  x,x,   R16G16B16A16_SNORM)
> -   SF( Y,  x,  x,  x,  Y,  x,  Y,  x,  x,x,   R16G16B16A16_SINT)
> -   SF( Y,  x,  x,  x,  Y,  x,  Y,  x,  x,x,   R16G16B16A16_UINT)
> -   SF( Y,  Y,  x,  x,  Y,  Y,  Y,  x,  x,x,   R16G16B16A16_FLOAT)
> -   SF( Y, 50,  x,  x,  Y,  Y,  Y,  Y,  x,x,   R32G32_FLOAT)
> +   SF( Y,  Y,  x,  x,  Y, 45,  Y,  x, 60,   90,   R16G16B16A16_UNORM)
> +   SF( Y,  Y,  x,  x,  Y, 60,  Y,  x,  x,   90,   R16G16B16A16_SNORM)
> +   SF( Y,  x,  x,  x,  Y,  x,  Y,  x,  x,   90,   R16G16B16A16_SINT)
> +   SF( Y,  x,  x,  x,  Y,  x,  Y,  x,  x,   90,   R16G16B16A16_UINT)
> +   SF( Y,  Y,  x,  x,  Y,  Y,  Y,  x,  x,   90,   R16G16B16A16_FLOAT)
> +   SF( Y, 50,  x,  x,  Y,  Y,  Y,  Y,  x,   90,   R32G32_FLOAT)
> SF( Y, 70,  x,  x,  Y,  Y,  Y,  Y,  x,x,   R32G32_FLOAT_LD)
> -   SF( Y,  x,  x,  x,  Y,  x,  Y,  Y,  x,x,   R32G32_SINT)
> -   SF( Y,  x,  x,  x,  Y,  x,  Y,  Y,  x,x,   R32G32_UINT)
> +   SF( Y,  x,  x,  x,  Y,  x,  Y,  Y,  x,   90,   R32G32_SINT)
> +   SF( Y,  x,  x,  x,  Y,  x,  Y,  Y,  x,   90,   R32G32_UINT)
> SF( Y, 50,  Y,  x,  x,  x,  x,  x,  x,x,   R32_FLOAT_X8X24_TYPELESS)
> SF( Y,  x,  x,  x,  x,  x,  x,  x,  x,x,   X32_TYPELESS_G8X24_UINT)
> SF( Y, 50,  x,  x,  x,  x,  x,  x,  x,x,   L32A32_FLOAT)
> @@ -125,7 +125,7 @@ const struct surface_format_info surface_formats[] = {
> SF( x,  x,  x,  x,  x,  x,  Y,  x,  x,x,   R32G32_SNORM)
> SF( x,  x,  x,  x,  x,  x,  Y,  x,  x,x,   R64_FLOAT)
> SF( Y,  Y,  x,  x,  x,  x,  x,  x,  x,x,   R16G16B16X16_UNORM)
> -   SF( Y,  Y,  x,  x,  x,  x,  x,  x,  x,x,   R16G16B16X16_FLOAT)
> +   SF( Y,  Y,  x,  x,  x,  x,  x,  x,  x,   90,   R16G16B16X16_FLOAT)
> SF( Y, 50,  x,  x,  x,  x,  x,  x,  x,x,   A32X32_FLOAT)
> SF( Y, 50,  x,  x,  x,  x,  x,  x,  x,x,   L32X32_FLOAT)
> SF( Y, 50,  x,  x,  x,  x,  x,  x,  x,x,

Re: [Mesa-dev] [PATCH 3/7] [v2] i965/skl: skip fast clears for certain surface formats

2015-11-16 Thread Ben Widawsky

On Fri, Nov 13, 2015 at 12:22:47PM -0800, Chad Versace wrote:
> On Wed 11 Nov 2015, Ben Widawsky wrote:
> > Some of the information originally in this commit message is now in the 
> > patch
> > before this.
> > 
> > SKL adds compressible render targets and as a result mutates some of the
> > programming for fast clears and resolves. There is a new internal surface 
> > type
> > called the CCS. The old AUX_MCS bit becomes AUX_CCS_D. "The Auxiliary 
> > surface is
> > a CCS (Color Control Surface) with compression disabled or an MCS with
> > compression enabled, depending on number of multisamples. MCS (Multisample
> > Control Surface) is a special type of CCS."
> > 
> > The formats which are supported are defined in the table titled "Render 
> > Target
> > Surface Types [SKL+]". There is no PRM yet to reference. The previously
> > implemented helper function already does the right thing provided the table 
> > is
> > correct.
> > 
> > v2: Use better English in commit message (Matt)
> > s/compressable/compressible/ (Matt)
> > Don't compare bools to true (Matt)
> > Use the helper function and don't increase the context size - this is mostly
> > implemented in the patch just before this (Chad, Neil)
> > Remove an "invalid" assert (Chad)
> > Fix assertion to check num_samples > 1, instead of num_samples (Chad)
> > 
> > Cc: Chad Versace 
> > Cc: Neil Roberts 
> > Signed-off-by: Ben Widawsky 
> > ---
> >  src/mesa/drivers/dri/i965/brw_surface_formats.c | 52 
> > -
> >  src/mesa/drivers/dri/i965/gen8_surface_state.c  |  7 +++-
> >  2 files changed, 31 insertions(+), 28 deletions(-)
> 
> 
> > diff --git a/src/mesa/drivers/dri/i965/gen8_surface_state.c 
> > b/src/mesa/drivers/dri/i965/gen8_surface_state.c
> > index 6909858..8fe480c 100644
> > --- a/src/mesa/drivers/dri/i965/gen8_surface_state.c
> > +++ b/src/mesa/drivers/dri/i965/gen8_surface_state.c
> > @@ -222,6 +222,7 @@ gen8_emit_texture_surface_state(struct brw_context *brw,
> > int surf_index = surf_offset - >wm.base.surf_offset[0];
> > unsigned tiling_mode, pitch;
> > const unsigned tr_mode = surface_tiling_resource_mode(mt->tr_mode);
> > +   const uint32_t surf_type = translate_tex_target(target);
> >  
> > if (mt->format == MESA_FORMAT_S_UINT8) {
> >tiling_mode = GEN8_SURFACE_TILING_W;
> > @@ -243,11 +244,13 @@ gen8_emit_texture_surface_state(struct brw_context 
> > *brw,
> > * "When Auxiliary Surface Mode is set to AUX_CCS_D or AUX_CCS_E, 
> > HALIGN
> > *  16 must be used."
> > */
> > -  if (brw->gen >= 9 || mt->num_samples == 1)
> > +  if (brw->gen >= 9 || mt->num_samples == 1) {
> >   assert(mt->halign == 16);
> > + assert(mt->num_samples > 1 ||
> > +brw_losslessly_compressible_format(brw, surf_type));
> > +  }
> 
> Please expand this if-then-assert block to be more straightforward. It's
> very confusing.

That's fine, can you tell me what you want, and maybe give me some intention of
adding an r-b?
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/2] glsl: initialize precision when adding per vertex record fields

2015-11-16 Thread Kenneth Graunke

On Monday, November 16, 2015 08:44:18 AM Tapani Pälli wrote:
> Fixes issues with tessellation builtin variables since precision was
> introduced to IR with commit f84bc57d7dc02fceb805803131426c791eadeff9.
> 
> Signed-off-by: Tapani Pälli 
> ---
>  src/glsl/builtin_variables.cpp | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/src/glsl/builtin_variables.cpp b/src/glsl/builtin_variables.cpp
> index b06c1bc..b927d50 100644
> --- a/src/glsl/builtin_variables.cpp
> +++ b/src/glsl/builtin_variables.cpp
> @@ -327,6 +327,7 @@ per_vertex_accumulator::add_field(int slot, const 
> glsl_type *type,
> this->fields[this->num_fields].centroid = 0;
> this->fields[this->num_fields].sample = 0;
> this->fields[this->num_fields].patch = 0;
> +   this->fields[this->num_fields].precision = GLSL_PRECISION_NONE;
> this->num_fields++;
>  }
>  
> 

Thanks, Tapani!

Patch 1 is:
Reviewed-by: Kenneth Graunke 

I verified that it fixes my problem as well.  I figured it was something
trivial like that :)


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] meta/generate_mipmap: Don't leak the framebuffer object

2015-11-16 Thread Ian Romanick

From: Ian Romanick 

Signed-off-by: Ian Romanick 
Cc: "10.6 11.0" 
---
 src/mesa/drivers/common/meta_generate_mipmap.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/src/mesa/drivers/common/meta_generate_mipmap.c 
b/src/mesa/drivers/common/meta_generate_mipmap.c
index ffd71b6..bde170f 100644
--- a/src/mesa/drivers/common/meta_generate_mipmap.c
+++ b/src/mesa/drivers/common/meta_generate_mipmap.c
@@ -131,6 +131,11 @@ _mesa_meta_glsl_generate_mipmap_cleanup(struct 
gen_mipmap_state *mipmap)
_mesa_DeleteSamplers(1, >Sampler);
mipmap->Sampler = 0;
 
+   if (mipmap->FBO != 0) {
+  _mesa_DeleteFramebuffers(1, >FBO);
+  mipmap->FBO = 0;
+   }
+
_mesa_meta_blit_shader_table_cleanup(>shaders);
 }
 
-- 
2.1.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/7] i965: Add lossless compression to surface format table

2015-11-16 Thread Ben Widawsky

On Fri, Nov 13, 2015 at 12:29:47PM -0800, Chad Versace wrote:
> On Wed 11 Nov 2015, Ben Widawsky wrote:
> > Background: Prior to Skylake and since Ivybridge Intel hardware has had the
> > ability to use a MCS (Multisample Control Surface) as auxiliary data in
> > "compression" operations on the surface. This reduces memory bandwidth.  
> > This
> > hardware was either used for MSAA compression, and fast clear operations.  
> > On
> > Gen8, a similar mechanism exists to allow the hiz buffer to be sampled 
> > from, and
> > therefore this feature is sometimes referred to more generally as "AUX 
> > buffers".
> > 
> > Skylake adds the ability to have the display engine directly source 
> > compressed
> > surfaces on top of the ability to sample from them. Inference dictates that
> > enabling this display features adding a restriction to the formats which 
> > could
> > actually be compressed. The current set of surfaces seems to be a subset as
> > compared to previous gens (see the next patch). Also, if I had to guess I 
> > would
> > guess that future gens add support for more surface formats. To make 
> > handling
> > this a bit easier to read, and more future proof, the support for this is 
> > moved
> > into the surface formats table.
> > 
> > Along with the modifications to the table, a helper function is also 
> > provided to
> > determine if a surface is CCS compatible.  Because fast clears are currently
> > disabled on SKL, we can plumb the helper all the way through here, and not
> > actually have anything break.
> > 
> > The logic in the table works a bit differently than the other columns in the
> > table and therefore deserves a small mention. For most other features, the 
> > GEN
> > which began implementing it is set, and it is assumed future gens also 
> > support
> > this. For this feature, GEN9 actually eliminates support for certain 
> > formats. We
> > could use this column to determine support for the similar feature on older
> > generation hardware. Aside from that being an error prone task which is
> > unrelated to enabling this on GEN9, it becomes somewhat tricky to implement
> > because of the fact that surface format support diminishes. You'd probably 
> > want
> > another column to cleanly implement it.
> > 
> > Requested-by: Chad Versace 
> > Requested-by: Neil Roberts 
> > Signed-off-by: Ben Widawsky 
> > ---
> >  src/mesa/drivers/dri/i965/brw_context.h |   2 +
> >  src/mesa/drivers/dri/i965/brw_surface_formats.c | 527 
> > +---
> >  src/mesa/drivers/dri/i965/intel_mipmap_tree.c   |   7 +
> >  3 files changed, 285 insertions(+), 251 deletions(-)
> > 
> > diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
> > b/src/mesa/drivers/dri/i965/brw_context.h
> > index 4b2db61..6284c18 100644
> > --- a/src/mesa/drivers/dri/i965/brw_context.h
> > +++ b/src/mesa/drivers/dri/i965/brw_context.h
> > @@ -1465,6 +1465,8 @@ void brw_upload_image_surfaces(struct brw_context 
> > *brw,
> >  /* brw_surface_formats.c */
> >  bool brw_render_target_supported(struct brw_context *brw,
> >   struct gl_renderbuffer *rb);
> > +bool brw_losslessly_compressible_format(struct brw_context *brw,
> > +uint32_t brw_format);
> >  uint32_t brw_depth_format(struct brw_context *brw, mesa_format format);
> >  mesa_format brw_lower_mesa_image_format(const struct brw_device_info 
> > *devinfo,
> >  mesa_format format);
> > diff --git a/src/mesa/drivers/dri/i965/brw_surface_formats.c 
> > b/src/mesa/drivers/dri/i965/brw_surface_formats.c
> > index 97fff60..a7cdc13 100644
> > --- a/src/mesa/drivers/dri/i965/brw_surface_formats.c
> > +++ b/src/mesa/drivers/dri/i965/brw_surface_formats.c
> > @@ -39,14 +39,15 @@ struct surface_format_info {
> > int input_vb;
> > int streamed_output_vb;
> > int color_processing;
> > +   int lossless_compression_support;
> 
> There's no need to place "support" in the name. Every struct member is
> a "support" member.
> 

Fine.

> > const char *name;
> >  };
> >  
> >  /* This macro allows us to write the table almost as it appears in the PRM,
> >   * while restructuring it to turn it into the C code we want.
> >   */
> > -#define SF(sampl, filt, shad, ck, rt, ab, vb, so, color, sf) \
> > -   [BRW_SURFACEFORMAT_##sf] = { true, sampl, filt, shad, ck, rt, ab, vb, 
> > so, color, #sf},
> > +#define SF(sampl, filt, shad, ck, rt, ab, vb, so, color, ccs, sf) \
> > +   [BRW_SURFACEFORMAT_##sf] = { true, sampl, filt, shad, ck, rt, ab, vb, 
> > so, color, ccs, #sf},
> >  
> >  #define Y 0
> >  #define x 999
> > @@ -74,6 +75,7 @@ struct surface_format_info {
> >   * VB- Input Vertex Buffer
> >   * SO- Steamed Output Vertex Buffers (transform feedback)
> >   * color - Color Processing
> > + * ccs   - Lossless Compression Support (gen9+ only)
> 
> Please don't name the

Re: [Mesa-dev] [PATCH] i965: Set MaxCombinedUniformBlocks properly.

2015-11-16 Thread Jordan Justen

Reviewed-by: Jordan Justen 

On 2015-11-13 15:05:17, Kenneth Graunke wrote:
> Up until now, we've been letting core Mesa initialize it to 36 for us
> (which is presumably BRW_MAX_UBO (12) * (VS+GS+FS stages -> 3)).
> 
> With compute and tessellation, we need to increase this.
> 
> Signed-off-by: Kenneth Graunke 
> ---
>  src/mesa/drivers/dri/i965/brw_context.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> Patch depends on 2/2: "i965: Clean up context constant initialization code."
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_context.c 
> b/src/mesa/drivers/dri/i965/brw_context.c
> index e70ad98..2ea0a9e 100644
> --- a/src/mesa/drivers/dri/i965/brw_context.c
> +++ b/src/mesa/drivers/dri/i965/brw_context.c
> @@ -391,6 +391,7 @@ brw_initialize_context_constants(struct brw_context *brw)
> ctx->Const.Program[MESA_SHADER_FRAGMENT].MaxTextureImageUnits);
>  
> ctx->Const.MaxUniformBufferBindings = num_stages * BRW_MAX_UBO;
> +   ctx->Const.MaxCombinedUniformBlocks = num_stages * BRW_MAX_UBO;
> ctx->Const.MaxCombinedAtomicBuffers = num_stages * BRW_MAX_ABO;
> ctx->Const.MaxCombinedShaderStorageBlocks = num_stages * BRW_MAX_SSBO;
> ctx->Const.MaxShaderStorageBufferBindings = num_stages * BRW_MAX_SSBO;
> -- 
> 2.6.2
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/2] i965: Convert scalar_* flags to a scalar_stage array.

2015-11-16 Thread Jordan Justen

On 2015-11-12 15:38:51, Kenneth Graunke wrote:
> I was going to add scalar_tcs and scalar_tes flags, and then thought
> better of it and decided to convert this to an array.  Simpler.
> 
> Signed-off-by: Kenneth Graunke 
> ---
>  src/mesa/drivers/dri/i965/brw_compiler.h  |  3 +--
>  src/mesa/drivers/dri/i965/brw_context.c   |  2 +-
>  src/mesa/drivers/dri/i965/brw_gs.c|  3 ++-
>  src/mesa/drivers/dri/i965/brw_link.cpp| 11 +---
>  src/mesa/drivers/dri/i965/brw_program.c   |  3 ++-
>  src/mesa/drivers/dri/i965/brw_shader.cpp  | 31 
> ++-
>  src/mesa/drivers/dri/i965/brw_shader.h|  2 --
>  src/mesa/drivers/dri/i965/brw_vec4.cpp|  4 +--
>  src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp |  2 +-
>  src/mesa/drivers/dri/i965/brw_vs.c|  7 ++---
>  10 files changed, 28 insertions(+), 40 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_compiler.h 
> b/src/mesa/drivers/dri/i965/brw_compiler.h
> index e3a26d6..3f54616 100644
> --- a/src/mesa/drivers/dri/i965/brw_compiler.h
> +++ b/src/mesa/drivers/dri/i965/brw_compiler.h
> @@ -89,8 +89,7 @@ struct brw_compiler {
> void (*shader_debug_log)(void *, const char *str, ...) PRINTFLIKE(2, 3);
> void (*shader_perf_log)(void *, const char *str, ...) PRINTFLIKE(2, 3);
>  
> -   bool scalar_vs;
> -   bool scalar_gs;
> +   bool scalar_stage[MESA_SHADER_STAGES];
> struct gl_shader_compiler_options 
> glsl_compiler_options[MESA_SHADER_STAGES];
>  };
>  
> diff --git a/src/mesa/drivers/dri/i965/brw_context.c 
> b/src/mesa/drivers/dri/i965/brw_context.c
> index ac6045d..2db99c7 100644
> --- a/src/mesa/drivers/dri/i965/brw_context.c
> +++ b/src/mesa/drivers/dri/i965/brw_context.c
> @@ -525,7 +525,7 @@ brw_initialize_context_constants(struct brw_context *brw)
>ctx->Const.Program[MESA_SHADER_FRAGMENT].MaxImageUniforms =
>   BRW_MAX_IMAGES;
>ctx->Const.Program[MESA_SHADER_VERTEX].MaxImageUniforms =
> - (brw->intelScreen->compiler->scalar_vs ? BRW_MAX_IMAGES : 0);
> + (brw->intelScreen->compiler->scalar_stage[MESA_SHADER_VERTEX] ? 
> BRW_MAX_IMAGES : 0);

Line > 80.

Reviewed-by: Jordan Justen 

>ctx->Const.Program[MESA_SHADER_COMPUTE].MaxImageUniforms =
>   BRW_MAX_IMAGES;
>ctx->Const.MaxImageUnits = MAX_IMAGE_UNITS;
> diff --git a/src/mesa/drivers/dri/i965/brw_gs.c 
> b/src/mesa/drivers/dri/i965/brw_gs.c
> index ed0890f..ad5b242 100644
> --- a/src/mesa/drivers/dri/i965/brw_gs.c
> +++ b/src/mesa/drivers/dri/i965/brw_gs.c
> @@ -87,7 +87,8 @@ brw_codegen_gs_prog(struct brw_context *brw,
> prog_data.base.base.nr_image_params = gs->NumImages;
>  
> brw_nir_setup_glsl_uniforms(gp->program.Base.nir, prog, >program.Base,
> -   _data.base.base, compiler->scalar_gs);
> +   _data.base.base,
> +   compiler->scalar_stage[MESA_SHADER_GEOMETRY]);
>  
> GLbitfield64 outputs_written = gp->program.Base.OutputsWritten;
>  
> diff --git a/src/mesa/drivers/dri/i965/brw_link.cpp 
> b/src/mesa/drivers/dri/i965/brw_link.cpp
> index 2991173..14421d4 100644
> --- a/src/mesa/drivers/dri/i965/brw_link.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_link.cpp
> @@ -66,12 +66,14 @@ brw_lower_packing_builtins(struct brw_context *brw,
> gl_shader_stage shader_type,
> exec_list *ir)
>  {
> +   const struct brw_compiler *compiler = brw->intelScreen->compiler;
> +
> int ops = LOWER_PACK_SNORM_2x16
> | LOWER_UNPACK_SNORM_2x16
> | LOWER_PACK_UNORM_2x16
> | LOWER_UNPACK_UNORM_2x16;
>  
> -   if (is_scalar_shader_stage(brw->intelScreen->compiler, shader_type)) {
> +   if (compiler->scalar_stage[shader_type]) {
>ops |= LOWER_UNPACK_UNORM_4x8
> | LOWER_UNPACK_SNORM_4x8
> | LOWER_PACK_UNORM_4x8
> @@ -84,7 +86,7 @@ brw_lower_packing_builtins(struct brw_context *brw,
> * lowering is needed. For SOA code, the Half2x16 ops must be
> * scalarized.
> */
> -  if (is_scalar_shader_stage(brw->intelScreen->compiler, shader_type)) {
> +  if (compiler->scalar_stage[shader_type]) {
>   ops |= LOWER_PACK_HALF_2x16_TO_SPLIT
>   |  LOWER_UNPACK_HALF_2x16_TO_SPLIT;
>}
> @@ -103,6 +105,7 @@ process_glsl_ir(gl_shader_stage stage,
>  struct gl_shader *shader)
>  {
> struct gl_context *ctx = >ctx;
> +   const struct brw_compiler *compiler = brw->intelScreen->compiler;
> const struct gl_shader_compiler_options *options =
>>Const.ShaderCompilerOptions[shader->Stage];
>  
> @@ -161,7 +164,7 @@ process_glsl_ir(gl_shader_stage stage,
> do {
>progress = false;
>  
> -  if (is_scalar_shader_stage(brw->intelScreen->compiler, shader->Stage)) 
> {
> +  if

[Mesa-dev] [PATCH 2/2] radeonsi/compute: Use the compiler's COMPUTE_PGM_RSRC* register values

2015-11-16 Thread Tom Stellard

The compiler has more information and is able to optimize the bits
it sets in these registers.

CC: 
---
 src/gallium/drivers/radeonsi/si_compute.c | 37 ++-
 src/gallium/drivers/radeonsi/si_shader.c  |  2 ++
 2 files changed, 9 insertions(+), 30 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_compute.c 
b/src/gallium/drivers/radeonsi/si_compute.c
index 2d551dd..a461b2c 100644
--- a/src/gallium/drivers/radeonsi/si_compute.c
+++ b/src/gallium/drivers/radeonsi/si_compute.c
@@ -34,11 +34,6 @@
 
 #define MAX_GLOBAL_BUFFERS 20
 
-/* XXX: Even though we don't pass the scratch buffer via user sgprs any more
- * LLVM still expects that we specify 4 USER_SGPRS so it can remain compatible
- * with older mesa. */
-#define NUM_USER_SGPRS 4
-
 struct si_compute {
struct si_context *ctx;
 
@@ -238,7 +233,6 @@ static void si_launch_grid(
uint64_t kernel_args_va;
uint64_t scratch_buffer_va = 0;
uint64_t shader_va;
-   unsigned arg_user_sgpr_count = NUM_USER_SGPRS;
unsigned i;
struct si_shader *shader = >shader;
unsigned lds_blocks;
@@ -366,19 +360,7 @@ static void si_launch_grid(
si_pm4_set_reg(pm4, R_00B830_COMPUTE_PGM_LO, shader_va >> 8);
si_pm4_set_reg(pm4, R_00B834_COMPUTE_PGM_HI, shader_va >> 40);
 
-   si_pm4_set_reg(pm4, R_00B848_COMPUTE_PGM_RSRC1,
-   /* We always use at least 3 VGPRS, these come from
-* TIDIG_COMP_CNT.
-* XXX: The compiler should account for this.
-*/
-   S_00B848_VGPRS((MAX2(3, shader->num_vgprs) - 1) / 4)
-   /* We always use at least 4 + arg_user_sgpr_count.  The 4 extra
-* sgprs are from TGID_X_EN, TGID_Y_EN, TGID_Z_EN, TG_SIZE_EN
-* XXX: The compiler should account for this.
-*/
-   |  S_00B848_SGPRS(((MAX2(4 + arg_user_sgpr_count,
-   shader->num_sgprs)) - 1) / 8)
-   |  S_00B028_FLOAT_MODE(shader->float_mode))
+   si_pm4_set_reg(pm4, R_00B848_COMPUTE_PGM_RSRC1, shader->rsrc1);
;
 
lds_blocks = shader->lds_size;
@@ -395,17 +377,12 @@ static void si_launch_grid(
 
assert(lds_blocks <= 0xFF);
 
-   si_pm4_set_reg(pm4, R_00B84C_COMPUTE_PGM_RSRC2,
-   S_00B84C_SCRATCH_EN(shader->scratch_bytes_per_wave > 0)
-   | S_00B84C_USER_SGPR(arg_user_sgpr_count)
-   | S_00B84C_TGID_X_EN(1)
-   | S_00B84C_TGID_Y_EN(1)
-   | S_00B84C_TGID_Z_EN(1)
-   | S_00B84C_TG_SIZE_EN(1)
-   | S_00B84C_TIDIG_COMP_CNT(2)
-   | S_00B84C_LDS_SIZE(lds_blocks)
-   | S_00B84C_EXCP_EN(0))
-   ;
+   /*
+*/
+   shader->rsrc2 &= C_00B84C_LDS_SIZE;
+   shader->rsrc2 |=  S_00B84C_LDS_SIZE(lds_blocks);
+
+   si_pm4_set_reg(pm4, R_00B84C_COMPUTE_PGM_RSRC2, shader->rsrc2);
si_pm4_set_reg(pm4, R_00B854_COMPUTE_RESOURCE_LIMITS, 0);
 
si_pm4_set_reg(pm4, R_00B858_COMPUTE_STATIC_THREAD_MGMT_SE0,
diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
b/src/gallium/drivers/radeonsi/si_shader.c
index 354d064..14f12df 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -3745,12 +3745,14 @@ void si_shader_binary_read_config(const struct 
si_screen *sscreen,
shader->num_sgprs = MAX2(shader->num_sgprs, 
(G_00B028_SGPRS(value) + 1) * 8);
shader->num_vgprs = MAX2(shader->num_vgprs, 
(G_00B028_VGPRS(value) + 1) * 4);
shader->float_mode =  G_00B028_FLOAT_MODE(value);
+   shader->rsrc1 = value;
break;
case R_00B02C_SPI_SHADER_PGM_RSRC2_PS:
shader->lds_size = MAX2(shader->lds_size, 
G_00B02C_EXTRA_LDS_SIZE(value));
break;
case R_00B84C_COMPUTE_PGM_RSRC2:
shader->lds_size = MAX2(shader->lds_size, 
G_00B84C_LDS_SIZE(value));
+   shader->rsrc2 = value;
break;
case R_0286CC_SPI_PS_INPUT_ENA:
shader->spi_ps_input_ena = value;
-- 
2.0.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/2] radeonsi: Rename si_shader::ls_rsrc{1, 2} to si_shader::rsrc{1, 2}

2015-11-16 Thread Marek Olšák

For the series:

Reviewed-by: Marek Olšák 

Marek

On Mon, Nov 16, 2015 at 9:03 PM, Tom Stellard  wrote:
> In the future, these will be used by other shaders types.
>
> CC: 
> ---
>  src/gallium/drivers/radeonsi/si_shader.h| 4 ++--
>  src/gallium/drivers/radeonsi/si_state_draw.c| 4 ++--
>  src/gallium/drivers/radeonsi/si_state_shaders.c | 4 ++--
>  3 files changed, 6 insertions(+), 6 deletions(-)
>
> diff --git a/src/gallium/drivers/radeonsi/si_shader.h 
> b/src/gallium/drivers/radeonsi/si_shader.h
> index 3400a03..f089dc7 100644
> --- a/src/gallium/drivers/radeonsi/si_shader.h
> +++ b/src/gallium/drivers/radeonsi/si_shader.h
> @@ -290,8 +290,8 @@ struct si_shader {
> boolis_gs_copy_shader;
> booldx10_clamp_mode; /* convert NaNs to 0 */
>
> -   unsignedls_rsrc1;
> -   unsignedls_rsrc2;
> +   unsignedrsrc1;
> +   unsignedrsrc2;
>  };
>
>  static inline struct tgsi_shader_info *si_get_vs_info(struct si_context 
> *sctx)
> diff --git a/src/gallium/drivers/radeonsi/si_state_draw.c 
> b/src/gallium/drivers/radeonsi/si_state_draw.c
> index 753abc8..771d206 100644
> --- a/src/gallium/drivers/radeonsi/si_state_draw.c
> +++ b/src/gallium/drivers/radeonsi/si_state_draw.c
> @@ -163,7 +163,7 @@ static void si_emit_derived_tess_state(struct si_context 
> *sctx,
> perpatch_output_offset = output_patch0_offset + 
> pervertex_output_patch_size;
>
> lds_size = output_patch0_offset + output_patch_size * *num_patches;
> -   ls_rsrc2 = ls->current->ls_rsrc2;
> +   ls_rsrc2 = ls->current->rsrc2;
>
> if (sctx->b.chip_class >= CIK) {
> assert(lds_size <= 65536);
> @@ -178,7 +178,7 @@ static void si_emit_derived_tess_state(struct si_context 
> *sctx,
> if (sctx->b.chip_class == CIK && sctx->b.family != CHIP_HAWAII)
> radeon_set_sh_reg(cs, R_00B52C_SPI_SHADER_PGM_RSRC2_LS, 
> ls_rsrc2);
> radeon_set_sh_reg_seq(cs, R_00B528_SPI_SHADER_PGM_RSRC1_LS, 2);
> -   radeon_emit(cs, ls->current->ls_rsrc1);
> +   radeon_emit(cs, ls->current->rsrc1);
> radeon_emit(cs, ls_rsrc2);
>
> /* Compute userdata SGPRs. */
> diff --git a/src/gallium/drivers/radeonsi/si_state_shaders.c 
> b/src/gallium/drivers/radeonsi/si_state_shaders.c
> index 7f6511c..ca6b4be 100644
> --- a/src/gallium/drivers/radeonsi/si_state_shaders.c
> +++ b/src/gallium/drivers/radeonsi/si_state_shaders.c
> @@ -121,11 +121,11 @@ static void si_shader_ls(struct si_shader *shader)
> si_pm4_set_reg(pm4, R_00B520_SPI_SHADER_PGM_LO_LS, va >> 8);
> si_pm4_set_reg(pm4, R_00B524_SPI_SHADER_PGM_HI_LS, va >> 40);
>
> -   shader->ls_rsrc1 = S_00B528_VGPRS((shader->num_vgprs - 1) / 4) |
> +   shader->rsrc1 = S_00B528_VGPRS((shader->num_vgprs - 1) / 4) |
>S_00B528_SGPRS((num_sgprs - 1) / 8) |
>S_00B528_VGPR_COMP_CNT(vgpr_comp_cnt) |
>S_00B528_DX10_CLAMP(shader->dx10_clamp_mode);
> -   shader->ls_rsrc2 = S_00B52C_USER_SGPR(num_user_sgprs) |
> +   shader->rsrc2 = S_00B52C_USER_SGPR(num_user_sgprs) |
>S_00B52C_SCRATCH_EN(shader->scratch_bytes_per_wave 
> > 0);
>  }
>
> --
> 2.0.4
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/2] r200: fix bgrx8/xrgb8 blits

2015-11-16 Thread Marek Olšák

Reviewed-by: Marek Olšák 

Marek

On Thu, Nov 12, 2015 at 8:00 PM,   wrote:
> From: Roland Scheidegger 
>
> Since 779cabfc7d022de8b7b9bc7fdac0caffa8646c51 the same txformat table entries
> are used for "normal" texturing as well as for blits. However, I forgot to put
> in an entry for the bgrx8 (le) and xrgb8 (be) formats - the normal texturing
> path can't hit them because the radeon tex format chooser will never chose
> them, but we get that format from the dri buffers (at least I assume we got
> it from there).
> This is untested but essentially addressing the same bug as for radeon.
> (I don't think that the second entry per le/be table is actually necessary,
> but shouldn't hurt...)
>
> Cc: "11.0" 
> ---
>  src/mesa/drivers/dri/r200/r200_tex.h | 4 
>  1 file changed, 4 insertions(+)
>
> diff --git a/src/mesa/drivers/dri/r200/r200_tex.h 
> b/src/mesa/drivers/dri/r200/r200_tex.h
> index a8c31b7..14f5e71 100644
> --- a/src/mesa/drivers/dri/r200/r200_tex.h
> +++ b/src/mesa/drivers/dri/r200/r200_tex.h
> @@ -63,7 +63,9 @@ static const struct tx_table tx_table_be[] =
> [ MESA_FORMAT_A8B8G8R8_UNORM ] = { R200_TXFORMAT_ABGR | 
> R200_TXFORMAT_ALPHA_IN_MAP, 0 },
> [ MESA_FORMAT_R8G8B8A8_UNORM ] = { R200_TXFORMAT_RGBA | 
> R200_TXFORMAT_ALPHA_IN_MAP, 0 },
> [ MESA_FORMAT_B8G8R8A8_UNORM ] = { R200_TXFORMAT_ARGB | 
> R200_TXFORMAT_ALPHA_IN_MAP, 0 },
> +   [ MESA_FORMAT_B8G8R8X8_UNORM ] = { R200_TXFORMAT_ARGB, 0 },
> [ MESA_FORMAT_A8R8G8B8_UNORM ] = { R200_TXFORMAT_ARGB | 
> R200_TXFORMAT_ALPHA_IN_MAP, 0 },
> +   [ MESA_FORMAT_X8R8G8B8_UNORM ] = { R200_TXFORMAT_ARGB, 0 },
> [ MESA_FORMAT_BGR_UNORM8 ] = { 0x, 0 },
> [ MESA_FORMAT_B5G6R5_UNORM ] = { R200_TXFORMAT_RGB565, 0 },
> [ MESA_FORMAT_R5G6B5_UNORM ] = { R200_TXFORMAT_RGB565, 0 },
> @@ -91,7 +93,9 @@ static const struct tx_table tx_table_le[] =
> [ MESA_FORMAT_A8B8G8R8_UNORM ] = { R200_TXFORMAT_RGBA | 
> R200_TXFORMAT_ALPHA_IN_MAP, 0 },
> [ MESA_FORMAT_R8G8B8A8_UNORM ] = { R200_TXFORMAT_ABGR | 
> R200_TXFORMAT_ALPHA_IN_MAP, 0 },
> [ MESA_FORMAT_B8G8R8A8_UNORM ] = { R200_TXFORMAT_ARGB | 
> R200_TXFORMAT_ALPHA_IN_MAP, 0 },
> +   [ MESA_FORMAT_B8G8R8X8_UNORM ] = { R200_TXFORMAT_ARGB, 0 },
> [ MESA_FORMAT_A8R8G8B8_UNORM ] = { R200_TXFORMAT_ARGB | 
> R200_TXFORMAT_ALPHA_IN_MAP, 0 },
> +   [ MESA_FORMAT_X8R8G8B8_UNORM ] = { R200_TXFORMAT_ARGB, 0 },
> [ MESA_FORMAT_BGR_UNORM8 ] = { R200_TXFORMAT_ARGB, 0 },
> [ MESA_FORMAT_B5G6R5_UNORM ] = { R200_TXFORMAT_RGB565, 0 },
> [ MESA_FORMAT_R5G6B5_UNORM ] = { R200_TXFORMAT_RGB565, 0 },
> --
> 2.1.4
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] i965/nir: Add hooks for testing nir_shader_clone

2015-11-16 Thread Jason Ekstrand

This commit adds code for testing nir_shader_clone by running it after each
and every optimization pass and throwing away the old shader.  Testing
nir_shader_clone is hidden behind a new INTEL_CLONE_NIR environment
variable.
---
 src/mesa/drivers/dri/i965/brw_fs.cpp  | 10 +++--
 src/mesa/drivers/dri/i965/brw_nir.c   | 55 ---
 src/mesa/drivers/dri/i965/brw_nir.h   | 28 ++--
 src/mesa/drivers/dri/i965/brw_vec4.cpp|  6 +--
 src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp |  6 +--
 5 files changed, 65 insertions(+), 40 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index e094131..9d5be95 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -5458,8 +5458,9 @@ brw_compile_fs(const struct brw_compiler *compiler, void 
*log_data,
char **error_str)
 {
nir_shader *shader = nir_shader_clone(mem_ctx, src_shader);
-   brw_nir_apply_sampler_key(shader, compiler->devinfo, >tex, true);
-   brw_postprocess_nir(shader, compiler->devinfo, true);
+   shader = brw_nir_apply_sampler_key(shader, compiler->devinfo,
+  >tex, true);
+   shader = brw_postprocess_nir(shader, compiler->devinfo, true);
 
/* key->alpha_test_func means simulating alpha testing via discards,
 * so the shader definitely kills pixels.
@@ -5619,8 +5620,9 @@ brw_compile_cs(const struct brw_compiler *compiler, void 
*log_data,
char **error_str)
 {
nir_shader *shader = nir_shader_clone(mem_ctx, src_shader);
-   brw_nir_apply_sampler_key(shader, compiler->devinfo, >tex, true);
-   brw_postprocess_nir(shader, compiler->devinfo, true);
+   shader = brw_nir_apply_sampler_key(shader, compiler->devinfo,
+  >tex, true);
+   shader = brw_postprocess_nir(shader, compiler->devinfo, true);
 
prog_data->local_size[0] = shader->info.cs.local_size[0];
prog_data->local_size[1] = shader->info.cs.local_size[1];
diff --git a/src/mesa/drivers/dri/i965/brw_nir.c 
b/src/mesa/drivers/dri/i965/brw_nir.c
index a897f27..452dbb7 100644
--- a/src/mesa/drivers/dri/i965/brw_nir.c
+++ b/src/mesa/drivers/dri/i965/brw_nir.c
@@ -171,11 +171,26 @@ brw_nir_lower_outputs(nir_shader *nir, bool is_scalar)
}
 }
 
-#define _OPT(do_pass) (({ \
-   bool this_progress = true; \
-   do_pass\
-   nir_validate_shader(nir);  \
-   this_progress; \
+static bool
+should_clone_nir()
+{
+   static int should_clone = -1;
+   if (should_clone < 1)
+  should_clone = brw_env_var_as_boolean("INTEL_CLONE_NIR", false);
+
+   return should_clone;
+}
+
+#define _OPT(do_pass) (({\
+   bool this_progress = true;\
+   do_pass   \
+   nir_validate_shader(nir); \
+   if (should_clone_nir()) { \
+  nir_shader *clone = nir_shader_clone(ralloc_parent(nir), nir); \
+  ralloc_free(nir);  \
+  nir = clone;   \
+   } \
+   this_progress;\
 }))
 
 #define OPT(pass, ...) _OPT(   \
@@ -191,7 +206,7 @@ brw_nir_lower_outputs(nir_shader *nir, bool is_scalar)
pass(nir, ##__VA_ARGS__);   \
 )
 
-static void
+static nir_shader *
 nir_optimize(nir_shader *nir, bool is_scalar)
 {
bool progress;
@@ -219,6 +234,8 @@ nir_optimize(nir_shader *nir, bool is_scalar)
   OPT(nir_opt_remove_phis);
   OPT(nir_opt_undef);
} while (progress);
+
+   return nir;
 }
 
 /* Does some simple lowering and runs the standard suite of optimizations
@@ -230,7 +247,7 @@ nir_optimize(nir_shader *nir, bool is_scalar)
  * intended for the FS backend as long as nir_optimize is called again with
  * is_scalar = true to scalarize everything prior to code gen.
  */
-void
+nir_shader *
 brw_preprocess_nir(nir_shader *nir, bool is_scalar)
 {
bool progress; /* Written by OPT and OPT_V */
@@ -250,15 +267,17 @@ brw_preprocess_nir(nir_shader *nir, bool is_scalar)
 
OPT(nir_split_var_copies);
 
-   nir_optimize(nir, is_scalar);
+   nir = nir_optimize(nir, is_scalar);
 
/* Lower a bunch of stuff */
OPT_V(nir_lower_var_copies);
 
/* Get rid of split copies */
-   nir_optimize(nir, is_scalar);
+   nir = nir_optimize(nir, is_scalar);
 
OPT(nir_remove_dead_variables);
+
+   return nir;
 }
 
 /* Lowers inputs, outputs, uniforms, and samplers for i965
@@ -268,7 +287,7 @@ brw_preprocess_nir(nir_shader *nir, bool is_scalar)
  * shader_prog parameter is optional and is used only for lowering sampler
  * derefs and atomics for GLSL shaders.
  */
-void

[Mesa-dev] [PATCH 1/2] radeonsi: Rename si_shader::ls_rsrc{1, 2} to si_shader::rsrc{1, 2}

2015-11-16 Thread Tom Stellard

In the future, these will be used by other shaders types.

CC: 
---
 src/gallium/drivers/radeonsi/si_shader.h| 4 ++--
 src/gallium/drivers/radeonsi/si_state_draw.c| 4 ++--
 src/gallium/drivers/radeonsi/si_state_shaders.c | 4 ++--
 3 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_shader.h 
b/src/gallium/drivers/radeonsi/si_shader.h
index 3400a03..f089dc7 100644
--- a/src/gallium/drivers/radeonsi/si_shader.h
+++ b/src/gallium/drivers/radeonsi/si_shader.h
@@ -290,8 +290,8 @@ struct si_shader {
boolis_gs_copy_shader;
booldx10_clamp_mode; /* convert NaNs to 0 */
 
-   unsignedls_rsrc1;
-   unsignedls_rsrc2;
+   unsignedrsrc1;
+   unsignedrsrc2;
 };
 
 static inline struct tgsi_shader_info *si_get_vs_info(struct si_context *sctx)
diff --git a/src/gallium/drivers/radeonsi/si_state_draw.c 
b/src/gallium/drivers/radeonsi/si_state_draw.c
index 753abc8..771d206 100644
--- a/src/gallium/drivers/radeonsi/si_state_draw.c
+++ b/src/gallium/drivers/radeonsi/si_state_draw.c
@@ -163,7 +163,7 @@ static void si_emit_derived_tess_state(struct si_context 
*sctx,
perpatch_output_offset = output_patch0_offset + 
pervertex_output_patch_size;
 
lds_size = output_patch0_offset + output_patch_size * *num_patches;
-   ls_rsrc2 = ls->current->ls_rsrc2;
+   ls_rsrc2 = ls->current->rsrc2;
 
if (sctx->b.chip_class >= CIK) {
assert(lds_size <= 65536);
@@ -178,7 +178,7 @@ static void si_emit_derived_tess_state(struct si_context 
*sctx,
if (sctx->b.chip_class == CIK && sctx->b.family != CHIP_HAWAII)
radeon_set_sh_reg(cs, R_00B52C_SPI_SHADER_PGM_RSRC2_LS, 
ls_rsrc2);
radeon_set_sh_reg_seq(cs, R_00B528_SPI_SHADER_PGM_RSRC1_LS, 2);
-   radeon_emit(cs, ls->current->ls_rsrc1);
+   radeon_emit(cs, ls->current->rsrc1);
radeon_emit(cs, ls_rsrc2);
 
/* Compute userdata SGPRs. */
diff --git a/src/gallium/drivers/radeonsi/si_state_shaders.c 
b/src/gallium/drivers/radeonsi/si_state_shaders.c
index 7f6511c..ca6b4be 100644
--- a/src/gallium/drivers/radeonsi/si_state_shaders.c
+++ b/src/gallium/drivers/radeonsi/si_state_shaders.c
@@ -121,11 +121,11 @@ static void si_shader_ls(struct si_shader *shader)
si_pm4_set_reg(pm4, R_00B520_SPI_SHADER_PGM_LO_LS, va >> 8);
si_pm4_set_reg(pm4, R_00B524_SPI_SHADER_PGM_HI_LS, va >> 40);
 
-   shader->ls_rsrc1 = S_00B528_VGPRS((shader->num_vgprs - 1) / 4) |
+   shader->rsrc1 = S_00B528_VGPRS((shader->num_vgprs - 1) / 4) |
   S_00B528_SGPRS((num_sgprs - 1) / 8) |
   S_00B528_VGPR_COMP_CNT(vgpr_comp_cnt) |
   S_00B528_DX10_CLAMP(shader->dx10_clamp_mode);
-   shader->ls_rsrc2 = S_00B52C_USER_SGPR(num_user_sgprs) |
+   shader->rsrc2 = S_00B52C_USER_SGPR(num_user_sgprs) |
   S_00B52C_SCRATCH_EN(shader->scratch_bytes_per_wave > 
0);
 }
 
-- 
2.0.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [Mesa-stable] [PATCH 2/2] r200: fix bgrx8/xrgb8 blits

2015-11-16 Thread Ian Romanick

Tested-by: Ian Romanick 

On 11/12/2015 11:00 AM, srol...@vmware.com wrote:
> From: Roland Scheidegger 
> 
> Since 779cabfc7d022de8b7b9bc7fdac0caffa8646c51 the same txformat table entries
> are used for "normal" texturing as well as for blits. However, I forgot to put
> in an entry for the bgrx8 (le) and xrgb8 (be) formats - the normal texturing
> path can't hit them because the radeon tex format chooser will never chose
> them, but we get that format from the dri buffers (at least I assume we got
> it from there).
> This is untested but essentially addressing the same bug as for radeon.
> (I don't think that the second entry per le/be table is actually necessary,
> but shouldn't hurt...)
> 
> Cc: "11.0" 
> ---
>  src/mesa/drivers/dri/r200/r200_tex.h | 4 
>  1 file changed, 4 insertions(+)
> 
> diff --git a/src/mesa/drivers/dri/r200/r200_tex.h 
> b/src/mesa/drivers/dri/r200/r200_tex.h
> index a8c31b7..14f5e71 100644
> --- a/src/mesa/drivers/dri/r200/r200_tex.h
> +++ b/src/mesa/drivers/dri/r200/r200_tex.h
> @@ -63,7 +63,9 @@ static const struct tx_table tx_table_be[] =
> [ MESA_FORMAT_A8B8G8R8_UNORM ] = { R200_TXFORMAT_ABGR | 
> R200_TXFORMAT_ALPHA_IN_MAP, 0 },
> [ MESA_FORMAT_R8G8B8A8_UNORM ] = { R200_TXFORMAT_RGBA | 
> R200_TXFORMAT_ALPHA_IN_MAP, 0 },
> [ MESA_FORMAT_B8G8R8A8_UNORM ] = { R200_TXFORMAT_ARGB | 
> R200_TXFORMAT_ALPHA_IN_MAP, 0 },
> +   [ MESA_FORMAT_B8G8R8X8_UNORM ] = { R200_TXFORMAT_ARGB, 0 },
> [ MESA_FORMAT_A8R8G8B8_UNORM ] = { R200_TXFORMAT_ARGB | 
> R200_TXFORMAT_ALPHA_IN_MAP, 0 },
> +   [ MESA_FORMAT_X8R8G8B8_UNORM ] = { R200_TXFORMAT_ARGB, 0 },
> [ MESA_FORMAT_BGR_UNORM8 ] = { 0x, 0 },
> [ MESA_FORMAT_B5G6R5_UNORM ] = { R200_TXFORMAT_RGB565, 0 },
> [ MESA_FORMAT_R5G6B5_UNORM ] = { R200_TXFORMAT_RGB565, 0 },
> @@ -91,7 +93,9 @@ static const struct tx_table tx_table_le[] =
> [ MESA_FORMAT_A8B8G8R8_UNORM ] = { R200_TXFORMAT_RGBA | 
> R200_TXFORMAT_ALPHA_IN_MAP, 0 },
> [ MESA_FORMAT_R8G8B8A8_UNORM ] = { R200_TXFORMAT_ABGR | 
> R200_TXFORMAT_ALPHA_IN_MAP, 0 },
> [ MESA_FORMAT_B8G8R8A8_UNORM ] = { R200_TXFORMAT_ARGB | 
> R200_TXFORMAT_ALPHA_IN_MAP, 0 },
> +   [ MESA_FORMAT_B8G8R8X8_UNORM ] = { R200_TXFORMAT_ARGB, 0 },
> [ MESA_FORMAT_A8R8G8B8_UNORM ] = { R200_TXFORMAT_ARGB | 
> R200_TXFORMAT_ALPHA_IN_MAP, 0 },
> +   [ MESA_FORMAT_X8R8G8B8_UNORM ] = { R200_TXFORMAT_ARGB, 0 },
> [ MESA_FORMAT_BGR_UNORM8 ] = { R200_TXFORMAT_ARGB, 0 },
> [ MESA_FORMAT_B5G6R5_UNORM ] = { R200_TXFORMAT_RGB565, 0 },
> [ MESA_FORMAT_R5G6B5_UNORM ] = { R200_TXFORMAT_RGB565, 0 },
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [Mesa-stable] [PATCH] meta/generate_mipmap: Don't leak the framebuffer object

2015-11-16 Thread Anuj Phogat

On Mon, Nov 16, 2015 at 10:32 AM, Ian Romanick  wrote:
> From: Ian Romanick 
>
> Signed-off-by: Ian Romanick 
> Cc: "10.6 11.0" 
> ---
>  src/mesa/drivers/common/meta_generate_mipmap.c | 5 +
>  1 file changed, 5 insertions(+)
>
> diff --git a/src/mesa/drivers/common/meta_generate_mipmap.c 
> b/src/mesa/drivers/common/meta_generate_mipmap.c
> index ffd71b6..bde170f 100644
> --- a/src/mesa/drivers/common/meta_generate_mipmap.c
> +++ b/src/mesa/drivers/common/meta_generate_mipmap.c
> @@ -131,6 +131,11 @@ _mesa_meta_glsl_generate_mipmap_cleanup(struct 
> gen_mipmap_state *mipmap)
> _mesa_DeleteSamplers(1, >Sampler);
> mipmap->Sampler = 0;
>
> +   if (mipmap->FBO != 0) {
> +  _mesa_DeleteFramebuffers(1, >FBO);
> +  mipmap->FBO = 0;
> +   }
> +
> _mesa_meta_blit_shader_table_cleanup(>shaders);
>  }
>
> --
> 2.1.0
>
> ___
> mesa-stable mailing list
> mesa-sta...@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-stable

Reviewed-by: Anuj Phogat 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/2] radeon: fix bgrx8/xrgb8 blits

2015-11-16 Thread Marek Olšák

Reviewed-by: Marek Olšák 

Marek

On Thu, Nov 12, 2015 at 8:00 PM,   wrote:
> From: Roland Scheidegger 
>
> Since d21320f6258b2e1780a15c1ca718963d8a15ca18 the same txformat table entries
> are used for "normal" texturing as well as for blits. However, I forgot to put
> in an entry for the bgrx8 (le) and xrgb8 (be) formats - the normal texturing
> path can't hit them because the radeon tex format chooser will never chose
> them, but we get that format from the dri buffers (at least I assume we got
> it from there). This caused lots of piglit regressions (and probably lots of
> trouble outside piglit too).
> This fixes bug https://bugs.freedesktop.org/show_bug.cgi?id=92900.
>
> Tested-by: Ian Romanick 
> Cc: "11.0" 
> ---
>  src/mesa/drivers/dri/radeon/radeon_tex.h | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/src/mesa/drivers/dri/radeon/radeon_tex.h 
> b/src/mesa/drivers/dri/radeon/radeon_tex.h
> index f8ec432..37c2fa0 100644
> --- a/src/mesa/drivers/dri/radeon/radeon_tex.h
> +++ b/src/mesa/drivers/dri/radeon/radeon_tex.h
> @@ -63,6 +63,8 @@ static const struct tx_table tx_table[] =
> [ MESA_FORMAT_R8G8B8A8_UNORM ] = { RADEON_TXFORMAT_RGBA | 
> RADEON_TXFORMAT_ALPHA_IN_MAP, 0 },
> [ MESA_FORMAT_B8G8R8A8_UNORM ] = { RADEON_TXFORMAT_ARGB | 
> RADEON_TXFORMAT_ALPHA_IN_MAP, 0 },
> [ MESA_FORMAT_A8R8G8B8_UNORM ] = { RADEON_TXFORMAT_ARGB | 
> RADEON_TXFORMAT_ALPHA_IN_MAP, 0 },
> +   [ MESA_FORMAT_B8G8R8X8_UNORM ] = { RADEON_TXFORMAT_ARGB, 0 },
> +   [ MESA_FORMAT_X8R8G8B8_UNORM ] = { RADEON_TXFORMAT_ARGB, 0 },
> [ MESA_FORMAT_BGR_UNORM8 ] = { RADEON_TXFORMAT_ARGB, 0 },
> [ MESA_FORMAT_B5G6R5_UNORM ] = { RADEON_TXFORMAT_RGB565, 0 },
> [ MESA_FORMAT_R5G6B5_UNORM ] = { RADEON_TXFORMAT_RGB565, 0 },
> --
> 2.1.4
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/2] i965/fs: Add support for gl_HelperInvocation system value.

2015-11-16 Thread Matt Turner

On Fri, Nov 13, 2015 at 6:05 PM, Matt Turner  wrote:
> ---
> This code generates
>
> mov(1)  f0<1>UW g1.14<0,1,0>UW
> mov(8)  g2<1>UD 0xUD
> (+f0) sel(8)g3<1>D  g2<8,8,1>D  -1D
>
> which I don't love because it uses the flag register, and likely uses
> of gl_HelperInvocation will be in an if condition, in which case we
> could have just used f0 directly.
>
> Alternative implementation ideas:
>
>   - Shift dispatch mask with a vector-immediate, and then
> resolve it to true/false with -(x & 1):
>
> shr(8)  tmp<1>UW  g1.14<1,8,0>UB  0x76543210V

I got this working (needed to read g1.28, not g1.24) modulo one yet
unexplained issue. Once that is fixed, I'll send a replacement for
this patch.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 3/7] [v2] i965/skl: skip fast clears for certain surface formats

2015-11-16 Thread Chad Versace

On Mon 16 Nov 2015, Matt Turner wrote:
> On Wed, Nov 11, 2015 at 2:06 PM, Ben Widawsky
>  wrote:
> > Some of the information originally in this commit message is now in the 
> > patch
> > before this.
> >
> > SKL adds compressible render targets and as a result mutates some of the
> > programming for fast clears and resolves. There is a new internal surface 
> > type
> > called the CCS. The old AUX_MCS bit becomes AUX_CCS_D. "The Auxiliary 
> > surface is
> > a CCS (Color Control Surface) with compression disabled or an MCS with
> > compression enabled, depending on number of multisamples. MCS (Multisample
> > Control Surface) is a special type of CCS."
> >
> > The formats which are supported are defined in the table titled "Render 
> > Target
> > Surface Types [SKL+]". There is no PRM yet to reference. The previously
> > implemented helper function already does the right thing provided the table 
> > is
> > correct.
> >
> > v2: Use better English in commit message (Matt)
> > s/compressable/compressible/ (Matt)
> > Don't compare bools to true (Matt)
> > Use the helper function and don't increase the context size - this is mostly
> > implemented in the patch just before this (Chad, Neil)
> > Remove an "invalid" assert (Chad)
> > Fix assertion to check num_samples > 1, instead of num_samples (Chad)
> >
> > Cc: Chad Versace 
> > Cc: Neil Roberts 
> > Signed-off-by: Ben Widawsky 
> > ---
> >  src/mesa/drivers/dri/i965/brw_surface_formats.c | 52 
> > -
> >  src/mesa/drivers/dri/i965/gen8_surface_state.c  |  7 +++-
> >  2 files changed, 31 insertions(+), 28 deletions(-)
> >
> > diff --git a/src/mesa/drivers/dri/i965/brw_surface_formats.c 
> > b/src/mesa/drivers/dri/i965/brw_surface_formats.c
> > index a7cdc13..a527f2f 100644
> > --- a/src/mesa/drivers/dri/i965/brw_surface_formats.c
> > +++ b/src/mesa/drivers/dri/i965/brw_surface_formats.c
> > @@ -90,9 +90,9 @@ struct surface_format_info {
> >   */
> >  const struct surface_format_info surface_formats[] = {
> >  /* smpl filt shad CK  RT  AB  VB  SO  color ccs */
> > -   SF( Y, 50,  x,  x,  Y,  Y,  Y,  Y,  x,x,   R32G32B32A32_FLOAT)
> > -   SF( Y,  x,  x,  x,  Y,  x,  Y,  Y,  x,x,   R32G32B32A32_SINT)
> > -   SF( Y,  x,  x,  x,  Y,  x,  Y,  Y,  x,x,   R32G32B32A32_UINT)
> > +   SF( Y, 50,  x,  x,  Y,  Y,  Y,  Y,  x,   90,   R32G32B32A32_FLOAT)
> > +   SF( Y,  x,  x,  x,  Y,  x,  Y,  Y,  x,   90,   R32G32B32A32_SINT)
> > +   SF( Y,  x,  x,  x,  Y,  x,  Y,  Y,  x,   90,   R32G32B32A32_UINT)
> > SF( x,  x,  x,  x,  x,  x,  Y,  x,  x,x,   R32G32B32A32_UNORM)
> > SF( x,  x,  x,  x,  x,  x,  Y,  x,  x,x,   R32G32B32A32_SNORM)
> > SF( x,  x,  x,  x,  x,  x,  Y,  x,  x,x,   R64G64_FLOAT)
> > @@ -109,15 +109,15 @@ const struct surface_format_info surface_formats[] = {
> > SF( x,  x,  x,  x,  x,  x,  Y,  x,  x,x,   R32G32B32_SSCALED)
> > SF( x,  x,  x,  x,  x,  x,  Y,  x,  x,x,   R32G32B32_USCALED)
> > SF( x,  x,  x,  x,  x,  x,  x,  x,  x,x,   R32G32B32_SFIXED)
> > -   SF( Y,  Y,  x,  x,  Y, 45,  Y,  x, 60,x,   R16G16B16A16_UNORM)
> > -   SF( Y,  Y,  x,  x,  Y, 60,  Y,  x,  x,x,   R16G16B16A16_SNORM)
> > -   SF( Y,  x,  x,  x,  Y,  x,  Y,  x,  x,x,   R16G16B16A16_SINT)
> > -   SF( Y,  x,  x,  x,  Y,  x,  Y,  x,  x,x,   R16G16B16A16_UINT)
> > -   SF( Y,  Y,  x,  x,  Y,  Y,  Y,  x,  x,x,   R16G16B16A16_FLOAT)
> > -   SF( Y, 50,  x,  x,  Y,  Y,  Y,  Y,  x,x,   R32G32_FLOAT)
> > +   SF( Y,  Y,  x,  x,  Y, 45,  Y,  x, 60,   90,   R16G16B16A16_UNORM)
> > +   SF( Y,  Y,  x,  x,  Y, 60,  Y,  x,  x,   90,   R16G16B16A16_SNORM)
> > +   SF( Y,  x,  x,  x,  Y,  x,  Y,  x,  x,   90,   R16G16B16A16_SINT)
> > +   SF( Y,  x,  x,  x,  Y,  x,  Y,  x,  x,   90,   R16G16B16A16_UINT)
> > +   SF( Y,  Y,  x,  x,  Y,  Y,  Y,  x,  x,   90,   R16G16B16A16_FLOAT)
> > +   SF( Y, 50,  x,  x,  Y,  Y,  Y,  Y,  x,   90,   R32G32_FLOAT)
> > SF( Y, 70,  x,  x,  Y,  Y,  Y,  Y,  x,x,   R32G32_FLOAT_LD)
> > -   SF( Y,  x,  x,  x,  Y,  x,  Y,  Y,  x,x,   R32G32_SINT)
> > -   SF( Y,  x,  x,  x,  Y,  x,  Y,  Y,  x,x,   R32G32_UINT)
> > +   SF( Y,  x,  x,  x,  Y,  x,  Y,  Y,  x,   90,   R32G32_SINT)
> > +   SF( Y,  x,  x,  x,  Y,  x,  Y,  Y,  x,   90,   R32G32_UINT)
> > SF( Y, 50,  Y,  x,  x,  x,  x,  x,  x,x,   R32_FLOAT_X8X24_TYPELESS)
> > SF( Y,  x,  x,  x,  x,  x,  x,  x,  x,x,   X32_TYPELESS_G8X24_UINT)
> > SF( Y, 50,  x,  x,  x,  x,  x,  x,  x,x,   L32A32_FLOAT)
> > @@ -125,7 +125,7 @@ const struct surface_format_info surface_formats[] = {
> > SF( x,  x,  x,  x,  x,  x,  Y,  x,  x,x,   R32G32_SNORM)
> > SF( x,  x,  x,  x,  x,  x,  Y,  x,  x,x,   R64_FLOAT)
> > SF( Y,  Y,  x,  x,  x,  x,  x,  x,  x,x,   R16G16B16X16_UNORM)
> > -   SF( Y,  Y,  x,  x,  x,  x,  x,  x,  x,x,   R16G16B16X16_FLOAT)
> > +   SF( Y,  Y,  x,  x,  x,  x,  x,  x,

Re: [Mesa-dev] [PATCH v2] i965: Prevent fast clears for MSRTs on SKL

2015-11-16 Thread Chad Versace

On Mon 16 Nov 2015, Neil Roberts wrote:
> There are currently a bunch of formats that behave strangely when
> sampling the cleared color from the MCS buffer on SKL. They seem to
> mostly be formats that don't have an alpha component, although it's
> not all of them, and we haven't yet found anything in the specs which
> would explain this. For now to be on the safe side this patch just
> prevents fast clears for MSRTs on SKL altogether so that when fast
> clears are eventually enabled it will only be for single-sampled
> surfaces. The assumption is that clears are probably more likely to be
> used in single-sampled applications anyway so we can at least get them
> working and we can enable MSRTs later once we understand the problem
> better.
> 
> This patch should have no functional effect other than perhaps
> receiving fewer perf_debug messages on SKL+.
> 
> v2: Improve the commit message to avoid saying the patch disables fast
> clears because it will be merged before fast clears are enabled
> for any surfaces so it doesn't actually disable anything.
> Reviewed-by: Ben Widawsky 
> ---
>  src/mesa/drivers/dri/i965/brw_meta_fast_clear.c | 7 +++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_meta_fast_clear.c 
> b/src/mesa/drivers/dri/i965/brw_meta_fast_clear.c
> index dc085ba..85576a8 100644
> --- a/src/mesa/drivers/dri/i965/brw_meta_fast_clear.c
> +++ b/src/mesa/drivers/dri/i965/brw_meta_fast_clear.c
> @@ -524,6 +524,13 @@ brw_meta_fast_clear(struct brw_context *brw, struct 
> gl_framebuffer *fb,
>if (brw->gen < 7)
>   clear_type = REP_CLEAR;
>  
> +  /* Certain formats have unresolved issues with sampling from the MCS
> +   * buffer on Gen9. This disables fast clears altogether for MSRTs until
> +   * we can figure out what's going on.
> +   */
> +  if (brw->gen >= 9 && irb->mt->num_samples > 1)
> + clear_type = REP_CLEAR;
> +
>if (irb->mt->fast_clear_state == INTEL_FAST_CLEAR_STATE_NO_MCS)
>   clear_type = REP_CLEAR;

Neil, do you have a bug open for this?

Reviewed-by: Chad Versace 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] r600g: Support TGSI_SEMANTIC_HELPER_INVOCATION

2015-11-16 Thread Marek Olšák

On Mon, Nov 16, 2015 at 6:03 PM, Ilia Mirkin  wrote:
> On Mon, Nov 16, 2015 at 8:31 AM, Nicolai Hähnle  wrote:
>> Hi Glenn,
>>
>> On 14.11.2015 00:11, Glenn Kennard wrote:
>>>
>>> On Fri, 13 Nov 2015 18:57:28 +0100, Nicolai Hähnle 
>>> wrote:
>>>
 On 13.11.2015 00:14, Glenn Kennard wrote:
>
> Signed-off-by: Glenn Kennard 
> ---
> Maybe there is a better way to check if a thread is a helper invocation?


 Is ctx->face_gpr guaranteed to be initialized when
 load_helper_invocation is called?

>>>
>>> allocate_system_value_inputs() sets that if needed, and is called before
>>> parsing any opcodes.
>>
>>
>> Sorry, you're right, I missed the second change to the inputs array there.
>>
>>
 Aside, I'm not sure I understand correctly what this is supposed to
 do. The values you're querying are related to multi-sampling, but my
 understanding has always been that helper invocations can also happen
 without multi-sampling: you always want to process 2x2 quads of pixels
 at a time to be able to compute derivatives for texture sampling. When
 the boundary of primitive intersects such a quad, you get helper
 invocations outside the primitive.

>>>
>>> Non-MSAA buffers act just like 1 sample buffers with regards to the
>>> coverage mask supplied by the hardware, so helper invocations which have
>>> no coverage get a 0 for the mask value, and normal fragments get 1.
>>> Works with the piglit test case posted at least...
>>
>>
>> Here's why I'm still skeptical: According to the GLSL spec, the fragment
>> shader is only run once per pixel by default, even when MSAA is enabled.
>> _However_, if a shader statically accesses the SampleID, _then_ it must be
>> run once per fragment. The way I understand it, your change forces the
>> fragment shader to access SampleID, even when people ostensibly use
>> HelperInvocation in the hope of optimizing something.
>
> GPU's don't operate based on GLSL specs. Per-sample shading is enabled
> separately.

FYI, per-sample shading is controlled by
pipe_context::set_min_samples. Other states can't turn it on.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/7] i965: Add lossless compression to surface format table

2015-11-16 Thread Chad Versace

On Mon 16 Nov 2015, Ben Widawsky wrote:
> On Fri, Nov 13, 2015 at 12:29:47PM -0800, Chad Versace wrote:
> > On Wed 11 Nov 2015, Ben Widawsky wrote:
> > > Background: Prior to Skylake and since Ivybridge Intel hardware has had 
> > > the
> > > ability to use a MCS (Multisample Control Surface) as auxiliary data in
> > > "compression" operations on the surface. This reduces memory bandwidth.  
> > > This
> > > hardware was either used for MSAA compression, and fast clear operations. 
> > >  On
> > > Gen8, a similar mechanism exists to allow the hiz buffer to be sampled 
> > > from, and
> > > therefore this feature is sometimes referred to more generally as "AUX 
> > > buffers".
> > > 
> > > Skylake adds the ability to have the display engine directly source 
> > > compressed
> > > surfaces on top of the ability to sample from them. Inference dictates 
> > > that
> > > enabling this display features adding a restriction to the formats which 
> > > could
> > > actually be compressed. The current set of surfaces seems to be a subset 
> > > as
> > > compared to previous gens (see the next patch). Also, if I had to guess I 
> > > would
> > > guess that future gens add support for more surface formats. To make 
> > > handling
> > > this a bit easier to read, and more future proof, the support for this is 
> > > moved
> > > into the surface formats table.
> > > 
> > > Along with the modifications to the table, a helper function is also 
> > > provided to
> > > determine if a surface is CCS compatible.  Because fast clears are 
> > > currently
> > > disabled on SKL, we can plumb the helper all the way through here, and not
> > > actually have anything break.
> > > 
> > > The logic in the table works a bit differently than the other columns in 
> > > the
> > > table and therefore deserves a small mention. For most other features, 
> > > the GEN
> > > which began implementing it is set, and it is assumed future gens also 
> > > support
> > > this. For this feature, GEN9 actually eliminates support for certain 
> > > formats. We
> > > could use this column to determine support for the similar feature on 
> > > older
> > > generation hardware. Aside from that being an error prone task which is
> > > unrelated to enabling this on GEN9, it becomes somewhat tricky to 
> > > implement
> > > because of the fact that surface format support diminishes. You'd 
> > > probably want
> > > another column to cleanly implement it.
> > > 
> > > Requested-by: Chad Versace 
> > > Requested-by: Neil Roberts 
> > > Signed-off-by: Ben Widawsky 
> > > ---
> > >  src/mesa/drivers/dri/i965/brw_context.h |   2 +
> > >  src/mesa/drivers/dri/i965/brw_surface_formats.c | 527 
> > > +---
> > >  src/mesa/drivers/dri/i965/intel_mipmap_tree.c   |   7 +
> > >  3 files changed, 285 insertions(+), 251 deletions(-)
> > > 
> > > diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
> > > b/src/mesa/drivers/dri/i965/brw_context.h
> > > index 4b2db61..6284c18 100644
> > > --- a/src/mesa/drivers/dri/i965/brw_context.h
> > > +++ b/src/mesa/drivers/dri/i965/brw_context.h
> > > @@ -1465,6 +1465,8 @@ void brw_upload_image_surfaces(struct brw_context 
> > > *brw,
> > >  /* brw_surface_formats.c */
> > >  bool brw_render_target_supported(struct brw_context *brw,
> > >   struct gl_renderbuffer *rb);
> > > +bool brw_losslessly_compressible_format(struct brw_context *brw,
> > > +uint32_t brw_format);
> > >  uint32_t brw_depth_format(struct brw_context *brw, mesa_format format);
> > >  mesa_format brw_lower_mesa_image_format(const struct brw_device_info 
> > > *devinfo,
> > >  mesa_format format);
> > > diff --git a/src/mesa/drivers/dri/i965/brw_surface_formats.c 
> > > b/src/mesa/drivers/dri/i965/brw_surface_formats.c
> > > index 97fff60..a7cdc13 100644
> > > --- a/src/mesa/drivers/dri/i965/brw_surface_formats.c
> > > +++ b/src/mesa/drivers/dri/i965/brw_surface_formats.c
> > > @@ -39,14 +39,15 @@ struct surface_format_info {
> > > int input_vb;
> > > int streamed_output_vb;
> > > int color_processing;
> > > +   int lossless_compression_support;
> > 
> > There's no need to place "support" in the name. Every struct member is
> > a "support" member.
> > 
> 
> Fine.
> 
> > > const char *name;
> > >  };
> > >  
> > >  /* This macro allows us to write the table almost as it appears in the 
> > > PRM,
> > >   * while restructuring it to turn it into the C code we want.
> > >   */
> > > -#define SF(sampl, filt, shad, ck, rt, ab, vb, so, color, sf) \
> > > -   [BRW_SURFACEFORMAT_##sf] = { true, sampl, filt, shad, ck, rt, ab, vb, 
> > > so, color, #sf},
> > > +#define SF(sampl, filt, shad, ck, rt, ab, vb, so, color, ccs, sf) \
> > > +   [BRW_SURFACEFORMAT_##sf] = { true, sampl, filt, shad, ck, rt, ab, vb, 
> > > so, color, ccs, #sf},
> > >  
> > >

Re: [Mesa-dev] [PATCH 2/2] i965: Clean up context constant initialization code.

2015-11-16 Thread Jordan Justen

On 2015-11-12 15:38:52, Kenneth Graunke wrote:
> This was getting pretty out of hand, and with compute partially in place
> and tessellation on the way, it was only going to get worse.
> 
> This patch makes a "stage exists?" predicate and a "number of stages"
> count and uses them to clean up a lot of calculations.  We can just
> loop over shader stages and set things for the ones that exist.  For
> combined counts, we can just multiply by the number of stages.
> 
> It also tries to organize a little bit.
> 
> We should probably use _mesa_has_geometry_shaders/tessellation/compute
> here, but we can't because ctx->Version isn't initialized yet.  Perhaps
> that could be fixed in the future.
> 
> No change in "glxinfo -l" on Broadwell.
> 
> Signed-off-by: Kenneth Graunke 
> ---
>  src/mesa/drivers/dri/i965/brw_context.c | 138 
> ++--
>  1 file changed, 58 insertions(+), 80 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_context.c 
> b/src/mesa/drivers/dri/i965/brw_context.c
> index 2db99c7..89533ae 100644
> --- a/src/mesa/drivers/dri/i965/brw_context.c
> +++ b/src/mesa/drivers/dri/i965/brw_context.c
> @@ -322,64 +322,85 @@ static void
>  brw_initialize_context_constants(struct brw_context *brw)
>  {
> struct gl_context *ctx = >ctx;
> +   const struct brw_compiler *compiler = brw->intelScreen->compiler;
> +
> +   bool stage_exists[MESA_SHADER_STAGES] = {
> +  [MESA_SHADER_VERTEX] = true,
> +  [MESA_SHADER_TESS_CTRL] = false,
> +  [MESA_SHADER_TESS_EVAL] = false,
> +  [MESA_SHADER_GEOMETRY] = brw->gen >= 6,
> +  [MESA_SHADER_FRAGMENT] = true,
> +  [MESA_SHADER_COMPUTE] = 
> _mesa_extension_override_enables.ARB_compute_shader,
> +   };
> +
> +   unsigned num_stages = 0;
> +   for (int i = 0; i < MESA_SHADER_STAGES; i++) {
> +  if (stage_exists[i])
> + num_stages++;
> +   }
>  
> unsigned max_samplers =
>brw->gen >= 8 || brw->is_haswell ? BRW_MAX_TEX_UNIT : 16;
>  
> +   ctx->Const.MaxDualSourceDrawBuffers = 1;
> +   ctx->Const.MaxDrawBuffers = BRW_MAX_DRAW_BUFFERS;
> +   ctx->Const.MaxCombinedShaderOutputResources =
> +  MAX_IMAGE_UNITS + BRW_MAX_DRAW_BUFFERS;
> +
> ctx->Const.QueryCounterBits.Timestamp = 36;
>  
> +   ctx->Const.MaxTextureCoordUnits = 8; /* Mesa limit */
> +   ctx->Const.MaxImageUnits = MAX_IMAGE_UNITS;
> +   ctx->Const.MaxRenderbufferSize = 8192;
> +   ctx->Const.MaxTextureLevels = MIN2(14 /* 8192 */, MAX_TEXTURE_LEVELS);
> +   ctx->Const.Max3DTextureLevels = 12; /* 2048 */
> +   ctx->Const.MaxCubeTextureLevels = 14; /* 8192 */
> +   ctx->Const.MaxArrayTextureLayers = brw->gen >= 7 ? 2048 : 512;
> +   ctx->Const.MaxTextureMbytes = 1536;
> +   ctx->Const.MaxTextureRectSize = 1 << 12;
> +   ctx->Const.MaxTextureMaxAnisotropy = 16.0;
> ctx->Const.StripTextureBorder = true;
> +   if (brw->gen >= 7)
> +  ctx->Const.MaxProgramTextureGatherComponents = 4;
> +   else if (brw->gen == 6)
> +  ctx->Const.MaxProgramTextureGatherComponents = 1;
>  
> ctx->Const.MaxUniformBlockSize = 65536;
> +
> for (int i = 0; i < MESA_SHADER_STAGES; i++) {
>struct gl_program_constants *prog = >Const.Program[i];
> +
> +  if (!stage_exists[i])
> + continue;
> +
> +  prog->MaxTextureImageUnits = max_samplers;
> +
>prog->MaxUniformBlocks = BRW_MAX_UBO;
>prog->MaxCombinedUniformComponents =
>   prog->MaxUniformComponents +
>   ctx->Const.MaxUniformBlockSize / 4 * prog->MaxUniformBlocks;
> +
> +  prog->MaxAtomicCounters = MAX_ATOMIC_COUNTERS;
> +  prog->MaxAtomicBuffers = BRW_MAX_ABO;
> +  prog->MaxImageUniforms = compiler->scalar_stage[i] ? BRW_MAX_IMAGES : 
> 0;
> +  prog->MaxShaderStorageBlocks = BRW_MAX_SSBO;
> }
>  
> -   ctx->Const.MaxDualSourceDrawBuffers = 1;
> -   ctx->Const.MaxDrawBuffers = BRW_MAX_DRAW_BUFFERS;
> -   ctx->Const.Program[MESA_SHADER_FRAGMENT].MaxTextureImageUnits = 
> max_samplers;
> -   ctx->Const.MaxTextureCoordUnits = 8; /* Mesa limit */
> +   if (ctx->Extensions.ARB_compute_shader)
> +  ctx->Const.MaxShaderStorageBufferBindings += BRW_MAX_SSBO;

I think you should instead check stage_exists. We have some hardware
that supports ES 3.1, but not the desktop extension.

Reviewed-by: Jordan Justen 

> +
> +
> ctx->Const.MaxTextureUnits =
>MIN2(ctx->Const.MaxTextureCoordUnits,
> ctx->Const.Program[MESA_SHADER_FRAGMENT].MaxTextureImageUnits);
> -   ctx->Const.Program[MESA_SHADER_VERTEX].MaxTextureImageUnits = 
> max_samplers;
> -   if (brw->gen >= 6)
> -  ctx->Const.Program[MESA_SHADER_GEOMETRY].MaxTextureImageUnits = 
> max_samplers;
> -   else
> -  ctx->Const.Program[MESA_SHADER_GEOMETRY].MaxTextureImageUnits = 0;
> -   if (_mesa_extension_override_enables.ARB_compute_shader) {
> -  ctx->Const.Program[MESA_SHADER_COMPUTE].MaxTextureImageUnits = 
> BRW_MAX_TEX_UNIT;
> -  ctx->Const.MaxUniformBufferBindings +=

[Mesa-dev] [PATCH 1/2] mesa: Add KBL PCI IDs and platform information.

2015-11-16 Thread Sarah Sharp

Add PCI IDs for the Intel Kabylake platforms.  The IDs are taken
directly from the Linux kernel patches, which are under review:

http://lists.freedesktop.org/archives/intel-gfx/2015-October/078967.html
http://cgit.freedesktop.org/~vivijim/drm-intel/log/?h=kbl-upstream-v2

Please note that if this patch is backported, the following fixes will
need to be added before this patch:

commit 28ed1e08e8ba98e "i965/skl: Remove early platform support"
commit c1e38ad37042b0e "i965/skl: Use larger URB size where available."

Thanks to Ben for fixing a bug around setting urb.size, and being
patient with my questions about what the various fields mean.

Signed-off-by: Sarah Sharp 
Suggested-by: Ben Widawsky 
Tested-by: Rodrigo Vivi  (KBL-GT2)
---

 include/pci_ids/i965_pci_ids.h  | 22 +++
 src/mesa/drivers/dri/i965/brw_device_info.c | 60 +
 2 files changed, 82 insertions(+)

diff --git a/include/pci_ids/i965_pci_ids.h b/include/pci_ids/i965_pci_ids.h
index 8a42599..ea3cc08 100644
--- a/include/pci_ids/i965_pci_ids.h
+++ b/include/pci_ids/i965_pci_ids.h
@@ -124,6 +124,28 @@ CHIPSET(0x1921, skl_gt2, "Intel(R) Skylake ULT GT2F")
 CHIPSET(0x1926, skl_gt3, "Intel(R) Skylake ULT GT3")
 CHIPSET(0x192A, skl_gt3, "Intel(R) Skylake SRV GT3")
 CHIPSET(0x192B, skl_gt3, "Intel(R) Skylake Halo GT3")
+CHIPSET(0x5913, kbl_gt1_5, "Intel(R) Kabylake GT1.5")
+CHIPSET(0x5915, kbl_gt1_5, "Intel(R) Kabylake GT1.5")
+CHIPSET(0x5917, kbl_gt1_5, "Intel(R) Kabylake GT1.5")
+CHIPSET(0x5906, kbl_gt1, "Intel(R) Kabylake GT1")
+CHIPSET(0x590E, kbl_gt1, "Intel(R) Kabylake GT1")
+CHIPSET(0x5902, kbl_gt1, "Intel(R) Kabylake GT1")
+CHIPSET(0x590B, kbl_gt1, "Intel(R) Kabylake GT1")
+CHIPSET(0x590A, kbl_gt1, "Intel(R) Kabylake GT1")
+CHIPSET(0x5916, kbl_gt2, "Intel(R) Kabylake GT2")
+CHIPSET(0x5921, kbl_gt2, "Intel(R) Kabylake GT2F")
+CHIPSET(0x591E, kbl_gt2, "Intel(R) Kabylake GT2")
+CHIPSET(0x5912, kbl_gt2, "Intel(R) Kabylake GT2")
+CHIPSET(0x591B, kbl_gt2, "Intel(R) Kabylake GT2")
+CHIPSET(0x591A, kbl_gt2, "Intel(R) Kabylake GT2")
+CHIPSET(0x591D, kbl_gt2, "Intel(R) Kabylake GT2")
+CHIPSET(0x5926, kbl_gt3, "Intel(R) Kabylake GT3")
+CHIPSET(0x592B, kbl_gt3, "Intel(R) Kabylake GT3")
+CHIPSET(0x592A, kbl_gt3, "Intel(R) Kabylake GT3")
+CHIPSET(0x5932, kbl_gt4, "Intel(R) Kabylake GT4")
+CHIPSET(0x593B, kbl_gt4, "Intel(R) Kabylake GT4")
+CHIPSET(0x593A, kbl_gt4, "Intel(R) Kabylake GT4")
+CHIPSET(0x593D, kbl_gt4, "Intel(R) Kabylake GT4")
 CHIPSET(0x22B0, chv, "Intel(R) HD Graphics (Cherryview)")
 CHIPSET(0x22B1, chv, "Intel(R) HD Graphics (Cherryview)")
 CHIPSET(0x22B2, chv, "Intel(R) HD Graphics (Cherryview)")
diff --git a/src/mesa/drivers/dri/i965/brw_device_info.c 
b/src/mesa/drivers/dri/i965/brw_device_info.c
index e86b530..aa7068c 100644
--- a/src/mesa/drivers/dri/i965/brw_device_info.c
+++ b/src/mesa/drivers/dri/i965/brw_device_info.c
@@ -358,6 +358,66 @@ static const struct brw_device_info brw_device_info_bxt = {
}
 };
 
+/*
+ * Note: for all KBL SKUs, the PRM says SKL for GS entries, not SKL+.
+ * There's no KBL entry. Using the default SKL (GEN9) GS entries value.
+ */
+
+/*
+ * Both SKL and KBL support a maximum of 64 threads per
+ * Pixel Shader Dispatch (PSD) unit.
+ */
+#define  KBL_MAX_THREADS_PER_PSD 64
+
+static const struct brw_device_info brw_device_info_kbl_gt1 = {
+   GEN9_FEATURES,
+   .gt = 1,
+
+   .max_cs_threads = 7 * 6,
+   .max_wm_threads = KBL_MAX_THREADS_PER_PSD * 2,
+   .urb.size = 192,
+};
+
+static const struct brw_device_info brw_device_info_kbl_gt1_5 = {
+   GEN9_FEATURES,
+   .gt = 1,
+
+   .max_cs_threads = 7 * 6,
+   .max_wm_threads = KBL_MAX_THREADS_PER_PSD * 3,
+};
+
+static const struct brw_device_info brw_device_info_kbl_gt2 = {
+   GEN9_FEATURES,
+   .gt = 2,
+
+   .max_wm_threads = KBL_MAX_THREADS_PER_PSD * 3,
+};
+
+static const struct brw_device_info brw_device_info_kbl_gt3 = {
+   GEN9_FEATURES,
+   .gt = 3,
+
+   .max_wm_threads = KBL_MAX_THREADS_PER_PSD * 6,
+};
+
+static const struct brw_device_info brw_device_info_kbl_gt4 = {
+   GEN9_FEATURES,
+   .gt = 4,
+
+   .max_wm_threads = KBL_MAX_THREADS_PER_PSD * 9,
+   /*
+* From the "L3 Allocation and Programming" documentation:
+*
+* "URB is limited to 1008KB due to programming restrictions.  This
+*  is not a restriction of the L3 implementation, but of the FF and
+*  other clients.  Therefore, in a GT4 implementation it is
+*  possible for the programmed allocation of the L3 data array to
+*  provide 3*384KB=1152KB for URB, but only 1008KB of this
+*  will be used."
+*/
+   .urb.size = 1008 / 3,
+};
+
 const struct brw_device_info *
 brw_get_device_info(int devid)
 {
-- 
2.3.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 0/2] Mesa and DRM patches for Intel Kabylake platform

2015-11-16 Thread Sarah Sharp

Sarah Sharp (1):
  mesa: Add KBL PCI IDs and platform information.

 include/pci_ids/i965_pci_ids.h  | 22 +++
 src/mesa/drivers/dri/i965/brw_device_info.c | 60 +
 2 files changed, 82 insertions(+)

Rodrigo Vivi (1):
  intel/kbl: Add Kabylake PCI ids

 intel/intel_chipset.h | 57 ++-
 1 file changed, 56 insertions(+), 1 deletion(-)

-- 
2.3.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/2] intel/kbl: Add Kabylake PCI ids

2015-11-16 Thread Sarah Sharp

From: Rodrigo Vivi 

Also, following kernel definition Kabylake is skylake.

Signed-off-by: Rodrigo Vivi 
Signed-off-by: Sarah Sharp 
---

 intel/intel_chipset.h | 57 ++-
 1 file changed, 56 insertions(+), 1 deletion(-)

diff --git a/intel/intel_chipset.h b/intel/intel_chipset.h
index 253ea71..4bbad5c 100644
--- a/intel/intel_chipset.h
+++ b/intel/intel_chipset.h
@@ -181,6 +181,29 @@
 #define PCI_CHIP_SKYLAKE_SRV_GT1   0x190A
 #define PCI_CHIP_SKYLAKE_WKS_GT2   0x191D
 
+#define PCI_CHIP_KABYLAKE_ULT_GT2  0x5916
+#define PCI_CHIP_KABYLAKE_ULT_GT1_50x5913
+#define PCI_CHIP_KABYLAKE_ULT_GT1  0x5906
+#define PCI_CHIP_KABYLAKE_ULT_GT3  0x5926
+#define PCI_CHIP_KABYLAKE_ULT_GT2F 0x5921
+#define PCI_CHIP_KABYLAKE_ULX_GT1_50x5915
+#define PCI_CHIP_KABYLAKE_ULX_GT1  0x590E
+#define PCI_CHIP_KABYLAKE_ULX_GT2  0x591E
+#define PCI_CHIP_KABYLAKE_DT_GT2   0x5912
+#define PCI_CHIP_KABYLAKE_DT_GT1_5 0x5917
+#define PCI_CHIP_KABYLAKE_DT_GT1   0x5902
+#define PCI_CHIP_KABYLAKE_DT_GT4   0x5932
+#define PCI_CHIP_KABYLAKE_HALO_GT2 0x591B
+#define PCI_CHIP_KABYLAKE_HALO_GT4 0x593B
+#define PCI_CHIP_KABYLAKE_HALO_GT3 0x592B
+#define PCI_CHIP_KABYLAKE_HALO_GT1 0x590B
+#define PCI_CHIP_KABYLAKE_SRV_GT2  0x591A
+#define PCI_CHIP_KABYLAKE_SRV_GT3  0x592A
+#define PCI_CHIP_KABYLAKE_SRV_GT1  0x590A
+#define PCI_CHIP_KABYLAKE_SRV_GT4  0x593A
+#define PCI_CHIP_KABYLAKE_WKS_GT2  0x591D
+#define PCI_CHIP_KABYLAKE_WKS_GT4  0x593D
+
 #define PCI_CHIP_BROXTON_0 0x0A84
 #define PCI_CHIP_BROXTON_1 0x1A84
 #define PCI_CHIP_BROXTON_2 0x5A84
@@ -362,6 +385,37 @@
 (devid) == PCI_CHIP_SKYLAKE_HALO_GT3   || \
 (devid) == PCI_CHIP_SKYLAKE_SRV_GT3)
 
+#define IS_KBL_GT1(devid)  ((devid) == PCI_CHIP_KABYLAKE_ULT_GT1_5 || \
+(devid) == PCI_CHIP_KABYLAKE_ULX_GT1_5 || \
+(devid) == PCI_CHIP_KABYLAKE_DT_GT1_5  || \
+(devid) == PCI_CHIP_KABYLAKE_ULT_GT1   || \
+(devid) == PCI_CHIP_KABYLAKE_ULX_GT1   || \
+(devid) == PCI_CHIP_KABYLAKE_DT_GT1|| \
+(devid) == PCI_CHIP_KABYLAKE_HALO_GT1  || \
+(devid) == PCI_CHIP_KABYLAKE_SRV_GT1)
+
+#define IS_KBL_GT2(devid)  ((devid) == PCI_CHIP_KABYLAKE_ULT_GT2   || \
+(devid) == PCI_CHIP_KABYLAKE_ULT_GT2F  || \
+(devid) == PCI_CHIP_KABYLAKE_ULX_GT2   || \
+(devid) == PCI_CHIP_KABYLAKE_DT_GT2|| \
+(devid) == PCI_CHIP_KABYLAKE_HALO_GT2  || \
+(devid) == PCI_CHIP_KABYLAKE_SRV_GT2   || \
+(devid) == PCI_CHIP_KABYLAKE_WKS_GT2)
+
+#define IS_KBL_GT3(devid)  ((devid) == PCI_CHIP_KABYLAKE_ULT_GT3   || \
+(devid) == PCI_CHIP_KABYLAKE_HALO_GT3  || \
+(devid) == PCI_CHIP_KABYLAKE_SRV_GT3)
+
+#define IS_KBL_GT4(devid)  ((devid) == PCI_CHIP_KABYLAKE_DT_GT4|| \
+(devid) == PCI_CHIP_KABYLAKE_HALO_GT4  || \
+(devid) == PCI_CHIP_KABYLAKE_SRV_GT4   || \
+(devid) == PCI_CHIP_KABYLAKE_WKS_GT4)
+
+#define IS_KABYLAKE(devid) (IS_KBL_GT1(devid) || \
+IS_KBL_GT2(devid) || \
+IS_KBL_GT3(devid) || \
+IS_KBL_GT4(devid))
+
 #define IS_SKYLAKE(devid)  (IS_SKL_GT1(devid) || \
 IS_SKL_GT2(devid) || \
 IS_SKL_GT3(devid))
@@ -371,7 +425,8 @@
 (devid) == PCI_CHIP_BROXTON_2)
 
 #define IS_GEN9(devid) (IS_SKYLAKE(devid) || \
-IS_BROXTON(devid))
+IS_BROXTON(devid) || \
+IS_KABYLAKE(devid))
 
 #define IS_9XX(dev)(IS_GEN3(dev) || \
 IS_GEN4(dev) || \
-- 
2.3.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 14/36] glsl ubo/ssbo: Use enum to track current buffer access type

2015-11-16 Thread Jordan Justen

On 2015-11-16 03:06:37, Iago Toral wrote:
> On Sat, 2015-11-14 at 13:43 -0800, Jordan Justen wrote:
> > Signed-off-by: Jordan Justen 
> > Cc: Samuel Iglesias Gonsalvez 
> > Cc: Iago Toral Quiroga 
> > ---
> >  src/glsl/lower_ubo_reference.cpp | 26 +-
> >  1 file changed, 21 insertions(+), 5 deletions(-)
> > 
> > diff --git a/src/glsl/lower_ubo_reference.cpp 
> > b/src/glsl/lower_ubo_reference.cpp
> > index b74aa3d..41012db 100644
> > --- a/src/glsl/lower_ubo_reference.cpp
> > +++ b/src/glsl/lower_ubo_reference.cpp
> > @@ -162,6 +162,14 @@ public:
> > ir_call *ssbo_store(ir_rvalue *deref, ir_rvalue *offset,
> > unsigned write_mask);
> >  
> > +   enum {
> > +  ubo_load_access,
> > +  ssbo_load_access,
> > +  ssbo_store_access,
> > +  ssbo_get_array_length,
> 
> ssbo_get_array_length misses that is for "unsized" arrays and does not
> include the "access" prefix that the other enum values have, which makes
> it a bit inconsistent. How about we name this
> 'ssbo_unsized_array_length_access'? or maybe 'ssbo_unsized_array_access'
> if we think the former is too long.
> 
> > +  ssbo_atomic_access,
> > +   } buffer_access_type;
> > +
> > void emit_access(bool is_write, ir_dereference *deref,
> >  ir_variable *base_offset, unsigned int deref_offset,
> >  bool row_major, int matrix_columns,
> > @@ -189,7 +197,6 @@ public:
> > struct gl_uniform_buffer_variable *ubo_var;
> > ir_rvalue *uniform_block;
> > bool progress;
> > -   bool is_shader_storage;
> >  };
> >  
> >  /**
> > @@ -339,10 +346,9 @@ 
> > lower_ubo_reference_visitor::setup_for_load_or_store(ir_variable *var,
> > deref, _block_index);
> >  
> > /* Locate the block by interface name */
> > -   this->is_shader_storage = var->is_in_shader_storage_block();
> > unsigned num_blocks;
> > struct gl_uniform_block **blocks;
> > -   if (this->is_shader_storage) {
> > +   if (buffer_access_type != ubo_load_access) {
> 
> I think this file generally uses 'this->' to refer to class members (or
> at least this function does), so maybe we should keep that for
> consistency. The same in the other places where you use
> buffer_access_type.

I don't really agree with this, but I went ahead and changed it.

> That said, right now it seems that we only ever use buffer_access_type
> here and you always assign its value right before calling
> setup_for_load_or_store() so maybe it is better to just make it a
> function parameter instead of a class member? setup_for_load_or_store()
> already has a large number of parameters, so I am not super happy about
> the idea, but it looks more natural to me. What do you think?

This function is going to move to
lower_buffer_access::setup_buffer_access. This class doesn't know
about enum of buffer_access_type, since that remains part of the
lower_ubo_reference_visitor class.

Therefore, I think we need to keep it as a member variable, so the
insert_buffer_access virtual function implementation in
lower_ubo_reference_visitor can make use of it.

-Jordan

> 
> >num_blocks = shader->NumShaderStorageBlocks;
> >blocks = shader->ShaderStorageBlocks;
> > } else {
> > @@ -552,6 +558,10 @@ lower_ubo_reference_visitor::handle_rvalue(ir_rvalue 
> > **rvalue)
> > int matrix_columns;
> > unsigned packing = var->get_interface_type()->interface_packing;
> >  
> > +   buffer_access_type =
> > +  var->is_in_shader_storage_block() ?
> > +  ssbo_load_access : ubo_load_access;
> > +
> > /* Compute the offset to the start if the dereference as well as other
> >  * information we need to configure the write
> >  */
> > @@ -795,7 +805,7 @@ lower_ubo_reference_visitor::emit_access(bool is_write,
> >if (is_write)
> >   base_ir->insert_after(ssbo_store(deref, offset, write_mask));
> >else {
> > - if (!this->is_shader_storage) {
> > + if (buffer_access_type == ubo_load_access) {
> >   base_ir->insert_before(assign(deref->clone(mem_ctx, NULL),
> > ubo_load(deref->type, offset)));
> >   } else {
> > @@ -862,7 +872,7 @@ lower_ubo_reference_visitor::emit_access(bool is_write,
> >  
> >  base_ir->insert_after(ssbo_store(swizzle(deref, i, 1), 
> > chan_offset, 1));
> >   } else {
> > -if (!this->is_shader_storage) {
> > +if (buffer_access_type == ubo_load_access) {
> > base_ir->insert_before(assign(deref->clone(mem_ctx, NULL),
> >   ubo_load(deref_type, 
> > chan_offset),
> >   (1U << i)));
> > @@ -891,6 +901,8 @@ 
> > lower_ubo_reference_visitor::write_to_memory(ir_dereference *deref,
> > int matrix_columns;
> > unsigned packing =

Re: [Mesa-dev] [PATCH 1/2] mesa: Add KBL PCI IDs and platform information.

2015-11-16 Thread Matt Turner

On Mon, Nov 16, 2015 at 4:24 PM, Sarah Sharp
 wrote:
> Add PCI IDs for the Intel Kabylake platforms.  The IDs are taken
> directly from the Linux kernel patches, which are under review:
>
> http://lists.freedesktop.org/archives/intel-gfx/2015-October/078967.html
> http://cgit.freedesktop.org/~vivijim/drm-intel/log/?h=kbl-upstream-v2
>
> Please note that if this patch is backported, the following fixes will
> need to be added before this patch:
>
> commit 28ed1e08e8ba98e "i965/skl: Remove early platform support"
> commit c1e38ad37042b0e "i965/skl: Use larger URB size where available."
>
> Thanks to Ben for fixing a bug around setting urb.size, and being
> patient with my questions about what the various fields mean.
>
> Signed-off-by: Sarah Sharp 
> Suggested-by: Ben Widawsky 
> Tested-by: Rodrigo Vivi  (KBL-GT2)
> ---
>
>  include/pci_ids/i965_pci_ids.h  | 22 +++
>  src/mesa/drivers/dri/i965/brw_device_info.c | 60 
> +
>  2 files changed, 82 insertions(+)
>
> diff --git a/include/pci_ids/i965_pci_ids.h b/include/pci_ids/i965_pci_ids.h
> index 8a42599..ea3cc08 100644
> --- a/include/pci_ids/i965_pci_ids.h
> +++ b/include/pci_ids/i965_pci_ids.h
> @@ -124,6 +124,28 @@ CHIPSET(0x1921, skl_gt2, "Intel(R) Skylake ULT GT2F")
>  CHIPSET(0x1926, skl_gt3, "Intel(R) Skylake ULT GT3")
>  CHIPSET(0x192A, skl_gt3, "Intel(R) Skylake SRV GT3")
>  CHIPSET(0x192B, skl_gt3, "Intel(R) Skylake Halo GT3")
> +CHIPSET(0x5913, kbl_gt1_5, "Intel(R) Kabylake GT1.5")
> +CHIPSET(0x5915, kbl_gt1_5, "Intel(R) Kabylake GT1.5")
> +CHIPSET(0x5917, kbl_gt1_5, "Intel(R) Kabylake GT1.5")
> +CHIPSET(0x5906, kbl_gt1, "Intel(R) Kabylake GT1")
> +CHIPSET(0x590E, kbl_gt1, "Intel(R) Kabylake GT1")
> +CHIPSET(0x5902, kbl_gt1, "Intel(R) Kabylake GT1")
> +CHIPSET(0x590B, kbl_gt1, "Intel(R) Kabylake GT1")
> +CHIPSET(0x590A, kbl_gt1, "Intel(R) Kabylake GT1")
> +CHIPSET(0x5916, kbl_gt2, "Intel(R) Kabylake GT2")
> +CHIPSET(0x5921, kbl_gt2, "Intel(R) Kabylake GT2F")
> +CHIPSET(0x591E, kbl_gt2, "Intel(R) Kabylake GT2")
> +CHIPSET(0x5912, kbl_gt2, "Intel(R) Kabylake GT2")
> +CHIPSET(0x591B, kbl_gt2, "Intel(R) Kabylake GT2")
> +CHIPSET(0x591A, kbl_gt2, "Intel(R) Kabylake GT2")
> +CHIPSET(0x591D, kbl_gt2, "Intel(R) Kabylake GT2")
> +CHIPSET(0x5926, kbl_gt3, "Intel(R) Kabylake GT3")
> +CHIPSET(0x592B, kbl_gt3, "Intel(R) Kabylake GT3")
> +CHIPSET(0x592A, kbl_gt3, "Intel(R) Kabylake GT3")
> +CHIPSET(0x5932, kbl_gt4, "Intel(R) Kabylake GT4")
> +CHIPSET(0x593B, kbl_gt4, "Intel(R) Kabylake GT4")
> +CHIPSET(0x593A, kbl_gt4, "Intel(R) Kabylake GT4")
> +CHIPSET(0x593D, kbl_gt4, "Intel(R) Kabylake GT4")
>  CHIPSET(0x22B0, chv, "Intel(R) HD Graphics (Cherryview)")
>  CHIPSET(0x22B1, chv, "Intel(R) HD Graphics (Cherryview)")
>  CHIPSET(0x22B2, chv, "Intel(R) HD Graphics (Cherryview)")

This doesn't apply, because it hasn't been rebased onto commit dde33fc.

I find it odd that GT1.5 comes before GT1 and that there's a GT2F in
the middle of the GT2s. Can we move GT1.5 between 1 and 2? I don't
know where GT2F should go.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH V2 05/12] glsl: add layout qualifier validation for the shader outside the parser

2015-11-16 Thread Timothy Arceri

On Tue, 2015-11-10 at 12:29 +, Emil Velikov wrote:
> Hi Tim,
> 
> On 8 November 2015 at 22:34, Timothy Arceri 
> wrote:
> > From: Timothy Arceri 
> > 
> > This is in preparation for compile-time constant support, a later
> > patch
> > will remove the validation from the shader.
> > 
> > The global shader layout qualifiers will now mostly be validated in
> > glsl_parser_extras.cpp.
> > 
> > In order to do validation at the later stage in
> > glsl_parser_extras.cpp we
> > need to temporarily add a field in ast_type_qualifier to keep track
> > of the
> > parser location, this will be removed in a following patch when we
> > introduce a new type for storing the comiple-time qualifiers.
> > 
> > Also as the set_shader_inout_layout() function in glsl parser
> > extras is
> > normally called after all validation is done we need to move the
> > code that
> > sets CompileStatus and InfoLog otherwise the newly add error
> > messages would
> > be ignored.
> > ---
> >  src/glsl/ast_to_hir.cpp | 14 --
> >  src/glsl/ast_type.cpp   |  2 ++
> >  src/glsl/glsl_parser_extras.cpp | 37
> > -
> >  3 files changed, 46 insertions(+), 7 deletions(-)
> > 
> > diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp
> > index 0cea607..5643c86 100644
> > --- a/src/glsl/ast_to_hir.cpp
> > +++ b/src/glsl/ast_to_hir.cpp
> > @@ -3544,10 +3544,19 @@ static void
> >  handle_tess_ctrl_shader_output_decl(struct _mesa_glsl_parse_state
> > *state,
> >  YYLTYPE loc, ir_variable *var)
> >  {
> > -   unsigned num_vertices = 0;
> > +   int num_vertices = 0;
> > 
> > if (state->tcs_output_vertices_specified) {
> >num_vertices = state->out_qualifier->vertices;
> > +  if (num_vertices <= 0) {
> > + _mesa_glsl_error(, state, "invalid vertices (%d)
> > specified",
> > +  num_vertices);
> > + return;
> > +  } else if ((unsigned) num_vertices > state
> > ->Const.MaxPatchVertices) {
> > + _mesa_glsl_error(, state, "vertices (%d) exceeds "
> > +  "GL_MAX_PATCH_VERTICES", num_vertices);
> > + return;
> > +  }
> > }
> > 
> > if (!var->type->is_array() && !var->data.patch) {
> > @@ -3561,7 +3570,8 @@ handle_tess_ctrl_shader_output_decl(struct
> > _mesa_glsl_parse_state *state,
> > if (var->data.patch)
> >return;
> > 
> > -   validate_layout_qualifier_vertex_count(state, loc, var,
> > num_vertices,
> > +   validate_layout_qualifier_vertex_count(state, loc, var,
> > +  (unsigned) num_vertices,
> >>tcs_output_size,
> >"tessellation control
> > shader output");
> >  }
> > diff --git a/src/glsl/ast_type.cpp b/src/glsl/ast_type.cpp
> > index 08a4504..53d1023 100644
> > --- a/src/glsl/ast_type.cpp
> > +++ b/src/glsl/ast_type.cpp
> > @@ -310,6 +310,7 @@ ast_type_qualifier::merge_out_qualifier(YYLTYPE
> > *loc,
> >  {
> > void *mem_ctx = state;
> > const bool r = this->merge_qualifier(loc, state, q);
> > +   this->loc = loc;
> > 
> > if (state->stage == MESA_SHADER_TESS_CTRL) {
> >node = new(mem_ctx) ast_tcs_output_layout(*loc, q.vertices);
> > @@ -329,6 +330,7 @@ ast_type_qualifier::merge_in_qualifier(YYLTYPE
> > *loc,
> > bool create_cs_ast = false;
> > ast_type_qualifier valid_in_mask;
> > valid_in_mask.flags.i = 0;
> > +   this->loc = loc;
> > 
> > switch (state->stage) {
> > case MESA_SHADER_TESS_EVAL:
> > diff --git a/src/glsl/glsl_parser_extras.cpp
> > b/src/glsl/glsl_parser_extras.cpp
> > index 2dba7d9..7d7f45c 100644
> > --- a/src/glsl/glsl_parser_extras.cpp
> > +++ b/src/glsl/glsl_parser_extras.cpp
> > @@ -947,6 +947,14 @@ _mesa_ast_process_interface_block(YYLTYPE
> > *locp,
> > 
> > if (state->stage == MESA_SHADER_GEOMETRY &&
> > state->has_explicit_attrib_stream()) {
> > +
> > +  if (state->out_qualifier->flags.q.explicit_stream) {
> > + if (state->out_qualifier->stream < 0) {
> > +_mesa_glsl_error(locp, state, "invalid stream %d
> > specified",
> > + state->out_qualifier->stream);
> > + }
> > +  }
> > +
> >/* Assign global layout's stream value. */
> >block->layout.flags.q.stream = 1;
> >block->layout.flags.q.explicit_stream = 0;
> > @@ -1615,7 +1623,7 @@ void ast_subroutine_list::print(void) const
> > 
> >  static void
> >  set_shader_inout_layout(struct gl_shader *shader,
> > -struct _mesa_glsl_parse_state *state)
> > +struct _mesa_glsl_parse_state *state)
> >  {
> You seems to me mixing the "validate" and "copy validated values"
> functions into one. This invalidates the already (not too
> descriptive)
> function name, and requires you to move

Re: [Mesa-dev] [PATCH 1/5] util/set: don't compare against deleted entries

2015-11-16 Thread Timothy Arceri

On Sat, 2015-11-14 at 21:59 -0500, Connor Abbott wrote:
> Not sure how this wasn't already caught by valgrind, but it fixes an
> issue with the vectorizer.

Can you give a more detailed description of the problem that is fixed? I'm
assuming its something to do with the key_equals_function having issues
comparing to the deleted_key value?

> 
> Signed-off-by: Connor Abbott 
> ---
>  src/util/set.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/src/util/set.c b/src/util/set.c
> index f01f869..331ff58 100644
> --- a/src/util/set.c
> +++ b/src/util/set.c
> @@ -282,7 +282,8 @@ set_add(struct set *ht, uint32_t hash, const void *key)
> * If freeing of old keys is required to avoid memory leaks,
> * perform a search before inserting.
> */
> -  if (entry->hash == hash &&
> +  if (entry_is_present(entry) &&

You can use !entry_is_deleted(entry) here as free entries will have already
cased the loop the break.

With these two comments addressed this and patch 2 are:

Reviewed-by: Timothy Arceri 

> +  entry->hash == hash &&
>ht->key_equals_function(key, entry->key)) {
>   entry->key = key;
>   return entry;
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/2] i965: Convert scalar_* flags to a scalar_stage array.

2015-11-16 Thread Kenneth Graunke

On Monday, November 16, 2015 10:23:22 AM Pohjolainen, Topi wrote:
> On Fri, Nov 13, 2015 at 11:29:00AM -0800, Kenneth Graunke wrote:
> > On Friday, November 13, 2015 10:06:23 AM Pohjolainen, Topi wrote:
> > > On Thu, Nov 12, 2015 at 03:38:51PM -0800, Kenneth Graunke wrote:
> > > > I was going to add scalar_tcs and scalar_tes flags, and then thought
> > > > better of it and decided to convert this to an array.  Simpler.
> > > > 
> > > > Signed-off-by: Kenneth Graunke 
> > > > ---
> > > >  src/mesa/drivers/dri/i965/brw_compiler.h  |  3 +--
> > > >  src/mesa/drivers/dri/i965/brw_context.c   |  2 +-
> > > >  src/mesa/drivers/dri/i965/brw_gs.c|  3 ++-
> > > >  src/mesa/drivers/dri/i965/brw_link.cpp| 11 +---
> > > >  src/mesa/drivers/dri/i965/brw_program.c   |  3 ++-
> > > >  src/mesa/drivers/dri/i965/brw_shader.cpp  | 31 
> > > > ++-
> > > >  src/mesa/drivers/dri/i965/brw_shader.h|  2 --
> > > >  src/mesa/drivers/dri/i965/brw_vec4.cpp|  4 +--
> > > >  src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp |  2 +-
> > > >  src/mesa/drivers/dri/i965/brw_vs.c|  7 ++---
> > > >  10 files changed, 28 insertions(+), 40 deletions(-)
> > > > 
> > > > diff --git a/src/mesa/drivers/dri/i965/brw_compiler.h 
> > > > b/src/mesa/drivers/dri/i965/brw_compiler.h
> > > > index e3a26d6..3f54616 100644
> > > > --- a/src/mesa/drivers/dri/i965/brw_compiler.h
> > > > +++ b/src/mesa/drivers/dri/i965/brw_compiler.h
> > > > @@ -89,8 +89,7 @@ struct brw_compiler {
> > > > void (*shader_debug_log)(void *, const char *str, ...) 
> > > > PRINTFLIKE(2, 3);
> > > > void (*shader_perf_log)(void *, const char *str, ...) PRINTFLIKE(2, 
> > > > 3);
> > > >  
> > > > -   bool scalar_vs;
> > > > -   bool scalar_gs;
> > > > +   bool scalar_stage[MESA_SHADER_STAGES];
> > > > struct gl_shader_compiler_options 
> > > > glsl_compiler_options[MESA_SHADER_STAGES];
> > > >  };
> > > >  
> > > > diff --git a/src/mesa/drivers/dri/i965/brw_context.c 
> > > > b/src/mesa/drivers/dri/i965/brw_context.c
> > > > index ac6045d..2db99c7 100644
> > > > --- a/src/mesa/drivers/dri/i965/brw_context.c
> > > > +++ b/src/mesa/drivers/dri/i965/brw_context.c
> > > > @@ -525,7 +525,7 @@ brw_initialize_context_constants(struct brw_context 
> > > > *brw)
> > > >ctx->Const.Program[MESA_SHADER_FRAGMENT].MaxImageUniforms =
> > > >   BRW_MAX_IMAGES;
> > > >ctx->Const.Program[MESA_SHADER_VERTEX].MaxImageUniforms =
> > > > - (brw->intelScreen->compiler->scalar_vs ? BRW_MAX_IMAGES : 0);
> > > > + (brw->intelScreen->compiler->scalar_stage[MESA_SHADER_VERTEX] 
> > > > ? BRW_MAX_IMAGES : 0);
> > > >ctx->Const.Program[MESA_SHADER_COMPUTE].MaxImageUniforms =
> > > >   BRW_MAX_IMAGES;
> > > >ctx->Const.MaxImageUnits = MAX_IMAGE_UNITS;
> > > > diff --git a/src/mesa/drivers/dri/i965/brw_gs.c 
> > > > b/src/mesa/drivers/dri/i965/brw_gs.c
> > > > index ed0890f..ad5b242 100644
> > > > --- a/src/mesa/drivers/dri/i965/brw_gs.c
> > > > +++ b/src/mesa/drivers/dri/i965/brw_gs.c
> > > > @@ -87,7 +87,8 @@ brw_codegen_gs_prog(struct brw_context *brw,
> > > > prog_data.base.base.nr_image_params = gs->NumImages;
> > > >  
> > > > brw_nir_setup_glsl_uniforms(gp->program.Base.nir, prog, 
> > > > >program.Base,
> > > > -   _data.base.base, 
> > > > compiler->scalar_gs);
> > > > +   _data.base.base,
> > > > +   
> > > > compiler->scalar_stage[MESA_SHADER_GEOMETRY]);
> > > >  
> > > > GLbitfield64 outputs_written = gp->program.Base.OutputsWritten;
> > > >  
> > > > diff --git a/src/mesa/drivers/dri/i965/brw_link.cpp 
> > > > b/src/mesa/drivers/dri/i965/brw_link.cpp
> > > > index 2991173..14421d4 100644
> > > > --- a/src/mesa/drivers/dri/i965/brw_link.cpp
> > > > +++ b/src/mesa/drivers/dri/i965/brw_link.cpp
> > > > @@ -66,12 +66,14 @@ brw_lower_packing_builtins(struct brw_context *brw,
> > > > gl_shader_stage shader_type,
> > > > exec_list *ir)
> > > >  {
> > > > +   const struct brw_compiler *compiler = brw->intelScreen->compiler;
> > > > +
> > > > int ops = LOWER_PACK_SNORM_2x16
> > > > | LOWER_UNPACK_SNORM_2x16
> > > > | LOWER_PACK_UNORM_2x16
> > > > | LOWER_UNPACK_UNORM_2x16;
> > > >  
> > > > -   if (is_scalar_shader_stage(brw->intelScreen->compiler, 
> > > > shader_type)) {
> > > > +   if (compiler->scalar_stage[shader_type]) {
> > > >ops |= LOWER_UNPACK_UNORM_4x8
> > > > | LOWER_UNPACK_SNORM_4x8
> > > > | LOWER_PACK_UNORM_4x8
> > > > @@ -84,7 +86,7 @@ brw_lower_packing_builtins(struct brw_context *brw,
> > > > * lowering is needed. For SOA code, the Half2x16 ops must be
> > > > * scalarized.
> > > > */
> > > > -  if

[Mesa-dev] [PATCH 2/2] i965: Add INTEL_DEBUG=shader_time support for tessellation shaders.

2015-11-16 Thread Kenneth Graunke

Signed-off-by: Kenneth Graunke 
---
 src/mesa/drivers/dri/i965/brw_context.h |  2 ++
 src/mesa/drivers/dri/i965/brw_program.c | 12 
 2 files changed, 14 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
b/src/mesa/drivers/dri/i965/brw_context.h
index 4b2db61..8d6bc19 100644
--- a/src/mesa/drivers/dri/i965/brw_context.h
+++ b/src/mesa/drivers/dri/i965/brw_context.h
@@ -523,6 +523,8 @@ struct brw_tracked_state {
 enum shader_time_shader_type {
ST_NONE,
ST_VS,
+   ST_TCS,
+   ST_TES,
ST_GS,
ST_FS8,
ST_FS16,
diff --git a/src/mesa/drivers/dri/i965/brw_program.c 
b/src/mesa/drivers/dri/i965/brw_program.c
index 2297fa6..f137c87 100644
--- a/src/mesa/drivers/dri/i965/brw_program.c
+++ b/src/mesa/drivers/dri/i965/brw_program.c
@@ -344,6 +344,8 @@ brw_report_shader_time(struct brw_context *brw)
 
   switch (type) {
   case ST_VS:
+  case ST_TCS:
+  case ST_TES:
   case ST_GS:
   case ST_FS8:
   case ST_FS16:
@@ -370,6 +372,8 @@ brw_report_shader_time(struct brw_context *brw)
 
   switch (type) {
   case ST_VS:
+  case ST_TCS:
+  case ST_TES:
   case ST_GS:
   case ST_FS8:
   case ST_FS16:
@@ -407,6 +411,12 @@ brw_report_shader_time(struct brw_context *brw)
   case ST_VS:
  stage = "vs";
  break;
+  case ST_TCS:
+ stage = "tcs";
+ break;
+  case ST_TES:
+ stage = "tes";
+ break;
   case ST_GS:
  stage = "gs";
  break;
@@ -430,6 +440,8 @@ brw_report_shader_time(struct brw_context *brw)
 
fprintf(stderr, "\n");
print_shader_time_line("total", "vs", 0, total_by_type[ST_VS], total);
+   print_shader_time_line("total", "tcs", 0, total_by_type[ST_TCS], total);
+   print_shader_time_line("total", "tes", 0, total_by_type[ST_TES], total);
print_shader_time_line("total", "gs", 0, total_by_type[ST_GS], total);
print_shader_time_line("total", "fs8", 0, total_by_type[ST_FS8], total);
print_shader_time_line("total", "fs16", 0, total_by_type[ST_FS16], total);
-- 
2.6.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 1/2] i965: Add INTEL_DEBUG=tcs, tes and hs, ds flags for tessellation shaders.

2015-11-16 Thread Kenneth Graunke

Even though both tessellation shader stages must be used together, I
still think it makes sense to add separate debug flags for each stage.
It makes it possible to read the TCS/HS, rule out problems, then read
the TES/DS separately, without sifting through as much printed text.

I decided to add both the GL names (tcs/tes) and hardware names (hs/ds)
so they can be used interchangeably.

Signed-off-by: Kenneth Graunke 
---
 src/mesa/drivers/dri/i965/intel_debug.c | 8 ++--
 src/mesa/drivers/dri/i965/intel_debug.h | 2 ++
 2 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/intel_debug.c 
b/src/mesa/drivers/dri/i965/intel_debug.c
index c00d2e7..f53c4ab 100644
--- a/src/mesa/drivers/dri/i965/intel_debug.c
+++ b/src/mesa/drivers/dri/i965/intel_debug.c
@@ -75,6 +75,10 @@ static const struct debug_control debug_control[] = {
{ "cs",  DEBUG_CS },
{ "hex", DEBUG_HEX },
{ "nocompact",   DEBUG_NO_COMPACTION },
+   { "hs",  DEBUG_TCS },
+   { "tcs", DEBUG_TCS },
+   { "ds",  DEBUG_TES },
+   { "tes", DEBUG_TES },
{ NULL,0 }
 };
 
@@ -83,8 +87,8 @@ intel_debug_flag_for_shader_stage(gl_shader_stage stage)
 {
uint64_t flags[] = {
   [MESA_SHADER_VERTEX] = DEBUG_VS,
-  [MESA_SHADER_TESS_CTRL] = 0,
-  [MESA_SHADER_TESS_EVAL] = 0,
+  [MESA_SHADER_TESS_CTRL] = DEBUG_TCS,
+  [MESA_SHADER_TESS_EVAL] = DEBUG_TES,
   [MESA_SHADER_GEOMETRY] = DEBUG_GS,
   [MESA_SHADER_FRAGMENT] = DEBUG_WM,
   [MESA_SHADER_COMPUTE] = DEBUG_CS,
diff --git a/src/mesa/drivers/dri/i965/intel_debug.h 
b/src/mesa/drivers/dri/i965/intel_debug.h
index 98bd7e9..9c6030a 100644
--- a/src/mesa/drivers/dri/i965/intel_debug.h
+++ b/src/mesa/drivers/dri/i965/intel_debug.h
@@ -69,6 +69,8 @@ extern uint64_t INTEL_DEBUG;
 #define DEBUG_CS  (1ull << 33)
 #define DEBUG_HEX (1ull << 34)
 #define DEBUG_NO_COMPACTION   (1ull << 35)
+#define DEBUG_TCS (1ull << 36)
+#define DEBUG_TES (1ull << 37)
 
 #ifdef HAVE_ANDROID_PLATFORM
 #define LOG_TAG "INTEL-MESA"
-- 
2.6.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 29/36] glsl: Allow atomic functions to be used with shared variables

2015-11-16 Thread Timothy Arceri

Reviewed-by: Timothy Arceri 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v5 5/7] glsl: Add precision information to ir_variable

2015-11-16 Thread Tapani Pälli




On 11/16/2015 01:29 PM, Samuel Iglesias Gonsálvez wrote:

Hello Ilia, Tapani:

I have reproduced the issue with a piglit test but not with the trace
uploaded in the bug report :-(

The piglit test was: bin/arb_shader_storage_buffer_object-maxblocks

I have upload a branch with some fixes at Igalia's mesa repo:

Git repo: https://github.com/Igalia/mesa.git
Branch: wip/siglesias/precision-fixes

But as this error might come from other initializations that I might
overlook:
* Ilia: Could you test if this issue is still happening to you? As I
cannot reproduce it locally, I might be forgetting something.
* Tapani: Could you do a quick run on CTS to check I have not broken
anything?


Sure thing, I'll run testing. FWIW one of the patches was identical to 
my fix sent for fixing tessellation shader problems:


http://lists.freedesktop.org/archives/mesa-dev/2015-November/100396.html


Thanks!

Sam


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/2] glsl: initialize precision when adding per vertex record fields

2015-11-16 Thread Samuel Iglesias Gonsálvez

Reviewed-by: Samuel Iglesias Gonsálvez 

On 16/11/15 07:44, Tapani Pälli wrote:
> Fixes issues with tessellation builtin variables since precision was
> introduced to IR with commit f84bc57d7dc02fceb805803131426c791eadeff9.
> 
> Signed-off-by: Tapani Pälli 
> ---
>  src/glsl/builtin_variables.cpp | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/src/glsl/builtin_variables.cpp b/src/glsl/builtin_variables.cpp
> index b06c1bc..b927d50 100644
> --- a/src/glsl/builtin_variables.cpp
> +++ b/src/glsl/builtin_variables.cpp
> @@ -327,6 +327,7 @@ per_vertex_accumulator::add_field(int slot, const 
> glsl_type *type,
> this->fields[this->num_fields].centroid = 0;
> this->fields[this->num_fields].sample = 0;
> this->fields[this->num_fields].patch = 0;
> +   this->fields[this->num_fields].precision = GLSL_PRECISION_NONE;
> this->num_fields++;
>  }
>  
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] nvc0: fix wrong value for NVC8_COMPUTE_CLASS

2015-11-16 Thread Samuel Pitoiset




On 11/16/2015 11:55 AM, Emil Velikov wrote:

On 9 October 2015 at 14:10, Samuel Pitoiset  wrote:

Compute class value for GF110+ is 0x91c0 and not 0x92c0. This fixes
compute support and MP performance counters on GF110.

Signed-off-by: Samuel Pitoiset 
---
  src/gallium/drivers/nouveau/nv_object.xml.h | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/gallium/drivers/nouveau/nv_object.xml.h 
b/src/gallium/drivers/nouveau/nv_object.xml.h
index 0a0e187..92c0633 100644
--- a/src/gallium/drivers/nouveau/nv_object.xml.h
+++ b/src/gallium/drivers/nouveau/nv_object.xml.h
@@ -197,7 +197,7 @@ WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 
SOFTWARE.
  #define NV50_COMPUTE_CLASS 0x50c0
  #define NVA3_COMPUTE_CLASS 0x85c0
  #define NVC0_COMPUTE_CLASS 0x90c0
-#define NVC8_COMPUTE_CLASS 0x92c0
+#define NVC8_COMPUTE_CLASS 0x91c0

Worth updating the classic one (src/mesa/drivers/dri/nouveau/) as well ?


I don't think it's useful because the classic one is for < NV50.
But one day, we should re-generate those headers.



There is a nasty looking comment in nvc0_screen_compute_setup about
the above class. Afaics although updated the define isn't used
anywhere so I'm dubious how it fixes compute on GF110.


This actually only fixes the value of the NVC8_COMPUTE_CLASS. I did this 
patch before writing that comment in nvc0_screen_compute_setup().


In practice, GF100/GF110 should support 0x91c0 but for some weird 
reason, a ILLEGAL_CLASS dmesg appears when using it.


Anyway, 0x91c0 only introduces some minor changes regarding 0x90c0. 
That's why we use 0x90c0 for all NV50 chipsets.





Cheers,
Emil



--
-Samuel
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] i965: Add more MAX_*_URB_ENTRY_SIZE_BYTES #defines.

2015-11-16 Thread Kenneth Graunke

Signed-off-by: Kenneth Graunke 
---
 src/mesa/drivers/dri/i965/brw_defines.h | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_defines.h 
b/src/mesa/drivers/dri/i965/brw_defines.h
index 0b8de63..ade3ede 100644
--- a/src/mesa/drivers/dri/i965/brw_defines.h
+++ b/src/mesa/drivers/dri/i965/brw_defines.h
@@ -1938,8 +1938,14 @@ enum brw_message_target {
 
 /* Gen7 "GS URB Entry Allocation Size" is a U9-1 field, so the maximum gs_size
  * is 2^9, or 512.  It's counted in multiples of 64 bytes.
+ *
+ * Identical for VS, DS, and HS.
  */
 #define GEN7_MAX_GS_URB_ENTRY_SIZE_BYTES(512*64)
+#define GEN7_MAX_DS_URB_ENTRY_SIZE_BYTES(512*64)
+#define GEN7_MAX_HS_URB_ENTRY_SIZE_BYTES(512*64)
+#define GEN7_MAX_VS_URB_ENTRY_SIZE_BYTES(512*64)
+
 /* Gen6 "GS URB Entry Allocation Size" is defined as a number of 1024-bit
  * (128 bytes) URB rows and the maximum allowed value is 5 rows.
  */
-- 
2.6.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 16/36] glsl ubo/ssbo: Add lower_buffer_access class

2015-11-16 Thread Jordan Justen

On 2015-11-16 04:27:55, Iago Toral wrote:
> On Sat, 2015-11-14 at 13:43 -0800, Jordan Justen wrote:
> > This class has code that will be shared by lower_ubo_reference and
> > lower_shared_reference. (lower_shared_reference will be used to
> > support compute shader shared variables.)
> > 
> > Signed-off-by: Jordan Justen 
> > Cc: Samuel Iglesias Gonsalvez 
> > Cc: Iago Toral Quiroga 
> > ---
> >  src/glsl/Makefile.sources|   1 +
> >  src/glsl/lower_buffer_access.cpp | 307 
> > +++
> >  src/glsl/lower_buffer_access.h   |  56 +++
> >  src/glsl/lower_ubo_reference.cpp | 180 +--
> >  4 files changed, 367 insertions(+), 177 deletions(-)
> >  create mode 100644 src/glsl/lower_buffer_access.cpp
> >  create mode 100644 src/glsl/lower_buffer_access.h
> > 
> > diff --git a/src/glsl/Makefile.sources b/src/glsl/Makefile.sources
> > index d4b02c1..f2c95c0 100644
> > --- a/src/glsl/Makefile.sources
> > +++ b/src/glsl/Makefile.sources
> > @@ -155,6 +155,7 @@ LIBGLSL_FILES = \
> >   loop_analysis.h \
> >   loop_controls.cpp \
> >   loop_unroll.cpp \
> > + lower_buffer_access.cpp \
> >   lower_clip_distance.cpp \
> >   lower_const_arrays_to_uniforms.cpp \
> >   lower_discard.cpp \
> > diff --git a/src/glsl/lower_buffer_access.cpp 
> > b/src/glsl/lower_buffer_access.cpp
> > new file mode 100644
> > index 000..e0b5a2f
> > --- /dev/null
> > +++ b/src/glsl/lower_buffer_access.cpp
> > @@ -0,0 +1,307 @@
> > +/*
> > + * Copyright (c) 2015 Intel Corporation
> > + *
> > + * Permission is hereby granted, free of charge, to any person obtaining a
> > + * copy of this software and associated documentation files (the 
> > "Software"),
> > + * to deal in the Software without restriction, including without 
> > limitation
> > + * the rights to use, copy, modify, merge, publish, distribute, sublicense,
> > + * and/or sell copies of the Software, and to permit persons to whom the
> > + * Software is furnished to do so, subject to the following conditions:
> > + *
> > + * The above copyright notice and this permission notice (including the 
> > next
> > + * paragraph) shall be included in all copies or substantial portions of 
> > the
> > + * Software.
> > + *
> > + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS 
> > OR
> > + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> > + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
> > + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR 
> > OTHER
> > + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
> > + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
> > + * DEALINGS IN THE SOFTWARE.
> > + */
> > +
> > +/**
> > + * \file lower_buffer_access.cpp
> > + *
> > + * Helper for IR lowering pass to replace dereferences of buffer object 
> > based
> > + * shader variables with intrinsic function calls.
> > + *
> > + * This helper is used by lowering passes for UBOs, SSBOs and compute 
> > shader
> > + * shared variables.
> > + */
> > +
> > +#include "ir.h"
> > +#include "ir_builder.h"
> > +#include "ir_rvalue_visitor.h"
> > +#include "main/macros.h"
> > +#include "util/list.h"
> > +#include "glsl_parser_extras.h"
> > +#include "lower_buffer_access.h"
> > +
> > +using namespace ir_builder;
> > +
> > +namespace lower_buffer_access {
> > +
> > +static inline int
> > +writemask_for_size(unsigned n)
> > +{
> > +   return ((1 << n) - 1);
> > +}
> > +
> > +/**
> > + * Takes LHS and emits a series of assignments into its components
> > + * from the shared variable storage.
> 
> I find this part of the comment a bit confusing. This function breaks a
> dereference access into one or multiple accesses to the underlying
> buffer storage. Such dereference could be in a RHS expression, and in
> fact, that will always be the case for UBO and SSBO loads.

Hmm. I may have copied this comment from lower_ubo_reference some time
back. Anyway, I intended to use the current comment from
lower_ubo_reference:

/**
 * Takes a deref and recursively calls itself to break the deref down to the
 * point that the reads or writes generated are contiguous scalars or vectors.
 */

> > + * Recursively calls itself to break the deref down to the point that
> > + * the intrinsic calls are generated.
> > + */
> > +void
> > +lower_buffer_access::emit_access(bool is_write,
> > + ir_dereference *deref,
> > + ir_variable *base_offset,
> > + unsigned int deref_offset,
> > + bool row_major,
> > + int matrix_columns,
> > + unsigned int packing,
> > + unsigned int write_mask)
> > +{
> 
> Why not pass mem_ctx as

[Mesa-dev] [PATCH] i965: Add assertion for src_stencil payload size

2015-11-16 Thread Ben Widawsky

This helps address a coverity warning and prevents future questions about this
code.

Reported-by: Coverity (via Ilia)
Cc: Matt Turner 
Cc: Ilia Mirkin 
Signed-off-by: Ben Widawsky 
---
 src/mesa/drivers/dri/i965/brw_fs.cpp | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index 84b5920..995ab22 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -3603,6 +3603,12 @@ lower_fb_write_logical_send(const fs_builder , 
fs_inst *inst,
   assert(devinfo->gen >= 9);
   assert(bld.dispatch_width() != 16);
 
+  /* XXX: src_stencil is only available on gen9+. dst_depth is never
+   * available on gen9+. As such it's impossible to have both enabled at 
the
+   * same time and therefore length cannot overrun the array.
+   */
+  assert(length < 15);
+
   sources[length] = bld.vgrf(BRW_REGISTER_TYPE_UD);
   bld.exec_all().annotate("FB write OS")
  .emit(FS_OPCODE_PACK_STENCIL_REF, sources[length],
-- 
2.6.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] i965: Add assertion for src_stencil payload size

2015-11-16 Thread Matt Turner

Reviewed-by: Matt Turner 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/2] nir: Add support for gl_HelperInvocation system value.

2015-11-16 Thread Tapani Pälli


Reviewed-by: Tapani Pälli 

On 11/14/2015 04:05 AM, Matt Turner wrote:

---
  src/glsl/nir/nir.c| 4 
  src/glsl/nir/nir_intrinsics.h | 1 +
  2 files changed, 5 insertions(+)

diff --git a/src/glsl/nir/nir.c b/src/glsl/nir/nir.c
index bb7a5fa..974438b 100644
--- a/src/glsl/nir/nir.c
+++ b/src/glsl/nir/nir.c
@@ -1565,6 +1565,8 @@ nir_intrinsic_from_system_value(gl_system_value val)
return nir_intrinsic_load_tess_level_inner;
 case SYSTEM_VALUE_VERTICES_IN:
return nir_intrinsic_load_patch_vertices_in;
+   case SYSTEM_VALUE_HELPER_INVOCATION:
+  return nir_intrinsic_load_helper_invocation;
 default:
unreachable("system value does not directly correspond to intrinsic");
 }
@@ -1608,6 +1610,8 @@ nir_system_value_from_intrinsic(nir_intrinsic_op intrin)
return SYSTEM_VALUE_TESS_LEVEL_INNER;
 case nir_intrinsic_load_patch_vertices_in:
return SYSTEM_VALUE_VERTICES_IN;
+   case nir_intrinsic_load_helper_invocation:
+  return SYSTEM_VALUE_HELPER_INVOCATION;
 default:
unreachable("intrinsic doesn't produce a system value");
 }
diff --git a/src/glsl/nir/nir_intrinsics.h b/src/glsl/nir/nir_intrinsics.h
index 36fb286..ca35b33 100644
--- a/src/glsl/nir/nir_intrinsics.h
+++ b/src/glsl/nir/nir_intrinsics.h
@@ -225,6 +225,7 @@ SYSTEM_VALUE(local_invocation_id, 3, 0)
  SYSTEM_VALUE(work_group_id, 3, 0)
  SYSTEM_VALUE(user_clip_plane, 4, 1) /* const_index[0] is user_clip_plane[idx] 
*/
  SYSTEM_VALUE(num_work_groups, 3, 0)
+SYSTEM_VALUE(helper_invocation, 1, 0)

  /*
   * The format of the indices depends on the type of the load.  For uniforms,


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/2] i965: Convert scalar_* flags to a scalar_stage array.

2015-11-16 Thread Pohjolainen, Topi

On Fri, Nov 13, 2015 at 11:29:00AM -0800, Kenneth Graunke wrote:
> On Friday, November 13, 2015 10:06:23 AM Pohjolainen, Topi wrote:
> > On Thu, Nov 12, 2015 at 03:38:51PM -0800, Kenneth Graunke wrote:
> > > I was going to add scalar_tcs and scalar_tes flags, and then thought
> > > better of it and decided to convert this to an array.  Simpler.
> > > 
> > > Signed-off-by: Kenneth Graunke 
> > > ---
> > >  src/mesa/drivers/dri/i965/brw_compiler.h  |  3 +--
> > >  src/mesa/drivers/dri/i965/brw_context.c   |  2 +-
> > >  src/mesa/drivers/dri/i965/brw_gs.c|  3 ++-
> > >  src/mesa/drivers/dri/i965/brw_link.cpp| 11 +---
> > >  src/mesa/drivers/dri/i965/brw_program.c   |  3 ++-
> > >  src/mesa/drivers/dri/i965/brw_shader.cpp  | 31 
> > > ++-
> > >  src/mesa/drivers/dri/i965/brw_shader.h|  2 --
> > >  src/mesa/drivers/dri/i965/brw_vec4.cpp|  4 +--
> > >  src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp |  2 +-
> > >  src/mesa/drivers/dri/i965/brw_vs.c|  7 ++---
> > >  10 files changed, 28 insertions(+), 40 deletions(-)
> > > 
> > > diff --git a/src/mesa/drivers/dri/i965/brw_compiler.h 
> > > b/src/mesa/drivers/dri/i965/brw_compiler.h
> > > index e3a26d6..3f54616 100644
> > > --- a/src/mesa/drivers/dri/i965/brw_compiler.h
> > > +++ b/src/mesa/drivers/dri/i965/brw_compiler.h
> > > @@ -89,8 +89,7 @@ struct brw_compiler {
> > > void (*shader_debug_log)(void *, const char *str, ...) PRINTFLIKE(2, 
> > > 3);
> > > void (*shader_perf_log)(void *, const char *str, ...) PRINTFLIKE(2, 
> > > 3);
> > >  
> > > -   bool scalar_vs;
> > > -   bool scalar_gs;
> > > +   bool scalar_stage[MESA_SHADER_STAGES];
> > > struct gl_shader_compiler_options 
> > > glsl_compiler_options[MESA_SHADER_STAGES];
> > >  };
> > >  
> > > diff --git a/src/mesa/drivers/dri/i965/brw_context.c 
> > > b/src/mesa/drivers/dri/i965/brw_context.c
> > > index ac6045d..2db99c7 100644
> > > --- a/src/mesa/drivers/dri/i965/brw_context.c
> > > +++ b/src/mesa/drivers/dri/i965/brw_context.c
> > > @@ -525,7 +525,7 @@ brw_initialize_context_constants(struct brw_context 
> > > *brw)
> > >ctx->Const.Program[MESA_SHADER_FRAGMENT].MaxImageUniforms =
> > >   BRW_MAX_IMAGES;
> > >ctx->Const.Program[MESA_SHADER_VERTEX].MaxImageUniforms =
> > > - (brw->intelScreen->compiler->scalar_vs ? BRW_MAX_IMAGES : 0);
> > > + (brw->intelScreen->compiler->scalar_stage[MESA_SHADER_VERTEX] ? 
> > > BRW_MAX_IMAGES : 0);
> > >ctx->Const.Program[MESA_SHADER_COMPUTE].MaxImageUniforms =
> > >   BRW_MAX_IMAGES;
> > >ctx->Const.MaxImageUnits = MAX_IMAGE_UNITS;
> > > diff --git a/src/mesa/drivers/dri/i965/brw_gs.c 
> > > b/src/mesa/drivers/dri/i965/brw_gs.c
> > > index ed0890f..ad5b242 100644
> > > --- a/src/mesa/drivers/dri/i965/brw_gs.c
> > > +++ b/src/mesa/drivers/dri/i965/brw_gs.c
> > > @@ -87,7 +87,8 @@ brw_codegen_gs_prog(struct brw_context *brw,
> > > prog_data.base.base.nr_image_params = gs->NumImages;
> > >  
> > > brw_nir_setup_glsl_uniforms(gp->program.Base.nir, prog, 
> > > >program.Base,
> > > -   _data.base.base, 
> > > compiler->scalar_gs);
> > > +   _data.base.base,
> > > +   
> > > compiler->scalar_stage[MESA_SHADER_GEOMETRY]);
> > >  
> > > GLbitfield64 outputs_written = gp->program.Base.OutputsWritten;
> > >  
> > > diff --git a/src/mesa/drivers/dri/i965/brw_link.cpp 
> > > b/src/mesa/drivers/dri/i965/brw_link.cpp
> > > index 2991173..14421d4 100644
> > > --- a/src/mesa/drivers/dri/i965/brw_link.cpp
> > > +++ b/src/mesa/drivers/dri/i965/brw_link.cpp
> > > @@ -66,12 +66,14 @@ brw_lower_packing_builtins(struct brw_context *brw,
> > > gl_shader_stage shader_type,
> > > exec_list *ir)
> > >  {
> > > +   const struct brw_compiler *compiler = brw->intelScreen->compiler;
> > > +
> > > int ops = LOWER_PACK_SNORM_2x16
> > > | LOWER_UNPACK_SNORM_2x16
> > > | LOWER_PACK_UNORM_2x16
> > > | LOWER_UNPACK_UNORM_2x16;
> > >  
> > > -   if (is_scalar_shader_stage(brw->intelScreen->compiler, shader_type)) {
> > > +   if (compiler->scalar_stage[shader_type]) {
> > >ops |= LOWER_UNPACK_UNORM_4x8
> > > | LOWER_UNPACK_SNORM_4x8
> > > | LOWER_PACK_UNORM_4x8
> > > @@ -84,7 +86,7 @@ brw_lower_packing_builtins(struct brw_context *brw,
> > > * lowering is needed. For SOA code, the Half2x16 ops must be
> > > * scalarized.
> > > */
> > > -  if (is_scalar_shader_stage(brw->intelScreen->compiler, 
> > > shader_type)) {
> > > +  if (compiler->scalar_stage[shader_type]) {
> > >   ops |= LOWER_PACK_HALF_2x16_TO_SPLIT
> > >   |  LOWER_UNPACK_HALF_2x16_TO_SPLIT;
> > >}
> > > @@ -103,6 +105,7 @@

Re: [Mesa-dev] [PATCH 07/11] i965: Move postprocess_nir to codegen time

2015-11-16 Thread Iago Toral

On Fri, 2015-11-13 at 07:34 -0800, Jason Ekstrand wrote:
> 
> On Nov 13, 2015 5:53 AM, "Iago Toral"  wrote:
> >
> > On Wed, 2015-11-11 at 17:26 -0800, Jason Ekstrand wrote:
> > > ---
> > >  src/mesa/drivers/dri/i965/brw_fs.cpp  | 11
> +--
> > >  src/mesa/drivers/dri/i965/brw_nir.c   |  1 -
> > >  src/mesa/drivers/dri/i965/brw_vec4.cpp|  5 -
> > >  src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp |  6 +-
> > >  4 files changed, 18 insertions(+), 5 deletions(-)
> > >
> > > diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp
> b/src/mesa/drivers/dri/i965/brw_fs.cpp
> > > index ad94fa4..b8713ab 100644
> > > --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
> > > +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
> > > @@ -43,6 +43,7 @@
> > >  #include "brw_wm.h"
> > >  #include "brw_fs.h"
> > >  #include "brw_cs.h"
> > > +#include "brw_nir.h"
> > >  #include "brw_vec4_gs_visitor.h"
> > >  #include "brw_cfg.h"
> > >  #include "brw_dead_control_flow.h"
> > > @@ -5459,13 +5460,16 @@ brw_compile_fs(const struct brw_compiler
> *compiler, void *log_data,
> > > void *mem_ctx,
> > > const struct brw_wm_prog_key *key,
> > > struct brw_wm_prog_data *prog_data,
> > > -   const nir_shader *shader,
> > > +   const nir_shader *src_shader,
> > > struct gl_program *prog,
> > > int shader_time_index8, int shader_time_index16,
> > > bool use_rep_send,
> > > unsigned *final_assembly_size,
> > > char **error_str)
> > >  {
> > > +   nir_shader *shader = nir_shader_clone(mem_ctx, src_shader);
> > > +   brw_postprocess_nir(shader, compiler->devinfo, true);
> > > +
> >
> > Maybe it is a silly question, but why do we need to clone the shader
> to
> > do this?
> 
> Because brw_compile_foo may be called multiple times on the same
> shader source. Since brw_postprocess_nir alters the shader source, we
> need to make a copy.

Ok, trying to see if I get the big picture of the series:

So the situation before this change is that we were running
brw_postprocess_nir in brw_create_nir (so at link-time) before we ran
brw_compile_foo (at codegen/drawing time), and thus, we never had this
problem. We still had to fix codegen for texture rectangle when drawing
though, which we were doing with rescale_texcoord().

With this change, we handle texture rectangle in brw_postprocess_nir()
so we don't need rescale_texcoord() any more, however, this needs to be
done at codegen time anyway, so as a consequence, now we have to move
all of brw_postprocess_nir there and we have to clone the NIR shader.

Did I get it right?

If I did then I wonder about the performance impact of this change,
since codegen happens when we draw and there is plenty of things
happening in brw_postprocess_nir (plus the cloning). Is it worth it?

> > > /* key->alpha_test_func means simulating alpha testing via
> discards,
> > >  * so the shader definitely kills pixels.
> > >  */
> > > @@ -5618,11 +5622,14 @@ brw_compile_cs(const struct brw_compiler
> *compiler, void *log_data,
> > > void *mem_ctx,
> > > const struct brw_cs_prog_key *key,
> > > struct brw_cs_prog_data *prog_data,
> > > -   const nir_shader *shader,
> > > +   const nir_shader *src_shader,
> > > int shader_time_index,
> > > unsigned *final_assembly_size,
> > > char **error_str)
> > >  {
> > > +   nir_shader *shader = nir_shader_clone(mem_ctx, src_shader);
> > > +   brw_postprocess_nir(shader, compiler->devinfo, true);
> > > +
> > > prog_data->local_size[0] = shader->info.cs.local_size[0];
> > > prog_data->local_size[1] = shader->info.cs.local_size[1];
> > > prog_data->local_size[2] = shader->info.cs.local_size[2];
> > > diff --git a/src/mesa/drivers/dri/i965/brw_nir.c
> b/src/mesa/drivers/dri/i965/brw_nir.c
> > > index 21c2648..693b9cd 100644
> > > --- a/src/mesa/drivers/dri/i965/brw_nir.c
> > > +++ b/src/mesa/drivers/dri/i965/brw_nir.c
> > > @@ -391,7 +391,6 @@ brw_create_nir(struct brw_context *brw,
> > >
> > > brw_preprocess_nir(nir, is_scalar);
> > > brw_lower_nir(nir, devinfo, shader_prog, is_scalar);
> > > -   brw_postprocess_nir(nir, devinfo, is_scalar);
> > >
> > > return nir;
> > >  }
> > > diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp
> b/src/mesa/drivers/dri/i965/brw_vec4.cpp
> > > index 8350a02..9f75bb6 100644
> > > --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp
> > > +++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp
> > > @@ -2028,13 +2028,16 @@ brw_compile_vs(const struct brw_compiler
> *compiler, void *log_data,
> > > void *mem_ctx,
> > > const struct brw_vs_prog_key *key,
> > > struct brw_vs_prog_data *prog_data,
> > > -   const nir_shader *shader,
> > > +   const nir_shader

Re: [Mesa-dev] [PATCH] radeonsi: enable optimal raster config setting for fiji (v2)

2015-11-16 Thread Marek Olšák

Reviewed-by: Marek Olšák 

Marek

On Fri, Nov 13, 2015 at 10:53 PM, Alex Deucher  wrote:
> Requires proper kernel tiling configurarion so check the tiling
> config registers.
>
> v2: send the right version of the patch
>
> Signed-off-by: Alex Deucher 
> Cc: mesa-sta...@lists.freedesktop.org
> ---
>  src/gallium/drivers/radeonsi/si_state.c | 12 +---
>  1 file changed, 9 insertions(+), 3 deletions(-)
>
> diff --git a/src/gallium/drivers/radeonsi/si_state.c 
> b/src/gallium/drivers/radeonsi/si_state.c
> index 1b5ea35..f8168d3 100644
> --- a/src/gallium/drivers/radeonsi/si_state.c
> +++ b/src/gallium/drivers/radeonsi/si_state.c
> @@ -3286,6 +3286,7 @@ si_write_harvested_raster_configs(struct si_context 
> *sctx,
>
>  static void si_init_config(struct si_context *sctx)
>  {
> +   struct si_screen *sscreen = sctx->screen;
> unsigned num_rb = MIN2(sctx->screen->b.info.r600_num_backends, 16);
> unsigned rb_mask = sctx->screen->b.info.si_backend_enabled_mask;
> unsigned raster_config, raster_config_1;
> @@ -3356,9 +3357,14 @@ static void si_init_config(struct si_context *sctx)
> raster_config_1 = 0x002e;
> break;
> case CHIP_FIJI:
> -   /* Fiji should be same as Hawaii, but that causes corruption 
> in some cases */
> -   raster_config = 0x1612; /* 0x3a00161a */
> -   raster_config_1 = 0x002a; /* 0x002e */
> +   if (sscreen->b.info.cik_macrotile_mode_array[0] == 
> 0x00e8) {
> +   /* old kernels with old tiling config */
> +   raster_config = 0x1612;
> +   raster_config_1 = 0x002a;
> +   } else {
> +   raster_config = 0x3a00161a;
> +   raster_config_1 = 0x002e;
> +   }
> break;
> case CHIP_TONGA:
> raster_config = 0x1612;
> --
> 1.8.3.1
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 00/11] i965/nir: Do texture rectangle lowering in NIR

2015-11-16 Thread Iago Toral

On Wed, 2015-11-11 at 17:27 -0800, Jason Ekstrand wrote:
> On Wed, Nov 11, 2015 at 5:23 PM, Jason Ekstrand  wrote:
> > On older hardware (Iron Lake and below), we can't support texture rectangle
> > natively.  Sandy Bridge through Haswell can support it but don't support
> > the GL_CLAMP wrap mode natively.  It isn't until Broadwell that GL_CLAMP is
> > supported together with GL_TEXTURE_RECTANGLE in hardware.  In the cases
> > where it isn't supported, we have to fake it by dividing by the texture
> > size.
> >
> > Previously, we had a rescale_texcoord function added a uniform to hold the
> > texture coordinate and used that to rescale/clamp the texture coordinates.
> > For a while now, nir_lower_tex has been able to lower texture rectangle to
> > a textureSize and a regular texture2D operation.  This series makes i965
> > use the nir_lower_tex path instead.  Incidentally, this fixes texture
> > rectangle support in vertex and geometry shaders on Haswell and below.
> > (The backend lowering was only ever done in the FS backend.)
> >
> > Since this is the first time we're doing any sort of shader variants in
> > NIR, the first several passes add the infastructure to do so.  Two of these
> > patches are from Ken, two are from Rob, and one (nir_clone itself) is my
> > rendition but heavily based on what Rob did only with less hashing.
> 
> Once again, git-send-email failed to send one of the patches
> (nir_clone).  You can find the whole thing on my freedesktop cgit:
> 
> http://cgit.freedesktop.org/~jekstrand/mesa/log/?h=wip/i965-nir-variants

Jason: I think I reviewed everything but the nir-clone patch. I figured
Connor or Rob would be better suited to review that one. That said, if
they can't review it promptly let me know and I'll give it a go.

Iago

> > Jason Ekstrand (7):
> >   nir: support to clone shaders
> >   i965/nir: Split shader optimization and lowering into three satages
> >   i965: Move postprocess_nir to codegen time
> >   nir/lower_tex: Report progress
> >   nir/lower_tex: Set the dest_type for txs instructions
> >   i965/fs: Don't allow SINT32 as a return type for resinfo
> >   i965: Use nir_lower_tex for texture coordinate lowering
> >
> > Kenneth Graunke (2):
> >   i965/nir: Add OPT() and OPT_V() macros for invoking NIR passes.
> >   i965/nir: Validate that NIR passes call nir_metadata_preserve().
> >
> > Rob Clark (2):
> >   nir: remove nir_variable::max_ifc_array_access
> >   nir: add array length field
> >
> >  src/glsl/Makefile.sources |   1 +
> >  src/glsl/nir/glsl_to_nir.cpp  |  14 +-
> >  src/glsl/nir/nir.c|   8 +
> >  src/glsl/nir/nir.h|  27 +-
> >  src/glsl/nir/nir_clone.c  | 671 
> > ++
> >  src/glsl/nir/nir_lower_tex.c  |  20 +-
> >  src/glsl/nir/nir_metadata.c   |  36 ++
> >  src/mesa/drivers/dri/i965/brw_fs.cpp  |  13 +-
> >  src/mesa/drivers/dri/i965/brw_fs.h|   3 -
> >  src/mesa/drivers/dri/i965/brw_fs_generator.cpp|  10 +-
> >  src/mesa/drivers/dri/i965/brw_fs_nir.cpp  |   4 +-
> >  src/mesa/drivers/dri/i965/brw_fs_visitor.cpp  | 125 
> >  src/mesa/drivers/dri/i965/brw_nir.c   | 268 +
> >  src/mesa/drivers/dri/i965/brw_nir.h   |  15 +
> >  src/mesa/drivers/dri/i965/brw_vec4.cpp|   7 +-
> >  src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp |   8 +-
> >  16 files changed, 966 insertions(+), 264 deletions(-)
> >  create mode 100644 src/glsl/nir/nir_clone.c
> >
> > --
> > 2.5.0.400.gff86faf
> >
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/5] nv50: implement a basic compute support

2015-11-16 Thread Emil Velikov

Hi Samuel,

On 13 November 2015 at 00:04, Samuel Pitoiset  wrote:
> This adds the ability to launch simple compute kernels like the one I
> will use to read out MP performance counters in the upcoming patch.
>
> This compute support is based on the work of Francisco Jerez (aka curro)
> that he did as part of his EVoC project in 2011/2012 to get OpenCL
> working on Tesla. His original work can be found here:
> https://github.com/curro/mesa/commits/nv50-compute
>
> I did some improvements on the original code, like fixing using both 3D
> and COMPUTE simultaneously, improving global buffers binding, and making
> the code closer to what nvc0 already does. This compute support has been
> tested by Pierre Moreau and myself with some compute kernels. This is a
> step towards OpenCL.
>
> Speaking about this, it seems like compute programs overlap fragment
> programs when they are used both. To fix this, we need to re-validate
> fragment programs when binding compute programs and vice versa.
>
> Note that, textures, samplers and surfaces still need to be implemented.
>
> Signed-off-by: Samuel Pitoiset 
> Tested-by: Pierre Moreau 
> ---
>  src/gallium/drivers/nouveau/Makefile.sources   |   1 +
>  .../drivers/nouveau/codegen/nv50_ir_driver.h   |   1 +
>  src/gallium/drivers/nouveau/nv50/nv50_compute.c| 332 +++
>  .../drivers/nouveau/nv50/nv50_compute.xml.h| 444 
> +
>  src/gallium/drivers/nouveau/nv50/nv50_context.c|  30 +-
>  src/gallium/drivers/nouveau/nv50/nv50_context.h|  23 +-
>  src/gallium/drivers/nouveau/nv50/nv50_program.c|  24 +-
>  src/gallium/drivers/nouveau/nv50/nv50_program.h|   7 +
>  src/gallium/drivers/nouveau/nv50/nv50_screen.c |  61 ++-
>  src/gallium/drivers/nouveau/nv50/nv50_screen.h |   8 +
>  src/gallium/drivers/nouveau/nv50/nv50_state.c  |  99 +
>  11 files changed, 1021 insertions(+), 9 deletions(-)
>  create mode 100644 src/gallium/drivers/nouveau/nv50/nv50_compute.c
>  create mode 100644 src/gallium/drivers/nouveau/nv50/nv50_compute.xml.h
>
> diff --git a/src/gallium/drivers/nouveau/Makefile.sources 
> b/src/gallium/drivers/nouveau/Makefile.sources
> index 83f8113..c2ff8e9 100644
> --- a/src/gallium/drivers/nouveau/Makefile.sources
> +++ b/src/gallium/drivers/nouveau/Makefile.sources
> @@ -64,6 +64,7 @@ NV50_C_SOURCES := \
> nv50/nv50_3ddefs.xml.h \
> nv50/nv50_3d.xml.h \
> nv50/nv50_blit.h \
> +   nv50/nv50_compute.c \
Please add the header into the list
nv50/nv50_compute.xml.h

Thanks
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 10/10] radeon: count cs dwords separately for query begin and end

2015-11-16 Thread Marek Olšák

For the series:

Reviewed-by: Marek Olšák 

Marek

On Fri, Nov 13, 2015 at 5:10 PM, Nicolai Hähnle  wrote:
> This will be important for perfcounter queries.
> ---
>  src/gallium/drivers/radeon/r600_query.c | 33 
> +++--
>  src/gallium/drivers/radeon/r600_query.h |  3 ++-
>  2 files changed, 21 insertions(+), 15 deletions(-)
>
> diff --git a/src/gallium/drivers/radeon/r600_query.c 
> b/src/gallium/drivers/radeon/r600_query.c
> index 4f89634..f8a30a2 100644
> --- a/src/gallium/drivers/radeon/r600_query.c
> +++ b/src/gallium/drivers/radeon/r600_query.c
> @@ -342,16 +342,18 @@ static struct pipe_query *r600_query_hw_create(struct 
> r600_common_context *rctx,
> case PIPE_QUERY_OCCLUSION_COUNTER:
> case PIPE_QUERY_OCCLUSION_PREDICATE:
> query->result_size = 16 * rctx->max_db;
> -   query->num_cs_dw = 6;
> +   query->num_cs_dw_begin = 6;
> +   query->num_cs_dw_end = 6;
> break;
> case PIPE_QUERY_TIME_ELAPSED:
> query->result_size = 16;
> -   query->num_cs_dw = 8;
> +   query->num_cs_dw_begin = 8;
> +   query->num_cs_dw_end = 8;
> query->flags = R600_QUERY_HW_FLAG_TIMER;
> break;
> case PIPE_QUERY_TIMESTAMP:
> query->result_size = 8;
> -   query->num_cs_dw = 8;
> +   query->num_cs_dw_end = 8;
> query->flags = R600_QUERY_HW_FLAG_TIMER |
>R600_QUERY_HW_FLAG_NO_START;
> break;
> @@ -361,13 +363,15 @@ static struct pipe_query *r600_query_hw_create(struct 
> r600_common_context *rctx,
> case PIPE_QUERY_SO_OVERFLOW_PREDICATE:
> /* NumPrimitivesWritten, PrimitiveStorageNeeded. */
> query->result_size = 32;
> -   query->num_cs_dw = 6;
> +   query->num_cs_dw_begin = 6;
> +   query->num_cs_dw_end = 6;
> query->stream = index;
> break;
> case PIPE_QUERY_PIPELINE_STATISTICS:
> /* 11 values on EG, 8 on R600. */
> query->result_size = (rctx->chip_class >= EVERGREEN ? 11 : 8) 
> * 16;
> -   query->num_cs_dw = 6;
> +   query->num_cs_dw_begin = 6;
> +   query->num_cs_dw_end = 6;
> break;
> default:
> assert(0);
> @@ -465,7 +469,9 @@ static void r600_query_hw_emit_start(struct 
> r600_common_context *ctx,
>
> r600_update_occlusion_query_state(ctx, query->b.type, 1);
> r600_update_prims_generated_query_state(ctx, query->b.type, 1);
> -   ctx->need_gfx_cs_space(>b, query->num_cs_dw * 2, TRUE);
> +
> +   ctx->need_gfx_cs_space(>b, query->num_cs_dw_begin + 
> query->num_cs_dw_end,
> +  TRUE);
>
> /* Get a new query buffer if needed. */
> if (query->buffer.results_end + query->result_size > 
> query->buffer.buf->b.b.width0) {
> @@ -482,10 +488,9 @@ static void r600_query_hw_emit_start(struct 
> r600_common_context *ctx,
> query->ops->emit_start(ctx, query, query->buffer.buf, va);
>
> if (query->flags & R600_QUERY_HW_FLAG_TIMER)
> -   ctx->num_cs_dw_timer_queries_suspend += query->num_cs_dw;
> +   ctx->num_cs_dw_timer_queries_suspend += query->num_cs_dw_end;
> else
> -   ctx->num_cs_dw_nontimer_queries_suspend += query->num_cs_dw;
> -
> +   ctx->num_cs_dw_nontimer_queries_suspend += 
> query->num_cs_dw_end;
>  }
>
>  static void r600_query_hw_do_emit_stop(struct r600_common_context *ctx,
> @@ -546,7 +551,7 @@ static void r600_query_hw_emit_stop(struct 
> r600_common_context *ctx,
>
> /* The queries which need begin already called this in begin_query. */
> if (query->flags & R600_QUERY_HW_FLAG_NO_START) {
> -   ctx->need_gfx_cs_space(>b, query->num_cs_dw, FALSE);
> +   ctx->need_gfx_cs_space(>b, query->num_cs_dw_end, FALSE);
> }
>
> /* emit end query */
> @@ -558,9 +563,9 @@ static void r600_query_hw_emit_stop(struct 
> r600_common_context *ctx,
>
> if (!(query->flags & R600_QUERY_HW_FLAG_NO_START)) {
> if (query->flags & R600_QUERY_HW_FLAG_TIMER)
> -   ctx->num_cs_dw_timer_queries_suspend -= 
> query->num_cs_dw;
> +   ctx->num_cs_dw_timer_queries_suspend -= 
> query->num_cs_dw_end;
> else
> -   ctx->num_cs_dw_nontimer_queries_suspend -= 
> query->num_cs_dw;
> +   ctx->num_cs_dw_nontimer_queries_suspend -= 
> query->num_cs_dw_end;
> }
>
> r600_update_occlusion_query_state(ctx, query->b.type, -1);
> @@ -980,14 +985,14 @@ static unsigned 
> r600_queries_num_cs_dw_for_resuming(struct r600_common_context *
>
> LIST_FOR_EACH_ENTRY(query,

Re: [Mesa-dev] [PATCH] radeonsi: use proper GRBM_GFX_INDEX offset for CI+

2015-11-16 Thread Marek Olšák

Reviewed-by: Marek Olšák 

Marek

On Fri, Nov 13, 2015 at 10:22 PM, Alex Deucher  wrote:
> The offset is different on CI and newer.
>
> Signed-off-by: Alex Deucher 
> ---
>  src/gallium/drivers/radeonsi/si_state.c | 16 
>  1 file changed, 12 insertions(+), 4 deletions(-)
>
> diff --git a/src/gallium/drivers/radeonsi/si_state.c 
> b/src/gallium/drivers/radeonsi/si_state.c
> index ff4d612..14763f7 100644
> --- a/src/gallium/drivers/radeonsi/si_state.c
> +++ b/src/gallium/drivers/radeonsi/si_state.c
> @@ -3259,21 +3259,29 @@ si_write_harvested_raster_configs(struct si_context 
> *sctx,
> }
> }
>
> -   /* GRBM_GFX_INDEX is privileged on VI */
> -   if (sctx->b.chip_class <= CIK)
> +   /* GRBM_GFX_INDEX has a different offset on SI and CI+ */
> +   if (sctx->b.chip_class < CIK)
> si_pm4_set_reg(pm4, GRBM_GFX_INDEX,
>SE_INDEX(se) | SH_BROADCAST_WRITES |
>INSTANCE_BROADCAST_WRITES);
> +   else
> +   si_pm4_set_reg(pm4, R_030800_GRBM_GFX_INDEX,
> +  S_030800_SE_INDEX(se) | 
> S_030800_SH_BROADCAST_WRITES(1) |
> +  S_030800_INSTANCE_BROADCAST_WRITES(1));
> si_pm4_set_reg(pm4, R_028350_PA_SC_RASTER_CONFIG, 
> raster_config_se);
> if (sctx->b.chip_class >= CIK)
> si_pm4_set_reg(pm4, R_028354_PA_SC_RASTER_CONFIG_1, 
> raster_config_1);
> }
>
> -   /* GRBM_GFX_INDEX is privileged on VI */
> -   if (sctx->b.chip_class <= CIK)
> +   /* GRBM_GFX_INDEX has a different offset on SI and CI+ */
> +   if (sctx->b.chip_class < CIK)
> si_pm4_set_reg(pm4, GRBM_GFX_INDEX,
>SE_BROADCAST_WRITES | SH_BROADCAST_WRITES |
>INSTANCE_BROADCAST_WRITES);
> +   else
> +   si_pm4_set_reg(pm4, R_030800_GRBM_GFX_INDEX,
> +  S_030800_SE_BROADCAST_WRITES(1) | 
> S_030800_SH_BROADCAST_WRITES(1) |
> +  S_030800_INSTANCE_BROADCAST_WRITES(1));
>  }
>
>  static void si_init_config(struct si_context *sctx)
> --
> 1.8.3.1
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 16/36] glsl ubo/ssbo: Add lower_buffer_access class

2015-11-16 Thread Emil Velikov

Hi Jordan,

On 14 November 2015 at 21:43, Jordan Justen  wrote:
> This class has code that will be shared by lower_ubo_reference and
> lower_shared_reference. (lower_shared_reference will be used to
> support compute shader shared variables.)
>
> Signed-off-by: Jordan Justen 
> Cc: Samuel Iglesias Gonsalvez 
> Cc: Iago Toral Quiroga 
> ---
>  src/glsl/Makefile.sources|   1 +
>  src/glsl/lower_buffer_access.cpp | 307 
> +++
>  src/glsl/lower_buffer_access.h   |  56 +++
>  src/glsl/lower_ubo_reference.cpp | 180 +--
>  4 files changed, 367 insertions(+), 177 deletions(-)
>  create mode 100644 src/glsl/lower_buffer_access.cpp
>  create mode 100644 src/glsl/lower_buffer_access.h
>
> diff --git a/src/glsl/Makefile.sources b/src/glsl/Makefile.sources
> index d4b02c1..f2c95c0 100644
> --- a/src/glsl/Makefile.sources
> +++ b/src/glsl/Makefile.sources
> @@ -155,6 +155,7 @@ LIBGLSL_FILES = \
> loop_analysis.h \
> loop_controls.cpp \
> loop_unroll.cpp \
> +   lower_buffer_access.cpp \
Please add the header file in the list

+   lower_buffer_access.h \

Thanks
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 03/36] i965: Define state flag to signal that the URB size has been altered.

2015-11-16 Thread Jordan Justen

Reviewed-by: Jordan Justen 

On 2015-11-14 13:43:39, Jordan Justen wrote:
> From: Francisco Jerez 
> 
> This will make sure that we recalculate the URB layout anytime the URB
> size is modified by the L3 partitioning code.
> ---
>  src/mesa/drivers/dri/i965/brw_context.h  | 2 ++
>  src/mesa/drivers/dri/i965/brw_state_upload.c | 1 +
>  src/mesa/drivers/dri/i965/gen7_urb.c | 3 +++
>  3 files changed, 6 insertions(+)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
> b/src/mesa/drivers/dri/i965/brw_context.h
> index 20d2dd0..ac05658 100644
> --- a/src/mesa/drivers/dri/i965/brw_context.h
> +++ b/src/mesa/drivers/dri/i965/brw_context.h
> @@ -213,6 +213,7 @@ enum brw_state_id {
> BRW_STATE_VS_ATTRIB_WORKAROUNDS,
> BRW_STATE_COMPUTE_PROGRAM,
> BRW_STATE_CS_WORK_GROUPS,
> +   BRW_STATE_URB_SIZE,
> BRW_NUM_STATE_BITS
>  };
>  
> @@ -293,6 +294,7 @@ enum brw_state_id {
>  #define BRW_NEW_VS_ATTRIB_WORKAROUNDS   (1ull << 
> BRW_STATE_VS_ATTRIB_WORKAROUNDS)
>  #define BRW_NEW_COMPUTE_PROGRAM (1ull << BRW_STATE_COMPUTE_PROGRAM)
>  #define BRW_NEW_CS_WORK_GROUPS  (1ull << BRW_STATE_CS_WORK_GROUPS)
> +#define BRW_NEW_URB_SIZE(1ull << BRW_STATE_URB_SIZE)
>  
>  struct brw_state_flags {
> /** State update flags signalled by mesa internals */
> diff --git a/src/mesa/drivers/dri/i965/brw_state_upload.c 
> b/src/mesa/drivers/dri/i965/brw_state_upload.c
> index 6f8daf6..aab5c91 100644
> --- a/src/mesa/drivers/dri/i965/brw_state_upload.c
> +++ b/src/mesa/drivers/dri/i965/brw_state_upload.c
> @@ -618,6 +618,7 @@ static struct dirty_bit_map brw_bits[] = {
> DEFINE_BIT(BRW_NEW_VS_ATTRIB_WORKAROUNDS),
> DEFINE_BIT(BRW_NEW_COMPUTE_PROGRAM),
> DEFINE_BIT(BRW_NEW_CS_WORK_GROUPS),
> +   DEFINE_BIT(BRW_NEW_URB_SIZE),
> {0, 0, 0}
>  };
>  
> diff --git a/src/mesa/drivers/dri/i965/gen7_urb.c 
> b/src/mesa/drivers/dri/i965/gen7_urb.c
> index 6916217..11a4f03 100644
> --- a/src/mesa/drivers/dri/i965/gen7_urb.c
> +++ b/src/mesa/drivers/dri/i965/gen7_urb.c
> @@ -153,6 +153,7 @@ gen7_upload_urb(struct brw_context *brw)
>  * skip the rest of the logic.
>  */
> if (!(brw->ctx.NewDriverState & BRW_NEW_CONTEXT) &&
> +   !(brw->ctx.NewDriverState & BRW_NEW_URB_SIZE) &&
> brw->urb.vsize == vs_size &&
> brw->urb.gs_present == gs_present &&
> brw->urb.gsize == gs_size) {
> @@ -176,6 +177,7 @@ gen7_upload_urb(struct brw_context *brw)
> unsigned chunk_size_bytes = 8192;
>  
> /* Determine the size of the URB in chunks.
> +* BRW_NEW_URB_SIZE
>  */
> unsigned urb_chunks = brw->urb.size * 1024 / chunk_size_bytes;
>  
> @@ -314,6 +316,7 @@ const struct brw_tracked_state gen7_urb = {
> .dirty = {
>.mesa = 0,
>.brw = BRW_NEW_CONTEXT |
> + BRW_NEW_URB_SIZE |
>   BRW_NEW_GEOMETRY_PROGRAM |
>   BRW_NEW_GS_PROG_DATA |
>   BRW_NEW_VS_PROG_DATA,
> -- 
> 2.6.2
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v5 5/7] glsl: Add precision information to ir_variable

2015-11-16 Thread Samuel Iglesias Gonsálvez



On 13/11/15 21:38, Ilia Mirkin wrote:
> On Fri, Nov 13, 2015 at 2:37 PM, Ilia Mirkin  wrote:
>> Looks like valgrind hates this for some reason. I'm seeing lots of
>>
>> ==16821== Conditional jump or move depends on uninitialised value(s)
>> ==16821==at 0xA074D09: glsl_type::record_compare(glsl_type const*)
>> const (glsl_types.cpp:783)
>>
>> Where line 783 is:
>>
>>   if (this->fields.structure[i].precision
>>   != b->fields.structure[i].precision)
>>
>> This happens with the trace from
>> https://bugs.freedesktop.org/show_bug.cgi?id=92229 but I suspect it
>> happens with just about anything with structs.
> 
> I tried the following but no go. I'm giving up for now.
> 

OK, I can reproduce this valgrind error. I am going to debug it.

Thanks!

Sam

>   -ilia
> 
> diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp
> index 51ea183..92f8b37 100644
> --- a/src/glsl/ast_to_hir.cpp
> +++ b/src/glsl/ast_to_hir.cpp
> @@ -6584,6 +6584,8 @@ ast_interface_block::hir(exec_list *instructions,
> earlier_per_vertex->fields.structure[j].sample;
>  fields[i].patch =
> earlier_per_vertex->fields.structure[j].patch;
> +fields[i].precision =
> +   earlier_per_vertex->fields.structure[j].precision;
>   }
>}
> 
> diff --git a/src/glsl/nir/glsl_types.cpp b/src/glsl/nir/glsl_types.cpp
> index 975b815..7345765 100644
> --- a/src/glsl/nir/glsl_types.cpp
> +++ b/src/glsl/nir/glsl_types.cpp
> @@ -124,6 +124,7 @@ glsl_type::glsl_type(const glsl_struct_field
> *fields, unsigned num_fields,
>this->fields.structure[i].sample = fields[i].sample;
>this->fields.structure[i].matrix_layout = fields[i].matrix_layout;
>this->fields.structure[i].patch = fields[i].patch;
> +  this->fields.structure[i].precision = fields[i].precision;
>this->fields.structure[i].image_read_only = fields[i].image_read_only;
>this->fields.structure[i].image_write_only = 
> fields[i].image_write_only;
>this->fields.structure[i].image_coherent = fields[i].image_coherent;
> diff --git a/src/glsl/nir/glsl_types.h b/src/glsl/nir/glsl_types.h
> index d841a32..f3a0cf8 100644
> --- a/src/glsl/nir/glsl_types.h
> +++ b/src/glsl/nir/glsl_types.h
> @@ -851,7 +851,7 @@ struct glsl_struct_field {
> 
> glsl_struct_field(const struct glsl_type *_type, const char *_name)
>: type(_type), name(_name), location(-1), interpolation(0), 
> centroid(0),
> -sample(0), matrix_layout(GLSL_MATRIX_LAYOUT_INHERITED), patch(0)
> +sample(0), matrix_layout(GLSL_MATRIX_LAYOUT_INHERITED),
> patch(0), precision(0)
> {
>/* empty */
> }
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 11/11] i965: Use nir_lower_tex for texture coordinate lowering

2015-11-16 Thread Iago Toral

On Wed, 2015-11-11 at 17:26 -0800, Jason Ekstrand wrote:
> Previously, we had a rescale_texcoords helper in the FS backend for
> handling rescaling of texture coordinates.  Now that we can do variants in
> NIR, we can use nir_lower_tex to do the rescaling for us.  This allows us
> to delete the i965-specific code and gives us proper TEXTURE_RECTANGLE and
> GL_CLAMP handling in vertex and geometry shaders.
> ---
>  src/mesa/drivers/dri/i965/brw_fs.cpp  |   2 +
>  src/mesa/drivers/dri/i965/brw_fs.h|   3 -
>  src/mesa/drivers/dri/i965/brw_fs_nir.cpp  |   4 +-
>  src/mesa/drivers/dri/i965/brw_fs_visitor.cpp  | 125 
> --
>  src/mesa/drivers/dri/i965/brw_nir.c   |  23 
>  src/mesa/drivers/dri/i965/brw_nir.h   |   6 ++
>  src/mesa/drivers/dri/i965/brw_vec4.cpp|   2 +
>  src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp |   2 +
>  8 files changed, 36 insertions(+), 131 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
> b/src/mesa/drivers/dri/i965/brw_fs.cpp
> index b8713ab..c56cafe 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
> @@ -5468,6 +5468,7 @@ brw_compile_fs(const struct brw_compiler *compiler, 
> void *log_data,
> char **error_str)
>  {
> nir_shader *shader = nir_shader_clone(mem_ctx, src_shader);
> +   brw_nir_apply_sampler_key(shader, compiler->devinfo, >tex, true);

This looks like it is part of the post-processing process. In fact, you
call this right before brw_postprocess_nir() for every stage. Why not
just add a key parameter to brw_postprocess_nir() and call
brw_nir_apply_sampler_key from there instead?

Either way,
Reviewed-by: Iago Toral Quiroga 

> brw_postprocess_nir(shader, compiler->devinfo, true);
>  
> /* key->alpha_test_func means simulating alpha testing via discards,
> @@ -5628,6 +5629,7 @@ brw_compile_cs(const struct brw_compiler *compiler, 
> void *log_data,
> char **error_str)
>  {
> nir_shader *shader = nir_shader_clone(mem_ctx, src_shader);
> +   brw_nir_apply_sampler_key(shader, compiler->devinfo, >tex, true);
> brw_postprocess_nir(shader, compiler->devinfo, true);
>  
> prog_data->local_size[0] = shader->info.cs.local_size[0];
> diff --git a/src/mesa/drivers/dri/i965/brw_fs.h 
> b/src/mesa/drivers/dri/i965/brw_fs.h
> index 2dfcab1..8a181d7 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs.h
> +++ b/src/mesa/drivers/dri/i965/brw_fs.h
> @@ -217,8 +217,6 @@ public:
> void emit_interpolation_setup_gen4();
> void emit_interpolation_setup_gen6();
> void compute_sample_position(fs_reg dst, fs_reg int_sample_pos);
> -   fs_reg rescale_texcoord(fs_reg coordinate, int coord_components,
> -   bool is_rect, uint32_t sampler);
> void emit_texture(ir_texture_opcode op,
>   const glsl_type *dest_type,
>   fs_reg coordinate, int components,
> @@ -229,7 +227,6 @@ public:
>   fs_reg mcs,
>   int gather_component,
>   bool is_cube_array,
> - bool is_rect,
>   uint32_t sampler,
>   fs_reg sampler_reg);
> fs_reg emit_mcs_fetch(const fs_reg , unsigned components,
> diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp 
> b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> index 02b9f5b..3d83d7c 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> @@ -2411,8 +2411,6 @@ fs_visitor::nir_emit_texture(const fs_builder , 
> nir_tex_instr *instr)
>  
> int gather_component = instr->component;
>  
> -   bool is_rect = instr->sampler_dim == GLSL_SAMPLER_DIM_RECT;
> -
> bool is_cube_array = instr->sampler_dim == GLSL_SAMPLER_DIM_CUBE &&
>  instr->is_array;
>  
> @@ -2549,7 +2547,7 @@ fs_visitor::nir_emit_texture(const fs_builder , 
> nir_tex_instr *instr)
> emit_texture(op, dest_type, coordinate, instr->coord_components,
>  shadow_comparitor, lod, lod2, lod_components, sample_index,
>  tex_offset, mcs, gather_component,
> -is_cube_array, is_rect, sampler, sampler_reg);
> +is_cube_array, sampler, sampler_reg);
>  
> fs_reg dest = get_nir_dest(instr->dest);
> dest.type = this->result.type;
> diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp 
> b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
> index 213c912..faf304c 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
> @@ -79,122 +79,6 @@ fs_visitor::emit_vs_system_value(int location)
> return reg;
>  }
>  
> -fs_reg
> -fs_visitor::rescale_texcoord(fs_reg coordinate, int coord_components,
> - bool is_rect, uint32_t sampler)
> -{
> -   bool needs_gl_clamp = true;
> -   fs_reg

Re: [Mesa-dev] [PATCH] nvc0: fix wrong value for NVC8_COMPUTE_CLASS

2015-11-16 Thread Emil Velikov

On 9 October 2015 at 14:10, Samuel Pitoiset  wrote:
> Compute class value for GF110+ is 0x91c0 and not 0x92c0. This fixes
> compute support and MP performance counters on GF110.
>
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/gallium/drivers/nouveau/nv_object.xml.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/src/gallium/drivers/nouveau/nv_object.xml.h 
> b/src/gallium/drivers/nouveau/nv_object.xml.h
> index 0a0e187..92c0633 100644
> --- a/src/gallium/drivers/nouveau/nv_object.xml.h
> +++ b/src/gallium/drivers/nouveau/nv_object.xml.h
> @@ -197,7 +197,7 @@ WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 
> SOFTWARE.
>  #define NV50_COMPUTE_CLASS 0x50c0
>  #define NVA3_COMPUTE_CLASS 0x85c0
>  #define NVC0_COMPUTE_CLASS 0x90c0
> -#define NVC8_COMPUTE_CLASS 0x92c0
> +#define NVC8_COMPUTE_CLASS 0x91c0
Worth updating the classic one (src/mesa/drivers/dri/nouveau/) as well ?

There is a nasty looking comment in nvc0_screen_compute_setup about
the above class. Afaics although updated the define isn't used
anywhere so I'm dubious how it fixes compute on GF110.

Cheers,
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 00/36] Computer shader shared variables

2015-11-16 Thread Lofstedt, Marta

I can confirm that this patch-set does not cause any regression for GLES 3.1 
CTS test on HSW and BDW.

> -Original Message-
> From: Justen, Jordan L
> Sent: Saturday, November 14, 2015 10:44 PM
> To: mesa-dev@lists.freedesktop.org
> Cc: Kristian Høgsberg Kristensen; Lofstedt, Marta; Palli, Tapani; Justen, 
> Jordan
> L
> Subject: [PATCH 00/36] Computer shader shared variables
> 
> git://people.freedesktop.org/~jljusten/mesa cs-shared-variables-v1
> http://patchwork.freedesktop.org/bundle/jljusten/cs-shared-variables-v1
> 
> Patches 1 - 13:
> 
>  * Rebased curro's "i965: L3 cache partitioning." (sent Sept 6)
> 
> Patches 14 - 19:
> 
>  * Rework lower_ubo_reference to allow code sharing with
>lower_shared_reference
> 
> Patches 20 - 28:
> 
>  * Add shared variable support for i965. Add lower_shared_reference,
>which works similar to lower_ubo_reference for SSBOs, except it
>merges all shared variable into one shared variable region. (Rather
>than separate BOs like SSBOs allows.)
> 
> Patches 29 - 36:
> 
>  * Adds atomic support for shared variable on i965, which is
>implemented similar to SSBOs.
> 
> On Ivy Bridge fixes several piglit and OpenGLES 3.1 CTS tests:
> 
>  * spec/arb_compute_shader/compiler/shared-atomics.comp: fail pass
>  * spec/arb_compute_shader/execution/shared-atomic: crash pass
>  * spec/arb_compute_shader/execution/simple-barrier: crash pass
> 
>  * es31-cts/compute_shader/atomic-case1: fail pass
>  * es31-cts/compute_shader/atomic-case3: fail pass
>  * es31-cts/compute_shader/shared-indexing: fail pass
>  * es31-cts/compute_shader/shared-max: fail pass
>  * es31-cts/compute_shader/shared-simple: fail pass
>  * es31-cts/compute_shader/shared-struct: fail pass
>  * es31-cts/compute_shader/work-group-size: fail pass
> 
> Francisco Jerez (13):
>   i965: Define symbolic constants for some useful L3 cache control
> registers.
>   i965: Keep track of whether LRI is allowed in the context struct.
>   i965: Define state flag to signal that the URB size has been altered.
>   i965/gen8: Don't add workaround bits to PIPE_CONTROL stalls if DC
> flush is set.
>   i965: Import tables enumerating the set of validated L3
> configurations.
>   i965: Implement programming of the L3 configuration.
>   i965/hsw: Enable L3 atomics.
>   i965: Implement selection of the closest L3 configuration based on a
> vector of weights.
>   i965: Calculate appropriate L3 partition weights for the current
> pipeline state.
>   i965: Implement L3 state atom.
>   i965: Add debug flag to print out the new L3 state during transitions.
>   i965: Work around L3 state leaks during context switches.
>   i965: Hook up L3 partitioning state atom.
> 
> Jordan Justen (23):
>   glsl ubo/ssbo: Use enum to track current buffer access type
>   glsl ubo/ssbo: Split buffer access to insert_buffer_access
>   glsl ubo/ssbo: Add lower_buffer_access class
>   glsl ubo/ssbo: Move is_dereferenced_thing_row_major into
> lower_buffer_access
>   glsl ubo/ssbo: Move common code into
> lower_buffer_access::setup_buffer_access
>   glsl: Add default matrix ordering in lower_buffer_access
>   glsl: Don't lower_variable_index_to_cond_assign for shared variables
>   glsl: Add lowering pass for shared variable references
>   nir: Translate glsl shared var load intrinsic to nir intrinsic
>   nir: Translate glsl shared var store intrinsic to nir intrinsic
>   i965: Disable vector splitting on shared variables
>   i965/fs: Handle nir shared variable load intrinsic
>   i965/fs: Handle nir shared variable store intrinsic function
>   i965: Enable shared local memory for CS shared variables
>   i965: Lower shared variable references to intrinsic calls
>   glsl: Allow atomic functions to be used with shared variables
>   glsl: Replace atomic_ssbo and ssbo_atomic with atomic
>   glsl: Check for SSBO variable in SSBO atomic lowering
>   glsl: Translate atomic intrinsic functions on shared variables
>   glsl: Buffer atomics are supported for compute shaders
>   glsl: Disable several optimizations on shared variables
>   nir: Add nir intrinsics for shared variable atomic operations
>   i965/nir: Implement shared variable atomic operations
> 
>  src/glsl/Makefile.sources  |   2 +
>  src/glsl/ast_function.cpp  |  18 +-
>  src/glsl/builtin_functions.cpp | 236 -
>  src/glsl/ir_optimization.h |   1 +
>  src/glsl/linker.cpp|   4 +
>  src/glsl/lower_buffer_access.cpp   | 565 
> +
>  src/glsl/lower_buffer_access.h |  72 +++
>  src/glsl/lower_shared_reference.cpp| 511 +++
>  src/glsl/lower_ubo_reference.cpp   | 536 +++
>  src/glsl/lower_variable_index_to_cond_assign.cpp   |   3 +
>  src/glsl/nir/glsl_to_nir.cpp   | 131 -
>

Re: [Mesa-dev] [PATCH v5 5/7] glsl: Add precision information to ir_variable

2015-11-16 Thread Samuel Iglesias Gonsálvez

Hello Ilia, Tapani:

I have reproduced the issue with a piglit test but not with the trace
uploaded in the bug report :-(

The piglit test was: bin/arb_shader_storage_buffer_object-maxblocks

I have upload a branch with some fixes at Igalia's mesa repo:

Git repo: https://github.com/Igalia/mesa.git
Branch: wip/siglesias/precision-fixes

But as this error might come from other initializations that I might
overlook:
* Ilia: Could you test if this issue is still happening to you? As I
cannot reproduce it locally, I might be forgetting something.
* Tapani: Could you do a quick run on CTS to check I have not broken
anything?

Thanks!

Sam
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/5] nv50: implement a basic compute support

2015-11-16 Thread Emil Velikov

On 16 November 2015 at 10:39, Emil Velikov  wrote:
> Hi Samuel,
>
> On 13 November 2015 at 00:04, Samuel Pitoiset  
> wrote:
>> This adds the ability to launch simple compute kernels like the one I
>> will use to read out MP performance counters in the upcoming patch.
>>
>> This compute support is based on the work of Francisco Jerez (aka curro)
>> that he did as part of his EVoC project in 2011/2012 to get OpenCL
>> working on Tesla. His original work can be found here:
>> https://github.com/curro/mesa/commits/nv50-compute
>>
>> I did some improvements on the original code, like fixing using both 3D
>> and COMPUTE simultaneously, improving global buffers binding, and making
>> the code closer to what nvc0 already does. This compute support has been
>> tested by Pierre Moreau and myself with some compute kernels. This is a
>> step towards OpenCL.
>>
>> Speaking about this, it seems like compute programs overlap fragment
>> programs when they are used both. To fix this, we need to re-validate
>> fragment programs when binding compute programs and vice versa.
>>
>> Note that, textures, samplers and surfaces still need to be implemented.
>>
>> Signed-off-by: Samuel Pitoiset 
>> Tested-by: Pierre Moreau 
>> ---
>>  src/gallium/drivers/nouveau/Makefile.sources   |   1 +
>>  .../drivers/nouveau/codegen/nv50_ir_driver.h   |   1 +
>>  src/gallium/drivers/nouveau/nv50/nv50_compute.c| 332 +++
>>  .../drivers/nouveau/nv50/nv50_compute.xml.h| 444 
>> +
>>  src/gallium/drivers/nouveau/nv50/nv50_context.c|  30 +-
>>  src/gallium/drivers/nouveau/nv50/nv50_context.h|  23 +-
>>  src/gallium/drivers/nouveau/nv50/nv50_program.c|  24 +-
>>  src/gallium/drivers/nouveau/nv50/nv50_program.h|   7 +
>>  src/gallium/drivers/nouveau/nv50/nv50_screen.c |  61 ++-
>>  src/gallium/drivers/nouveau/nv50/nv50_screen.h |   8 +
>>  src/gallium/drivers/nouveau/nv50/nv50_state.c  |  99 +
>>  11 files changed, 1021 insertions(+), 9 deletions(-)
>>  create mode 100644 src/gallium/drivers/nouveau/nv50/nv50_compute.c
>>  create mode 100644 src/gallium/drivers/nouveau/nv50/nv50_compute.xml.h
>>
>> diff --git a/src/gallium/drivers/nouveau/Makefile.sources 
>> b/src/gallium/drivers/nouveau/Makefile.sources
>> index 83f8113..c2ff8e9 100644
>> --- a/src/gallium/drivers/nouveau/Makefile.sources
>> +++ b/src/gallium/drivers/nouveau/Makefile.sources
>> @@ -64,6 +64,7 @@ NV50_C_SOURCES := \
>> nv50/nv50_3ddefs.xml.h \
>> nv50/nv50_3d.xml.h \
>> nv50/nv50_blit.h \
>> +   nv50/nv50_compute.c \
> Please add the header into the list
> nv50/nv50_compute.xml.h
>
Ouch, did not see that you've pushed the series. No need to do
anything - I've just committed a fix.

-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/5] nv50: implement a basic compute support

2015-11-16 Thread Samuel Pitoiset




On 11/16/2015 11:47 AM, Emil Velikov wrote:

On 16 November 2015 at 10:39, Emil Velikov  wrote:

Hi Samuel,

On 13 November 2015 at 00:04, Samuel Pitoiset  wrote:

This adds the ability to launch simple compute kernels like the one I
will use to read out MP performance counters in the upcoming patch.

This compute support is based on the work of Francisco Jerez (aka curro)
that he did as part of his EVoC project in 2011/2012 to get OpenCL
working on Tesla. His original work can be found here:
https://github.com/curro/mesa/commits/nv50-compute

I did some improvements on the original code, like fixing using both 3D
and COMPUTE simultaneously, improving global buffers binding, and making
the code closer to what nvc0 already does. This compute support has been
tested by Pierre Moreau and myself with some compute kernels. This is a
step towards OpenCL.

Speaking about this, it seems like compute programs overlap fragment
programs when they are used both. To fix this, we need to re-validate
fragment programs when binding compute programs and vice versa.

Note that, textures, samplers and surfaces still need to be implemented.

Signed-off-by: Samuel Pitoiset 
Tested-by: Pierre Moreau 
---
  src/gallium/drivers/nouveau/Makefile.sources   |   1 +
  .../drivers/nouveau/codegen/nv50_ir_driver.h   |   1 +
  src/gallium/drivers/nouveau/nv50/nv50_compute.c| 332 +++
  .../drivers/nouveau/nv50/nv50_compute.xml.h| 444 +
  src/gallium/drivers/nouveau/nv50/nv50_context.c|  30 +-
  src/gallium/drivers/nouveau/nv50/nv50_context.h|  23 +-
  src/gallium/drivers/nouveau/nv50/nv50_program.c|  24 +-
  src/gallium/drivers/nouveau/nv50/nv50_program.h|   7 +
  src/gallium/drivers/nouveau/nv50/nv50_screen.c |  61 ++-
  src/gallium/drivers/nouveau/nv50/nv50_screen.h |   8 +
  src/gallium/drivers/nouveau/nv50/nv50_state.c  |  99 +
  11 files changed, 1021 insertions(+), 9 deletions(-)
  create mode 100644 src/gallium/drivers/nouveau/nv50/nv50_compute.c
  create mode 100644 src/gallium/drivers/nouveau/nv50/nv50_compute.xml.h

diff --git a/src/gallium/drivers/nouveau/Makefile.sources 
b/src/gallium/drivers/nouveau/Makefile.sources
index 83f8113..c2ff8e9 100644
--- a/src/gallium/drivers/nouveau/Makefile.sources
+++ b/src/gallium/drivers/nouveau/Makefile.sources
@@ -64,6 +64,7 @@ NV50_C_SOURCES := \
 nv50/nv50_3ddefs.xml.h \
 nv50/nv50_3d.xml.h \
 nv50/nv50_blit.h \
+   nv50/nv50_compute.c \

Please add the header into the list
nv50/nv50_compute.xml.h


Ouch, did not see that you've pushed the series. No need to do
anything - I've just committed a fix.


Thank you for doing this change. I really forgot that!



-Emil



--
-Samuel
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 14/36] glsl ubo/ssbo: Use enum to track current buffer access type

2015-11-16 Thread Iago Toral

On Sat, 2015-11-14 at 13:43 -0800, Jordan Justen wrote:
> Signed-off-by: Jordan Justen 
> Cc: Samuel Iglesias Gonsalvez 
> Cc: Iago Toral Quiroga 
> ---
>  src/glsl/lower_ubo_reference.cpp | 26 +-
>  1 file changed, 21 insertions(+), 5 deletions(-)
> 
> diff --git a/src/glsl/lower_ubo_reference.cpp 
> b/src/glsl/lower_ubo_reference.cpp
> index b74aa3d..41012db 100644
> --- a/src/glsl/lower_ubo_reference.cpp
> +++ b/src/glsl/lower_ubo_reference.cpp
> @@ -162,6 +162,14 @@ public:
> ir_call *ssbo_store(ir_rvalue *deref, ir_rvalue *offset,
> unsigned write_mask);
>  
> +   enum {
> +  ubo_load_access,
> +  ssbo_load_access,
> +  ssbo_store_access,
> +  ssbo_get_array_length,

ssbo_get_array_length misses that is for "unsized" arrays and does not
include the "access" prefix that the other enum values have, which makes
it a bit inconsistent. How about we name this
'ssbo_unsized_array_length_access'? or maybe 'ssbo_unsized_array_access'
if we think the former is too long.

> +  ssbo_atomic_access,
> +   } buffer_access_type;
> +
> void emit_access(bool is_write, ir_dereference *deref,
>  ir_variable *base_offset, unsigned int deref_offset,
>  bool row_major, int matrix_columns,
> @@ -189,7 +197,6 @@ public:
> struct gl_uniform_buffer_variable *ubo_var;
> ir_rvalue *uniform_block;
> bool progress;
> -   bool is_shader_storage;
>  };
>  
>  /**
> @@ -339,10 +346,9 @@ 
> lower_ubo_reference_visitor::setup_for_load_or_store(ir_variable *var,
> deref, _block_index);
>  
> /* Locate the block by interface name */
> -   this->is_shader_storage = var->is_in_shader_storage_block();
> unsigned num_blocks;
> struct gl_uniform_block **blocks;
> -   if (this->is_shader_storage) {
> +   if (buffer_access_type != ubo_load_access) {

I think this file generally uses 'this->' to refer to class members (or
at least this function does), so maybe we should keep that for
consistency. The same in the other places where you use
buffer_access_type.

That said, right now it seems that we only ever use buffer_access_type
here and you always assign its value right before calling
setup_for_load_or_store() so maybe it is better to just make it a
function parameter instead of a class member? setup_for_load_or_store()
already has a large number of parameters, so I am not super happy about
the idea, but it looks more natural to me. What do you think?

>num_blocks = shader->NumShaderStorageBlocks;
>blocks = shader->ShaderStorageBlocks;
> } else {
> @@ -552,6 +558,10 @@ lower_ubo_reference_visitor::handle_rvalue(ir_rvalue 
> **rvalue)
> int matrix_columns;
> unsigned packing = var->get_interface_type()->interface_packing;
>  
> +   buffer_access_type =
> +  var->is_in_shader_storage_block() ?
> +  ssbo_load_access : ubo_load_access;
> +
> /* Compute the offset to the start if the dereference as well as other
>  * information we need to configure the write
>  */
> @@ -795,7 +805,7 @@ lower_ubo_reference_visitor::emit_access(bool is_write,
>if (is_write)
>   base_ir->insert_after(ssbo_store(deref, offset, write_mask));
>else {
> - if (!this->is_shader_storage) {
> + if (buffer_access_type == ubo_load_access) {
>   base_ir->insert_before(assign(deref->clone(mem_ctx, NULL),
> ubo_load(deref->type, offset)));
>   } else {
> @@ -862,7 +872,7 @@ lower_ubo_reference_visitor::emit_access(bool is_write,
>  
>  base_ir->insert_after(ssbo_store(swizzle(deref, i, 1), 
> chan_offset, 1));
>   } else {
> -if (!this->is_shader_storage) {
> +if (buffer_access_type == ubo_load_access) {
> base_ir->insert_before(assign(deref->clone(mem_ctx, NULL),
>   ubo_load(deref_type, 
> chan_offset),
>   (1U << i)));
> @@ -891,6 +901,8 @@ 
> lower_ubo_reference_visitor::write_to_memory(ir_dereference *deref,
> int matrix_columns;
> unsigned packing = var->get_interface_type()->interface_packing;
>  
> +   buffer_access_type = ssbo_store_access;
> +
> /* Compute the offset to the start if the dereference as well as other
>  * information we need to configure the write
>  */
> @@ -1068,6 +1080,8 @@ 
> lower_ubo_reference_visitor::process_ssbo_unsized_array_length(ir_rvalue 
> **rvalu
> unsigned packing = var->get_interface_type()->interface_packing;
> int unsized_array_stride = calculate_unsized_array_stride(deref, packing);
>  
> +   buffer_access_type = ssbo_get_array_length;
> +
> /* Compute the offset to the start if the dereference as well as other
>  * information we need to calculate the length.
>  */
> @@

Re: [Mesa-dev] [PATCH 15/36] glsl ubo/ssbo: Split buffer access to insert_buffer_access

2015-11-16 Thread Iago Toral

Looks good to me,

Reviewed-by: Iago Toral Quiroga 

On Sat, 2015-11-14 at 13:43 -0800, Jordan Justen wrote:
> This allows the code in emit_access to be generic enough to also be
> for lowering shared variables.
> 
> Signed-off-by: Jordan Justen 
> Cc: Samuel Iglesias Gonsalvez 
> Cc: Iago Toral Quiroga 
> ---
>  src/glsl/lower_ubo_reference.cpp | 78 
> ++--
>  1 file changed, 43 insertions(+), 35 deletions(-)
> 
> diff --git a/src/glsl/lower_ubo_reference.cpp 
> b/src/glsl/lower_ubo_reference.cpp
> index 41012db..b8fcc8e 100644
> --- a/src/glsl/lower_ubo_reference.cpp
> +++ b/src/glsl/lower_ubo_reference.cpp
> @@ -170,6 +170,9 @@ public:
>ssbo_atomic_access,
> } buffer_access_type;
>  
> +   void insert_buffer_access(ir_dereference *deref, const glsl_type *type,
> + ir_rvalue *offset, unsigned mask, int channel);
> +
> void emit_access(bool is_write, ir_dereference *deref,
>  ir_variable *base_offset, unsigned int deref_offset,
>  bool row_major, int matrix_columns,
> @@ -689,6 +692,41 @@ lower_ubo_reference_visitor::ssbo_load(const struct 
> glsl_type *type,
> return new(mem_ctx) ir_call(sig, deref_result, _params);
>  }
>  
> +void
> +lower_ubo_reference_visitor::insert_buffer_access(ir_dereference *deref,
> +  const glsl_type *type,
> +  ir_rvalue *offset,
> +  unsigned mask,
> +  int channel)
> +{
> +   switch (buffer_access_type) {
> +   case ubo_load_access:
> +  base_ir->insert_before(assign(deref->clone(mem_ctx, NULL),
> +ubo_load(type, offset),
> +mask));
> +  break;
> +   case ssbo_load_access: {
> +  ir_call *load_ssbo = ssbo_load(type, offset);
> +  base_ir->insert_before(load_ssbo);
> +  ir_rvalue *value = 
> load_ssbo->return_deref->as_rvalue()->clone(mem_ctx, NULL);
> +  ir_assignment *assignment =
> + assign(deref->clone(mem_ctx, NULL), value, mask);
> +  base_ir->insert_before(assignment);
> +  break;
> +   }
> +   case ssbo_store_access:
> +  if (channel >= 0) {
> + base_ir->insert_after(ssbo_store(swizzle(deref, channel, 1),
> +  offset, 1));
> +  } else {
> + base_ir->insert_after(ssbo_store(deref, offset, mask));
> +  }
> +  break;
> +   default:
> +  unreachable("invalid buffer_access_type in insert_buffer_access");
> +   }
> +}
> +
>  static inline int
>  writemask_for_size(unsigned n)
>  {
> @@ -802,19 +840,9 @@ lower_ubo_reference_visitor::emit_access(bool is_write,
> if (!row_major) {
>ir_rvalue *offset =
>   add(base_offset, new(mem_ctx) ir_constant(deref_offset));
> -  if (is_write)
> - base_ir->insert_after(ssbo_store(deref, offset, write_mask));
> -  else {
> - if (buffer_access_type == ubo_load_access) {
> - base_ir->insert_before(assign(deref->clone(mem_ctx, NULL),
> -   ubo_load(deref->type, offset)));
> - } else {
> -ir_call *load_ssbo = ssbo_load(deref->type, offset);
> -base_ir->insert_before(load_ssbo);
> -ir_rvalue *value = 
> load_ssbo->return_deref->as_rvalue()->clone(mem_ctx, NULL);
> -base_ir->insert_before(assign(deref->clone(mem_ctx, NULL), 
> value));
> - }
> -  }
> +  unsigned mask =
> + is_write ? write_mask : (1 << deref->type->vector_elements) - 1;
> +  insert_buffer_access(deref, deref->type, offset, mask, -1);
> } else {
>unsigned N = deref->type->is_double() ? 8 : 4;
>  
> @@ -863,28 +891,8 @@ lower_ubo_reference_visitor::emit_access(bool is_write,
>   ir_rvalue *chan_offset =
>  add(base_offset,
>  new(mem_ctx) ir_constant(deref_offset + i * matrix_stride));
> - if (is_write) {
> -/* If the component is not in the writemask, then don't
> - * store any value.
> - */
> -if (!((1 << i) & write_mask))
> -   continue;
> -
> -base_ir->insert_after(ssbo_store(swizzle(deref, i, 1), 
> chan_offset, 1));
> - } else {
> -if (buffer_access_type == ubo_load_access) {
> -   base_ir->insert_before(assign(deref->clone(mem_ctx, NULL),
> - ubo_load(deref_type, 
> chan_offset),
> - (1U << i)));
> -} else {
> -   ir_call *load_ssbo = ssbo_load(deref_type, chan_offset);
> -   base_ir->insert_before(load_ssbo);
> -   ir_rvalue

Re: [Mesa-dev] [PATCH 16/36] glsl ubo/ssbo: Add lower_buffer_access class

2015-11-16 Thread Iago Toral

On Mon, 2015-11-16 at 17:52 -0800, Jordan Justen wrote:
> On 2015-11-16 04:27:55, Iago Toral wrote:
> > On Sat, 2015-11-14 at 13:43 -0800, Jordan Justen wrote:
> > > This class has code that will be shared by lower_ubo_reference and
> > > lower_shared_reference. (lower_shared_reference will be used to
> > > support compute shader shared variables.)
> > > 
> > > Signed-off-by: Jordan Justen 
> > > Cc: Samuel Iglesias Gonsalvez 
> > > Cc: Iago Toral Quiroga 
> > > ---
> > >  src/glsl/Makefile.sources|   1 +
> > >  src/glsl/lower_buffer_access.cpp | 307 
> > > +++
> > >  src/glsl/lower_buffer_access.h   |  56 +++
> > >  src/glsl/lower_ubo_reference.cpp | 180 +--
> > >  4 files changed, 367 insertions(+), 177 deletions(-)
> > >  create mode 100644 src/glsl/lower_buffer_access.cpp
> > >  create mode 100644 src/glsl/lower_buffer_access.h
> > > 
> > > diff --git a/src/glsl/Makefile.sources b/src/glsl/Makefile.sources
> > > index d4b02c1..f2c95c0 100644
> > > --- a/src/glsl/Makefile.sources
> > > +++ b/src/glsl/Makefile.sources
> > > @@ -155,6 +155,7 @@ LIBGLSL_FILES = \
> > >   loop_analysis.h \
> > >   loop_controls.cpp \
> > >   loop_unroll.cpp \
> > > + lower_buffer_access.cpp \
> > >   lower_clip_distance.cpp \
> > >   lower_const_arrays_to_uniforms.cpp \
> > >   lower_discard.cpp \
> > > diff --git a/src/glsl/lower_buffer_access.cpp 
> > > b/src/glsl/lower_buffer_access.cpp
> > > new file mode 100644
> > > index 000..e0b5a2f
> > > --- /dev/null
> > > +++ b/src/glsl/lower_buffer_access.cpp
> > > @@ -0,0 +1,307 @@
> > > +/*
> > > + * Copyright (c) 2015 Intel Corporation
> > > + *
> > > + * Permission is hereby granted, free of charge, to any person obtaining 
> > > a
> > > + * copy of this software and associated documentation files (the 
> > > "Software"),
> > > + * to deal in the Software without restriction, including without 
> > > limitation
> > > + * the rights to use, copy, modify, merge, publish, distribute, 
> > > sublicense,
> > > + * and/or sell copies of the Software, and to permit persons to whom the
> > > + * Software is furnished to do so, subject to the following conditions:
> > > + *
> > > + * The above copyright notice and this permission notice (including the 
> > > next
> > > + * paragraph) shall be included in all copies or substantial portions of 
> > > the
> > > + * Software.
> > > + *
> > > + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, 
> > > EXPRESS OR
> > > + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF 
> > > MERCHANTABILITY,
> > > + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT 
> > > SHALL
> > > + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR 
> > > OTHER
> > > + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, 
> > > ARISING
> > > + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
> > > + * DEALINGS IN THE SOFTWARE.
> > > + */
> > > +
> > > +/**
> > > + * \file lower_buffer_access.cpp
> > > + *
> > > + * Helper for IR lowering pass to replace dereferences of buffer object 
> > > based
> > > + * shader variables with intrinsic function calls.
> > > + *
> > > + * This helper is used by lowering passes for UBOs, SSBOs and compute 
> > > shader
> > > + * shared variables.
> > > + */
> > > +
> > > +#include "ir.h"
> > > +#include "ir_builder.h"
> > > +#include "ir_rvalue_visitor.h"
> > > +#include "main/macros.h"
> > > +#include "util/list.h"
> > > +#include "glsl_parser_extras.h"
> > > +#include "lower_buffer_access.h"
> > > +
> > > +using namespace ir_builder;
> > > +
> > > +namespace lower_buffer_access {
> > > +
> > > +static inline int
> > > +writemask_for_size(unsigned n)
> > > +{
> > > +   return ((1 << n) - 1);
> > > +}
> > > +
> > > +/**
> > > + * Takes LHS and emits a series of assignments into its components
> > > + * from the shared variable storage.
> > 
> > I find this part of the comment a bit confusing. This function breaks a
> > dereference access into one or multiple accesses to the underlying
> > buffer storage. Such dereference could be in a RHS expression, and in
> > fact, that will always be the case for UBO and SSBO loads.
> 
> Hmm. I may have copied this comment from lower_ubo_reference some time
> back. Anyway, I intended to use the current comment from
> lower_ubo_reference:
> 
> /**
>  * Takes a deref and recursively calls itself to break the deref down to the
>  * point that the reads or writes generated are contiguous scalars or vectors.
>  */

Yeah, that looks better.

> > > + * Recursively calls itself to break the deref down to the point that
> > > + * the intrinsic calls are generated.
> > > + */
> > > +void
> > > +lower_buffer_access::emit_access(bool is_write,
> > > + ir_dereference *deref,
> > > +

Re: [Mesa-dev] [PATCH 1/2] i965: Add INTEL_DEBUG=tcs, tes and hs, ds flags for tessellation shaders.

2015-11-16 Thread Tapani Pälli


Both patches are

Reviewed-by: Tapani Pälli 

On 11/17/2015 02:36 AM, Kenneth Graunke wrote:

Even though both tessellation shader stages must be used together, I
still think it makes sense to add separate debug flags for each stage.
It makes it possible to read the TCS/HS, rule out problems, then read
the TES/DS separately, without sifting through as much printed text.

I decided to add both the GL names (tcs/tes) and hardware names (hs/ds)
so they can be used interchangeably.

Signed-off-by: Kenneth Graunke 
---
  src/mesa/drivers/dri/i965/intel_debug.c | 8 ++--
  src/mesa/drivers/dri/i965/intel_debug.h | 2 ++
  2 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/intel_debug.c 
b/src/mesa/drivers/dri/i965/intel_debug.c
index c00d2e7..f53c4ab 100644
--- a/src/mesa/drivers/dri/i965/intel_debug.c
+++ b/src/mesa/drivers/dri/i965/intel_debug.c
@@ -75,6 +75,10 @@ static const struct debug_control debug_control[] = {
 { "cs",  DEBUG_CS },
 { "hex", DEBUG_HEX },
 { "nocompact",   DEBUG_NO_COMPACTION },
+   { "hs",  DEBUG_TCS },
+   { "tcs", DEBUG_TCS },
+   { "ds",  DEBUG_TES },
+   { "tes", DEBUG_TES },
 { NULL,0 }
  };

@@ -83,8 +87,8 @@ intel_debug_flag_for_shader_stage(gl_shader_stage stage)
  {
 uint64_t flags[] = {
[MESA_SHADER_VERTEX] = DEBUG_VS,
-  [MESA_SHADER_TESS_CTRL] = 0,
-  [MESA_SHADER_TESS_EVAL] = 0,
+  [MESA_SHADER_TESS_CTRL] = DEBUG_TCS,
+  [MESA_SHADER_TESS_EVAL] = DEBUG_TES,
[MESA_SHADER_GEOMETRY] = DEBUG_GS,
[MESA_SHADER_FRAGMENT] = DEBUG_WM,
[MESA_SHADER_COMPUTE] = DEBUG_CS,
diff --git a/src/mesa/drivers/dri/i965/intel_debug.h 
b/src/mesa/drivers/dri/i965/intel_debug.h
index 98bd7e9..9c6030a 100644
--- a/src/mesa/drivers/dri/i965/intel_debug.h
+++ b/src/mesa/drivers/dri/i965/intel_debug.h
@@ -69,6 +69,8 @@ extern uint64_t INTEL_DEBUG;
  #define DEBUG_CS  (1ull << 33)
  #define DEBUG_HEX (1ull << 34)
  #define DEBUG_NO_COMPACTION   (1ull << 35)
+#define DEBUG_TCS (1ull << 36)
+#define DEBUG_TES (1ull << 37)

  #ifdef HAVE_ANDROID_PLATFORM
  #define LOG_TAG "INTEL-MESA"


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] mesa: error out in indirect draw when vertex bindings mismatch

2015-11-16 Thread Tapani Pälli




On 11/16/2015 03:12 PM, Samuel Iglesias Gonsálvez wrote:



On 13/11/15 16:55, Tapani Pälli wrote:

On 11/13/2015 03:40 PM, Samuel Iglesias Gonsálvez wrote:


On 13/11/15 11:32, Tapani Pälli wrote:

Patch adds additional mask for tracking which vertex buffer bindings
are set. This array can be directly compared to which vertex arrays
are enabled and should match when drawing.

Fixes following CTS tests:

 ES31-CTS.draw_indirect.negative-noVBO-arrays
 ES31-CTS.draw_indirect.negative-noVBO-elements

Signed-off-by: Tapani Pälli 
---
   src/mesa/main/api_validate.c | 13 +
   src/mesa/main/mtypes.h   |  3 +++
   src/mesa/main/varray.c   |  5 +
   3 files changed, 21 insertions(+)

diff --git a/src/mesa/main/api_validate.c b/src/mesa/main/api_validate.c
index a490189..e82e89a 100644
--- a/src/mesa/main/api_validate.c
+++ b/src/mesa/main/api_validate.c
@@ -710,6 +710,19 @@ valid_draw_indirect(struct gl_context *ctx,
 return GL_FALSE;
  }
   +   /* From OpenGL ES 3.1 spec. section 10.5:
+* "An INVALID_OPERATION error is generated if zero is bound to
+* VERTEX_ARRAY_BINDING, DRAW_INDIRECT_BUFFER or to any enabled
+* vertex array."
+*
+* Here we check that vertex buffer bindings match with enabled
+* vertex arrays.
+*/
+   if (ctx->Array.VAO->_Enabled != ctx->Array.VAO->VertexBindingMask) {
+  _mesa_error(ctx, GL_INVALID_OPERATION, "%s(No VBO bound)", name);
+  return GL_FALSE;
+   }
+
  if (!_mesa_valid_prim_mode(ctx, mode, name))
 return GL_FALSE;
   diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
index 4efdf1e..6c6187f 100644
--- a/src/mesa/main/mtypes.h
+++ b/src/mesa/main/mtypes.h
@@ -1419,6 +1419,9 @@ struct gl_vertex_array_object
  /** Vertex buffer bindings */
  struct gl_vertex_buffer_binding VertexBinding[VERT_ATTRIB_MAX];
   +   /** Mask indicating which binding points are set. */
+   GLbitfield64 VertexBindingMask;
+
  /** Mask of VERT_BIT_* values indicating which arrays are
enabled */
  GLbitfield64 _Enabled;
   diff --git a/src/mesa/main/varray.c b/src/mesa/main/varray.c
index 887d0c0..0a94c5a 100644
--- a/src/mesa/main/varray.c
+++ b/src/mesa/main/varray.c
@@ -174,6 +174,11 @@ bind_vertex_buffer(struct gl_context *ctx,
 binding->Offset = offset;
 binding->Stride = stride;
   +  if (vbo == ctx->Shared->NullBufferObj)
+ vao->VertexBindingMask &= ~VERT_BIT(index);
+  else
+ vao->VertexBindingMask |= VERT_BIT(index);
+

Should't it be VERT_BIT_GENERIC()?


I used VERT_BIT because that is used when enabling vertex arrays and
this mask should match that one.



For that reason, I think it is VERT_BIT_GENERIC(). See:

http://cgit.freedesktop.org/mesa/mesa/tree/src/mesa/main/varray.c#n759

Or am I missing something?


In bind_vertex_buffer, 'index' includes already the offset added by 
VERT_BIT_GENERIC, if VERT_BIT_GENERIC were used, it would add offset 
twice and mask would not match, using VERT_BIT makes exact match because 
of the offset.




Sam


Sam


 vao->NewArrays |= binding->_BoundArrays;
  }
   }





___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 3/6] glsl: initialize data.precision value in ir_variable constructor

2015-11-16 Thread Samuel Iglesias Gonsálvez

Signed-off-by: Samuel Iglesias Gonsálvez 
---
 src/glsl/ir.cpp | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/glsl/ir.cpp b/src/glsl/ir.cpp
index 8933b23..8b5ba71 100644
--- a/src/glsl/ir.cpp
+++ b/src/glsl/ir.cpp
@@ -1676,6 +1676,7 @@ ir_variable::ir_variable(const struct glsl_type *type, 
const char *name,
this->data.interpolation = INTERP_QUALIFIER_NONE;
this->data.max_array_access = 0;
this->data.atomic.offset = 0;
+   this->data.precision = GLSL_PRECISION_NONE;
this->data.image_read_only = false;
this->data.image_write_only = false;
this->data.image_coherent = false;
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 6/6] glsl: copy each field's precision information in glsl_types's structure constructor

2015-11-16 Thread Samuel Iglesias Gonsálvez

Signed-off-by: Samuel Iglesias Gonsálvez 
---
 src/glsl/nir/glsl_types.cpp | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/glsl/nir/glsl_types.cpp b/src/glsl/nir/glsl_types.cpp
index 975b815..9cc3715 100644
--- a/src/glsl/nir/glsl_types.cpp
+++ b/src/glsl/nir/glsl_types.cpp
@@ -129,6 +129,7 @@ glsl_type::glsl_type(const glsl_struct_field *fields, 
unsigned num_fields,
   this->fields.structure[i].image_coherent = fields[i].image_coherent;
   this->fields.structure[i].image_volatile = fields[i].image_volatile;
   this->fields.structure[i].image_restrict = fields[i].image_restrict;
+  this->fields.structure[i].precision = fields[i].precision;
}
 
mtx_unlock(_type::mutex);
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 1/6] nir: reduce memory footprint of glsl_struct_field's precision

2015-11-16 Thread Samuel Iglesias Gonsálvez

Signed-off-by: Samuel Iglesias Gonsálvez 
---
 src/glsl/nir/glsl_types.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/glsl/nir/glsl_types.h b/src/glsl/nir/glsl_types.h
index d841a32..2d44059 100644
--- a/src/glsl/nir/glsl_types.h
+++ b/src/glsl/nir/glsl_types.h
@@ -837,7 +837,7 @@ struct glsl_struct_field {
/**
 * Precision qualifier
 */
-   unsigned precision;
+   unsigned precision:2;
 
/**
 * Image qualifiers, applicable to buffer variables defined in shader
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 5/6] glsl: copy each field's precision information from the old gl_PerVertex interface block

2015-11-16 Thread Samuel Iglesias Gonsálvez

Signed-off-by: Samuel Iglesias Gonsálvez 
---
 src/glsl/ast_to_hir.cpp | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp
index f529243..97554cb 100644
--- a/src/glsl/ast_to_hir.cpp
+++ b/src/glsl/ast_to_hir.cpp
@@ -6603,6 +6603,8 @@ ast_interface_block::hir(exec_list *instructions,
earlier_per_vertex->fields.structure[j].sample;
 fields[i].patch =
earlier_per_vertex->fields.structure[j].patch;
+fields[i].precision =
+   earlier_per_vertex->fields.structure[j].precision;
  }
   }
 
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/6] glsl/nir: initialize precision field in glsl_struct_field constructor

2015-11-16 Thread Samuel Iglesias Gonsálvez

Signed-off-by: Samuel Iglesias Gonsálvez 
---
 src/glsl/nir/glsl_types.h | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/glsl/nir/glsl_types.h b/src/glsl/nir/glsl_types.h
index 2d44059..d8a999a 100644
--- a/src/glsl/nir/glsl_types.h
+++ b/src/glsl/nir/glsl_types.h
@@ -851,7 +851,8 @@ struct glsl_struct_field {
 
glsl_struct_field(const struct glsl_type *_type, const char *_name)
   : type(_type), name(_name), location(-1), interpolation(0), centroid(0),
-sample(0), matrix_layout(GLSL_MATRIX_LAYOUT_INHERITED), patch(0)
+sample(0), matrix_layout(GLSL_MATRIX_LAYOUT_INHERITED), patch(0),
+precision(GLSL_PRECISION_NONE)
{
   /* empty */
}
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 4/6] glsl: copy each field's precision information when generating varying variables

2015-11-16 Thread Samuel Iglesias Gonsálvez

Signed-off-by: Samuel Iglesias Gonsálvez 
---
 src/glsl/builtin_variables.cpp | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/glsl/builtin_variables.cpp b/src/glsl/builtin_variables.cpp
index b927d50..fc7a3c3 100644
--- a/src/glsl/builtin_variables.cpp
+++ b/src/glsl/builtin_variables.cpp
@@ -1187,6 +1187,7 @@ builtin_variable_generator::generate_varyings()
  var->data.centroid = fields[i].centroid;
  var->data.sample = fields[i].sample;
  var->data.patch = fields[i].patch;
+ var->data.precision = fields[i].precision;
  var->init_interface_type(per_vertex_out_type);
   }
}
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 07/11] i965: Move postprocess_nir to codegen time

2015-11-16 Thread Iago Toral

Hi Jason,

On Mon, 2015-11-16 at 07:50 -0800, Jason Ekstrand wrote:
> 
> On Nov 16, 2015 2:01 AM, "Iago Toral"  wrote:
> >
> > On Fri, 2015-11-13 at 07:34 -0800, Jason Ekstrand wrote:
> > >
> > > On Nov 13, 2015 5:53 AM, "Iago Toral"  wrote:
> > > >
> > > > On Wed, 2015-11-11 at 17:26 -0800, Jason Ekstrand wrote:
> > > > > ---
> > > > >  src/mesa/drivers/dri/i965/brw_fs.cpp  | 11
> > > +--
> > > > >  src/mesa/drivers/dri/i965/brw_nir.c   |  1 -
> > > > >  src/mesa/drivers/dri/i965/brw_vec4.cpp|  5 -
> > > > >  src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp |  6 +-
> > > > >  4 files changed, 18 insertions(+), 5 deletions(-)
> > > > >
> > > > > diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp
> > > b/src/mesa/drivers/dri/i965/brw_fs.cpp
> > > > > index ad94fa4..b8713ab 100644
> > > > > --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
> > > > > +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
> > > > > @@ -43,6 +43,7 @@
> > > > >  #include "brw_wm.h"
> > > > >  #include "brw_fs.h"
> > > > >  #include "brw_cs.h"
> > > > > +#include "brw_nir.h"
> > > > >  #include "brw_vec4_gs_visitor.h"
> > > > >  #include "brw_cfg.h"
> > > > >  #include "brw_dead_control_flow.h"
> > > > > @@ -5459,13 +5460,16 @@ brw_compile_fs(const struct
> brw_compiler
> > > *compiler, void *log_data,
> > > > > void *mem_ctx,
> > > > > const struct brw_wm_prog_key *key,
> > > > > struct brw_wm_prog_data *prog_data,
> > > > > -   const nir_shader *shader,
> > > > > +   const nir_shader *src_shader,
> > > > > struct gl_program *prog,
> > > > > int shader_time_index8, int
> shader_time_index16,
> > > > > bool use_rep_send,
> > > > > unsigned *final_assembly_size,
> > > > > char **error_str)
> > > > >  {
> > > > > +   nir_shader *shader = nir_shader_clone(mem_ctx,
> src_shader);
> > > > > +   brw_postprocess_nir(shader, compiler->devinfo, true);
> > > > > +
> > > >
> > > > Maybe it is a silly question, but why do we need to clone the
> shader
> > > to
> > > > do this?
> > >
> > > Because brw_compile_foo may be called multiple times on the same
> > > shader source. Since brw_postprocess_nir alters the shader source,
> we
> > > need to make a copy.
> >
> > Ok, trying to see if I get the big picture of the series:
> >
> > So the situation before this change is that we were running
> > brw_postprocess_nir in brw_create_nir (so at link-time) before we
> ran
> > brw_compile_foo (at codegen/drawing time), and thus, we never had
> this
> > problem. We still had to fix codegen for texture rectangle when
> drawing
> > though, which we were doing with rescale_texcoord().
> >
> > With this change, we handle texture rectangle in
> brw_postprocess_nir()
> > so we don't need rescale_texcoord() any more, however, this needs to
> be
> > done at codegen time anyway, so as a consequence, now we have to
> move
> > all of brw_postprocess_nir there and we have to clone the NIR
> shader.
> >
> > Did I get it right?
> 
> It actually happens in brw_apply_foo_key, but yes.
> 
> > If I did then I wonder about the performance impact of this change,
> > since codegen happens when we draw and there is plenty of things
> > happening in brw_postprocess_nir (plus the cloning). Is it worth it?
> 
> That's a very hard question to answer definitively.  However, here are
> a few data-points:

Thanks for the very detailed reply:

>   a) At one point we were doing all our NIR stuff on-demand.  No one
> complained about the performance impact.
> 
>   b) We pre-compile for the common case at link time so we will only
> hit this at draw time if we actually need a recompile.

Yeah, this is true. Although it looks like with this we are stepping a
bit further into the performance unpredictability issues that have
usually been a concern with OpenGL drivers.

>   c) While brw_postprocess_nir looks like it does a lot of stuff, it
> calls a fixed number of mostly linear-time passes.  The most expensive
> is almost certainly the the out-of-SSA pass and that one is on the
> order of register allocation in the back-end:
> http://people.freedesktop.org/~cwabbott0/perf-shader-db-nir.svg

Yeah, that could be true.

> Will it have an affect? Yes.  Will it be that bad? I don't think so.
> It also has some advantages.  In the case of texture-rectangle, it
> lets us delete some fairly nasty code in the fs back-end compiler and
> gives us support (without porting that nasty code) in the vec4
> back-end.  In the case of texture swizzle, it lets us share code
> between the backends, cleans up the backends, and gives NIR a chance
> to optimize the swizzle.  I think this last point is important.  There
> are some cases such as "if (...) { a = tex } else { a = tex }" where
> we end up with a pipeline-stalling move right after both tex
> operations that the FS backend has a lot

Re: [Mesa-dev] [PATCH 11/11] i965: Use nir_lower_tex for texture coordinate lowering

2015-11-16 Thread Iago Toral

On Mon, 2015-11-16 at 07:55 -0800, Jason Ekstrand wrote:
> On Mon, Nov 16, 2015 at 6:27 AM, Iago Toral  wrote:
> > On Mon, 2015-11-16 at 11:33 +0100, Iago Toral wrote:
> >> On Wed, 2015-11-11 at 17:26 -0800, Jason Ekstrand wrote:
> >> > Previously, we had a rescale_texcoords helper in the FS backend for
> >> > handling rescaling of texture coordinates.  Now that we can do variants 
> >> > in
> >> > NIR, we can use nir_lower_tex to do the rescaling for us.  This allows us
> >> > to delete the i965-specific code and gives us proper TEXTURE_RECTANGLE 
> >> > and
> >> > GL_CLAMP handling in vertex and geometry shaders.
> >> > ---
> >> >  src/mesa/drivers/dri/i965/brw_fs.cpp  |   2 +
> >> >  src/mesa/drivers/dri/i965/brw_fs.h|   3 -
> >> >  src/mesa/drivers/dri/i965/brw_fs_nir.cpp  |   4 +-
> >> >  src/mesa/drivers/dri/i965/brw_fs_visitor.cpp  | 125 
> >> > --
> >> >  src/mesa/drivers/dri/i965/brw_nir.c   |  23 
> >> >  src/mesa/drivers/dri/i965/brw_nir.h   |   6 ++
> >> >  src/mesa/drivers/dri/i965/brw_vec4.cpp|   2 +
> >> >  src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp |   2 +
> >> >  8 files changed, 36 insertions(+), 131 deletions(-)
> >> >
> >> > diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
> >> > b/src/mesa/drivers/dri/i965/brw_fs.cpp
> >> > index b8713ab..c56cafe 100644
> >> > --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
> >> > +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
> >> > @@ -5468,6 +5468,7 @@ brw_compile_fs(const struct brw_compiler 
> >> > *compiler, void *log_data,
> >> > char **error_str)
> >> >  {
> >> > nir_shader *shader = nir_shader_clone(mem_ctx, src_shader);
> >> > +   brw_nir_apply_sampler_key(shader, compiler->devinfo, >tex, 
> >> > true);
> >>
> >> This looks like it is part of the post-processing process. In fact, you
> >> call this right before brw_postprocess_nir() for every stage. Why not
> >> just add a key parameter to brw_postprocess_nir() and call
> >> brw_nir_apply_sampler_key from there instead?
> 
> Well, right now, brw_nir_apply_sampler_key is used for all stages.
> However, if we do more variant stuff in NIR, we'll need to have
> brw_nir_apply_vs_key, brw_nir_apply_fs_key, etc. and we can't put it
> in postprocess_nir.  I didn't want to join it prematurely.

Oh right, that makes sense.

> >> Either way,
> >> Reviewed-by: Iago Toral Quiroga 
> 
> Thanks!
> 
> >> > brw_postprocess_nir(shader, compiler->devinfo, true);
> >> >
> >> > /* key->alpha_test_func means simulating alpha testing via discards,
> >> > @@ -5628,6 +5629,7 @@ brw_compile_cs(const struct brw_compiler 
> >> > *compiler, void *log_data,
> >> > char **error_str)
> >> >  {
> >> > nir_shader *shader = nir_shader_clone(mem_ctx, src_shader);
> >> > +   brw_nir_apply_sampler_key(shader, compiler->devinfo, >tex, 
> >> > true);
> >> > brw_postprocess_nir(shader, compiler->devinfo, true);
> >> >
> >> > prog_data->local_size[0] = shader->info.cs.local_size[0];
> >> > diff --git a/src/mesa/drivers/dri/i965/brw_fs.h 
> >> > b/src/mesa/drivers/dri/i965/brw_fs.h
> >> > index 2dfcab1..8a181d7 100644
> >> > --- a/src/mesa/drivers/dri/i965/brw_fs.h
> >> > +++ b/src/mesa/drivers/dri/i965/brw_fs.h
> >> > @@ -217,8 +217,6 @@ public:
> >> > void emit_interpolation_setup_gen4();
> >> > void emit_interpolation_setup_gen6();
> >> > void compute_sample_position(fs_reg dst, fs_reg int_sample_pos);
> >> > -   fs_reg rescale_texcoord(fs_reg coordinate, int coord_components,
> >> > -   bool is_rect, uint32_t sampler);
> >> > void emit_texture(ir_texture_opcode op,
> >> >   const glsl_type *dest_type,
> >> >   fs_reg coordinate, int components,
> >> > @@ -229,7 +227,6 @@ public:
> >> >   fs_reg mcs,
> >> >   int gather_component,
> >> >   bool is_cube_array,
> >> > - bool is_rect,
> >> >   uint32_t sampler,
> >> >   fs_reg sampler_reg);
> >> > fs_reg emit_mcs_fetch(const fs_reg , unsigned components,
> >> > diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp 
> >> > b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> >> > index 02b9f5b..3d83d7c 100644
> >> > --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> >> > +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> >> > @@ -2411,8 +2411,6 @@ fs_visitor::nir_emit_texture(const fs_builder 
> >> > , nir_tex_instr *instr)
> >> >
> >> > int gather_component = instr->component;
> >> >
> >> > -   bool is_rect = instr->sampler_dim == GLSL_SAMPLER_DIM_RECT;
> >> > -
> >> > bool is_cube_array = instr->sampler_dim == GLSL_SAMPLER_DIM_CUBE &&
> >> >  instr->is_array;
> >> >
> >> > @@ -2549,7 +2547,7 @@ fs_visitor::nir_emit_texture(const fs_builder 
> >> > , nir_tex_instr *instr)
>

Re: [Mesa-dev] [PATCH 14/36] glsl ubo/ssbo: Use enum to track current buffer access type

2015-11-16 Thread Iago Toral

On Mon, 2015-11-16 at 16:50 -0800, Jordan Justen wrote:
> On 2015-11-16 03:06:37, Iago Toral wrote:
> > On Sat, 2015-11-14 at 13:43 -0800, Jordan Justen wrote:
> > > Signed-off-by: Jordan Justen 
> > > Cc: Samuel Iglesias Gonsalvez 
> > > Cc: Iago Toral Quiroga 
> > > ---
> > >  src/glsl/lower_ubo_reference.cpp | 26 +-
> > >  1 file changed, 21 insertions(+), 5 deletions(-)
> > > 
> > > diff --git a/src/glsl/lower_ubo_reference.cpp 
> > > b/src/glsl/lower_ubo_reference.cpp
> > > index b74aa3d..41012db 100644
> > > --- a/src/glsl/lower_ubo_reference.cpp
> > > +++ b/src/glsl/lower_ubo_reference.cpp
> > > @@ -162,6 +162,14 @@ public:
> > > ir_call *ssbo_store(ir_rvalue *deref, ir_rvalue *offset,
> > > unsigned write_mask);
> > >  
> > > +   enum {
> > > +  ubo_load_access,
> > > +  ssbo_load_access,
> > > +  ssbo_store_access,
> > > +  ssbo_get_array_length,
> > 
> > ssbo_get_array_length misses that is for "unsized" arrays and does not
> > include the "access" prefix that the other enum values have, which makes
> > it a bit inconsistent. How about we name this
> > 'ssbo_unsized_array_length_access'? or maybe 'ssbo_unsized_array_access'
> > if we think the former is too long.
> > 
> > > +  ssbo_atomic_access,
> > > +   } buffer_access_type;
> > > +
> > > void emit_access(bool is_write, ir_dereference *deref,
> > >  ir_variable *base_offset, unsigned int deref_offset,
> > >  bool row_major, int matrix_columns,
> > > @@ -189,7 +197,6 @@ public:
> > > struct gl_uniform_buffer_variable *ubo_var;
> > > ir_rvalue *uniform_block;
> > > bool progress;
> > > -   bool is_shader_storage;
> > >  };
> > >  
> > >  /**
> > > @@ -339,10 +346,9 @@ 
> > > lower_ubo_reference_visitor::setup_for_load_or_store(ir_variable *var,
> > > deref, _block_index);
> > >  
> > > /* Locate the block by interface name */
> > > -   this->is_shader_storage = var->is_in_shader_storage_block();
> > > unsigned num_blocks;
> > > struct gl_uniform_block **blocks;
> > > -   if (this->is_shader_storage) {
> > > +   if (buffer_access_type != ubo_load_access) {
> > 
> > I think this file generally uses 'this->' to refer to class members (or
> > at least this function does), so maybe we should keep that for
> > consistency. The same in the other places where you use
> > buffer_access_type.
> 
> I don't really agree with this, but I went ahead and changed it.
> 
> > That said, right now it seems that we only ever use buffer_access_type
> > here and you always assign its value right before calling
> > setup_for_load_or_store() so maybe it is better to just make it a
> > function parameter instead of a class member? setup_for_load_or_store()
> > already has a large number of parameters, so I am not super happy about
> > the idea, but it looks more natural to me. What do you think?
> 
> This function is going to move to
> lower_buffer_access::setup_buffer_access. This class doesn't know
> about enum of buffer_access_type, since that remains part of the
> lower_ubo_reference_visitor class.
> 
> Therefore, I think we need to keep it as a member variable, so the
> insert_buffer_access virtual function implementation in
> lower_ubo_reference_visitor can make use of it.

We could define the buffer_access_type enum in the header file of
lower_buffer_access so we can use it from both lower_ubo_reference.cpp
and lower_buffer_access, so it should be possible, if we think it is a
good idea. In any case, setup_buffer_access has a lot of parameters
already... so feel free to ignore this comment if you like your
implementation better.

With the other changes I suggested,
Reviewed-by: Iago Toral Quiroga 

> -Jordan
> 
> > 
> > >num_blocks = shader->NumShaderStorageBlocks;
> > >blocks = shader->ShaderStorageBlocks;
> > > } else {
> > > @@ -552,6 +558,10 @@ lower_ubo_reference_visitor::handle_rvalue(ir_rvalue 
> > > **rvalue)
> > > int matrix_columns;
> > > unsigned packing = var->get_interface_type()->interface_packing;
> > >  
> > > +   buffer_access_type =
> > > +  var->is_in_shader_storage_block() ?
> > > +  ssbo_load_access : ubo_load_access;
> > > +
> > > /* Compute the offset to the start if the dereference as well as other
> > >  * information we need to configure the write
> > >  */
> > > @@ -795,7 +805,7 @@ lower_ubo_reference_visitor::emit_access(bool 
> > > is_write,
> > >if (is_write)
> > >   base_ir->insert_after(ssbo_store(deref, offset, write_mask));
> > >else {
> > > - if (!this->is_shader_storage) {
> > > + if (buffer_access_type == ubo_load_access) {
> > >   base_ir->insert_before(assign(deref->clone(mem_ctx, NULL),
> > > ubo_load(deref->type, 
> > > offset)));

Re: [Mesa-dev] [PATCH] mesa: error out in indirect draw when vertex bindings mismatch

2015-11-16 Thread Samuel Iglesias Gonsálvez



On 17/11/15 07:38, Tapani Pälli wrote:
> 
> 
> On 11/16/2015 03:12 PM, Samuel Iglesias Gonsálvez wrote:
>>
>>
>> On 13/11/15 16:55, Tapani Pälli wrote:
>>> On 11/13/2015 03:40 PM, Samuel Iglesias Gonsálvez wrote:

 On 13/11/15 11:32, Tapani Pälli wrote:
> Patch adds additional mask for tracking which vertex buffer bindings
> are set. This array can be directly compared to which vertex arrays
> are enabled and should match when drawing.
>
> Fixes following CTS tests:
>
>  ES31-CTS.draw_indirect.negative-noVBO-arrays
>  ES31-CTS.draw_indirect.negative-noVBO-elements
>
> Signed-off-by: Tapani Pälli 
> ---
>src/mesa/main/api_validate.c | 13 +
>src/mesa/main/mtypes.h   |  3 +++
>src/mesa/main/varray.c   |  5 +
>3 files changed, 21 insertions(+)
>
> diff --git a/src/mesa/main/api_validate.c
> b/src/mesa/main/api_validate.c
> index a490189..e82e89a 100644
> --- a/src/mesa/main/api_validate.c
> +++ b/src/mesa/main/api_validate.c
> @@ -710,6 +710,19 @@ valid_draw_indirect(struct gl_context *ctx,
>  return GL_FALSE;
>   }
>+   /* From OpenGL ES 3.1 spec. section 10.5:
> +* "An INVALID_OPERATION error is generated if zero is
> bound to
> +* VERTEX_ARRAY_BINDING, DRAW_INDIRECT_BUFFER or to any
> enabled
> +* vertex array."
> +*
> +* Here we check that vertex buffer bindings match with enabled
> +* vertex arrays.
> +*/
> +   if (ctx->Array.VAO->_Enabled !=
> ctx->Array.VAO->VertexBindingMask) {
> +  _mesa_error(ctx, GL_INVALID_OPERATION, "%s(No VBO bound)",
> name);
> +  return GL_FALSE;
> +   }
> +
>   if (!_mesa_valid_prim_mode(ctx, mode, name))
>  return GL_FALSE;
>diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
> index 4efdf1e..6c6187f 100644
> --- a/src/mesa/main/mtypes.h
> +++ b/src/mesa/main/mtypes.h
> @@ -1419,6 +1419,9 @@ struct gl_vertex_array_object
>   /** Vertex buffer bindings */
>   struct gl_vertex_buffer_binding VertexBinding[VERT_ATTRIB_MAX];
>+   /** Mask indicating which binding points are set. */
> +   GLbitfield64 VertexBindingMask;
> +
>   /** Mask of VERT_BIT_* values indicating which arrays are
> enabled */
>   GLbitfield64 _Enabled;
>diff --git a/src/mesa/main/varray.c b/src/mesa/main/varray.c
> index 887d0c0..0a94c5a 100644
> --- a/src/mesa/main/varray.c
> +++ b/src/mesa/main/varray.c
> @@ -174,6 +174,11 @@ bind_vertex_buffer(struct gl_context *ctx,
>  binding->Offset = offset;
>  binding->Stride = stride;
>+  if (vbo == ctx->Shared->NullBufferObj)
> + vao->VertexBindingMask &= ~VERT_BIT(index);
> +  else
> + vao->VertexBindingMask |= VERT_BIT(index);
> +
 Should't it be VERT_BIT_GENERIC()?
>>>
>>> I used VERT_BIT because that is used when enabling vertex arrays and
>>> this mask should match that one.
>>>
>>
>> For that reason, I think it is VERT_BIT_GENERIC(). See:
>>
>> http://cgit.freedesktop.org/mesa/mesa/tree/src/mesa/main/varray.c#n759
>>
>> Or am I missing something?
> 
> In bind_vertex_buffer, 'index' includes already the offset added by
> VERT_BIT_GENERIC, if VERT_BIT_GENERIC were used, it would add offset
> twice and mask would not match, using VERT_BIT makes exact match because
> of the offset.
> 

OK, you are right. This patch is:

Reviewed-by: Samuel Iglesias Gonsálvez 

Thanks!

Sam

> 
>> Sam
>>
 Sam

>  vao->NewArrays |= binding->_BoundArrays;
>   }
>}
>
>>>
>>>
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] mesa: error out in indirect draw when vertex bindings mismatch

2015-11-16 Thread Tapani Pälli




On 11/16/2015 08:55 AM, Tapani Pälli wrote:



On 11/13/2015 07:18 PM, Fredrik Höglund wrote:

On Friday 13 November 2015, Tapani Pälli wrote:

Patch adds additional mask for tracking which vertex buffer bindings
are set. This array can be directly compared to which vertex arrays
are enabled and should match when drawing.

Fixes following CTS tests:

ES31-CTS.draw_indirect.negative-noVBO-arrays
ES31-CTS.draw_indirect.negative-noVBO-elements

Signed-off-by: Tapani Pälli 
---
  src/mesa/main/api_validate.c | 13 +
  src/mesa/main/mtypes.h   |  3 +++
  src/mesa/main/varray.c   |  5 +
  3 files changed, 21 insertions(+)

diff --git a/src/mesa/main/api_validate.c b/src/mesa/main/api_validate.c
index a490189..e82e89a 100644
--- a/src/mesa/main/api_validate.c
+++ b/src/mesa/main/api_validate.c
@@ -710,6 +710,19 @@ valid_draw_indirect(struct gl_context *ctx,
return GL_FALSE;
 }

+   /* From OpenGL ES 3.1 spec. section 10.5:
+* "An INVALID_OPERATION error is generated if zero is bound to
+* VERTEX_ARRAY_BINDING, DRAW_INDIRECT_BUFFER or to any enabled
+* vertex array."
+*
+* Here we check that vertex buffer bindings match with enabled
+* vertex arrays.
+*/
+   if (ctx->Array.VAO->_Enabled != ctx->Array.VAO->VertexBindingMask) {


This test only works when the enabled vertex arrays are associated with
their default vertex buffer binding points.


Could you open up this more, is there some existing test or app that
would do this? Would be great for testing purposes, all the indirect
draw rendering CTS tests pass with this change.


Sorry, the question does not make sense. What I meant is that do you 
know some app that would fail this test to help debugging/fixing the issue?






+  _mesa_error(ctx, GL_INVALID_OPERATION, "%s(No VBO bound)", name);
+  return GL_FALSE;
+   }
+
 if (!_mesa_valid_prim_mode(ctx, mode, name))
return GL_FALSE;

diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
index 4efdf1e..6c6187f 100644
--- a/src/mesa/main/mtypes.h
+++ b/src/mesa/main/mtypes.h
@@ -1419,6 +1419,9 @@ struct gl_vertex_array_object
 /** Vertex buffer bindings */
 struct gl_vertex_buffer_binding VertexBinding[VERT_ATTRIB_MAX];

+   /** Mask indicating which binding points are set. */
+   GLbitfield64 VertexBindingMask;
+
 /** Mask of VERT_BIT_* values indicating which arrays are
enabled */
 GLbitfield64 _Enabled;

diff --git a/src/mesa/main/varray.c b/src/mesa/main/varray.c
index 887d0c0..0a94c5a 100644
--- a/src/mesa/main/varray.c
+++ b/src/mesa/main/varray.c
@@ -174,6 +174,11 @@ bind_vertex_buffer(struct gl_context *ctx,
binding->Offset = offset;
binding->Stride = stride;

+  if (vbo == ctx->Shared->NullBufferObj)
+ vao->VertexBindingMask &= ~VERT_BIT(index);
+  else
+ vao->VertexBindingMask |= VERT_BIT(index);
+
vao->NewArrays |= binding->_BoundArrays;
 }
  }




___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

97 matches

Mail list logo