date:20160209

[Mesa-dev] Are gallium unpack_rgba_8unorm/pack_rgba_8unorm safe for in-place conversion ?

2016-02-09 Thread Axel Davy


Hi,

We'd need to do some formats conversion in gallium nine, and if possible 
we would like to do them in-place.


unpack_rgba_8unorm/pack_rgba_8unorm doesn't seem to explicitly allow 
in-place conversion,

but the generated code seems to be fine with that.

Can we rely on these functions to be safe for in-place conversion ? 
Could we add that somewhere as requirement ?


CC-ing vmware guys, as they probably know.

Yours,

Axel Davy
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] radeonsi: enable denorms for 64-bit and 16-bit floats

2016-02-09 Thread Matt Arsenault


> On Feb 9, 2016, at 11:23, Tom Stellard  wrote:
> 
> We should still add +fp64-denormals even if the backend doesn't do
> anything with it now.

This is the default, so it doesn’t really matter anyway.

-Matt___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 93813] Incorrect viewport range when GL_CLIP_ORIGIN is GL_UPPER_LEFT

2016-02-09 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=93813

Brian Paul  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|NEW |RESOLVED

--- Comment #10 from Brian Paul  ---
Should be fixed with commit fe14110f359b0665cb0c09aa14f13a5ebb33b1bc

There's also a new piglit test for glClipControl + glViewport.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] Are gallium unpack_rgba_8unorm/pack_rgba_8unorm safe for in-place conversion ?

2016-02-09 Thread Ian Romanick

On 02/09/2016 11:30 AM, Jose Fonseca wrote:
> On 09/02/16 19:17, Axel Davy wrote:
>> Hi,
>>
>> We'd need to do some formats conversion in gallium nine, and if possible
>> we would like to do them in-place.
>>
>> unpack_rgba_8unorm/pack_rgba_8unorm doesn't seem to explicitly allow
>> in-place conversion,
>> but the generated code seems to be fine with that.
>>
>> Can we rely on these functions to be safe for in-place conversion ?
>> Could we add that somewhere as requirement ?
>>
>> CC-ing vmware guys, as they probably know.
>>
>> Yours,
>>
>> Axel Davy
> 
> It might work in practice now, but I don't think this is same it should
> be dependent upon.
> 
> Take BGRA <-> RGBA: currently we read the dword, swizzle, and write
> dword back, but it's an implementation detail.  One could imagine at
> some point the implementation be changed to read/write bytes at a time,
> which would break if done in place.
> 
> 
> My recommendation is, if you want to do this things in place: malloc a
> temp with the size of a row of pixels, convert into temp row then memcpy
> into src.

The other option is to make this explicit and add a unit test to enforce
the behavior.

> You can even add util_convert_format_inplace() helper if you want.
> 
> And if there's any hot-path (e.g., BGRA<->RGBA) just add a special code
> path to this new util_convert_format_inplace that avoids the temp copy.
> 
> Jose
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 02/23] i965: Use miptree non-aligned dimensions directly for x-tiled

2016-02-09 Thread Ben Widawsky

On Mon, Feb 08, 2016 at 06:51:22PM +0200, Topi Pohjolainen wrote:
> The logic in intel_miptree_create() uses the local copies
> for 64-byte aligned equivalent but only for stencil buffers which
> in turn are never x-tiled. This makes the logic a little more
> explicit and helps to keep subsequent patches easier to read.
> 
> Signed-off-by: Topi Pohjolainen 
> ---
>  src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
> b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> index 0edd59f..033f4c6 100644
> --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> @@ -698,7 +698,7 @@ intel_miptree_create(struct brw_context *brw,
>mt->tiling = I915_TILING_X;
>drm_intel_bo_unreference(mt->bo);
>mt->bo = drm_intel_bo_alloc_tiled(brw->bufmgr, "miptree",
> -  total_width, total_height, mt->cpp,
> +  mt->total_width, mt->total_height, mt->cpp,
>>tiling, , alloc_flags);
>mt->pitch = pitch;
> }

Maybe you can just move the stencil alignment down into the if statement and
make it even better isolated. Right now I think it's a little confusing to have
the local variables defined and unused except gen < 6 + stencil. Either way:
Reviewed-by: Ben Widawsky 


BTW, maybe you can see if this still makes sense:
https://patchwork.freedesktop.org/patch/56792/
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 01/23] i965: Let caller of intel_miptree_create_layout() decide msaa layout

2016-02-09 Thread Ben Widawsky

On Mon, Feb 08, 2016 at 06:51:21PM +0200, Topi Pohjolainen wrote:
> Signed-off-by: Topi Pohjolainen 
> ---
>  src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 15 +++
>  1 file changed, 11 insertions(+), 4 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
> b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> index 108dd87..0edd59f 100644
> --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> @@ -64,8 +64,11 @@ intel_miptree_alloc_mcs(struct brw_context *brw,
>   */
>  static enum intel_msaa_layout
>  compute_msaa_layout(struct brw_context *brw, mesa_format format,
> -bool disable_aux_buffers)
> +unsigned num_samples, bool disable_aux_buffers)
>  {
> +   if (num_samples <= 1)
> +  return INTEL_MSAA_LAYOUT_NONE;
> +
> /* Prior to Gen7, all MSAA surfaces used IMS layout. */
> if (brw->gen < 7)
>return INTEL_MSAA_LAYOUT_IMS;
> @@ -299,6 +302,7 @@ intel_miptree_create_layout(struct brw_context *brw,
>  GLuint height0,
>  GLuint depth0,
>  GLuint num_samples,
> +enum intel_msaa_layout msaa_layout,

Is there a reason why you decided not to roll this into layout flags? It seems
to fit into that pretty well IMO.

>  uint32_t layout_flags)
>  {
> struct intel_mipmap_tree *mt = calloc(sizeof(*mt), 1);
> @@ -343,13 +347,11 @@ intel_miptree_create_layout(struct brw_context *brw,
> mt->cpp = _mesa_get_format_bytes(format);
> mt->num_samples = num_samples;
> mt->compressed = _mesa_is_format_compressed(format);
> -   mt->msaa_layout = INTEL_MSAA_LAYOUT_NONE;
> +   mt->msaa_layout = msaa_layout;
> mt->refcount = 1;
>  
> if (num_samples > 1) {
>/* Adjust width/height/depth for MSAA */
> -  mt->msaa_layout = compute_msaa_layout(brw, format,
> -mt->disable_aux_buffers);
>if (mt->msaa_layout == INTEL_MSAA_LAYOUT_IMS) {
>   /* From the Ivybridge PRM, Volume 1, Part 1, page 108:
>* "If the surface is multisampled and it is a depth or stencil
> @@ -636,6 +638,8 @@ intel_miptree_create(struct brw_context *brw,
> mt = intel_miptree_create_layout(brw, target, format,
>  first_level, last_level, width0,
>  height0, depth0, num_samples,
> +compute_msaa_layout(brw, format,
> +num_samples, false),
>  layout_flags);
> /*
>  * pitch == 0 || height == 0  indicates the null texture
> @@ -743,6 +747,7 @@ intel_miptree_create_for_bo(struct brw_context *brw,
> struct intel_mipmap_tree *mt;
> uint32_t tiling, swizzle;
> GLenum target;
> +   const bool disable_aux_buffers = layout_flags & 
> MIPTREE_LAYOUT_DISABLE_AUX;
>  
> drm_intel_bo_get_tiling(bo, , );
>  
> @@ -769,6 +774,8 @@ intel_miptree_create_for_bo(struct brw_context *brw,
> mt = intel_miptree_create_layout(brw, target, format,
>  0, 0,
>  width, height, depth, 0,
> +compute_msaa_layout(brw, format, 0,
> +disable_aux_buffers),

If you roll in to layout flags, you just pass layout flags to
compute_msaa_layout()

>  layout_flags);
> if (!mt)
>return NULL;

Patch makes sense to me though, and I don't spot anything wrong with the
mechanics. I'd prefer you use layout flags, but either way it's:
Reviewed-by: Ben Widawsky 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] [v2] i965: Make sure we blit a full compressed block

2016-02-09 Thread Ian Romanick

On 02/09/2016 11:58 AM, Ben Widawsky wrote:
> On Mon, Feb 08, 2016 at 09:48:54PM -0800, Jason Ekstrand wrote:
>> On Sat, Feb 6, 2016 at 6:11 PM, Ben Widawsky 
>> wrote:
>>
>>> This fixes an assertion failure in [at least] one of the Unreal Engine
>>> Linux
>>> demo/games that uses DXT1 compression. Specifically, the "Vehicle Game".
>>>
>>> At some point, the game ends up trying to blit mip level whose size is 2x2,
>>> which is smaller than a DXT1 block. As a result, the assertion in the blit
>>> path
>>> is triggered. It should be safe to simply make sure we align the width and
>>> height, which is sadly an example of compression being less efficient.
>>>
>>> NOTE: The demo seems to work fine without the assert, and therefore release
>>> builds of mesa wouldn't stumble over this. Perhaps there is some
>>> unnoticeable
>>> corruption, but I had trouble spotting it.
>>>
>>> Thanks to Jason for looking at my backtrace and figuring out what was
>>> going on.
>>>
>>> v2: Use NPOT alignment to make sure ASTC is handled properly (Ilia)
>>> Remove comment about how this doesn't fix other bugs, because it does.
>>>
>>> Cc: Jason Ekstrand 
>>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93358
>>> Signed-off-by: Ben Widawsky 
>>> ---
>>>  src/mesa/drivers/dri/i965/intel_copy_image.c | 13 +
>>>  1 file changed, 13 insertions(+)
>>>
>>> diff --git a/src/mesa/drivers/dri/i965/intel_copy_image.c
>>> b/src/mesa/drivers/dri/i965/intel_copy_image.c
>>> index 0a3337e..42bd7ff 100644
>>> --- a/src/mesa/drivers/dri/i965/intel_copy_image.c
>>> +++ b/src/mesa/drivers/dri/i965/intel_copy_image.c
>>> @@ -212,6 +212,7 @@ intel_copy_image_sub_data(struct gl_context *ctx,
>>> struct brw_context *brw = brw_context(ctx);
>>> struct intel_mipmap_tree *src_mt, *dst_mt;
>>> unsigned src_level, dst_level;
>>> +   GLuint bw, bh;
>>>
>>> if (_mesa_meta_CopyImageSubData_uncompressed(ctx,
>>>  src_image,
>>> src_renderbuffer,
>>> @@ -275,6 +276,18 @@ intel_copy_image_sub_data(struct gl_context *ctx,
>>> intel_miptree_all_slices_resolve_depth(brw, dst_mt);
>>> intel_miptree_resolve_color(brw, dst_mt);
>>>
>>> +   _mesa_get_format_block_size(src_mt->format, , );
>>> +   /* It's legal to have a WxH that's smaller than a compressed block.
>>> This
>>> +* happens for example when you are using a higher level LOD. For this
>>> case,
>>> +* we still want to copy the entire block, or else the decompression
>>> will be
>>> +* incorrect.
>>> +*/
>>> +   if (src_width < bw)
>>> +  src_width = ALIGN_NPOT(src_width, bw);
>>> +
>>> +   if (src_height < bh)
>>> +  src_height = ALIGN_NPOT(src_height, bh);
>>>
>>
>> I've been going back-and-forth as to whether or not this is the right place
>> to do this or if we wanted it further up or down the stack.  At the end of
>> the day, I concluded that it's as good a place as any.
>>
>> Reviewed-by: Jason Ekstrand 
>> Cc "11.0 11.1" 
> 
> I left out stable intentionally because we will only hit the assert in debug
> builds, and AFAICT, we'll only get minor corruption by this not being there -
> but it's up to you. I just figured I'd share my thoughts in case they weren't
> clear. I can add stable if you still think it wise.

I think it's a good candidate for stable.  It's pretty low risk, and it
fixes something real.

Is there a piglit test? :)

> Thanks for review.
> 
>>
>> --Jason
>>
>>
>>> +
>>> if (copy_image_with_blitter(brw, src_mt, src_level,
>>> src_x, src_y, src_z,
>>> dst_mt, dst_level,
>>> --
>>> 2.7.0
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] Are gallium unpack_rgba_8unorm/pack_rgba_8unorm safe for in-place conversion ?

2016-02-09 Thread Axel Davy


Hi again,

Actually, "util_format_translate" seems to fit our needs.
It could probably be optimised better (8unorm fitting format->ARGB could 
avoid using tmp buffer for example),

but that should be ok for our needs.

Yours,

Axel Davy

On 09/02/2016 20:17, Axel Davy wrote:

Hi,

We'd need to do some formats conversion in gallium nine, and if 
possible we would like to do them in-place.


unpack_rgba_8unorm/pack_rgba_8unorm doesn't seem to explicitly allow 
in-place conversion,

but the generated code seems to be fine with that.

Can we rely on these functions to be safe for in-place conversion ? 
Could we add that somewhere as requirement ?


CC-ing vmware guys, as they probably know.

Yours,

Axel Davy
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 3/4] glsl: apply align layout qualifier rules to block offsets

2016-02-09 Thread Ian Romanick

On 02/09/2016 08:56 AM, Ian Romanick wrote:
> On 02/09/2016 06:06 AM, Samuel Iglesias Gonsálvez wrote:
>>
>> On Tue, 2016-01-12 at 20:34 +1100, Timothy Arceri wrote:
>>> From Section 4.4.5 (Uniform and Shader Storage Block Layout
>>> Qualifiers) of the OpenGL 4.50 spec:
>>>
>>>   "The align qualifier makes the start of each block member have a
>>>   minimum byte alignment.  It does not affect the internal layout
>>>   within each member, which will still follow the std140 or std430
>>>   rules. The specified alignment must be a power of 2, or a
>>>   compile-time error results.
>>>
>>>   The actual alignment of a member will be the greater of the
>>>   specified align alignment and the standard (e.g., std140) base
>>>   alignment for the member's type. The actual offset of a member is
>>>   computed as follows: If offset was declared, start with that
>>>   offset, otherwise start with the next available offset. If the
>>>   resulting offset is not a multiple of the actual alignment,
>>>   increase it to the first offset that is a multiple of the actual
>>>   alignment. This results in the actual offset the member will have.
>>>
>>>   When align is applied to an array, it affects only the start of
>>>   the array, not the array's internal stride. Both an offset and an
>>>   align qualifier can be specified on a declaration.
>>>
>>>   The align qualifier, when used on a block, has the same effect as
>>>   qualifying each member with the same align value as declared on
>>>   the block, and gets the same compile-time results and errors as if
>>>   this had been done. As described in general earlier, an individual
>>>   member can specify its own align, which overrides the block-level
>>>   align, but just for that member.
>>> ---
>>>  src/glsl/ast_to_hir.cpp | 51
>>> ++---
>>>  1 file changed, 48 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp
>>> index f5da771..fd05fd7 100644
>>> --- a/src/glsl/ast_to_hir.cpp
>>> +++ b/src/glsl/ast_to_hir.cpp
>>> @@ -6212,7 +6212,8 @@
>>> ast_process_struct_or_iface_block_members(exec_list *instructions,
>>>ir_variable_mode var_mode,
>>>ast_type_qualifier
>>> *layout,
>>>unsigned block_stream,
>>> -  unsigned expl_location)
>>> +  unsigned expl_location,
>>> +  unsigned expl_align)
>>>  {
>>> unsigned decl_count = 0;
>>> unsigned next_offset = 0;
>>> @@ -6466,6 +6467,34 @@
>>> ast_process_struct_or_iface_block_members(exec_list *instructions,
>>>  }
>>>   } else {
>>>  fields[i].offset = -1;
>>> + }
>>> +
>>> + if (qual->flags.q.explicit_align || expl_align != 0) {
>>> +unsigned offset = fields[i].offset != -1 ?
>>> fields[i].offset :
>>> +   next_offset;
>>> +if (align == 0 || size == 0) {
>>> +   _mesa_glsl_error(, state, "align can only be used
>>> with "
>>> +"std430 and std140 layouts");
>>> +} else if (qual->flags.q.explicit_align) {
>>> +   unsigned member_align;
>>> +   if (process_qualifier_constant(state, , "align",
>>> +  qual->align,
>>> _align)) {
>>> +  if (member_align == 0 ||
>>
>> I have modified ubo-block-align-zero.vert piglit test to set the align
>> qualifier only to block members and tested with align = 0. The shader
>> compiles on NVIDIA proprietary driver.
>>
>> According to the spec, the specified alignment must be a power of 2.
>> However align = 0 could have different interpretations (for example,
>> when applied to a block member, it would mean to ignore block's align
>> value). As I am not sure about this case...
>>
>> Have you checked if align = 0 is invalid?
> 
> I looked at the ARB_enhanced_layouts spec, and it doesn't provide any
> real guidance.  My gut tells me that align=0 is not valid because the
> spec doesn't say what it means.  Either way, I have submitted a spec bug:
> 
> https://www.khronos.org/bugzilla/show_bug.cgi?id=1461
> 
> Does glslang allow layout(align=0)?  AMD's closed-source driver?

According to https://www.khronos.org/bugzilla/show_bug.cgi?id=1461#c1
the Khronos reference compiler does not allow align=0, so, lacking any
contradictory evidence from the spec, I'm happy to not allow it either.

Now I get to figure out how to add a conformance test for this... >:)

>> Sam
>>
>>> +  member_align & (member_align - 1)) {
>>> + _mesa_glsl_error(, state, "align layout
>>> qualifier "
>>> +  "in not a power of 2");
>>> +  } else {
>>> +

Re: [Mesa-dev] [PATCH 3/5] nir: Do opt_algebraic in reverse order.

2016-02-09 Thread Ian Romanick

On 02/08/2016 04:21 PM, Matt Turner wrote:
> On Mon, Feb 8, 2016 at 3:57 PM, Ian Romanick  wrote:
>> On 02/04/2016 05:47 PM, Matt Turner wrote:
>>> Walking the SSA definitions in order means that we consider the smallest
>>> algebraic optimizations before larger optimizations. So if a smaller
>>> rule is part of a larger rule, the smaller one will happen first,
>>> preventing the larger one from happening.
>>
>> Doesn't that just mean that our "larger pattern" space is somehow
>> incompletely?  This seems to indicate that applications could (but
>> probably don't?) open-code these patterns and we'll miss them.
> 
> I don't understand the question. Does my reply to Eduardo perhaps
> answer your question?

No.  I understood what he was asking about.  Let me provide a somewhat
contrived example.  Say we had only the following set of algebraic
optimizations:

   (('fadd', ('fmul', a, b), c), ('ffma', a, b, c)),
   # Replace c*b - c*a + a with flrp
   (('fsub', ('fmul', a, b), ('fadd', ('fmul', c, a), a)),
('flrp', a, b, c)),

Without your patch, the first optimization would occur, and the second
would never happen.  With your patch, the larger tree is examined first,
so the second optimization patter matches.

However, an application could still open code (a*b)-fma(c, a, a), and
we'll miss it with or without your patch.  The patch prevents the
optimizer from fooling itself, but the application can still fool it.
Are we better off adding more patterns so that the application can't
fool it too?
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] radeonsi: enable denorms for 64-bit and 16-bit floats

2016-02-09 Thread Tom Stellard

On Mon, Feb 08, 2016 at 09:38:32PM +0100, Marek Olšák wrote:
> On Mon, Feb 8, 2016 at 5:08 PM, Tom Stellard  wrote:
> > On Sat, Feb 06, 2016 at 01:15:42PM +0100, Marek Olšák wrote:
> >> From: Marek Olšák 
> >>
> >> This fixes FP16 conversion instructions for VI, which has 16-bit floats,
> >> but not SI & CI, which can't disable denorms for those instructions.
> >
> > Do you know why this fixes FP16 conversions?  What does the OpenGL
> > spec say about denormal handing?
> 
> Yes, I know why. The patch explain everything as far as I can see
> though. What isn't clear?
> 

I got it now.

> SI & CI: Don't support FP16. FP16 conversions are hardcoded to emit
> and accept FP16 denormals.
> VI: Supports FP16. FP16 denormal support is now configurable and
> affects FP16 conversions as well.(shared setting with FP64).
> 
> OpenGL doesn't require denormals. Piglit does. I think this is
> incorrect piglit behavior.
> 
> >
> >> ---
> >>  src/gallium/drivers/radeonsi/si_shader.c| 14 ++
> >>  src/gallium/drivers/radeonsi/si_state_shaders.c | 18 --
> >>  src/gallium/drivers/radeonsi/sid.h  |  3 +++
> >>  3 files changed, 29 insertions(+), 6 deletions(-)
> >>
> >> diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
> >> b/src/gallium/drivers/radeonsi/si_shader.c
> >> index a4680ce..3f1db70 100644
> >> --- a/src/gallium/drivers/radeonsi/si_shader.c
> >> +++ b/src/gallium/drivers/radeonsi/si_shader.c
> >> @@ -4155,6 +4155,20 @@ int si_compile_llvm(struct si_screen *sscreen,
> >>
> >>   si_shader_binary_read_config(binary, conf, 0);
> >>
> >> + /* Enable 64-bit and 16-bit denormals, because there is no 
> >> performance
> >> +  * cost.
> >> +  *
> >> +  * If denormals are enabled, all floating-point output modifiers are
> >> +  * ignored.
> >> +  *
> >> +  * Don't enable denormals for 32-bit floats, because:
> >> +  * - Floating-point output modifiers would be ignored by the hw.
> >> +  * - Some opcodes don't support denormals, such as v_mad_f32. We 
> >> would
> >> +  *   have to stop using those.
> >> +  * - SI & CI would be very slow.
> >> +  */
> >> + conf->float_mode |= V_00B028_FP_64_DENORMS;
> >> +
> >
> > Do SI/CI support fp64 denorms?  If so, won't this hurt performance?
> 
> Yes, they do. Fp64 denorms don't hurt performance. Only fp32 denorms
> do on SI & CI.
> 
> >
> > We should tell the compiler we are enabling fp-64 denorms by adding
> > +fp64-denormals to the feature string.  It would also be better to
> > read the float_mode value from the config registers emitted by the
> > compiler.
> 
> Yes, I agree, but LLVM only sets these parameters for compute or even
> HSA-only kernels, not for graphics shaders. We need to set the mode
> for all users _now_, not in 6 months. Last time I looked,
> +fp64-denormals had no effect on graphics shaders.
> 

We should still add +fp64-denormals even if the backend doesn't do
anything with it now.  This will make it easier if we have to use this
feature string to enable fix a bug in the backend,
because we will just be able to update LLVM.

I don't have a problem hard-coding float_mode for now, but once LLVM is
emitting the correct thing, we should pull the value from LLVM.

-Tom

> Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] winsys/radeon: fix a wrong NUM_TILE_PIPES value from the kernel

2016-02-09 Thread Alex Deucher

On Tue, Feb 9, 2016 at 12:47 PM, Marek Olšák  wrote:
> On Tue, Feb 9, 2016 at 6:17 PM, Alexandre Demers
>  wrote:
>>> +/* The kernel returns 12 for some cards for an unknown
>>> reason.
>>> + * I thought this was supposed to be a power of two.
>>> + */
>>> +if (ws->gen == DRV_SI && ws->info.num_tile_pipes == 12)
>>> +ws->info.num_tile_pipes = 8;
>>> +
>>
>> I may be late in the conversation, but shouldn't we have a look at why the
>> value reported by the kernel is wrong for "some" cards? Which ones and why
>> should be identified. It seems to be limited to Southern Islands as far as
>> we know for now, which limits the scope for now.
>>
>> Also, about the patch itself, even if only some cards were reported to be
>> problematic, why would we limit it to "ws->gen == DRV_SI"? Any cards
>> reporting a wrong value should be treated the same way by mapping its value
>> from 12 to 8, no?
>
> No. Only one card is affected (Tahiti or Pitcairn, I don't remember
> which one). No other card reports 12.
>
> There is no point in looking into why the value is wrong and I haven't
> been able to find where the value had come from. It's part of the
> kernel ABI now anyway. Userspace won't use it anymore.

I did it when I added SI support.  I think I get the value from tcore
or some hw features header.  I don't remember the details.  Anyway, I
think it may have been a hw value (related to the number of memory
channels) whereas from a sw perspective, we'd use 8 for tiling
computations.  E.g., when we set up rdev->config.si.tile_config which
is what we used to use, we set it to 8.

Alex
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] [v2] i965: Make sure we blit a full compressed block

2016-02-09 Thread Ben Widawsky

On Tue, Feb 09, 2016 at 12:10:18PM -0800, Ian Romanick wrote:
> On 02/09/2016 11:58 AM, Ben Widawsky wrote:
> > On Mon, Feb 08, 2016 at 09:48:54PM -0800, Jason Ekstrand wrote:
> >> On Sat, Feb 6, 2016 at 6:11 PM, Ben Widawsky 
> >> wrote:
> >>
> >>> This fixes an assertion failure in [at least] one of the Unreal Engine
> >>> Linux
> >>> demo/games that uses DXT1 compression. Specifically, the "Vehicle Game".
> >>>
> >>> At some point, the game ends up trying to blit mip level whose size is 
> >>> 2x2,
> >>> which is smaller than a DXT1 block. As a result, the assertion in the blit
> >>> path
> >>> is triggered. It should be safe to simply make sure we align the width and
> >>> height, which is sadly an example of compression being less efficient.
> >>>
> >>> NOTE: The demo seems to work fine without the assert, and therefore 
> >>> release
> >>> builds of mesa wouldn't stumble over this. Perhaps there is some
> >>> unnoticeable
> >>> corruption, but I had trouble spotting it.
> >>>
> >>> Thanks to Jason for looking at my backtrace and figuring out what was
> >>> going on.
> >>>
> >>> v2: Use NPOT alignment to make sure ASTC is handled properly (Ilia)
> >>> Remove comment about how this doesn't fix other bugs, because it does.
> >>>
> >>> Cc: Jason Ekstrand 
> >>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93358
> >>> Signed-off-by: Ben Widawsky 
> >>> ---
> >>>  src/mesa/drivers/dri/i965/intel_copy_image.c | 13 +
> >>>  1 file changed, 13 insertions(+)
> >>>
> >>> diff --git a/src/mesa/drivers/dri/i965/intel_copy_image.c
> >>> b/src/mesa/drivers/dri/i965/intel_copy_image.c
> >>> index 0a3337e..42bd7ff 100644
> >>> --- a/src/mesa/drivers/dri/i965/intel_copy_image.c
> >>> +++ b/src/mesa/drivers/dri/i965/intel_copy_image.c
> >>> @@ -212,6 +212,7 @@ intel_copy_image_sub_data(struct gl_context *ctx,
> >>> struct brw_context *brw = brw_context(ctx);
> >>> struct intel_mipmap_tree *src_mt, *dst_mt;
> >>> unsigned src_level, dst_level;
> >>> +   GLuint bw, bh;
> >>>
> >>> if (_mesa_meta_CopyImageSubData_uncompressed(ctx,
> >>>  src_image,
> >>> src_renderbuffer,
> >>> @@ -275,6 +276,18 @@ intel_copy_image_sub_data(struct gl_context *ctx,
> >>> intel_miptree_all_slices_resolve_depth(brw, dst_mt);
> >>> intel_miptree_resolve_color(brw, dst_mt);
> >>>
> >>> +   _mesa_get_format_block_size(src_mt->format, , );
> >>> +   /* It's legal to have a WxH that's smaller than a compressed block.
> >>> This
> >>> +* happens for example when you are using a higher level LOD. For this
> >>> case,
> >>> +* we still want to copy the entire block, or else the decompression
> >>> will be
> >>> +* incorrect.
> >>> +*/
> >>> +   if (src_width < bw)
> >>> +  src_width = ALIGN_NPOT(src_width, bw);
> >>> +
> >>> +   if (src_height < bh)
> >>> +  src_height = ALIGN_NPOT(src_height, bh);
> >>>
> >>
> >> I've been going back-and-forth as to whether or not this is the right place
> >> to do this or if we wanted it further up or down the stack.  At the end of
> >> the day, I concluded that it's as good a place as any.
> >>
> >> Reviewed-by: Jason Ekstrand 
> >> Cc "11.0 11.1" 
> > 
> > I left out stable intentionally because we will only hit the assert in debug
> > builds, and AFAICT, we'll only get minor corruption by this not being there 
> > -
> > but it's up to you. I just figured I'd share my thoughts in case they 
> > weren't
> > clear. I can add stable if you still think it wise.
> 
> I think it's a good candidate for stable.  It's pretty low risk, and it
> fixes something real.
> 
> Is there a piglit test? :)
> 

Anuj is working on it.

> > Thanks for review.
> > 
> >>
> >> --Jason
> >>
> >>
> >>> +
> >>> if (copy_image_with_blitter(brw, src_mt, src_level,
> >>> src_x, src_y, src_z,
> >>> dst_mt, dst_level,
> >>> --
> >>> 2.7.0
> > ___
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] glsl: remove unrequired forward declaration

2016-02-09 Thread Kenneth Graunke

On Saturday, January 30, 2016 10:50:12 AM PST Timothy Arceri wrote:
> This was added in 2548092ad80156a4 although I don't see why as it
> was already in the linker.h header.
> ---
>  src/compiler/glsl/linker.cpp | 2 --
>  1 file changed, 2 deletions(-)
> 
> diff --git a/src/compiler/glsl/linker.cpp b/src/compiler/glsl/linker.cpp
> index 7f41a4e..73a4b6a 100644
> --- a/src/compiler/glsl/linker.cpp
> +++ b/src/compiler/glsl/linker.cpp
> @@ -82,8 +82,6 @@
>  #include "main/enums.h"
>  
>  
> -void linker_error(gl_shader_program *, const char *, ...);
> -
>  namespace {
>  
>  /**
> 

Yeah, certainly doesn't seem necessary.

Reviewed-by: Kenneth Graunke 


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] st/mesa: check ureg_create() retval in create_pbo_upload_vs()

2016-02-09 Thread Samuel Pitoiset

This avoids a possible NULL dereference because ureg_create() might
return a NULL pointer.

Spotted by coverity.

Signed-off-by: Samuel Pitoiset 
Cc: Nicolai Hähnle 
---
 src/mesa/state_tracker/st_cb_texture.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/src/mesa/state_tracker/st_cb_texture.c 
b/src/mesa/state_tracker/st_cb_texture.c
index e9ac9a6..bdd2f63 100644
--- a/src/mesa/state_tracker/st_cb_texture.c
+++ b/src/mesa/state_tracker/st_cb_texture.c
@@ -1132,6 +1132,8 @@ create_pbo_upload_vs(struct st_context *st)
struct ureg_dst out_layer;
 
ureg = ureg_create(TGSI_PROCESSOR_VERTEX);
+   if (!ureg)
+  return NULL;
 
in_pos = ureg_DECL_vs_input(ureg, TGSI_SEMANTIC_POSITION);
 
-- 
2.7.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] glsl: validate arrays of arrays on empty type delclarations

2016-02-09 Thread Timothy Arceri

On Tue, 2016-02-09 at 00:50 -0800, Kenneth Graunke wrote:
> On Friday, February 5, 2016 1:08:19 PM PST Timothy Arceri wrote:
> > Fixes:
> > dEQP-
> > GLES31.functional.shaders.arrays_of_arrays.invalid.empty_declaratio
> > n_without_var_name_fragment
> > dEQP-
> > GLES31.functional.shaders.arrays_of_arrays.invalid.empty_declaratio
> > n_without_var_name_vertex
> > 
> > Cc: Ilia Mirkin 
> > ---
> >  src/compiler/glsl/ast_to_hir.cpp | 63 --
> > --
> >  1 file changed, 38 insertions(+), 25 deletions(-)
> > 
> > diff --git a/src/compiler/glsl/ast_to_hir.cpp
> > b/src/compiler/glsl/ast_to_hir.cpp
> > index 6890172..5df3a85 100644
> > --- a/src/compiler/glsl/ast_to_hir.cpp
> > +++ b/src/compiler/glsl/ast_to_hir.cpp
> > @@ -4267,33 +4267,46 @@ ast_declarator_list::hir(exec_list
> > *instructions,
> >   _mesa_glsl_error(, state,
> >    "invalid type `%s' in empty
> > declaration",
> >    type_name);
> > -  } else if (decl_type->base_type == GLSL_TYPE_ATOMIC_UINT) {
> > - /* Empty atomic counter declarations are allowed and
> > useful
> > -  * to set the default offset qualifier.
> > -  */
> > - return NULL;
> > -  } else if (this->type->qualifier.precision !=
> > ast_precision_none) {
> > - if (this->type->specifier->structure != NULL) {
> > -_mesa_glsl_error(, state,
> > - "precision qualifiers can't be
> > applied "
> > - "to structures");
> > - } else {
> > -static const char *const precision_names[] = {
> > -   "highp",
> > -   "highp",
> > -   "mediump",
> > -   "lowp"
> > -};
> > +  } else {
> > + if (decl_type->base_type == GLSL_TYPE_ARRAY) {
> > +/* From Section 4.12 (Empty Declarations) of the GLSL
> > 4.5 spec:
> > + *
> > + *"The combinations of types and qualifiers that
> > cause
> > + *compile-time or link-time errors are the same
> > whether or not
> > + *the declaration is empty."
> > + */
> > +validate_array_dimensions(decl_type, state, );
> > + }
> 
> Hi Tim,
> 
> I'm a bit puzzled by the if ladders here.  I originally expected to
> see
> 
>    if (decl_type == NULL) {
>   ...
>    } else if (decl_type->base_type == GLSL_TYPE_ARRAY) {
>   ...
>    } else if (decl_type->base_type == GLSL_TYPE_ATOMIC_UINT) {
>   ...
>    } else if (this->type->qualifier.precision != ast_precision_none)
> {
>   ...
>    } else if (this->type->specifier->structure == NULL) {
>   ...
>    }
> 
> instead of everything but the first case indented a level, with two
> checks of decl_type->base_type without an else if.  I guess the way
> you structured it, the precision check happens for arrays, and the
> "empty declaration" warning would happen too?  Presumably at least
> the last one is useful?
> 
> Honestly, the precision check seems a bit weird and out of place here
> (even before your patch) - it gets skipped for ATOMIC_UINT (why?),
> and explicitly complains about structs, but not arrays?
> 
> It seems like what we really want is:
> 
> if (decl_type == NULL) {
>    error: bogus type...
> } else {
> if (this->type->qualifier.precision != ast_precision_none) {
>    if (decl_type isn't glsl_type::float_type / int_type) {
>   error; precision qualifiers can only be applied to
>   float or int AFAICT...
>    } else {
>   warn: empty declaration with precision qualifier; wrong
>    }
> }
> 
> if (decl_type->base_type == GLSL_TYPE_ARRAY) {
>    ...validate the array things...
> } else if (decl_type->base_type != OGLSL_TYPE_ATOMIC_UINT &&
>    this->type->specifier->structure != NULL) {
>    warn: empty declaration
> }
> }
> 
> This would raise an error for precision qualifiers on arrays, too,
> which I think we want - and also for vec2/ivec2/uint/etc, which I
> also think we want (though outside the scope of your original patch).
> (Or, maybe forbidding vectors is done elsewhere?)
> 
> What do you think?  ast->hir code is rather messy :(

Yeah I was trying to make it work for the precision case  when we have
arrays and possibly to hit the structs warning too although I didn't
think about that case too much.

I'm probably missing something but the logic of your code looks the
same as mine just backwards. To me the code looked broken already,
there are many rules not checked for here and I was just trying to fix
the dEQP tests. IMO this stuff is only really usefull for passing
conformance tests.

With your change it now warns for empty atomics but the comment says
that they are actually useful so maybe we wouldn't want to warn. The
other thing

Re: [Mesa-dev] [PATCH] glsl: Disallow transform feedback varyings with compute shaders.

2016-02-09 Thread Timothy Arceri

On Tue, 2016-02-09 at 02:12 -0800, Kenneth Graunke wrote:
> If the only stage is MESA_SHADER_COMPUTE, we should complain that
> there's nothing coming out of the geometry shader stage just as
> we would if the first stage were MESA_SHADER_FRAGMENT.
> 
> Also, it's valid for tessellation shaders to be the stage producing
> transform feedback varyings, so mention those in the compiler error.
> 
> Found by inspection.
> 
> Signed-off-by: Kenneth Graunke 

Reviewed-by: Timothy Arceri 

> ---
>  src/compiler/glsl/linker.cpp | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/src/compiler/glsl/linker.cpp
> b/src/compiler/glsl/linker.cpp
> index f1ac53a..3f871ab 100644
> --- a/src/compiler/glsl/linker.cpp
> +++ b/src/compiler/glsl/linker.cpp
> @@ -4452,9 +4452,10 @@ link_shaders(struct gl_context *ctx, struct
> gl_shader_program *prog)
> * non-zero, but the program object has no vertex or
> geometry
> * shader;
> */
> -  if (first == MESA_SHADER_FRAGMENT) {
> +  if (first >= MESA_SHADER_FRAGMENT) {
>   linker_error(prog, "Transform feedback varyings specified,
> but "
> -  "no vertex or geometry shader is present.\n");
> +  "no vertex, tessellation, or geometry shader
> is "
> +  "present.\n");
>   goto done;
>    }
>  
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] glsl: validate arrays of arrays on empty type delclarations

2016-02-09 Thread Kenneth Graunke

On Friday, February 5, 2016 1:08:19 PM PST Timothy Arceri wrote:
> Fixes:
> dEQP-GLES31.functional.shaders.arrays_of_arrays.invalid.empty_declaration_without_var_name_fragment
> dEQP-GLES31.functional.shaders.arrays_of_arrays.invalid.empty_declaration_without_var_name_vertex
> 
> Cc: Ilia Mirkin 
> ---
>  src/compiler/glsl/ast_to_hir.cpp | 63 
> 
>  1 file changed, 38 insertions(+), 25 deletions(-)
> 
> diff --git a/src/compiler/glsl/ast_to_hir.cpp 
> b/src/compiler/glsl/ast_to_hir.cpp
> index 6890172..5df3a85 100644
> --- a/src/compiler/glsl/ast_to_hir.cpp
> +++ b/src/compiler/glsl/ast_to_hir.cpp
> @@ -4267,33 +4267,46 @@ ast_declarator_list::hir(exec_list *instructions,
>   _mesa_glsl_error(, state,
>"invalid type `%s' in empty declaration",
>type_name);
> -  } else if (decl_type->base_type == GLSL_TYPE_ATOMIC_UINT) {
> - /* Empty atomic counter declarations are allowed and useful
> -  * to set the default offset qualifier.
> -  */
> - return NULL;
> -  } else if (this->type->qualifier.precision != ast_precision_none) {
> - if (this->type->specifier->structure != NULL) {
> -_mesa_glsl_error(, state,
> - "precision qualifiers can't be applied "
> - "to structures");
> - } else {
> -static const char *const precision_names[] = {
> -   "highp",
> -   "highp",
> -   "mediump",
> -   "lowp"
> -};
> +  } else {
> + if (decl_type->base_type == GLSL_TYPE_ARRAY) {
> +/* From Section 4.12 (Empty Declarations) of the GLSL 4.5 spec:
> + *
> + *"The combinations of types and qualifiers that cause
> + *compile-time or link-time errors are the same whether or 
> not
> + *the declaration is empty."
> + */
> +validate_array_dimensions(decl_type, state, );
> + }

Hi Tim,

I'm a bit puzzled by the if ladders here.  I originally expected to see

   if (decl_type == NULL) {
  ...
   } else if (decl_type->base_type == GLSL_TYPE_ARRAY) {
  ...
   } else if (decl_type->base_type == GLSL_TYPE_ATOMIC_UINT) {
  ...
   } else if (this->type->qualifier.precision != ast_precision_none) {
  ...
   } else if (this->type->specifier->structure == NULL) {
  ...
   }

instead of everything but the first case indented a level, with two
checks of decl_type->base_type without an else if.  I guess the way
you structured it, the precision check happens for arrays, and the
"empty declaration" warning would happen too?  Presumably at least
the last one is useful?

Honestly, the precision check seems a bit weird and out of place here
(even before your patch) - it gets skipped for ATOMIC_UINT (why?),
and explicitly complains about structs, but not arrays?

It seems like what we really want is:

if (decl_type == NULL) {
   error: bogus type...
} else {
if (this->type->qualifier.precision != ast_precision_none) {
   if (decl_type isn't glsl_type::float_type / int_type) {
  error; precision qualifiers can only be applied to
  float or int AFAICT...
   } else {
  warn: empty declaration with precision qualifier; wrong
   }
}

if (decl_type->base_type == GLSL_TYPE_ARRAY) {
   ...validate the array things...
} else if (decl_type->base_type != OGLSL_TYPE_ATOMIC_UINT &&
   this->type->specifier->structure != NULL) {
   warn: empty declaration
}
}

This would raise an error for precision qualifiers on arrays, too,
which I think we want - and also for vec2/ivec2/uint/etc, which I
also think we want (though outside the scope of your original patch).
(Or, maybe forbidding vectors is done elsewhere?)

What do you think?  ast->hir code is rather messy :(

> -_mesa_glsl_warning(, state,
> -   "empty declaration with precision qualifier, "
> -   "to set the default precision, use "
> -   "`precision %s %s;'",
> -   
> precision_names[this->type->qualifier.precision],
> -   type_name);
> + if (decl_type->base_type == GLSL_TYPE_ATOMIC_UINT) {
> +/* Empty atomic counter declarations are allowed and useful
> + * to set the default offset qualifier.
> + */
> +return NULL;
> + } else if (this->type->qualifier.precision != ast_precision_none) {
> +if (this->type->specifier->structure != NULL) {
> +   _mesa_glsl_error(, state,
> +"precision qualifiers can't be applied "
> +"to

Re: [Mesa-dev] [PATCH V3 1/2] glsl: don't attempt to link empty program

2016-02-09 Thread Kenneth Graunke

On Tuesday, January 26, 2016 11:46:19 AM PST Timothy Arceri wrote:
> Previously an empty program would go through the entire
> link_shaders() function and we would have to be careful
> not to cause a segfault.
> 
> In core profile also now set link_status to false by
> generating an error, it was previously set to true.
> 
> From Section 7.3 (PROGRAM OBJECTS) of the OpenGL 4.5 spec:
> 
>"Linking can fail for a variety of reasons as specified in the
>OpenGL Shading Language Specification, as well as any of the
>following reasons:
> 
> - No shader objects are attached to program."
> 
> V2: Only generate an error in core profile and add spec quote (Ian)
> 
> V3: generate error in ES too, remove previous check which was only
> applying the rule to GL 4.5/ES 3.1 and above. My understand is that
> this spec change is clarifying previously undefined behaviour and
> therefore should be applied retrospectively. The ES CTS tests for
> this are in ES 2 I suspect it was passing because it would have
> generated an error for not having both a vertex and fragment shader.
> 
> Cc: Ian Romanick 
> Cc: Tapani Pälli 
> Cc: Ilia Mirkin 
> ---
>  src/glsl/linker.cpp | 43 ++-
>  1 file changed, 22 insertions(+), 21 deletions(-)
> 
> diff --git a/src/glsl/linker.cpp b/src/glsl/linker.cpp
> index 665..79039a0 100644
> --- a/src/glsl/linker.cpp
> +++ b/src/glsl/linker.cpp
> @@ -4105,14 +4105,34 @@ disable_varying_optimizations_for_sso(struct 
> gl_shader_program *prog)
>  void
>  link_shaders(struct gl_context *ctx, struct gl_shader_program *prog)
>  {
> +   prog->Validated = false;
> +   prog->_Used = false;
> +
> +   /* Section 7.3 (Program Objects) of the OpenGL 4.5 Core Profile spec says:
> +*
> +* "Linking can fail for a variety of reasons as specified in the
> +* OpenGL Shading Language Specification, as well as any of the
> +* following reasons:
> +*
> +* - No shader objects are attached to program."
> +*
> +* The Compatibility Profile specification does not list the error.  In
> +* Compatibility Profile missing shader stages are replaced by
> +* fixed-function.  This applies to the case where all stages are
> +* missing.
> +*/
> +   if (prog->NumShaders == 0) {
> +  if (ctx->API != API_OPENGL_COMPAT)
> + linker_error(prog, "no shaders attached to the program\n");
> +  return;
> +   }
> +
> tfeedback_decl *tfeedback_decls = NULL;
> unsigned num_tfeedback_decls = prog->TransformFeedback.NumVarying;
>  
> void *mem_ctx = ralloc_context(NULL); // temporary linker context
>  
> prog->LinkStatus = true; /* All error paths will set this to false */
> -   prog->Validated = false;
> -   prog->_Used = false;
>  
> prog->ARB_fragment_coord_conventions_enable = false;
>  
> @@ -4162,25 +4182,6 @@ link_shaders(struct gl_context *ctx, struct 
> gl_shader_program *prog)
> prog->Version = max_version;
> prog->IsES = is_es_prog;
>  
> -   /* From OpenGL 4.5 Core specification (7.3 Program Objects):
> -* "Linking can fail for a variety of reasons as specified in the 
> OpenGL
> -* Shading Language Specification, as well as any of the following
> -* reasons:
> -*
> -* * No shader objects are attached to program.
> -*
> -* ..."
> -*
> -* Same rule applies for OpenGL ES >= 3.1.
> -*/
> -
> -   if (prog->NumShaders == 0 &&
> -   ((ctx->API == API_OPENGL_CORE && ctx->Version >= 45) ||
> -(ctx->API == API_OPENGLES2 && ctx->Version >= 31))) {
> -  linker_error(prog, "No shader objects are attached to program.\n");
> -  goto done;
> -   }
> -
> /* Some shaders have to be linked with some other shaders present.
>  */
> if (num_shaders[MESA_SHADER_GEOMETRY] > 0 &&

This patch seems good to me.  It's nice to short-circuit the NumShaders
== 0 case so we don't have to think about it later.  Taking the new text
as a clarification makes a ton of sense.

Reviewed-by: Kenneth Graunke 


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH V2 2/2] glsl: clean up and fix bug in varying linking rules

2016-02-09 Thread Kenneth Graunke

On Tuesday, February 2, 2016 12:20:01 PM PST Timothy Arceri wrote:
> The existing code was very hard to follow and has been the source
> of at least 3 bugs in the past year.
> 
> The existing code also has a bug for SSO where if we have a
> multi-stage SSO for example a tes -> gs program, if we try to use
> transform feedback with gs the existing code would look for the
> transform feedback varyings in the tes stage and fail as it can't
> find them.
> 
> V2: Add more code comments, always try to remove unused inputs
> to the first stage.
> 
> Cc: Ilia Mirkin 
> Cc: Chris Forbes 
> ---
>  src/compiler/glsl/linker.cpp | 137 
> ---
>  1 file changed, 63 insertions(+), 74 deletions(-)
> 
> diff --git a/src/compiler/glsl/linker.cpp b/src/compiler/glsl/linker.cpp
> index fdfdcaa..09323bb 100644
> --- a/src/compiler/glsl/linker.cpp
> +++ b/src/compiler/glsl/linker.cpp
> @@ -4461,91 +4461,80 @@ link_shaders(struct gl_context *ctx, struct 
> gl_shader_program *prog)
>   goto done;
> }
>  
> -   /* Linking the stages in the opposite order (from fragment to vertex)
> -* ensures that inter-shader outputs written to in an earlier stage are
> -* eliminated if they are (transitively) not used in a later stage.
> +   /* If there is no fragment shader we need to set transform feedback.
> +*
> +* For SSO we need also need to assign output locations, we assign them
> +* here because we need to do it for both single stage programs and multi
> +* stage programs.
>  */
> -   int next;
> -
> -   if (first < MESA_SHADER_FRAGMENT) {
> -  gl_shader *const sh = prog->_LinkedShaders[last];
> -
> -  if (first != MESA_SHADER_VERTEX) {
> - /* There was no vertex shader, but we still have to assign varying
> -  * locations for use by tessellation/geometry shader inputs in SSO.
> -  *
> -  * If the shader is not separable (i.e., prog->SeparateShader is
> -  * false), linking will have already failed when first is not
> -  * MESA_SHADER_VERTEX.
> -  */
> - if (!assign_varying_locations(ctx, mem_ctx, prog,
> -   NULL, prog->_LinkedShaders[first],
> -   num_tfeedback_decls, tfeedback_decls))
> -goto done;
> -  }
> -
> -  if (last != MESA_SHADER_FRAGMENT &&
> - (num_tfeedback_decls != 0 || prog->SeparateShader)) {
> - /* There was no fragment shader, but we still have to assign varying
> -  * locations for use by transform feedback.
> -  */
> - if (!assign_varying_locations(ctx, mem_ctx, prog,
> -   sh, NULL,
> -   num_tfeedback_decls, tfeedback_decls))
> -goto done;
> -  }
> -
> -  do_dead_builtin_varyings(ctx, sh, NULL,
> -   num_tfeedback_decls, tfeedback_decls);
> +   if (last < MESA_SHADER_FRAGMENT &&
> +   (num_tfeedback_decls != 0 || prog->SeparateShader)) {
> +  if (!assign_varying_locations(ctx, mem_ctx, prog,
> +prog->_LinkedShaders[last], NULL,
> +num_tfeedback_decls, tfeedback_decls))
> + goto done;
> +   }
>  
> -  remove_unused_shader_inputs_and_outputs(prog->SeparateShader, sh,
> +   if (last <= MESA_SHADER_FRAGMENT) {

I don't understand the point of last <= MESA_SHADER_FRAGMENT here.
The only other option is MESA_SHADER_COMPUTE, which has no inputs or
outputs, so calling this should be harmless.

> +  /* Remove unused varyings from the first/last stage unless SSO */
> +  remove_unused_shader_inputs_and_outputs(prog->SeparateShader,
> +  prog->_LinkedShaders[first],
> +  ir_var_shader_in);
> +  remove_unused_shader_inputs_and_outputs(prog->SeparateShader,
> +  prog->_LinkedShaders[last],
>ir_var_shader_out);

I see, assign_varying_locations calls this for internal stages, when
there's a producer and consumer - this just calls it for the boundary
shaders in the !SSO case.  Makes sense.

This is much cleaner - nice work!

Reviewed-by: Kenneth Graunke 


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/3] glsl: small tidy up now that link_shaders() exits early with 0 shaders

2016-02-09 Thread Kenneth Graunke

On Saturday, January 30, 2016 8:43:52 AM PST Timothy Arceri wrote:
> ---
> 
>  This applies on top of:
> 
>  http://patchwork.freedesktop.org/patch/71619/
> 
>  src/compiler/glsl/linker.cpp | 10 --
>  1 file changed, 4 insertions(+), 6 deletions(-)

Nice!

This series is:
Reviewed-by: Kenneth Graunke 

signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/4] mesa/image: Make _mesa_clip_readpixels() work with renderbuffers

2016-02-09 Thread Lofstedt, Marta

Thanks Nanley, 
I confirm that this patch-set fix the Bugzilla:92193

/Marta

> -Original Message-
> From: mesa-dev [mailto:mesa-dev-boun...@lists.freedesktop.org] On
> Behalf Of Nanley Chery
> Sent: Monday, February 8, 2016 11:38 PM
> To: mesa-dev@lists.freedesktop.org
> Cc: 11.0 11.1 ; Chery, Nanley G
> 
> Subject: [Mesa-dev] [PATCH 1/4] mesa/image: Make
> _mesa_clip_readpixels() work with renderbuffers
> 
> From: Nanley Chery 
> 
> v2: Use gl_renderbuffer::{Width,Height} (Jason)
> 
> Cc: "11.0 11.1" 
> Signed-off-by: Nanley Chery 
> ---
>  src/mesa/main/image.c | 22 +-
>  1 file changed, 17 insertions(+), 5 deletions(-)
> 
> diff --git a/src/mesa/main/image.c b/src/mesa/main/image.c index
> e79e3e6..99f253c 100644
> --- a/src/mesa/main/image.c
> +++ b/src/mesa/main/image.c
> @@ -670,7 +670,7 @@ _mesa_clip_drawpixels(const struct gl_context *ctx,
>   * so that the image region is entirely within the window bounds.
>   * Note: this is different from _mesa_clip_drawpixels() in that the
>   * scissor box is ignored, and we use the bounds of the current readbuffer
> - * surface.
> + * surface or the attached image.
>   *
>   * \return  GL_TRUE if region to read is in bounds
>   *  GL_FALSE if region is completely out of bounds (nothing to read)
> @@ -682,6 +682,18 @@ _mesa_clip_readpixels(const struct gl_context *ctx,
>struct gl_pixelstore_attrib *pack)  {
> const struct gl_framebuffer *buffer = ctx->ReadBuffer;
> +   struct gl_renderbuffer *rb = buffer->_ColorReadBuffer;
> +   GLsizei clip_width;
> +   GLsizei clip_height;
> +
> +   if (rb) {
> +  clip_width = rb->Width;
> +  clip_height = rb->Height;
> +   } else {
> +  clip_width = buffer->Width;
> +  clip_height = buffer->Height;
> +   }
> +
> 
> if (pack->RowLength == 0) {
>pack->RowLength = *width;
> @@ -694,8 +706,8 @@ _mesa_clip_readpixels(const struct gl_context *ctx,
>*srcX = 0;
> }
> /* right clipping */
> -   if (*srcX + *width > (GLsizei) buffer->Width)
> -  *width -= (*srcX + *width - buffer->Width);
> +   if (*srcX + *width > clip_width)
> +  *width -= (*srcX + *width - clip_width);
> 
> if (*width <= 0)
>return GL_FALSE;
> @@ -707,8 +719,8 @@ _mesa_clip_readpixels(const struct gl_context *ctx,
>*srcY = 0;
> }
> /* top clipping */
> -   if (*srcY + *height > (GLsizei) buffer->Height)
> -  *height -= (*srcY + *height - buffer->Height);
> +   if (*srcY + *height > clip_height)
> +  *height -= (*srcY + *height - clip_height);
> 
> if (*height <= 0)
>return GL_FALSE;
> --
> 2.7.0
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] nir: remove unused nir_variable fields

2016-02-09 Thread Kenneth Graunke

On Tuesday, February 2, 2016 11:53:57 AM PST Timothy Arceri wrote:
> These are used in GLSL IR to removed unused varyings and match
> transform feedback variables. There is no need to use these in NIR.
> ---
>  src/compiler/nir/glsl_to_nir.cpp |  2 --
>  src/compiler/nir/nir.h   | 18 --
>  2 files changed, 20 deletions(-)
> 
> diff --git a/src/compiler/nir/glsl_to_nir.cpp 
> b/src/compiler/nir/glsl_to_nir.cpp
> index 4b76d23..b9e27e9 100644
> --- a/src/compiler/nir/glsl_to_nir.cpp
> +++ b/src/compiler/nir/glsl_to_nir.cpp
> @@ -364,8 +364,6 @@ nir_visitor::visit(ir_variable *ir)
> var->data.explicit_binding = ir->data.explicit_binding;
> var->data.has_initializer = ir->data.has_initializer;
> var->data.location_frac = ir->data.location_frac;
> -   var->data.from_named_ifc_block_array = 
> ir->data.from_named_ifc_block_array;
> -   var->data.from_named_ifc_block_nonarray = 
> ir->data.from_named_ifc_block_nonarray;
>  
> switch (ir->data.depth_layout) {
> case ir_depth_layout_none:
> diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h
> index aec75fb..3fc87ae 100644
> --- a/src/compiler/nir/nir.h
> +++ b/src/compiler/nir/nir.h
> @@ -224,24 +224,6 @@ typedef struct nir_variable {
>unsigned location_frac:2;
>  
>/**
> -   * Non-zero if this variable was created by lowering a named interface
> -   * block which was not an array.
> -   *
> -   * Note that this variable and \c from_named_ifc_block_array will never
> -   * both be non-zero.
> -   */
> -  unsigned from_named_ifc_block_nonarray:1;
> -
> -  /**
> -   * Non-zero if this variable was created by lowering a named interface
> -   * block which was an array.
> -   *
> -   * Note that this variable and \c from_named_ifc_block_nonarray will 
> never
> -   * both be non-zero.
> -   */
> -  unsigned from_named_ifc_block_array:1;
> -
> -  /**
> * \brief Layout qualifier for gl_FragDepth.
> *
> * This is not equal to \c ir_depth_layout_none if and only if this
> 


Yep, it doesn't make much sense to have these post-linking.

Reviewed-by: Kenneth Graunke 


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] glsl: Disallow transform feedback varyings with compute shaders.

2016-02-09 Thread Kenneth Graunke

If the only stage is MESA_SHADER_COMPUTE, we should complain that
there's nothing coming out of the geometry shader stage just as
we would if the first stage were MESA_SHADER_FRAGMENT.

Also, it's valid for tessellation shaders to be the stage producing
transform feedback varyings, so mention those in the compiler error.

Found by inspection.

Signed-off-by: Kenneth Graunke 
Cc: Timothy Arceri 
---
 src/compiler/glsl/linker.cpp | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/src/compiler/glsl/linker.cpp b/src/compiler/glsl/linker.cpp
index f1ac53a..3f871ab 100644
--- a/src/compiler/glsl/linker.cpp
+++ b/src/compiler/glsl/linker.cpp
@@ -4452,9 +4452,10 @@ link_shaders(struct gl_context *ctx, struct 
gl_shader_program *prog)
* non-zero, but the program object has no vertex or geometry
* shader;
*/
-  if (first == MESA_SHADER_FRAGMENT) {
+  if (first >= MESA_SHADER_FRAGMENT) {
  linker_error(prog, "Transform feedback varyings specified, but "
-  "no vertex or geometry shader is present.\n");
+  "no vertex, tessellation, or geometry shader is "
+  "present.\n");
  goto done;
   }
 
-- 
2.7.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 3/4] glsl: apply align layout qualifier rules to block offsets

2016-02-09 Thread Samuel Iglesias Gonsálvez


On Tue, 2016-01-12 at 20:34 +1100, Timothy Arceri wrote:
> From Section 4.4.5 (Uniform and Shader Storage Block Layout
> Qualifiers) of the OpenGL 4.50 spec:
> 
>   "The align qualifier makes the start of each block member have a
>   minimum byte alignment.  It does not affect the internal layout
>   within each member, which will still follow the std140 or std430
>   rules. The specified alignment must be a power of 2, or a
>   compile-time error results.
> 
>   The actual alignment of a member will be the greater of the
>   specified align alignment and the standard (e.g., std140) base
>   alignment for the member's type. The actual offset of a member is
>   computed as follows: If offset was declared, start with that
>   offset, otherwise start with the next available offset. If the
>   resulting offset is not a multiple of the actual alignment,
>   increase it to the first offset that is a multiple of the actual
>   alignment. This results in the actual offset the member will have.
> 
>   When align is applied to an array, it affects only the start of
>   the array, not the array's internal stride. Both an offset and an
>   align qualifier can be specified on a declaration.
> 
>   The align qualifier, when used on a block, has the same effect as
>   qualifying each member with the same align value as declared on
>   the block, and gets the same compile-time results and errors as if
>   this had been done. As described in general earlier, an individual
>   member can specify its own align, which overrides the block-level
>   align, but just for that member.
> ---
>  src/glsl/ast_to_hir.cpp | 51
> ++---
>  1 file changed, 48 insertions(+), 3 deletions(-)
> 
> diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp
> index f5da771..fd05fd7 100644
> --- a/src/glsl/ast_to_hir.cpp
> +++ b/src/glsl/ast_to_hir.cpp
> @@ -6212,7 +6212,8 @@
> ast_process_struct_or_iface_block_members(exec_list *instructions,
>    ir_variable_mode var_mode,
>    ast_type_qualifier
> *layout,
>    unsigned block_stream,
> -  unsigned expl_location)
> +  unsigned expl_location,
> +  unsigned expl_align)
>  {
> unsigned decl_count = 0;
> unsigned next_offset = 0;
> @@ -6466,6 +6467,34 @@
> ast_process_struct_or_iface_block_members(exec_list *instructions,
>  }
>   } else {
>  fields[i].offset = -1;
> + }
> +
> + if (qual->flags.q.explicit_align || expl_align != 0) {
> +unsigned offset = fields[i].offset != -1 ?
> fields[i].offset :
> +   next_offset;
> +if (align == 0 || size == 0) {
> +   _mesa_glsl_error(, state, "align can only be used
> with "
> +"std430 and std140 layouts");
> +} else if (qual->flags.q.explicit_align) {
> +   unsigned member_align;
> +   if (process_qualifier_constant(state, , "align",
> +  qual->align,
> _align)) {
> +  if (member_align == 0 ||

I have modified ubo-block-align-zero.vert piglit test to set the align
qualifier only to block members and tested with align = 0. The shader
compiles on NVIDIA proprietary driver.

According to the spec, the specified alignment must be a power of 2.
However align = 0 could have different interpretations (for example,
when applied to a block member, it would mean to ignore block's align
value). As I am not sure about this case...

Have you checked if align = 0 is invalid?

Sam

> +  member_align & (member_align - 1)) {
> + _mesa_glsl_error(, state, "align layout
> qualifier "
> +  "in not a power of 2");
> +  } else {
> + fields[i].offset = glsl_align(offset,
> member_align);
> + next_offset = glsl_align(fields[i].offset +
> size, align);
> +  }
> +   }
> +} else {
> +   fields[i].offset = glsl_align(offset, expl_align);
> +   next_offset = glsl_align(fields[i].offset + size,
> align);
> +}
> + }
> +
> + if (!qual->flags.q.explicit_offset) {
>  if (align != 0 && size != 0)
> next_offset = glsl_align(next_offset + size, align);
>   }
> @@ -6590,7 +6619,8 @@ ast_struct_specifier::hir(exec_list
> *instructions,
>  ir_var_auto,
>  layout,
>  0, /* for interface
> only */
> -expl_location);
> +

Re: [Mesa-dev] [PATCH] st/mesa: check ureg_create() retval in create_pbo_upload_vs()

2016-02-09 Thread Nicolai Hähnle


On 09.02.2016 05:40, Samuel Pitoiset wrote:

This avoids a possible NULL dereference because ureg_create() might
return a NULL pointer.

Spotted by coverity.


Thanks.

Reviewed-by: Nicolai Hähnle 



Signed-off-by: Samuel Pitoiset 
Cc: Nicolai Hähnle 
---
  src/mesa/state_tracker/st_cb_texture.c | 2 ++
  1 file changed, 2 insertions(+)

diff --git a/src/mesa/state_tracker/st_cb_texture.c 
b/src/mesa/state_tracker/st_cb_texture.c
index e9ac9a6..bdd2f63 100644
--- a/src/mesa/state_tracker/st_cb_texture.c
+++ b/src/mesa/state_tracker/st_cb_texture.c
@@ -1132,6 +1132,8 @@ create_pbo_upload_vs(struct st_context *st)
 struct ureg_dst out_layer;

 ureg = ureg_create(TGSI_PROCESSOR_VERTEX);
+   if (!ureg)
+  return NULL;

 in_pos = ureg_DECL_vs_input(ureg, TGSI_SEMANTIC_POSITION);



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 3/3] st/mesa: don't allocate bitmap drawing state until needed

2016-02-09 Thread Nicolai Hähnle


On 08.02.2016 12:07, Brian Paul wrote:

Most apps don't use glBitmap so don't allocate the bitmap cache or
gallium state objects/shaders/etc until the first call to st_Bitmap().
---
  src/mesa/state_tracker/st_cb_bitmap.c | 145 ++
  src/mesa/state_tracker/st_cb_bitmap.h |   3 -
  src/mesa/state_tracker/st_context.c   |   1 -
  3 files changed, 77 insertions(+), 72 deletions(-)

diff --git a/src/mesa/state_tracker/st_cb_bitmap.c 
b/src/mesa/state_tracker/st_cb_bitmap.c
index c26ee7f..ca1dfab 100644
--- a/src/mesa/state_tracker/st_cb_bitmap.c
+++ b/src/mesa/state_tracker/st_cb_bitmap.c
@@ -497,8 +497,9 @@ create_cache_trans(struct st_context *st)
  void
  st_flush_bitmap_cache(struct st_context *st)
  {
-   if (!st->bitmap.cache->empty) {
-  struct bitmap_cache *cache = st->bitmap.cache;
+   struct bitmap_cache *cache = st->bitmap.cache;
+
+   if (cache && !st->bitmap.cache->empty) {
struct pipe_context *pipe = st->pipe;
struct pipe_sampler_view *sv;

@@ -617,6 +618,76 @@ accum_bitmap(struct gl_context *ctx,
  }


+/**
+ * One-time init for drawing bitmaps.
+ */
+static void
+init_bitmap_state(struct st_context *st)
+{
+   struct pipe_sampler_state *sampler = >bitmap.samplers[0];
+   struct pipe_context *pipe = st->pipe;
+   struct pipe_screen *screen = pipe->screen;
+
+   /* This function should only be called once */
+   assert(st->bitmap.cache == NULL);
+
+   /* alloc bitmap cache object */
+   st->bitmap.cache = ST_CALLOC_STRUCT(bitmap_cache);


What if the allocation fails? I know the old code didn't check this 
either, but...


Apart from this and the comment by Gustaw, the series is

Reviewed-by: Nicolai Hähnle 


+
+   /* init sampler state once */
+   memset(sampler, 0, sizeof(*sampler));
+   sampler->wrap_s = PIPE_TEX_WRAP_CLAMP;
+   sampler->wrap_t = PIPE_TEX_WRAP_CLAMP;
+   sampler->wrap_r = PIPE_TEX_WRAP_CLAMP;
+   sampler->min_img_filter = PIPE_TEX_FILTER_NEAREST;
+   sampler->min_mip_filter = PIPE_TEX_MIPFILTER_NONE;
+   sampler->mag_img_filter = PIPE_TEX_FILTER_NEAREST;
+   st->bitmap.samplers[1] = *sampler;
+   st->bitmap.samplers[1].normalized_coords = 1;
+
+   /* init baseline rasterizer state once */
+   memset(>bitmap.rasterizer, 0, sizeof(st->bitmap.rasterizer));
+   st->bitmap.rasterizer.half_pixel_center = 1;
+   st->bitmap.rasterizer.bottom_edge_rule = 1;
+   st->bitmap.rasterizer.depth_clip = 1;
+
+   /* find a usable texture format */
+   if (screen->is_format_supported(screen, PIPE_FORMAT_I8_UNORM,
+   PIPE_TEXTURE_2D, 0,
+   PIPE_BIND_SAMPLER_VIEW)) {
+  st->bitmap.tex_format = PIPE_FORMAT_I8_UNORM;
+   }
+   else if (screen->is_format_supported(screen, PIPE_FORMAT_A8_UNORM,
+PIPE_TEXTURE_2D, 0,
+PIPE_BIND_SAMPLER_VIEW)) {
+  st->bitmap.tex_format = PIPE_FORMAT_A8_UNORM;
+   }
+   else if (screen->is_format_supported(screen, PIPE_FORMAT_L8_UNORM,
+PIPE_TEXTURE_2D, 0,
+PIPE_BIND_SAMPLER_VIEW)) {
+  st->bitmap.tex_format = PIPE_FORMAT_L8_UNORM;
+   }
+   else {
+  /* XXX support more formats */
+  assert(0);
+   }
+
+   /* Create the vertex shader */
+   {
+  const uint semantic_names[] = { TGSI_SEMANTIC_POSITION,
+  TGSI_SEMANTIC_COLOR,
+st->needs_texcoord_semantic ? TGSI_SEMANTIC_TEXCOORD :
+  TGSI_SEMANTIC_GENERIC };
+  const uint semantic_indexes[] = { 0, 0, 0 };
+  st->bitmap.vs = util_make_vertex_passthrough_shader(st->pipe, 3,
+  semantic_names,
+  semantic_indexes,
+  FALSE);
+   }
+
+   reset_cache(st);
+}
+

  /**
   * Called via ctx->Driver.Bitmap()
@@ -632,6 +703,10 @@ st_Bitmap(struct gl_context *ctx, GLint x, GLint y,
 assert(width > 0);
 assert(height > 0);

+   if (!st->bitmap.cache) {
+  init_bitmap_state(st);
+   }
+
 /* We only need to validate state of the st dirty flags are set or
  * any non-_NEW_PROGRAM_CONSTANTS mesa flags are set.  The VS we use
  * for bitmap drawing uses no constants and the FS constants are
@@ -641,19 +716,6 @@ st_Bitmap(struct gl_context *ctx, GLint x, GLint y,
st_validate_state(st);
 }

-   if (!st->bitmap.vs) {
-  /* create pass-through vertex shader now */
-  const uint semantic_names[] = { TGSI_SEMANTIC_POSITION,
-  TGSI_SEMANTIC_COLOR,
-st->needs_texcoord_semantic ? TGSI_SEMANTIC_TEXCOORD :
-  TGSI_SEMANTIC_GENERIC };
-  const uint semantic_indexes[] = { 0, 0, 0 };
-  st->bitmap.vs = util_make_vertex_passthrough_shader(st->pipe,

Re: [Mesa-dev] [PATCH 1/4] glsl: add component to has_layout() helper

2016-02-09 Thread Samuel Iglesias Gonsálvez

Reviewed-by: Samuel Iglesias Gonsálvez 

On Tue, 2016-01-12 at 20:34 +1100, Timothy Arceri wrote:
> I don't think this will do much as it's a compiler error
> to use component without location which is already in the
> table but its good to be consistent.
> ---
>  src/glsl/ast_type.cpp | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/src/glsl/ast_type.cpp b/src/glsl/ast_type.cpp
> index 7330a34..6d6f88f 100644
> --- a/src/glsl/ast_type.cpp
> +++ b/src/glsl/ast_type.cpp
> @@ -73,6 +73,7 @@ ast_type_qualifier::has_layout() const
>    || this->flags.q.column_major
>    || this->flags.q.row_major
>    || this->flags.q.packed
> +  || this->flags.q.explicit_component
>    || this->flags.q.explicit_location
>    || this->flags.q.explicit_index
>    || this->flags.q.explicit_binding

signature.asc
Description: This is a digitally signed message part
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 4/4] docs: mark align layout qualifier as DONE

2016-02-09 Thread Samuel Iglesias Gonsálvez

Reviewed-by: Samuel Iglesias Gonsálvez 

On Tue, 2016-01-12 at 20:34 +1100, Timothy Arceri wrote:
> ---
>  docs/GL3.txt | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/docs/GL3.txt b/docs/GL3.txt
> index c64784c..1377eb8 100644
> --- a/docs/GL3.txt
> +++ b/docs/GL3.txt
> @@ -181,7 +181,7 @@ GL 4.4, GLSL 4.40:
>    GL_ARB_enhanced_layouts  in progress
> (Timothy)
>    - compile-time constant expressions  DONE
>    - explicit byte offsets for blocks   DONE
> -  - forced alignment within blocks in progress
> +  - forced alignment within blocks DONE
>    - specified vec4-slot component numbers  DONE
>    - specified transform/feedback layoutin progress
>    - input/output block locations   DONE

signature.asc
Description: This is a digitally signed message part
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/4] glsl: parse align layout qualifier

2016-02-09 Thread Samuel Iglesias Gonsálvez

Reviewed-by: Samuel Iglesias Gonsálvez 

On Tue, 2016-01-12 at 20:34 +1100, Timothy Arceri wrote:
> ---
>  src/glsl/ast.h  | 11 +++
>  src/glsl/ast_type.cpp   |  4 
>  src/glsl/glsl_parser.yy | 11 +++
>  3 files changed, 26 insertions(+)
> 
> diff --git a/src/glsl/ast.h b/src/glsl/ast.h
> index e22deed..0b6871a 100644
> --- a/src/glsl/ast.h
> +++ b/src/glsl/ast.h
> @@ -479,6 +479,12 @@ struct ast_type_qualifier {
>    unsigned pixel_center_integer:1;
>    /*@}*/
>  
> + /**
> +  * Flag set if GL_ARB_enhanced_layouts "align" layout
> qualifier is
> +  * used.
> +  */
> + unsigned explicit_align:1;
> +
>    /**
>     * Flag set if GL_ARB_explicit_attrib_location "location"
> layout
>     * qualifier is used.
> @@ -583,6 +589,11 @@ struct ast_type_qualifier {
> /** Precision of the type (highp/medium/lowp). */
> unsigned precision:2;
>  
> +   /**
> +* Alignment specified via GL_ARB_enhanced_layouts "align" layout
> qualifier
> +*/
> +   ast_expression *align;
> +
> /** Geometry shader invocations for GL_ARB_gpu_shader5. */
> ast_layout_expression *invocations;
>  
> diff --git a/src/glsl/ast_type.cpp b/src/glsl/ast_type.cpp
> index 6d6f88f..f4e51b8 100644
> --- a/src/glsl/ast_type.cpp
> +++ b/src/glsl/ast_type.cpp
> @@ -73,6 +73,7 @@ ast_type_qualifier::has_layout() const
>    || this->flags.q.column_major
>    || this->flags.q.row_major
>    || this->flags.q.packed
> +  || this->flags.q.explicit_align
>    || this->flags.q.explicit_component
>    || this->flags.q.explicit_location
>    || this->flags.q.explicit_index
> @@ -268,6 +269,9 @@ ast_type_qualifier::merge_qualifier(YYLTYPE *loc,
>  
> this->flags.i |= q.flags.i;
>  
> +   if (q.flags.q.explicit_align)
> +  this->align = q.align;
> +
> if (q.flags.q.explicit_location)
>    this->location = q.location;
>  
> diff --git a/src/glsl/glsl_parser.yy b/src/glsl/glsl_parser.yy
> index b2b94f4..83bebe9 100644
> --- a/src/glsl/glsl_parser.yy
> +++ b/src/glsl/glsl_parser.yy
> @@ -1464,6 +1464,17 @@ layout_qualifier_id:
>    "GLSL 4.40 or ARB_enhanced_layouts");
>    }
>  
> +  if (match_layout_qualifier("align", $1, state) == 0) {
> + if (!state->has_enhanced_layouts()) {
> +_mesa_glsl_error(& @1, state,
> + "align qualifier requires "
> + "GLSL 4.40 or ARB_enhanced_layouts");
> + } else {
> +$$.flags.q.explicit_align = 1;
> +$$.align = $3;
> + }
> +  }
> +
>    if (match_layout_qualifier("location", $1, state) == 0) {
>   $$.flags.q.explicit_location = 1;
>  

signature.asc
Description: This is a digitally signed message part
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 22/23] i965: Add helper for checking for lossless compressible

2016-02-09 Thread Pohjolainen, Topi

On Mon, Feb 08, 2016 at 06:51:42PM +0200, Topi Pohjolainen wrote:
> Signed-off-by: Topi Pohjolainen 
> ---
>  src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 21 +
>  src/mesa/drivers/dri/i965/intel_mipmap_tree.h |  3 +++
>  2 files changed, 24 insertions(+)
> 
> diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
> b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> index 1fd2654..59961f2 100644
> --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> @@ -270,6 +270,27 @@ intel_miptree_supports_non_msrt_fast_clear(struct 
> brw_context *brw,
>return true;
>  }
>  
> +bool
> +intel_miptree_supports_lossless_compressed(mesa_format format)
> +{
> +   /* For now compression is only enabled for integer formats even though
> +* there exist supported floating point formats also. This is a heuristic
> +* decision based on current public benchmarks. In none of the cases these
> +* formats provided any improvement but a few cases were seen to regress.
> +* Hence these are left to to be enabled in the future when they are known
> +* to improve things.
> +*/
> +   if (!_mesa_is_format_integer_color(format))

This is wrong afterall, it doesn't take GL_UNSIGNED_NORMALIZED into
account which is definitely needed. The performance numbers are valid though,
I got them having an explicit check against MESA_FORMAT_RGBA_FLOAT16 here
allowing MESA_FORMAT_R8G8B8A8_UNORM through that makes the difference in
the benches.

> +  return false;
> +
> +   /* In principle, fast clear mechanism and lossless compression go hand in
> +* hand. However, fast clear can be also used to clear srgb surfaces by
> +* using equivalent linear format. This trick, however, can't be extended
> +* to be used with lossless compression and therefore a check is needed to
> +* see if the format really is linear.
> +*/
> +   return _mesa_get_srgb_format_linear(format) == format;
> +}
>  
>  /**
>   * Determine depth format corresponding to a depth+stencil format,
> diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.h 
> b/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
> index f05436d..3a1ecd2 100644
> --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
> +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
> @@ -684,6 +684,9 @@ intel_miptree_supports_non_msrt_fast_clear(struct 
> brw_context *brw,
> const struct intel_mipmap_tree 
> *mt);
>  
>  bool
> +intel_miptree_supports_lossless_compressed(mesa_format format);
> +
> +bool
>  intel_miptree_alloc_non_msrt_mcs(struct brw_context *brw,
>   struct intel_mipmap_tree *mt);
>  
> -- 
> 2.5.0
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 00/12] Some more Nine fixes

2016-02-09 Thread Emil Velikov

On 7 February 2016 at 23:13, Axel Davy  wrote:
> A few more patches I'd like to get in 11.2.
>
> There a few cleanup patches and some fixes.
>
> The last patch fixes build with llvm 32 bits
> when it isn't built with -mstackrealign.
> Basically Apps have a 4 byte aligned stack,
> and it needs to be converted at some point
> to 16 byte aligned stack to have SSE code
> and llvm work correctly. I think the better
> is to just realign at d3d entry points.
>
> Any suggestion whether that last patch should
> be sent to mesa stable or not ?
>
As is the patch is a bit large, I'm thinking if it wouldn't be better
to get it in if we get (m)any reports ?
Although I won't object if you/others want to squeeze it in.


>   st/nine: Align stack for entry points
>
As mentioned on IRC, one can just add the alignment to the definition
of WINAP in d3dtypes.h. Other, non nine, users of the macro, should
not include d3dtypes.h so we ought to be fine. Alternatively ...


>  create mode 100644 src/gallium/state_trackers/nine/nine_alignment.h
>
Please add this new file to the list in
src/gallium/state_trackers/nine/Makefile.sources

Thanks
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 93813] Incorrect viewport range when GL_CLIP_ORIGIN is GL_UPPER_LEFT

2016-02-09 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=93813

--- Comment #6 from Jose Fonseca  ---
(In reply to Mathias Fröhlich from comment #4)
> I believe that the piglit tests don't uses non zero y values for
> the viewport. May be we want to add a test there also.

Definitely.  And in fact James example seems an excellent way to verify it.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH V2 2/2] glsl: clean up and fix bug in varying linking rules

2016-02-09 Thread Timothy Arceri

On Tue, 2016-02-09 at 01:29 -0800, Kenneth Graunke wrote:
> On Tuesday, February 2, 2016 12:20:01 PM PST Timothy Arceri wrote:
> > The existing code was very hard to follow and has been the source
> > of at least 3 bugs in the past year.
> > 
> > The existing code also has a bug for SSO where if we have a
> > multi-stage SSO for example a tes -> gs program, if we try to use
> > transform feedback with gs the existing code would look for the
> > transform feedback varyings in the tes stage and fail as it can't
> > find them.
> > 
> > V2: Add more code comments, always try to remove unused inputs
> > to the first stage.
> > 
> > Cc: Ilia Mirkin 
> > Cc: Chris Forbes 
> > ---
> >  src/compiler/glsl/linker.cpp | 137 ---
> > 
> >  1 file changed, 63 insertions(+), 74 deletions(-)
> > 
> > diff --git a/src/compiler/glsl/linker.cpp
> > b/src/compiler/glsl/linker.cpp
> > index fdfdcaa..09323bb 100644
> > --- a/src/compiler/glsl/linker.cpp
> > +++ b/src/compiler/glsl/linker.cpp
> > @@ -4461,91 +4461,80 @@ link_shaders(struct gl_context *ctx, struct
> > gl_shader_program *prog)
> >   goto done;
> > }
> >  
> > -   /* Linking the stages in the opposite order (from fragment to
> > vertex)
> > -* ensures that inter-shader outputs written to in an earlier
> > stage are
> > -* eliminated if they are (transitively) not used in a later
> > stage.
> > +   /* If there is no fragment shader we need to set transform
> > feedback.
> > +*
> > +* For SSO we need also need to assign output locations, we
> > assign them
> > +* here because we need to do it for both single stage programs
> > and multi
> > +* stage programs.
> >  */
> > -   int next;
> > -
> > -   if (first < MESA_SHADER_FRAGMENT) {
> > -  gl_shader *const sh = prog->_LinkedShaders[last];
> > -
> > -  if (first != MESA_SHADER_VERTEX) {
> > - /* There was no vertex shader, but we still have to
> > assign varying
> > -  * locations for use by tessellation/geometry shader
> > inputs in SSO.
> > -  *
> > -  * If the shader is not separable (i.e., prog-
> > >SeparateShader is
> > -  * false), linking will have already failed when first is
> > not
> > -  * MESA_SHADER_VERTEX.
> > -  */
> > - if (!assign_varying_locations(ctx, mem_ctx, prog,
> > -   NULL, prog-
> > >_LinkedShaders[first],
> > -   num_tfeedback_decls,
> > tfeedback_decls))
> > -goto done;
> > -  }
> > -
> > -  if (last != MESA_SHADER_FRAGMENT &&
> > - (num_tfeedback_decls != 0 || prog->SeparateShader)) {
> > - /* There was no fragment shader, but we still have to
> > assign varying
> > -  * locations for use by transform feedback.
> > -  */
> > - if (!assign_varying_locations(ctx, mem_ctx, prog,
> > -   sh, NULL,
> > -   num_tfeedback_decls,
> > tfeedback_decls))
> > -goto done;
> > -  }
> > -
> > -  do_dead_builtin_varyings(ctx, sh, NULL,
> > -   num_tfeedback_decls,
> > tfeedback_decls);
> > +   if (last < MESA_SHADER_FRAGMENT &&
> > +   (num_tfeedback_decls != 0 || prog->SeparateShader)) {
> > +  if (!assign_varying_locations(ctx, mem_ctx, prog,
> > +prog->_LinkedShaders[last],
> > NULL,
> > +num_tfeedback_decls,
> > tfeedback_decls))
> > + goto done;
> > +   }
> >  
> > -  remove_unused_shader_inputs_and_outputs(prog-
> > >SeparateShader, sh,
> > +   if (last <= MESA_SHADER_FRAGMENT) {
> 
> I don't understand the point of last <= MESA_SHADER_FRAGMENT here.
> The only other option is MESA_SHADER_COMPUTE, which has no inputs or
> outputs, so calling this should be harmless.

Since the existing code was so hard to follow I wrote this by seeing
how little code I could get away with and testing until all regressions
went away. Anyway it wasn't harmless when compute reached this code,
although I don't recall exactly what the problem was as it was a couple
of weeks ago.

> 
> > +  /* Remove unused varyings from the first/last stage unless
> > SSO */
> > +  remove_unused_shader_inputs_and_outputs(prog-
> > >SeparateShader,
> > +  prog-
> > >_LinkedShaders[first],
> > +  ir_var_shader_in);
> > +  remove_unused_shader_inputs_and_outputs(prog-
> > >SeparateShader,
> > +  prog-
> > >_LinkedShaders[last],
> >    ir_var_shader_out);
> 
> I see, assign_varying_locations calls this for internal stages, when
> there's a producer and consumer - this just calls it for the boundary
> shaders in the !SSO case.

[Mesa-dev] [Bug 93813] Incorrect viewport range when GL_CLIP_ORIGIN is GL_UPPER_LEFT

2016-02-09 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=93813

Jose Fonseca  changed:

   What|Removed |Added

 CC||bri...@vmware.com,
   ||jfons...@vmware.com,
   ||srol...@vmware.com

--- Comment #5 from Jose Fonseca  ---
(In reply to Ilia Mirkin from comment #3)
> A bit of potentially-relevant spec quote from ARB_clip_control:
> 
> 12) Does setting the clip control origin to GL_UPPER_LEFT change the
> origin of the window coordinate system use for commands such as
> glViewport, glScissor, glWindowPos2i, and glReadPixels?
> 
> RESOLVED:  No.
> 
> The (x,y) window space location passed to these commands have the
> (0,0) origin at the lower left corner of the window, independent
> of the state of the clip control origin.
> 
> Still unsure whether that rules in favor of Mesa or of the test case :)
> Hopefully someone more knowledgeable can weigh in.

Hi Ilia,

First of all, let me warn that I'm not an expert on this extension.  I
appreciate your perseverance getting to the bottom at this though.  In fact, we
(at VMware) actually ran into issues using this extension, but the person most
knowledgeable about it is currently on PTO.

FWIW, I looked into this, and the way I interpret it, the text of issue 12) is
actually conclusive evidence that Mesa is *wrong*:

- The attached draws a red rectangle covering the whole viewport, as xyzw
coordinates before clipping will be

-1.0, -1.0, 0.5, 1.0
 3.0, -1.0, 0.5, 1.0
-1.0,  3.0, 0.5, 1.0
 3.0,  3.0, 0.5, 1.0

  hence the expected rendering is that all pixels inside viewport will be red,
all outside will be black (the clear color)

- Issue 12) states that the _window coordinate system_ for glViewport should be
the same

- Hence, we should get the _exact_ same rendering, whether we comment or not
the 

glClipControl(GL_UPPER_LEFT, GL_ZERO_TO_ONE)

  call in the attached example.

  That is, it should doesn't matter where's the clip origin, as the primitive
completely cover the viewport, and all fragments have the same color.  

- But that's not what's happening:

  - if one comments out glClipControl one gets the viewport pixels painted red

  - if one does not comment the glClipControl only some viewport pixels are
painted red, but some are painted black, which really shouldn't happen.




For reference, this are a few modifications I did to the attached sample to
better understand it:

--- fdo121191.c.orig2016-02-09 11:37:33.122145252 +
+++ fdo121191.c2016-02-09 11:40:23.597820907 +
@@ -68,7 +68,8 @@
 typedef void (APIENTRYP PFNGLCLIPCONTROLPROC) (GLenum origin, GLenum
depth);
 const auto glClipControl = reinterpret_cast(
 SDL_GL_GetProcAddress("glClipControl"));
-glClipControl(GL_UPPER_LEFT, GL_ZERO_TO_ONE);
+// XXX: We should get the same rendering regardless we have 1 or 0 here
+if (1) glClipControl(GL_UPPER_LEFT, GL_ZERO_TO_ONE);

 GLuint texture;
 glGenTextures(1, );
@@ -160,7 +161,12 @@

 assert(glGetError() == GL_NO_ERROR);

-std::cout << colours[0] << ", " << colours[2] << "\n";
+for (unsigned y = 0; y < textureSizeY; ++y) {
+for (unsigned x = 0; x < textureSizeX; ++x) {
+std::cout << colours[y*textureSizeX + x] << "\t";
+}
+std::cout << "\n";
+}
 // This was outside the viewport, so should be black from the clear
 assert(colours[0] == 0.0f);
 // This was inside the viewport, so should be red from the draw

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 93813] Incorrect viewport range when GL_CLIP_ORIGIN is GL_UPPER_LEFT

2016-02-09 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=93813

--- Comment #7 from Brian Paul  ---
I think Ilia's fix in comment #1 is correct.

BTW, the piglit clip-control test currently passes with Mesa but fails with
nvidia's driver.  I'll try to take a closer look.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH V3 1/2] glsl: don't attempt to link empty program

2016-02-09 Thread Ian Romanick

On 01/25/2016 06:33 PM, Timothy Arceri wrote:
> On Mon, 2016-01-25 at 16:58 -0800, Ian Romanick wrote:
>> On 01/25/2016 04:46 PM, Timothy Arceri wrote:
>>> Previously an empty program would go through the entire
>>> link_shaders() function and we would have to be careful
>>> not to cause a segfault.
>>>
>>> In core profile also now set link_status to false by
>>> generating an error, it was previously set to true.
>>>
 From Section 7.3 (PROGRAM OBJECTS) of the OpenGL 4.5 spec:
>>>
>>>"Linking can fail for a variety of reasons as specified in the
>>>OpenGL Shading Language Specification, as well as any of the
>>>following reasons:
>>>
>>> - No shader objects are attached to program."
>>>
>>> V2: Only generate an error in core profile and add spec quote (Ian)
>>>
>>> V3: generate error in ES too, remove previous check which was only
>>> applying the rule to GL 4.5/ES 3.1 and above. My understand is that
>>> this spec change is clarifying previously undefined behaviour and
>>> therefore should be applied retrospectively. The ES CTS tests for
>>> this are in ES 2 I suspect it was passing because it would have
>>> generated an error for not having both a vertex and fragment
>>> shader.
>>>
>>> Cc: Ian Romanick 
>>> Cc: Tapani Pälli 
>>> Cc: Ilia Mirkin 
>>> ---
>>>  src/glsl/linker.cpp | 43 ++---
>>> --
>>>  1 file changed, 22 insertions(+), 21 deletions(-)
>>>
>>> diff --git a/src/glsl/linker.cpp b/src/glsl/linker.cpp
>>> index 665..79039a0 100644
>>> --- a/src/glsl/linker.cpp
>>> +++ b/src/glsl/linker.cpp
>>> @@ -4105,14 +4105,34 @@
>>> disable_varying_optimizations_for_sso(struct gl_shader_program
>>> *prog)
>>>  void
>>>  link_shaders(struct gl_context *ctx, struct gl_shader_program
>>> *prog)
>>>  {
>>> +   prog->Validated = false;
>>> +   prog->_Used = false;
>>> +
>>> +   /* Section 7.3 (Program Objects) of the OpenGL 4.5 Core Profile
>>> spec says:
>>> +*
>>> +* "Linking can fail for a variety of reasons as specified
>>> in the
>>> +* OpenGL Shading Language Specification, as well as any of
>>> the
>>> +* following reasons:
>>> +*
>>> +* - No shader objects are attached to program."
>>> +*
>>> +* The Compatibility Profile specification does not list the
>>> error.  In
>>> +* Compatibility Profile missing shader stages are replaced by
>>> +* fixed-function.  This applies to the case where all stages
>>> are
>>> +* missing.
>>> +*/
>>> +   if (prog->NumShaders == 0) {
>>> +  if (ctx->API != API_OPENGL_COMPAT)
>>> + linker_error(prog, "no shaders attached to the
>>> program\n");
>>> +  return;
>>
>> Does glsl-link-empty-prog-02 still pass?  Since prog->LinkStatus is
>> false, I think that will cause problems.  Won't glUseProgram generate
>> GL_INVALID_OPERATION?
> 
> It does pass, seems something sets the link flag to true before we get
> here. Maybe the fallback code?
> 
> Anyway it makes sense to also move the prog->LinkStatus = true; to the
> top so I've done that locally.

With the 'prog->LinkStatus = true;' moved, this patch is

Reviewed-by: Ian Romanick 

>>> +   }
>>> +
>>> tfeedback_decl *tfeedback_decls = NULL;
>>> unsigned num_tfeedback_decls = prog-
 TransformFeedback.NumVarying;
>>>  
>>> void *mem_ctx = ralloc_context(NULL); // temporary linker
>>> context
>>>  
>>> prog->LinkStatus = true; /* All error paths will set this to
>>> false */
>>> -   prog->Validated = false;
>>> -   prog->_Used = false;
>>>  
>>> prog->ARB_fragment_coord_conventions_enable = false;
>>>  
>>> @@ -4162,25 +4182,6 @@ link_shaders(struct gl_context *ctx, struct
>>> gl_shader_program *prog)
>>> prog->Version = max_version;
>>> prog->IsES = is_es_prog;
>>>  
>>> -   /* From OpenGL 4.5 Core specification (7.3 Program Objects):
>>> -* "Linking can fail for a variety of reasons as specified
>>> in the OpenGL
>>> -* Shading Language Specification, as well as any of the
>>> following
>>> -* reasons:
>>> -*
>>> -* * No shader objects are attached to program.
>>> -*
>>> -* ..."
>>> -*
>>> -* Same rule applies for OpenGL ES >= 3.1.
>>> -*/
>>> -
>>> -   if (prog->NumShaders == 0 &&
>>> -   ((ctx->API == API_OPENGL_CORE && ctx->Version >= 45) ||
>>> -(ctx->API == API_OPENGLES2 && ctx->Version >= 31))) {
>>> -  linker_error(prog, "No shader objects are attached to
>>> program.\n");
>>> -  goto done;
>>> -   }
>>> -
>>> /* Some shaders have to be linked with some other shaders
>>> present.
>>>  */
>>> if (num_shaders[MESA_SHADER_GEOMETRY] > 0 &&
>>>
>>

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] GLSL IR is no longer cool where to from here?

2016-02-09 Thread Ian Romanick

On 02/05/2016 06:57 PM, Timothy Arceri wrote:
> For the past couple of months I've been working away solely in the
> wasteland that is GLSL IR and one things seems clear. No one wants to
> review this code anymore. A lot of the original developers have either
> moved on or are busy with other things.
> 
> The difference between sending a patch with nir: ... vs glsl: ... is
> very noticable.
> 
> Its not impossible to get reviews for patches, especially if they are a
> small part of a bigger series not just confined to GLSL IR, but
> anything involving a refactoring can be difficult as no-one wants to
> relearn how this code works, step up to the ast code and things are
> even worse.
> 
> So I guess the discussion I'm trying to kick off is, with the CTS, dEPQ
> and piglit all saying no regressions (or even reporting fixes \0/).
> Should one still be forced to go around hassling people for a rubber
> stamped r-b? Or can we relax the criteria for pushing bug-fix/refactor
> type patches for GLSL-IR?
> 
> The other thing I've consider is maybe this I just don't review enough
> patches for people to reciprocate, although I've made an effort to
> review patches where I feel I can since being employed to work on Mesa
> so hopefully thats not it. Maybe it's time for Ken to run his script
> again :-P

At this point, most of it is just twitchy, annoying code that only a
couple people still really know.

I think one or two lazy people (eh-hem... maybe just one) need to get
off their butts and start actively reviewing your patches.  Please don't
start pushing unreviewed code.  That never turns out well.  There have
been enough reviewed patches this year already that have had problems...
some of them leading to reverts.  We don't need unreviewed code too.

> Tim
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 6/7] glx: remove incorrect refcounting of DRIdrawable

2016-02-09 Thread Ian Romanick

On 02/08/2016 05:11 PM, Ian Romanick wrote:
> On 02/05/2016 01:11 PM, Miklós Máté wrote:
>> dri drawables must never be released when unbound from a context
>> as long as their corresponding glx objects (window, pixmap, pbuffer)
>> still exist
> 
> I'd really like to have Kristian weigh in, since DRI2 was his design,
> and this is all his code being affected.  That said, I'm a little unsure
> about this change.
> 
> The DRI drawables and associated resources will still get freed when the
> application eventually calls glXDestroyPBuffer or glXDestroyWindow.
> 
> But I'm not 100% sure what will happen with GLX 1.2 applications...
> there are no glXDestroy* functions in GLX 1.2.  In that case the
> application will have a soft leak because calling XDestroyWindow won't
> do anything on the client (or server?) for the DRI resources.  I suspect
> that's what this code was trying to prevent.

And this commit affirms that:

commit bf69ce37f0dcbb479078ee676d5100ac63e20750
Author: Stéphane Marchesin 
Date:   Wed Jun 15 15:09:12 2011 -0700

glx: implement drawable refcounting.

The current dri context unbind logic will leak drawables until the process
dies (they will then get released by the GEM code). There are two ways to 
fix
this: either always call driReleaseDrawables every time we unbind a context
(but that costs us round trips to the X server at getbuffers() time) or
implement proper drawable refcounting. This patch implements the latter.

Signed-off-by: Antoine Labour 
Signed-off-by: Stéphane Marchesin 
Reviewed-by: Adam Jackson 

Since we don't have any way to know when the corresponding GLX object
ceases to exist, we don't know when it's safe to destroy the DRI
object.  I don't know how we can not leak and not prematurely destroy
the object. :(

>> this fixes fd.o bug #93955 and disappearing characters in KotOR when
>> soft shadows are enabled
> 
> I think this is also a candidate for stable.  Reference this in the
> commit message like:
> 
> This fixes disappearing characters in KotOR when soft shadows are enabled.
> 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93955
> Cc: "11.1" 
> 
>> ---
>>  src/glx/dri2_glx.c   |  4 
>>  src/glx/dri3_glx.c   |  4 
>>  src/glx/dri_common.c | 38 --
>>  src/glx/dri_glx.c|  4 
>>  src/glx/drisw_glx.c  |  4 
>>  src/glx/glxclient.h  |  1 -
>>  6 files changed, 55 deletions(-)
>>
>> diff --git a/src/glx/dri2_glx.c b/src/glx/dri2_glx.c
>> index 7710349..ebc878f 100644
>> --- a/src/glx/dri2_glx.c
>> +++ b/src/glx/dri2_glx.c
>> @@ -122,8 +122,6 @@ dri2_destroy_context(struct glx_context *context)
>> struct dri2_context *pcp = (struct dri2_context *) context;
>> struct dri2_screen *psc = (struct dri2_screen *) context->psc;
>>  
>> -   driReleaseDrawables(>base);
>> -
>> free((char *) context->extensions);
>>  
>> (*psc->core->destroyContext) (pcp->driContext);
>> @@ -145,8 +143,6 @@ dri2_bind_context(struct glx_context *context, struct 
>> glx_context *old,
>> pdraw = (struct dri2_drawable *) driFetchDrawable(context, draw);
>> pread = (struct dri2_drawable *) driFetchDrawable(context, read);
>>  
>> -   driReleaseDrawables(>base);
>> -
>> if (pdraw)
>>dri_draw = pdraw->driDrawable;
>> else if (draw != None)
>> diff --git a/src/glx/dri3_glx.c b/src/glx/dri3_glx.c
>> index 6054ffc..38f5799 100644
>> --- a/src/glx/dri3_glx.c
>> +++ b/src/glx/dri3_glx.c
>> @@ -186,8 +186,6 @@ dri3_destroy_context(struct glx_context *context)
>> struct dri3_context *pcp = (struct dri3_context *) context;
>> struct dri3_screen *psc = (struct dri3_screen *) context->psc;
>>  
>> -   driReleaseDrawables(>base);
>> -
>> free((char *) context->extensions);
>>  
>> (*psc->core->destroyContext) (pcp->driContext);
>> @@ -206,8 +204,6 @@ dri3_bind_context(struct glx_context *context, struct 
>> glx_context *old,
>> pdraw = (struct dri3_drawable *) driFetchDrawable(context, draw);
>> pread = (struct dri3_drawable *) driFetchDrawable(context, read);
>>  
>> -   driReleaseDrawables(>base);
>> -
>> if (pdraw == NULL || pread == NULL)
>>return GLXBadDrawable;
>>  
>> diff --git a/src/glx/dri_common.c b/src/glx/dri_common.c
>> index 6728d38..2d334ab 100644
>> --- a/src/glx/dri_common.c
>> +++ b/src/glx/dri_common.c
>> @@ -404,7 +404,6 @@ driFetchDrawable(struct glx_context *gc, GLXDrawable 
>> glxDrawable)
>>return NULL;
>>  
>> if (__glxHashLookup(priv->drawHash, glxDrawable, (void *) ) == 0) {
>> -  pdraw->refcount ++;
>>return pdraw;
>> }
>>  
>> @@ -420,47 +419,10 @@ driFetchDrawable(struct glx_context *gc, GLXDrawable 
>> glxDrawable)
>>(*pdraw->destroyDrawable) (pdraw);
>>return NULL;
>> }
>> -   pdraw->refcount = 1;
>>  
>> return pdraw;
>>  }
>>  
>>

[Mesa-dev] [Bug 93813] Incorrect viewport range when GL_CLIP_ORIGIN is GL_UPPER_LEFT

2016-02-09 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=93813

--- Comment #8 from Ilia Mirkin  ---
Mathias, Jose, and Brian -- thanks to all of you for taking a look. Sounds like
everyone's in agreement that Mesa is wrong.

I'll get the ball rolling on throwing something into piglit and sending out my
patch from comment #1 as a proper patch.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] mesa: fix incorrect viewport position when GL_CLIP_ORIGIN = GL_LOWER_LEFT

2016-02-09 Thread Brian Paul

Ilia Mirkin found/fixed the mistake.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93813
Cc: "11.1" 
---
 src/mesa/main/viewport.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/mesa/main/viewport.c b/src/mesa/main/viewport.c
index 7d891429..681e46b 100644
--- a/src/mesa/main/viewport.c
+++ b/src/mesa/main/viewport.c
@@ -456,11 +456,11 @@ _mesa_get_viewport_xform(struct gl_context *ctx, unsigned 
i,
translate[0] = half_width + x;
if (ctx->Transform.ClipOrigin == GL_UPPER_LEFT) {
   scale[1] = -half_height;
-  translate[1] = half_height - y;
} else {
   scale[1] = half_height;
-  translate[1] = half_height + y;
}
+   translate[1] = half_height + y;
+
if (ctx->Transform.ClipDepthMode == GL_NEGATIVE_ONE_TO_ONE) {
   scale[2] = 0.5 * (f - n);
   translate[2] = 0.5 * (n + f);
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 3/4] glsl: apply align layout qualifier rules to block offsets

2016-02-09 Thread Ian Romanick

On 02/09/2016 06:06 AM, Samuel Iglesias Gonsálvez wrote:
> 
> On Tue, 2016-01-12 at 20:34 +1100, Timothy Arceri wrote:
>> From Section 4.4.5 (Uniform and Shader Storage Block Layout
>> Qualifiers) of the OpenGL 4.50 spec:
>>
>>   "The align qualifier makes the start of each block member have a
>>   minimum byte alignment.  It does not affect the internal layout
>>   within each member, which will still follow the std140 or std430
>>   rules. The specified alignment must be a power of 2, or a
>>   compile-time error results.
>>
>>   The actual alignment of a member will be the greater of the
>>   specified align alignment and the standard (e.g., std140) base
>>   alignment for the member's type. The actual offset of a member is
>>   computed as follows: If offset was declared, start with that
>>   offset, otherwise start with the next available offset. If the
>>   resulting offset is not a multiple of the actual alignment,
>>   increase it to the first offset that is a multiple of the actual
>>   alignment. This results in the actual offset the member will have.
>>
>>   When align is applied to an array, it affects only the start of
>>   the array, not the array's internal stride. Both an offset and an
>>   align qualifier can be specified on a declaration.
>>
>>   The align qualifier, when used on a block, has the same effect as
>>   qualifying each member with the same align value as declared on
>>   the block, and gets the same compile-time results and errors as if
>>   this had been done. As described in general earlier, an individual
>>   member can specify its own align, which overrides the block-level
>>   align, but just for that member.
>> ---
>>  src/glsl/ast_to_hir.cpp | 51
>> ++---
>>  1 file changed, 48 insertions(+), 3 deletions(-)
>>
>> diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp
>> index f5da771..fd05fd7 100644
>> --- a/src/glsl/ast_to_hir.cpp
>> +++ b/src/glsl/ast_to_hir.cpp
>> @@ -6212,7 +6212,8 @@
>> ast_process_struct_or_iface_block_members(exec_list *instructions,
>>ir_variable_mode var_mode,
>>ast_type_qualifier
>> *layout,
>>unsigned block_stream,
>> -  unsigned expl_location)
>> +  unsigned expl_location,
>> +  unsigned expl_align)
>>  {
>> unsigned decl_count = 0;
>> unsigned next_offset = 0;
>> @@ -6466,6 +6467,34 @@
>> ast_process_struct_or_iface_block_members(exec_list *instructions,
>>  }
>>   } else {
>>  fields[i].offset = -1;
>> + }
>> +
>> + if (qual->flags.q.explicit_align || expl_align != 0) {
>> +unsigned offset = fields[i].offset != -1 ?
>> fields[i].offset :
>> +   next_offset;
>> +if (align == 0 || size == 0) {
>> +   _mesa_glsl_error(, state, "align can only be used
>> with "
>> +"std430 and std140 layouts");
>> +} else if (qual->flags.q.explicit_align) {
>> +   unsigned member_align;
>> +   if (process_qualifier_constant(state, , "align",
>> +  qual->align,
>> _align)) {
>> +  if (member_align == 0 ||
> 
> I have modified ubo-block-align-zero.vert piglit test to set the align
> qualifier only to block members and tested with align = 0. The shader
> compiles on NVIDIA proprietary driver.
> 
> According to the spec, the specified alignment must be a power of 2.
> However align = 0 could have different interpretations (for example,
> when applied to a block member, it would mean to ignore block's align
> value). As I am not sure about this case...
> 
> Have you checked if align = 0 is invalid?

I looked at the ARB_enhanced_layouts spec, and it doesn't provide any
real guidance.  My gut tells me that align=0 is not valid because the
spec doesn't say what it means.  Either way, I have submitted a spec bug:

https://www.khronos.org/bugzilla/show_bug.cgi?id=1461

Does glslang allow layout(align=0)?  AMD's closed-source driver?

> Sam
> 
>> +  member_align & (member_align - 1)) {
>> + _mesa_glsl_error(, state, "align layout
>> qualifier "
>> +  "in not a power of 2");
>> +  } else {
>> + fields[i].offset = glsl_align(offset,
>> member_align);
>> + next_offset = glsl_align(fields[i].offset +
>> size, align);
>> +  }
>> +   }
>> +} else {
>> +   fields[i].offset = glsl_align(offset, expl_align);
>> +   next_offset = glsl_align(fields[i].offset + size,
>> align);
>> +}
>> + }
>> +
>> + if (!qual->flags.q.explicit_offset)

Re: [Mesa-dev] [PATCH 1/4] glsl: add component to has_layout() helper

2016-02-09 Thread Ian Romanick

Patches 1, 2, and 4 are

Reviewed-by: Ian Romanick 

Pending the result of the Khronos spec bug, patch 3 is also R-b.

On 01/12/2016 01:34 AM, Timothy Arceri wrote:
> I don't think this will do much as it's a compiler error
> to use component without location which is already in the
> table but its good to be consistent.
> ---
>  src/glsl/ast_type.cpp | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/src/glsl/ast_type.cpp b/src/glsl/ast_type.cpp
> index 7330a34..6d6f88f 100644
> --- a/src/glsl/ast_type.cpp
> +++ b/src/glsl/ast_type.cpp
> @@ -73,6 +73,7 @@ ast_type_qualifier::has_layout() const
>|| this->flags.q.column_major
>|| this->flags.q.row_major
>|| this->flags.q.packed
> +  || this->flags.q.explicit_component
>|| this->flags.q.explicit_location
>|| this->flags.q.explicit_index
>|| this->flags.q.explicit_binding
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] winsys/radeon: fix a wrong NUM_TILE_PIPES value from the kernel

2016-02-09 Thread Alexandre Demers

> +/* The kernel returns 12 for some cards for an unknown
reason.
> + * I thought this was supposed to be a power of two.
> + */
> +if (ws->gen == DRV_SI && ws->info.num_tile_pipes == 12)
> +ws->info.num_tile_pipes = 8;
> +

I may be late in the conversation, but shouldn't we have a look at why the
value reported by the kernel is wrong for "some" cards? Which ones and why
should be identified. It seems to be limited to Southern Islands as far as
we know for now, which limits the scope for now.

Also, about the patch itself, even if only some cards were reported to be
problematic, why would we limit it to "ws->gen == DRV_SI"? Any cards
reporting a wrong value should be treated the same way by mapping its value
from 12 to 8, no?

My late two cents here.
Alexandre Demers
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 93813] Incorrect viewport range when GL_CLIP_ORIGIN is GL_UPPER_LEFT

2016-02-09 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=93813

--- Comment #9 from Mathias Fröhlich  ---
Brian,

I had the same experience with the nvidia driver back when I wrote the test
some time ago.
But if I recall correctly, then the nvidia driver does agree with the test in
any of these tested state combinations. But the piglit test wildly switches
between the state combinations in between draws which probably exposes an
internal update problem in the blob.
May be you will prove me wrong, but keep that in mind. I believe in the end I
could get the expected from the binary blob by only changing clip control at
the beginning of a new frame or something similar.

Greetings

Mathias

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 0/7] Fixes for SW:KotOR (v2)

2016-02-09 Thread Miklós Máté


On 02/06/2016 12:11 AM, Matt Turner wrote:

Thanks a ton for contributing this. This is really cool.

I've replied to a couple of patches with a lot of style comments.
These issues appear in a lot of places I didn't point out (it was
getting a bit repetitive). A summary is

  - Use spaces around operators
  - Use BSD-style function declarations, with qualifiers and return
type on their own line
  - Align line-wrapped arguments and parameters with the open paranthesis
  - Use braces in nested if-statements
  - Don't indent case/default labels past switch()
  - Don't add code that is commented out
  - Use spaces, not tabs

Thanks. I look forward to porting this to NIR and adding i965 support
at some point. :)
Thanks for your thorough review. I started working on fixing the coding 
style to match the project.
However, I have to argue with you on main/state.c (in 1/7): I created 
that code with copy, so the
usage of tabs and the && placement matches the style of the other if 
blocks in that function. Should I

fix them all?

MM
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] winsys/radeon: fix a wrong NUM_TILE_PIPES value from the kernel

2016-02-09 Thread Alexandre Demers

On Tue, 9 Feb 2016 at 15:17 Alex Deucher  wrote:

> On Tue, Feb 9, 2016 at 12:47 PM, Marek Olšák  wrote:
> > On Tue, Feb 9, 2016 at 6:17 PM, Alexandre Demers
> >  wrote:
> >>> +/* The kernel returns 12 for some cards for an unknown
> >>> reason.
> >>> + * I thought this was supposed to be a power of two.
> >>> + */
> >>> +if (ws->gen == DRV_SI && ws->info.num_tile_pipes == 12)
> >>> +ws->info.num_tile_pipes = 8;
> >>> +
> >>
> >> I may be late in the conversation, but shouldn't we have a look at why
> the
> >> value reported by the kernel is wrong for "some" cards? Which ones and
> why
> >> should be identified. It seems to be limited to Southern Islands as far
> as
> >> we know for now, which limits the scope for now.
> >>
> >> Also, about the patch itself, even if only some cards were reported to
> be
> >> problematic, why would we limit it to "ws->gen == DRV_SI"? Any cards
> >> reporting a wrong value should be treated the same way by mapping its
> value
> >> from 12 to 8, no?
> >
> > No. Only one card is affected (Tahiti or Pitcairn, I don't remember
> > which one). No other card reports 12.
> >
> > There is no point in looking into why the value is wrong and I haven't
> > been able to find where the value had come from. It's part of the
> > kernel ABI now anyway. Userspace won't use it anymore.
>
> I did it when I added SI support.  I think I get the value from tcore
> or some hw features header.  I don't remember the details.  Anyway, I
> think it may have been a hw value (related to the number of memory
> channels) whereas from a sw perspective, we'd use 8 for tiling
> computations.  E.g., when we set up rdev->config.si.tile_config which
> is what we used to use, we set it to 8.
>
> Alex
>

Ok, I had a look at what you were pointing. There was even a comment in
si_gpu_init() about tile_config for the case of the problematic value:

rdev->config.si.tile_config = 0;
switch (rdev->config.si.num_tile_pipes) {
[...]
case 8:
default:
/* XXX what about 12? */
rdev->config.si.tile_config |= (3 << 0);
break;
[...]

Should we just clarify the comment in Mesa's committed patch according to
Alex's explanation?
Alexandre Demers
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 04/23] i965: Don't try to create aux buffer for non-msrt aux-buffer

2016-02-09 Thread Ben Widawsky

On Mon, Feb 08, 2016 at 06:51:24PM +0200, Topi Pohjolainen wrote:
> In addition to simply calling miptree_create() the higher level
> call intel_miptree_create() also considers if the buffer should
> be associated with an auxiliary buffer based on the given format.
> 
> Here we are allocating an auxiliary buffer which in turn has such
> format that would mislead intel_miptree_create_layout() later on
> to try to associate the auxiliary buffer with an auxiliary buffer.
> To prevent this the actual buffer creation logic was split out
> into its own function. Lets invoke that instead.
> 
> Signed-off-by: Topi Pohjolainen 
> ---
>  src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 21 +++--
>  1 file changed, 11 insertions(+), 10 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
> b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> index d655de8..6c447ba 100644
> --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> @@ -1550,16 +1550,17 @@ intel_miptree_alloc_non_msrt_mcs(struct brw_context 
> *brw,
> if (brw->gen >= 8) {
>layout_flags |= MIPTREE_LAYOUT_FORCE_HALIGN16;
> }
> -   mt->mcs_mt = intel_miptree_create(brw,
> - mt->target,
> - format,
> - mt->first_level,
> - mt->last_level,
> - mcs_width,
> - mcs_height,
> - mt->logical_depth0,
> - 0 /* num_samples */,
> - layout_flags);
> +   mt->mcs_mt = miptree_create(brw,
> +   mt->target,
> +   format,
> +   mt->first_level,
> +   mt->last_level,
> +   mcs_width,
> +   mcs_height,
> +   mt->logical_depth0,
> +   0 /* num_samples */,
> +   INTEL_MSAA_LAYOUT_NONE,
> +   layout_flags);
>  
> return mt->mcs_mt;
>  }
> -- 

Same comment as patch 1, I guess - but the logic of it is:
Reviewed-by: Ben Widawsky 

I think with this you can fix the check for num_samples in gen8_surface_state.c
as well. I think it looks a bit better to check the format/layout as opposed to
num_samples in that case. Up to you.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 1/3] st/mesa: use MAX3() macro, as we do for sampler view code below

2016-02-09 Thread Brian Paul

---
 src/mesa/state_tracker/st_cb_drawpixels.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/mesa/state_tracker/st_cb_drawpixels.c 
b/src/mesa/state_tracker/st_cb_drawpixels.c
index 9c1eba4..7096bd2 100644
--- a/src/mesa/state_tracker/st_cb_drawpixels.c
+++ b/src/mesa/state_tracker/st_cb_drawpixels.c
@@ -650,7 +650,8 @@ draw_textured_quad(struct gl_context *ctx, GLint x, GLint 
y, GLfloat z,
   if (fpv) {
  /* drawing a color image */
  const struct pipe_sampler_state *samplers[PIPE_MAX_SAMPLERS];
- uint num = MAX2(MAX2(fpv->drawpix_sampler, fpv->pixelmap_sampler) + 1,
+ uint num = MAX3(fpv->drawpix_sampler + 1,
+ fpv->pixelmap_sampler + 1,
  st->state.num_samplers[PIPE_SHADER_FRAGMENT]);
  uint i;
 
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 3/3] st/mesa: clarify some texture target code in st_cb_drawpix.c

2016-02-09 Thread Brian Paul

Use st->internal_target instead of PIPE_TEXTURE_2D when choosing the
texture format.  Probably no real difference, but let's be consistent.

Simplify a test when determining whether we need normalized texcoords.

Add a new assertion.
---
 src/mesa/state_tracker/st_cb_drawpixels.c | 8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/src/mesa/state_tracker/st_cb_drawpixels.c 
b/src/mesa/state_tracker/st_cb_drawpixels.c
index 7096bd2..fd58886 100644
--- a/src/mesa/state_tracker/st_cb_drawpixels.c
+++ b/src/mesa/state_tracker/st_cb_drawpixels.c
@@ -358,8 +358,8 @@ make_texture(struct st_context *st,
   GLenum intFormat = internal_format(ctx, format, type);
 
   pipeFormat = st_choose_format(st, intFormat, format, type,
-PIPE_TEXTURE_2D, 0, PIPE_BIND_SAMPLER_VIEW,
-FALSE);
+st->internal_target, 0,
+PIPE_BIND_SAMPLER_VIEW, FALSE);
   assert(pipeFormat != PIPE_FORMAT_NONE);
}
 
@@ -556,7 +556,9 @@ draw_textured_quad(struct gl_context *ctx, GLint x, GLint 
y, GLfloat z,
struct cso_context *cso = st->cso_context;
GLfloat x0, y0, x1, y1;
GLsizei maxSize;
-   boolean normalized = sv[0]->texture->target != PIPE_TEXTURE_RECT;
+   boolean normalized = sv[0]->texture->target == PIPE_TEXTURE_2D;
+
+   assert(sv[0]->texture->target == st->internal_target);
 
/* limit checks */
/* XXX if DrawPixels image is larger than max texture size, break
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/3] st/mesa: fix bitmap texture target code and simplify tex sampler state

2016-02-09 Thread Brian Paul

Bitmaps may be drawn with a PIPE_TEXTURE_2D or PIPE_TEXTURE_RECT resource
as determined at context creation by checking if PIPE_CAP_NPOT_TEXTURES is
supported.  But many places in the bitmap code were hard-coded to use
PIPE_TEXTURE_2D.  Use st->internal_target instead.

I think an older NV chip is the only case where a gallium driver does not
support NPOT textures.  Bitmap drawing was probably broken for that GPU.

Also, we only need one sampler state with texcoord normalization set up
according to st->internal_target.
---
 src/mesa/state_tracker/st_cb_bitmap.c | 32 
 src/mesa/state_tracker/st_context.h   |  2 +-
 2 files changed, 17 insertions(+), 17 deletions(-)

diff --git a/src/mesa/state_tracker/st_cb_bitmap.c 
b/src/mesa/state_tracker/st_cb_bitmap.c
index 34809ad..627b8cb 100644
--- a/src/mesa/state_tracker/st_cb_bitmap.c
+++ b/src/mesa/state_tracker/st_cb_bitmap.c
@@ -251,8 +251,7 @@ setup_render_state(struct gl_context *ctx,
   for (i = 0; i < st->state.num_samplers[PIPE_SHADER_FRAGMENT]; i++) {
  samplers[i] = >state.samplers[PIPE_SHADER_FRAGMENT][i];
   }
-  samplers[fpv->bitmap_sampler] =
- >bitmap.samplers[sv->texture->target != PIPE_TEXTURE_RECT];
+  samplers[fpv->bitmap_sampler] = >bitmap.sampler;
   cso_set_samplers(cso, PIPE_SHADER_FRAGMENT, num,
(const struct pipe_sampler_state **) samplers);
}
@@ -438,7 +437,7 @@ reset_cache(struct st_context *st)
assert(!cache->texture);
 
/* allocate a new texture */
-   cache->texture = st_texture_create(st, PIPE_TEXTURE_2D,
+   cache->texture = st_texture_create(st, st->internal_target,
   st->bitmap.tex_format, 0,
   BITMAP_CACHE_WIDTH, BITMAP_CACHE_HEIGHT,
   1, 1, 0,
@@ -624,26 +623,27 @@ accum_bitmap(struct gl_context *ctx,
 static void
 init_bitmap_state(struct st_context *st)
 {
-   struct pipe_sampler_state *sampler = >bitmap.samplers[0];
struct pipe_context *pipe = st->pipe;
struct pipe_screen *screen = pipe->screen;
 
/* This function should only be called once */
assert(st->bitmap.cache == NULL);
 
+   assert(st->internal_target == PIPE_TEXTURE_2D ||
+  st->internal_target == PIPE_TEXTURE_RECT);
+
/* alloc bitmap cache object */
st->bitmap.cache = ST_CALLOC_STRUCT(bitmap_cache);
 
/* init sampler state once */
-   memset(sampler, 0, sizeof(*sampler));
-   sampler->wrap_s = PIPE_TEX_WRAP_CLAMP;
-   sampler->wrap_t = PIPE_TEX_WRAP_CLAMP;
-   sampler->wrap_r = PIPE_TEX_WRAP_CLAMP;
-   sampler->min_img_filter = PIPE_TEX_FILTER_NEAREST;
-   sampler->min_mip_filter = PIPE_TEX_MIPFILTER_NONE;
-   sampler->mag_img_filter = PIPE_TEX_FILTER_NEAREST;
-   st->bitmap.samplers[1] = *sampler;
-   st->bitmap.samplers[1].normalized_coords = 1;
+   memset(>bitmap.sampler, 0, sizeof(st->bitmap.sampler));
+   st->bitmap.sampler.wrap_s = PIPE_TEX_WRAP_CLAMP;
+   st->bitmap.sampler.wrap_t = PIPE_TEX_WRAP_CLAMP;
+   st->bitmap.sampler.wrap_r = PIPE_TEX_WRAP_CLAMP;
+   st->bitmap.sampler.min_img_filter = PIPE_TEX_FILTER_NEAREST;
+   st->bitmap.sampler.min_mip_filter = PIPE_TEX_MIPFILTER_NONE;
+   st->bitmap.sampler.mag_img_filter = PIPE_TEX_FILTER_NEAREST;
+   st->bitmap.sampler.normalized_coords = st->internal_target == 
PIPE_TEXTURE_2D;
 
/* init baseline rasterizer state once */
memset(>bitmap.rasterizer, 0, sizeof(st->bitmap.rasterizer));
@@ -653,17 +653,17 @@ init_bitmap_state(struct st_context *st)
 
/* find a usable texture format */
if (screen->is_format_supported(screen, PIPE_FORMAT_I8_UNORM,
-   PIPE_TEXTURE_2D, 0,
+   st->internal_target, 0,
PIPE_BIND_SAMPLER_VIEW)) {
   st->bitmap.tex_format = PIPE_FORMAT_I8_UNORM;
}
else if (screen->is_format_supported(screen, PIPE_FORMAT_A8_UNORM,
-PIPE_TEXTURE_2D, 0,
+st->internal_target, 0,
 PIPE_BIND_SAMPLER_VIEW)) {
   st->bitmap.tex_format = PIPE_FORMAT_A8_UNORM;
}
else if (screen->is_format_supported(screen, PIPE_FORMAT_L8_UNORM,
-PIPE_TEXTURE_2D, 0,
+st->internal_target, 0,
 PIPE_BIND_SAMPLER_VIEW)) {
   st->bitmap.tex_format = PIPE_FORMAT_L8_UNORM;
}
diff --git a/src/mesa/state_tracker/st_context.h 
b/src/mesa/state_tracker/st_context.h
index 352e795..9a80f4b 100644
--- a/src/mesa/state_tracker/st_context.h
+++ b/src/mesa/state_tracker/st_context.h
@@ -182,7 +182,7 @@ struct st_context
/** for glBitmap */
struct {
   struct pipe_rasterizer_state rasterizer;
-  struct pipe_sampler_state samplers[2];
+  struct pipe_sampler_state sampler;
   enum pipe_format

Re: [Mesa-dev] Are gallium unpack_rgba_8unorm/pack_rgba_8unorm safe for in-place conversion ?

2016-02-09 Thread Axel Davy


On 09/02/2016 20:30, Jose Fonseca wrote:

On 09/02/16 19:17, Axel Davy wrote:

Hi,

We'd need to do some formats conversion in gallium nine, and if possible
we would like to do them in-place.

unpack_rgba_8unorm/pack_rgba_8unorm doesn't seem to explicitly allow
in-place conversion,
but the generated code seems to be fine with that.

Can we rely on these functions to be safe for in-place conversion ?
Could we add that somewhere as requirement ?

CC-ing vmware guys, as they probably know.

Yours,

Axel Davy


It might work in practice now, but I don't think this is same it 
should be dependent upon.


Take BGRA <-> RGBA: currently we read the dword, swizzle, and write 
dword back, but it's an implementation detail.  One could imagine at 
some point the implementation be changed to read/write bytes at a 
time, which would break if done in place.



My recommendation is, if you want to do this things in place: malloc a 
temp with the size of a row of pixels, convert into temp row then 
memcpy into src.


You can even add util_convert_format_inplace() helper if you want.

And if there's any hot-path (e.g., BGRA<->RGBA) just add a special 
code path to this new util_convert_format_inplace that avoids the temp 
copy.


Jose


Thanks for the answer !

I decided to use util_format_translate, which solves my problem (I had 
thought of
doing about the same thing by converting to argb, and then convert 
in place to the dst format - of course

restricting with 32 bits formats -).
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 0/7] Fixes for SW:KotOR (v2)

2016-02-09 Thread Ian Romanick

On 02/09/2016 12:57 PM, Miklós Máté wrote:
> On 02/06/2016 12:11 AM, Matt Turner wrote:
>> Thanks a ton for contributing this. This is really cool.
>>
>> I've replied to a couple of patches with a lot of style comments.
>> These issues appear in a lot of places I didn't point out (it was
>> getting a bit repetitive). A summary is
>>
>>   - Use spaces around operators
>>   - Use BSD-style function declarations, with qualifiers and return
>> type on their own line
>>   - Align line-wrapped arguments and parameters with the open paranthesis
>>   - Use braces in nested if-statements
>>   - Don't indent case/default labels past switch()
>>   - Don't add code that is commented out
>>   - Use spaces, not tabs
>>
>> Thanks. I look forward to porting this to NIR and adding i965 support
>> at some point. :)
> Thanks for your thorough review. I started working on fixing the coding
> style to match the project.
> However, I have to argue with you on main/state.c (in 1/7): I created
> that code with copy, so the
> usage of tabs and the && placement matches the style of the other if
> blocks in that function. Should I
> fix them all?

You don't have to, but if you do, it should be a separate patch.

There's a lot of code in Mesa that uses deprecated styles.  We haven't
felt like it's worth the effort to go fix all of it at once.  We've
taken the approach of using the current style in all new code and
changed code.  Eventually it will all get up-to-date. :)

This has the side effect that you can tell how long it has been since
someone touched the code just by looking at it.  Sometimes that's a
warning sign.

> MM
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] Are gallium unpack_rgba_8unorm/pack_rgba_8unorm safe for in-place conversion ?

2016-02-09 Thread Jose Fonseca


On 09/02/16 19:17, Axel Davy wrote:

Hi,

We'd need to do some formats conversion in gallium nine, and if possible
we would like to do them in-place.

unpack_rgba_8unorm/pack_rgba_8unorm doesn't seem to explicitly allow
in-place conversion,
but the generated code seems to be fine with that.

Can we rely on these functions to be safe for in-place conversion ?
Could we add that somewhere as requirement ?

CC-ing vmware guys, as they probably know.

Yours,

Axel Davy


It might work in practice now, but I don't think this is same it should 
be dependent upon.


Take BGRA <-> RGBA: currently we read the dword, swizzle, and write 
dword back, but it's an implementation detail.  One could imagine at 
some point the implementation be changed to read/write bytes at a time, 
which would break if done in place.



My recommendation is, if you want to do this things in place: malloc a 
temp with the size of a row of pixels, convert into temp row then memcpy 
into src.


You can even add util_convert_format_inplace() helper if you want.

And if there's any hot-path (e.g., BGRA<->RGBA) just add a special code 
path to this new util_convert_format_inplace that avoids the temp copy.


Jose
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] winsys/radeon: fix a wrong NUM_TILE_PIPES value from the kernel

2016-02-09 Thread Alexandre Demers

On Tue, 9 Feb 2016 at 12:47 Marek Olšák  wrote:

> On Tue, Feb 9, 2016 at 6:17 PM, Alexandre Demers
>  wrote:
> >> +/* The kernel returns 12 for some cards for an unknown
> >> reason.
> >> + * I thought this was supposed to be a power of two.
> >> + */
> >> +if (ws->gen == DRV_SI && ws->info.num_tile_pipes == 12)
> >> +ws->info.num_tile_pipes = 8;
> >> +
> >
> > I may be late in the conversation, but shouldn't we have a look at why
> the
> > value reported by the kernel is wrong for "some" cards? Which ones and
> why
> > should be identified. It seems to be limited to Southern Islands as far
> as
> > we know for now, which limits the scope for now.
> >
> > Also, about the patch itself, even if only some cards were reported to be
> > problematic, why would we limit it to "ws->gen == DRV_SI"? Any cards
> > reporting a wrong value should be treated the same way by mapping its
> value
> > from 12 to 8, no?
>
> No. Only one card is affected (Tahiti or Pitcairn, I don't remember
> which one). No other card reports 12.
>
> There is no point in looking into why the value is wrong and I haven't
> been able to find where the value had come from. It's part of the
> kernel ABI now anyway. Userspace won't use it anymore.
>
> Marek
>
Well, meanwhile, I went on and I had a look at the kernel settings. Here is
the answer and the "problem":

This was returned by radeon_info_ioctl(), case RADEON_INFO_NUM_TILE_PIPES,
else if (rdev->family >= CHIP_TAHITI) *value =
rdev->config.si.max_tile_pipes;
(
http://lxr.free-electrons.com/source/drivers/gpu/drm/radeon/radeon_kms.c#L353
)

Searching where max_tile_pipes was set, it seems the value comes from
si_gpu_init(), case CHIP_TAHITI, rdev->config.si.max_tile_pipes = 12
(
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/drivers/gpu/drm/radeon/si.c
)

This is probably wrong, all other GPU having a power of 2 value (I had a
look at ni.c, si.c, evergreen.c). Either that or we have a special case for
Tahiti.

Also, if this is a special case, while comparing how things works between
si.c and cik.c, I saw that (si | cik)_tiling_mode_table_init() were not
exactly mapping gpus the same way: si.c uses the family, while cik.c uses
the max_tile_pipes value and defaults any value over 8 to be treated as 16.

If this can be of any help / reflection...
Alexandre Demers
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v3] workarounds: Update workaround names and platforms

2016-02-09 Thread Matt Turner

On Tue, Feb 9, 2016 at 9:44 AM, Sameer Kibey  wrote:
> Update the format in which workarounds are documented
> in the source code. This allows mesa to be parsed
> by the list-workarounds utility in intel-gpu-tools.

I don't know that I find this valuable.

Ben touched on one concern -- keeping it updated. But I have another,
and that's whether the information is accurate, or useful at all.


> Signed-off-by: Sameer Kibey 
> ---
>  src/mesa/drivers/dri/i965/brw_binding_tables.c | 3 ++-
>  src/mesa/drivers/dri/i965/brw_blorp.cpp| 2 ++
>  src/mesa/drivers/dri/i965/brw_defines.h| 2 ++
>  src/mesa/drivers/dri/i965/brw_eu_emit.c| 2 ++
>  src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 6 ++
>  src/mesa/drivers/dri/i965/brw_pipe_control.c   | 2 ++
>  src/mesa/drivers/dri/i965/gen6_queryobj.c  | 5 +++--
>  src/mesa/drivers/dri/i965/gen8_depth_state.c   | 7 ---
>  src/mesa/drivers/dri/i965/intel_batchbuffer.c  | 2 +-
>  9 files changed, 24 insertions(+), 7 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_binding_tables.c 
> b/src/mesa/drivers/dri/i965/brw_binding_tables.c
> index f3a0310..6dd35dd 100644
> --- a/src/mesa/drivers/dri/i965/brw_binding_tables.c
> +++ b/src/mesa/drivers/dri/i965/brw_binding_tables.c
> @@ -54,13 +54,14 @@ static uint32_t
>  reserve_hw_bt_space(struct brw_context *brw, unsigned bytes)
>  {
> /* From the Broadwell PRM, Volume 16, "Workarounds",
> -* WaStateBindingTableOverfetch:
>  * "HW over-fetches two cache lines of binding table indices.  When
>  *  using the resource streamer, SW needs to pad binding table pointer
>  *  updates with an additional two cache lines."
>  *
>  * Cache lines are 64 bytes, so we subtract 128 bytes from the size of
>  * the binding table pool buffer.
> +*
> +* WaStateBindingTableOverfetch:hsw,bdw,chv,bxt
>  */
> if (brw->hw_bt_pool.next_offset + bytes >= brw->hw_bt_pool.bo->size - 
> 128) {
>gen7_reset_hw_bt_pool_offsets(brw);
> diff --git a/src/mesa/drivers/dri/i965/brw_blorp.cpp 
> b/src/mesa/drivers/dri/i965/brw_blorp.cpp
> index 1bc6d15..f798e29 100644
> --- a/src/mesa/drivers/dri/i965/brw_blorp.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_blorp.cpp
> @@ -318,6 +318,8 @@ brw_hiz_op_params::brw_hiz_op_params(struct 
> intel_mipmap_tree *mt,
>  * SURFACE_STATE.Surface_Horizontal_Alignment should be 4 for Z24 
> surfaces,
>  * not 8. But commit 1f112cc increased the alignment from 4 to 8, which
>  * prevents the clobbering.
> +*
> +* WaHizAmbiguate8x4Aligned:hsw
>  */
> depth.width = ALIGN(depth.width, 8);
> depth.height = ALIGN(depth.height, 4);
> diff --git a/src/mesa/drivers/dri/i965/brw_defines.h 
> b/src/mesa/drivers/dri/i965/brw_defines.h
> index 01e0c99..2146172 100644
> --- a/src/mesa/drivers/dri/i965/brw_defines.h
> +++ b/src/mesa/drivers/dri/i965/brw_defines.h
> @@ -1762,6 +1762,8 @@ enum brw_message_target {
>   * WaForceEnableNonCoherent, HDC memory access may have been overridden by 
> the
>   * kernel to be non-coherent (matching the behavior of the same BTI on
>   * pre-Gen8 hardware) and BTI 255 may actually be an alias for BTI 253.
> + *
> + * WaForceEnableNonCoherent:bdw,chv,skl,kbl

This comment is just explaining that 255 on Gen8+ doesn't mean the
same thing as it did before. It passingly mentions
WaForceEnableNonCoherent, but that's a kernel driver workaround --
nothing in Mesa.

Moreover, when I look that up in the workarounds database it says it
applies to all BDW, CHV, KBL, but only SKL until D0. I'm not sure if
we even care about D0. Was that publicly released? Mentioning that it
applies to "skl" doesn't tell the whole story and might lead to us
applying it when it's not needed.

>   */
>  #define GEN8_BTI_STATELESS_IA_COHERENT   255
>  #define GEN8_BTI_STATELESS_NON_COHERENT  253
> diff --git a/src/mesa/drivers/dri/i965/brw_eu_emit.c 
> b/src/mesa/drivers/dri/i965/brw_eu_emit.c
> index 35d8039..7a6179a 100644
> --- a/src/mesa/drivers/dri/i965/brw_eu_emit.c
> +++ b/src/mesa/drivers/dri/i965/brw_eu_emit.c
> @@ -1891,6 +1891,8 @@ void brw_CMP(struct brw_codegen *p,
>  *
>  * It also applies to other Gen7 platforms (IVB, BYT) even though it isn't
>  * mentioned on their work-arounds pages.
> +*
> +* WaCMPInstNullDstForcesThreadSwitch:ivb,hsw,vlv

Not related to your patch, but the ivbgt2 Bug database has not worked
for me... ever? Try the link in the workarounds database for this one.
Does it work for you? Maybe you've made contacts during this project
that might know something?

>  */
> if (devinfo->gen == 7) {
>if (dest.file == BRW_ARCHITECTURE_REGISTER_FILE &&
> diff --git a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp 
> b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
> index 1916a99..3d19605 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
> @@

Re: [Mesa-dev] [PATCH] [v2] i965: Make sure we blit a full compressed block

2016-02-09 Thread Ben Widawsky

On Mon, Feb 08, 2016 at 09:48:54PM -0800, Jason Ekstrand wrote:
> On Sat, Feb 6, 2016 at 6:11 PM, Ben Widawsky 
> wrote:
> 
> > This fixes an assertion failure in [at least] one of the Unreal Engine
> > Linux
> > demo/games that uses DXT1 compression. Specifically, the "Vehicle Game".
> >
> > At some point, the game ends up trying to blit mip level whose size is 2x2,
> > which is smaller than a DXT1 block. As a result, the assertion in the blit
> > path
> > is triggered. It should be safe to simply make sure we align the width and
> > height, which is sadly an example of compression being less efficient.
> >
> > NOTE: The demo seems to work fine without the assert, and therefore release
> > builds of mesa wouldn't stumble over this. Perhaps there is some
> > unnoticeable
> > corruption, but I had trouble spotting it.
> >
> > Thanks to Jason for looking at my backtrace and figuring out what was
> > going on.
> >
> > v2: Use NPOT alignment to make sure ASTC is handled properly (Ilia)
> > Remove comment about how this doesn't fix other bugs, because it does.
> >
> > Cc: Jason Ekstrand 
> > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93358
> > Signed-off-by: Ben Widawsky 
> > ---
> >  src/mesa/drivers/dri/i965/intel_copy_image.c | 13 +
> >  1 file changed, 13 insertions(+)
> >
> > diff --git a/src/mesa/drivers/dri/i965/intel_copy_image.c
> > b/src/mesa/drivers/dri/i965/intel_copy_image.c
> > index 0a3337e..42bd7ff 100644
> > --- a/src/mesa/drivers/dri/i965/intel_copy_image.c
> > +++ b/src/mesa/drivers/dri/i965/intel_copy_image.c
> > @@ -212,6 +212,7 @@ intel_copy_image_sub_data(struct gl_context *ctx,
> > struct brw_context *brw = brw_context(ctx);
> > struct intel_mipmap_tree *src_mt, *dst_mt;
> > unsigned src_level, dst_level;
> > +   GLuint bw, bh;
> >
> > if (_mesa_meta_CopyImageSubData_uncompressed(ctx,
> >  src_image,
> > src_renderbuffer,
> > @@ -275,6 +276,18 @@ intel_copy_image_sub_data(struct gl_context *ctx,
> > intel_miptree_all_slices_resolve_depth(brw, dst_mt);
> > intel_miptree_resolve_color(brw, dst_mt);
> >
> > +   _mesa_get_format_block_size(src_mt->format, , );
> > +   /* It's legal to have a WxH that's smaller than a compressed block.
> > This
> > +* happens for example when you are using a higher level LOD. For this
> > case,
> > +* we still want to copy the entire block, or else the decompression
> > will be
> > +* incorrect.
> > +*/
> > +   if (src_width < bw)
> > +  src_width = ALIGN_NPOT(src_width, bw);
> > +
> > +   if (src_height < bh)
> > +  src_height = ALIGN_NPOT(src_height, bh);
> >
> 
> I've been going back-and-forth as to whether or not this is the right place
> to do this or if we wanted it further up or down the stack.  At the end of
> the day, I concluded that it's as good a place as any.
> 
> Reviewed-by: Jason Ekstrand 
> Cc "11.0 11.1" 

I left out stable intentionally because we will only hit the assert in debug
builds, and AFAICT, we'll only get minor corruption by this not being there -
but it's up to you. I just figured I'd share my thoughts in case they weren't
clear. I can add stable if you still think it wise.

Thanks for review.

> 
> --Jason
> 
> 
> > +
> > if (copy_image_with_blitter(brw, src_mt, src_level,
> > src_x, src_y, src_z,
> > dst_mt, dst_level,
> > --
> > 2.7.0
> >
> > ___
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> >

> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] mesa: fix incorrect viewport position when GL_CLIP_ORIGIN = GL_LOWER_LEFT

2016-02-09 Thread Roland Scheidegger

Am 09.02.2016 um 18:03 schrieb Brian Paul:
> Ilia Mirkin found/fixed the mistake.
> 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93813
> Cc: "11.1" 
> ---
>  src/mesa/main/viewport.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/src/mesa/main/viewport.c b/src/mesa/main/viewport.c
> index 7d891429..681e46b 100644
> --- a/src/mesa/main/viewport.c
> +++ b/src/mesa/main/viewport.c
> @@ -456,11 +456,11 @@ _mesa_get_viewport_xform(struct gl_context *ctx, 
> unsigned i,
> translate[0] = half_width + x;
> if (ctx->Transform.ClipOrigin == GL_UPPER_LEFT) {
>scale[1] = -half_height;
> -  translate[1] = half_height - y;
> } else {
>scale[1] = half_height;
> -  translate[1] = half_height + y;
> }
> +   translate[1] = half_height + y;
> +
> if (ctx->Transform.ClipDepthMode == GL_NEGATIVE_ONE_TO_ONE) {
>scale[2] = 0.5 * (f - n);
>translate[2] = 0.5 * (n + f);
> 

With the caveat that I always get confused by the top/bottom flip stuff,
this looks right to me...

Reviewed-by: Roland Scheidegger 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 07/23] i965/gen9: Allow halign == 16 also for losslessly compressed

2016-02-09 Thread Ben Widawsky

On Mon, Feb 08, 2016 at 06:51:27PM +0200, Topi Pohjolainen wrote:
> Signed-off-by: Topi Pohjolainen 
> ---
>  src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
> b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> index fe525c3..6f46385 100644
> --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> @@ -530,7 +530,8 @@ intel_miptree_create_layout(struct brw_context *brw,
> if (intel_miptree_supports_non_msrt_fast_clear(brw, mt)) {
>if (brw->gen >= 9 || (brw->gen == 8 && num_samples <= 1))
>   layout_flags |= MIPTREE_LAYOUT_FORCE_HALIGN16;
> -   } else if (brw->gen >= 9 && num_samples > 1) {
> +   } else if (brw->gen >= 9 &&
> +  (num_samples > 1 || mt->msaa_layout == INTEL_MSAA_LAYOUT_CSS)) 
> {

It seems like *if* we don't create a separate CSS type, we could just change
this to layout == INTEL_MSAA_LAYOUT_CMS. I'm still waiting to be convinced by a
later patch that we need the separate type.

Maybe even if we keep the CSS type, make this:
mt->msaa_layout == INTEL_MSAA_LAYOUT_CSS ||
mt->msaa_layout == INTEL_MSAA_LAYOUT_CMS

Also, I think either this patch, or the last patch should update the 
if (mt->disable_aux_buffers) case just below this.

Anyway, looks fine for now other than my opinions of questionable importance.

>layout_flags |= MIPTREE_LAYOUT_FORCE_HALIGN16;
> } else {
>/* For now, nothing else has this requirement */
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 3/3] st/mesa: clarify some texture target code in st_cb_drawpix.c

2016-02-09 Thread Roland Scheidegger

Am 09.02.2016 um 23:49 schrieb Brian Paul:
> Use st->internal_target instead of PIPE_TEXTURE_2D when choosing the
> texture format.  Probably no real difference, but let's be consistent.
> 
> Simplify a test when determining whether we need normalized texcoords.
> 
> Add a new assertion.
> ---
>  src/mesa/state_tracker/st_cb_drawpixels.c | 8 +---
>  1 file changed, 5 insertions(+), 3 deletions(-)
> 
> diff --git a/src/mesa/state_tracker/st_cb_drawpixels.c 
> b/src/mesa/state_tracker/st_cb_drawpixels.c
> index 7096bd2..fd58886 100644
> --- a/src/mesa/state_tracker/st_cb_drawpixels.c
> +++ b/src/mesa/state_tracker/st_cb_drawpixels.c
> @@ -358,8 +358,8 @@ make_texture(struct st_context *st,
>GLenum intFormat = internal_format(ctx, format, type);
>  
>pipeFormat = st_choose_format(st, intFormat, format, type,
> -PIPE_TEXTURE_2D, 0, 
> PIPE_BIND_SAMPLER_VIEW,
> -FALSE);
> +st->internal_target, 0,
> +PIPE_BIND_SAMPLER_VIEW, FALSE);
>assert(pipeFormat != PIPE_FORMAT_NONE);
> }
>  
> @@ -556,7 +556,9 @@ draw_textured_quad(struct gl_context *ctx, GLint x, GLint 
> y, GLfloat z,
> struct cso_context *cso = st->cso_context;
> GLfloat x0, y0, x1, y1;
> GLsizei maxSize;
> -   boolean normalized = sv[0]->texture->target != PIPE_TEXTURE_RECT;
> +   boolean normalized = sv[0]->texture->target == PIPE_TEXTURE_2D;
> +
> +   assert(sv[0]->texture->target == st->internal_target);
>  
> /* limit checks */
> /* XXX if DrawPixels image is larger than max texture size, break
> 

For the series:
Reviewed-by: Roland Scheidegger 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [V2 PATCH] meta: Fix the pbo usage in meta for GLES{1, 2} contexts

2016-02-09 Thread Anuj Phogat

OpenGL ES 1.0 doesn't support using GL_STREAM_DRAW and both
ES 1.0 and 2.0 don't support GL_STREAM_READ in glBufferData().
So, handle it correctly by calling the _mesa_meta_begin()
before create_texture_for_pbo().

V2: Remove the changes related to allocate_storage. (Ian)

Cc: Ian Romanick 
Cc: "11.1" 
Signed-off-by: Anuj Phogat 
---
 src/mesa/drivers/common/meta_tex_subimage.c | 21 +
 1 file changed, 13 insertions(+), 8 deletions(-)

diff --git a/src/mesa/drivers/common/meta_tex_subimage.c 
b/src/mesa/drivers/common/meta_tex_subimage.c
index 4adaad7..4d6f571 100644
--- a/src/mesa/drivers/common/meta_tex_subimage.c
+++ b/src/mesa/drivers/common/meta_tex_subimage.c
@@ -211,20 +211,22 @@ _mesa_meta_pbo_TexSubImage(struct gl_context *ctx, GLuint 
dims,
 */
image_height = packing->ImageHeight == 0 ? height : packing->ImageHeight;
 
+   _mesa_meta_begin(ctx, ~(MESA_META_PIXEL_TRANSFER |
+   MESA_META_PIXEL_STORE));
+
pbo_tex_image = create_texture_for_pbo(ctx, create_pbo,
   GL_PIXEL_UNPACK_BUFFER,
   dims, width, height, depth,
   format, type, pixels, packing,
   , _tex);
-   if (!pbo_tex_image)
+   if (!pbo_tex_image) {
+  _mesa_meta_end(ctx);
   return false;
+   }
 
if (allocate_storage)
   ctx->Driver.AllocTextureImageBuffer(ctx, tex_image);
 
-   _mesa_meta_begin(ctx, ~(MESA_META_PIXEL_TRANSFER |
-   MESA_META_PIXEL_STORE));
-
_mesa_GenFramebuffers(2, fbos);
_mesa_BindFramebuffer(GL_READ_FRAMEBUFFER, fbos[0]);
_mesa_BindFramebuffer(GL_DRAW_FRAMEBUFFER, fbos[1]);
@@ -346,15 +348,18 @@ _mesa_meta_pbo_GetTexSubImage(struct gl_context *ctx, 
GLuint dims,
 */
image_height = packing->ImageHeight == 0 ? height : packing->ImageHeight;
 
+   _mesa_meta_begin(ctx, ~(MESA_META_PIXEL_TRANSFER |
+   MESA_META_PIXEL_STORE));
+
pbo_tex_image = create_texture_for_pbo(ctx, false, GL_PIXEL_PACK_BUFFER,
   dims, width, height, depth,
   format, type, pixels, packing,
   , _tex);
-   if (!pbo_tex_image)
-  return false;
 
-   _mesa_meta_begin(ctx, ~(MESA_META_PIXEL_TRANSFER |
-   MESA_META_PIXEL_STORE));
+   if (!pbo_tex_image) {
+  _mesa_meta_end(ctx);
+  return false;
+   }
 
/* GL_CLAMP_FRAGMENT_COLOR doesn't affect ReadPixels and GettexImage */
if (ctx->Extensions.ARB_color_buffer_float)
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 11/23] i965/gen8: Remove dead assertion

2016-02-09 Thread Ben Widawsky

On Mon, Feb 08, 2016 at 06:51:31PM +0200, Topi Pohjolainen wrote:
> The assertion is inside a condition mandating num_samples > 1 and
> therefore the first half of the constraint is always met. The
> second half in turn would only be applicable for single sampled
> case and moreover it is trying to falsely check against surface
> type instead of format.


Oops.

> Subsequent patches will introduce proper support for the lossless
> compression and dropping this here makes the patches a little
> simpler.
> 
> Signed-off-by: Topi Pohjolainen 
> ---
>  src/mesa/drivers/dri/i965/gen8_surface_state.c | 6 --
>  1 file changed, 6 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/gen8_surface_state.c 
> b/src/mesa/drivers/dri/i965/gen8_surface_state.c
> index 0df25d2..fc8f701 100644
> --- a/src/mesa/drivers/dri/i965/gen8_surface_state.c
> +++ b/src/mesa/drivers/dri/i965/gen8_surface_state.c
> @@ -243,12 +243,6 @@ gen8_emit_texture_surface_state(struct brw_context *brw,
> */
>if (brw->gen >= 9 || mt->num_samples == 1)
>   assert(mt->halign == 16);
> -
> -  if (brw->gen >= 9) {
> - assert(mt->num_samples > 1 ||
> -brw_losslessly_compressible_format(brw, surf_type));
> -  }
> -
> }
>  
> uint32_t *surf = allocate_surface_state(brw, surf_offset, surf_index);

Seems fine to drop the assertion entirely instead of correcting it. 
You could mention the two patches which touched this part of the code if you
wanted, or perhaps you realized you reviewed both of them :P
Reviewed-by: Ben Widawsky 

The one that used type instead of format:
commit 6fa1130cd21926cdd4ae86aa12ee3f5c0bb5ba33
Author: Ben Widawsky 
Date:   Tue Oct 13 20:50:21 2015 -0700

i965/skl: skip fast clears for certain surface formats

The one that didn't drop the duplicated assertion:
commit eb291d7013eef64c33826f9cc0006c89adcf4e53
Author: Neil Roberts 
Date:   Tue Nov 24 17:59:28 2015 +0100

i965/gen8+: Don't upload the MCS buffer for single-sampled textures
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] i965/gen9: Compression support for single-sampled

2016-02-09 Thread Ben Widawsky

On Mon, Feb 08, 2016 at 06:51:20PM +0200, Topi Pohjolainen wrote:
> This series enables compression for single sampled color surfaces,
> also referred to as "lossless compression". This is yet only for
> driver internal use easing pressure on memory bandwidth and caches
> when writing, blending and sampling surfaces uing gpu.
> 
> As a side effect the need for color buffer resolves after fast
> clears is also decreased. Current understanding is that sampling
> engine doesn't understand meta data (auxiliary buffer) for single
> sampled fast cleared surfaces. However, if the meta data is written
> with lossless compression enabled, even sampling engine is capable
> of reading both the color buffer and the auxiliary, and resolves
> can be omitted in those case.
> 
> The final enabling patch is dependent on earlier two-patch series
> fixing state restore mechanism in i965-meta operations.
> 
> There are some performance numbers available in the final commit.
> 
> Topi Pohjolainen (23):
>   i965: Let caller of intel_miptree_create_layout() decide msaa layout
>   i965: Use miptree non-aligned dimensions directly for x-tiled
>   i965: Separate miptree creation from auxiliary buffer setup
>   i965: Don't try to create aux buffer for non-msrt aux-buffer
>   i965: Stop considering if msrt aux buffers need aux buffer
>   i965/gen9: Add buffer layout representing lossless compression
>   i965/gen9: Allow halign == 16 also for losslessly compressed
>   i965: Allow fast clear to be used with lossless compression
>   i965: Add resolve option for lossless compression
>   i965: Use constant pointer when checking for compression
>   i965/gen8: Remove dead assertion
>   i965: Refactor resolving of auxiliary mode
>   i965: Resolve color buffer also in lossless compression case
>   i965: Add means for limiting color resolves
>   i965: Add a flag telling color resolve pass to ignore CCS_E
>   i965: Add a few assertions on lossless compression
>   i965: Set buffer cleared after actually clearing it
>   i965/gen9: Prepare surface state setup for lossless compression
>   i965/gen9: Refactor msrt mcs initialization
>   i965: Expose logic telling if non-msrt mcs is supported
>   i965/gen9: Setup MCS for compressed texture surfaces
>   i965: Add helper for checking for lossless compressible
>   i965/gen9: Enable lossless compression
> 
>  src/mesa/drivers/dri/i965/brw_blorp_blit.cpp|  21 +-
>  src/mesa/drivers/dri/i965/brw_context.c |  22 ++-
>  src/mesa/drivers/dri/i965/brw_context.h |   2 +-
>  src/mesa/drivers/dri/i965/brw_defines.h |   2 +
>  src/mesa/drivers/dri/i965/brw_meta_fast_clear.c |  16 +-
>  src/mesa/drivers/dri/i965/brw_surface_formats.c |   2 +-
>  src/mesa/drivers/dri/i965/brw_tex_layout.c  |   2 +
>  src/mesa/drivers/dri/i965/gen8_surface_state.c  |  90 +
>  src/mesa/drivers/dri/i965/intel_blit.c  |   4 +-
>  src/mesa/drivers/dri/i965/intel_copy_image.c|   4 +-
>  src/mesa/drivers/dri/i965/intel_mipmap_tree.c   | 249 
> +---
>  src/mesa/drivers/dri/i965/intel_mipmap_tree.h   |  32 ++-
>  src/mesa/drivers/dri/i965/intel_pixel_bitmap.c  |   2 +-
>  src/mesa/drivers/dri/i965/intel_pixel_read.c|   2 +-
>  src/mesa/drivers/dri/i965/intel_tex_image.c |   2 +-
>  src/mesa/drivers/dri/i965/intel_tex_subimage.c  |   2 +-
>  16 files changed, 318 insertions(+), 136 deletions(-)
> 

Need to take a break, my head hurts.
In addition to the comments I already left, 5, 8, 9, 10, and 11 are:
Reviewed-by: Ben Widawsky 

I think they might change at least a bit if you consider the feedback I've given
so far, but it's really up to you.

8 is kind of useless on its own IMO, but whatever you like.

I'd say you should push 10, and 11 now.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 06/23] i965/gen9: Add buffer layout representing lossless compression

2016-02-09 Thread Ben Widawsky

On Mon, Feb 08, 2016 at 06:51:26PM +0200, Topi Pohjolainen wrote:
> Skylake introduces compression support also for the single-sampled
> color buffers. Similarly to the multi-sampled case the color buffer
> will be associated with an auxiliary surface tracking the
> compression state.
> 
> Signed-off-by: Topi Pohjolainen 
> ---
>  src/mesa/drivers/dri/i965/brw_blorp_blit.cpp  | 15 +++
>  src/mesa/drivers/dri/i965/brw_tex_layout.c|  2 ++
>  src/mesa/drivers/dri/i965/intel_mipmap_tree.c |  4 
>  src/mesa/drivers/dri/i965/intel_mipmap_tree.h |  9 +
>  4 files changed, 30 insertions(+)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp 
> b/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp
> index c7cb394..9ca33b4 100644
> --- a/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp
> @@ -1162,6 +1162,11 @@ brw_blorp_blit_program::encode_msaa(unsigned 
> num_samples,
>SWAP_XY_AND_XPYP();
>s_is_zero = true;
>break;
> +   case INTEL_MSAA_LAYOUT_CSS:
> +  /* This is impossible combination, blorp is supported only on
> +   * gen < 8 while CSS is only supported from gen 9 onwards.
> +   */
> +  unreachable("Blorp does not support lossless compression");
> }
>  }
>  
> @@ -1239,6 +1244,11 @@ brw_blorp_blit_program::decode_msaa(unsigned 
> num_samples,
>s_is_zero = false;
>SWAP_XY_AND_XPYP();
>break;
> +   case INTEL_MSAA_LAYOUT_CSS:
> +  /* This is impossible combination, blorp is supported only on
> +   * gen < 8 while CSS is only supported from gen 9 onwards.
> +   */
> +  unreachable("Blorp does not support lossless compression");
> }
>  }
>  
> @@ -1642,6 +1652,11 @@ brw_blorp_blit_program::texel_fetch(struct brw_reg dst)
>   texture_lookup(dst, SHADER_OPCODE_TXF, gen7_ld_args,
>  ARRAY_SIZE(gen7_ld_args));
>   break;
> +  case INTEL_MSAA_LAYOUT_CSS:
> + /* This is impossible combination, blorp is supported only on
> +  * gen < 8 while CSS is only supported from gen 9 onwards.
> +  */
> +  unreachable("Blorp does not support lossless compression");

Does blorp use tabs? This looks weird, but meh.

>}
>break;
> default:
> diff --git a/src/mesa/drivers/dri/i965/brw_tex_layout.c 
> b/src/mesa/drivers/dri/i965/brw_tex_layout.c
> index a294829..2366bfa 100644
> --- a/src/mesa/drivers/dri/i965/brw_tex_layout.c
> +++ b/src/mesa/drivers/dri/i965/brw_tex_layout.c
> @@ -686,6 +686,8 @@ intel_miptree_set_total_width_height(struct brw_context 
> *brw,
>case INTEL_MSAA_LAYOUT_CMS:
>   brw_miptree_layout_texture_array(brw, mt);
>   break;
> +  case INTEL_MSAA_LAYOUT_CSS:
> + assert(brw->gen >= 9);
>case INTEL_MSAA_LAYOUT_NONE:
>case INTEL_MSAA_LAYOUT_IMS:
>   if (gen9_use_linear_1d_layout(brw, mt))
> diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
> b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> index d40a529..fe525c3 100644
> --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> @@ -460,6 +460,8 @@ intel_miptree_create_layout(struct brw_context *brw,
>case INTEL_MSAA_LAYOUT_CMS:
>   mt->array_layout = ALL_SLICES_AT_EACH_LOD;
>   break;
> +  case INTEL_MSAA_LAYOUT_CSS:
> +  unreachable("Lossless compression is only support for gen9+");
>}
> }
>  
> @@ -1051,6 +1053,8 @@ intel_miptree_match_image(struct intel_mipmap_tree *mt,
>case INTEL_MSAA_LAYOUT_CMS:
>   level_depth /= mt->num_samples;
>   break;
> +  case INTEL_MSAA_LAYOUT_CSS:
> + unreachable("Lossless compression is only for single-sampled");
>}
> }
>  
> diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.h 
> b/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
> index 64f73ea..0970fd5 100644
> --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
> +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
> @@ -198,6 +198,15 @@ enum intel_msaa_layout
>  * @see PRM section "Compressed Multisampled Surfaces"
>  */
> INTEL_MSAA_LAYOUT_CMS,
> +
> +   /**
> +* Compressed Singlesample Surface. The color values are stored in one
> +* slice and an auxiliary buffer is used to track compression state just
> +* as in the Compressed Multisample case.
> +*
> +* @see section "Lossless Compression"
> +*/
> +   INTEL_MSAA_LAYOUT_CSS,

I understand why you called this CSS, but I do think it's a shame since the docs
to seem to clearly define this as "Color Control Surface (CCS)". I'm also not
yet certain why this differs from INTEL_MSAA_LAYOUT_CMS, but I assume I'll get
to that later in the series.

I won't argue much about this, since I remember how frustrated I got when I was
working on fast clears, but, just consider it.

>  };
>  
>  
> -- 
> 2.5.0
> 
>

Re: [Mesa-dev] [PATCH] winsys/radeon: fix a wrong NUM_TILE_PIPES value from the kernel

2016-02-09 Thread Marek Olšák

On Tue, Feb 9, 2016 at 11:38 PM, Alexandre Demers
 wrote:
> On Tue, 9 Feb 2016 at 15:17 Alex Deucher  wrote:
>>
>> On Tue, Feb 9, 2016 at 12:47 PM, Marek Olšák  wrote:
>> > On Tue, Feb 9, 2016 at 6:17 PM, Alexandre Demers
>> >  wrote:
>> >>> +/* The kernel returns 12 for some cards for an unknown
>> >>> reason.
>> >>> + * I thought this was supposed to be a power of two.
>> >>> + */
>> >>> +if (ws->gen == DRV_SI && ws->info.num_tile_pipes == 12)
>> >>> +ws->info.num_tile_pipes = 8;
>> >>> +
>> >>
>> >> I may be late in the conversation, but shouldn't we have a look at why
>> >> the
>> >> value reported by the kernel is wrong for "some" cards? Which ones and
>> >> why
>> >> should be identified. It seems to be limited to Southern Islands as far
>> >> as
>> >> we know for now, which limits the scope for now.
>> >>
>> >> Also, about the patch itself, even if only some cards were reported to
>> >> be
>> >> problematic, why would we limit it to "ws->gen == DRV_SI"? Any cards
>> >> reporting a wrong value should be treated the same way by mapping its
>> >> value
>> >> from 12 to 8, no?
>> >
>> > No. Only one card is affected (Tahiti or Pitcairn, I don't remember
>> > which one). No other card reports 12.
>> >
>> > There is no point in looking into why the value is wrong and I haven't
>> > been able to find where the value had come from. It's part of the
>> > kernel ABI now anyway. Userspace won't use it anymore.
>>
>> I did it when I added SI support.  I think I get the value from tcore
>> or some hw features header.  I don't remember the details.  Anyway, I
>> think it may have been a hw value (related to the number of memory
>> channels) whereas from a sw perspective, we'd use 8 for tiling
>> computations.  E.g., when we set up rdev->config.si.tile_config which
>> is what we used to use, we set it to 8.
>>
>> Alex
>
>
> Ok, I had a look at what you were pointing. There was even a comment in
> si_gpu_init() about tile_config for the case of the problematic value:
>
> rdev->config.si.tile_config = 0;
> switch (rdev->config.si.num_tile_pipes) {
> [...]
> case 8:
> default:
> /* XXX what about 12? */
> rdev->config.si.tile_config |= (3 << 0);
> break;
> [...]
>
> Should we just clarify the comment in Mesa's committed patch according to
> Alex's explanation?

I don't think it's necessary. The value should be equal to the pipe
config setting in the tile mode array. That's the only thing that
matters here. Everything else is irrelevant AFAIK.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] nir: Remove the const_offset from nir_tex_instr

2016-02-09 Thread Jason Ekstrand

When NIR was originally drafted, there was no easy way to determine if
something was constant or not.  The result was that we had lots of
special-casing for constant values such as this.  Now that load_const
instructions are SSA-only, it's really easy to find constants and this
isn't really needed anymore.
---
 src/compiler/nir/glsl_to_nir.cpp   | 16 +
 src/compiler/nir/nir.h |  3 ---
 src/compiler/nir/nir_clone.c   |  1 -
 src/compiler/nir/nir_instr_set.c   |  3 ---
 src/compiler/nir/nir_print.c   | 14 
 .../drivers/freedreno/ir3/ir3_compiler_nir.c   | 15 -
 src/mesa/drivers/dri/i965/brw_fs_nir.cpp   | 26 +-
 src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 21 -
 8 files changed, 27 insertions(+), 72 deletions(-)

diff --git a/src/compiler/nir/glsl_to_nir.cpp b/src/compiler/nir/glsl_to_nir.cpp
index 3db2775..eaf71d0 100644
--- a/src/compiler/nir/glsl_to_nir.cpp
+++ b/src/compiler/nir/glsl_to_nir.cpp
@@ -1824,7 +1824,7 @@ nir_visitor::visit(ir_texture *ir)
   num_srcs++;
if (ir->shadow_comparitor != NULL)
   num_srcs++;
-   if (ir->offset != NULL && ir->offset->as_constant() == NULL)
+   if (ir->offset != NULL)
   num_srcs++;
 
nir_tex_instr *instr = nir_tex_instr_create(this->shader, num_srcs);
@@ -1881,16 +1881,10 @@ nir_visitor::visit(ir_texture *ir)
   /* we don't support multiple offsets yet */
   assert(ir->offset->type->is_vector() || ir->offset->type->is_scalar());
 
-  ir_constant *const_offset = ir->offset->as_constant();
-  if (const_offset != NULL) {
- for (unsigned i = 0; i < const_offset->type->vector_elements; i++)
-instr->const_offset[i] = const_offset->value.i[i];
-  } else {
- instr->src[src_number].src =
-nir_src_for_ssa(evaluate_rvalue(ir->offset));
- instr->src[src_number].src_type = nir_tex_src_offset;
- src_number++;
-  }
+  instr->src[src_number].src =
+ nir_src_for_ssa(evaluate_rvalue(ir->offset));
+  instr->src[src_number].src_type = nir_tex_src_offset;
+  src_number++;
}
 
switch (ir->op) {
diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h
index a4dbfde..294b18e 100644
--- a/src/compiler/nir/nir.h
+++ b/src/compiler/nir/nir.h
@@ -949,9 +949,6 @@ typedef struct {
 */
bool is_new_style_shadow;
 
-   /* constant offset - must be 0 if the offset source is used */
-   int const_offset[4];
-
/* gather component selector */
unsigned component : 2;
 
diff --git a/src/compiler/nir/nir_clone.c b/src/compiler/nir/nir_clone.c
index 5eff743..58ba2eb 100644
--- a/src/compiler/nir/nir_clone.c
+++ b/src/compiler/nir/nir_clone.c
@@ -355,7 +355,6 @@ clone_tex(clone_state *state, const nir_tex_instr *tex)
ntex->is_array = tex->is_array;
ntex->is_shadow = tex->is_shadow;
ntex->is_new_style_shadow = tex->is_new_style_shadow;
-   memcpy(ntex->const_offset, tex->const_offset, sizeof(ntex->const_offset));
ntex->component = tex->component;
ntex->sampler_index = tex->sampler_index;
ntex->sampler_array_size = tex->sampler_array_size;
diff --git a/src/compiler/nir/nir_instr_set.c b/src/compiler/nir/nir_instr_set.c
index d3f939f..880a814 100644
--- a/src/compiler/nir/nir_instr_set.c
+++ b/src/compiler/nir/nir_instr_set.c
@@ -152,7 +152,6 @@ hash_tex(uint32_t hash, const nir_tex_instr *instr)
hash = HASH(hash, instr->is_array);
hash = HASH(hash, instr->is_shadow);
hash = HASH(hash, instr->is_new_style_shadow);
-   hash = HASH(hash, instr->const_offset);
unsigned component = instr->component;
hash = HASH(hash, component);
hash = HASH(hash, instr->sampler_index);
@@ -302,8 +301,6 @@ nir_instrs_equal(const nir_instr *instr1, const nir_instr 
*instr2)
   tex1->is_array != tex2->is_array ||
   tex1->is_shadow != tex2->is_shadow ||
   tex1->is_new_style_shadow != tex2->is_new_style_shadow ||
-  memcmp(tex1->const_offset, tex2->const_offset,
- sizeof(tex1->const_offset)) != 0 ||
   tex1->component != tex2->component ||
  tex1->sampler_index != tex2->sampler_index ||
  tex1->sampler_array_size != tex2->sampler_array_size) {
diff --git a/src/compiler/nir/nir_print.c b/src/compiler/nir/nir_print.c
index 48ecb48..370eb8b 100644
--- a/src/compiler/nir/nir_print.c
+++ b/src/compiler/nir/nir_print.c
@@ -612,20 +612,6 @@ print_tex_instr(nir_tex_instr *instr, print_state *state)
   fprintf(fp, ", ");
}
 
-   bool has_nonzero_offset = false;
-   for (unsigned i = 0; i < 4; i++) {
-  if (instr->const_offset[i] != 0) {
- has_nonzero_offset = true;
- break;
-  }
-   }
-
-   if (has_nonzero_offset) {
-  fprintf(fp, "[%i %i %i %i] (offset), ",
-  instr->const_offset[0], instr->const_offset[1],
-

[Mesa-dev] [PATCH] Better explain why we are lowering the num_tile_pipes value for TAHITI

2016-02-09 Thread Alexandre Demers

Signed-off-by: Alexandre Demers 
---
 src/gallium/winsys/radeon/drm/radeon_drm_winsys.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/src/gallium/winsys/radeon/drm/radeon_drm_winsys.c 
b/src/gallium/winsys/radeon/drm/radeon_drm_winsys.c
index 49c310c..aab81f9 100644
--- a/src/gallium/winsys/radeon/drm/radeon_drm_winsys.c
+++ b/src/gallium/winsys/radeon/drm/radeon_drm_winsys.c
@@ -405,8 +405,9 @@ static boolean do_winsys_init(struct radeon_drm_winsys *ws)
 radeon_get_drm_value(ws->fd, RADEON_INFO_NUM_TILE_PIPES, NULL,
  >info.num_tile_pipes);
 
-/* The kernel returns 12 for some cards for an unknown reason.
- * I thought this was supposed to be a power of two.
+/* Tahiti have a max_tile_pipes of 12 exceptionally. However, we 
+ * work with power of two, so let's set it to the nearest power of 
+ * two value.
  */
 if (ws->gen == DRV_SI && ws->info.num_tile_pipes == 12)
 ws->info.num_tile_pipes = 8;
-- 
2.7.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 03/23] i965: Separate miptree creation from auxiliary buffer setup

2016-02-09 Thread Ben Widawsky

On Mon, Feb 08, 2016 at 06:51:23PM +0200, Topi Pohjolainen wrote:
> Currently the logic allocating and setting up miptrees is closely
> combined with decision making when to re-allocate buffers in
> X-tiled layout and when to associate colors with auxiliary buffers.
> 
> These auxiliary buffers are in turn also represented as miptrees
> and are created by the same miptree creation logic calling itself
> recursively. This means considering in vain if the auxiliary buffers
> should be represented in X-tiled layout or if they should be
> associated with auxiliary buffers again.
> While this is somewhat unnecessary, this doesn't impose any problems
> currently. Miptrees for auxiliary buffers are created as simgle-sampled
> fusing the consideration for multi-sampled compression auxiliary
> buffers. The format in turn is such that is not applicable for
> single-sampled fast clears (that would require accompaning auxiliary
> buffer).
> But once the driver starts to support lossless compression of color
> buffers the auxiliary buffer will have a format that would itself
> be applicable for lossless compression. This would be rather
> difficult and ugly to detect in the current miptree creation logic,
> and therefore this patch seeks to separate the association logic
> from the general allocation and setup steps.
> 
> Signed-off-by: Topi Pohjolainen 

Shameless plug. I think we reached the same conclusion about the existing code
:-). I forgot what I actually did, but I'm pretty sure I at least did this
stuff, and maybe some more. I'd love it if you could see if anything is useful
there.  https://patchwork.freedesktop.org/patch/56792/

> ---
>  src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 75 
> +--
>  1 file changed, 49 insertions(+), 26 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
> b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> index 033f4c6..d655de8 100644
> --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> @@ -611,17 +611,18 @@ intel_get_yf_ys_bo_size(struct intel_mipmap_tree *mt, 
> unsigned *alignment,
> return size;
>  }
>  
> -struct intel_mipmap_tree *
> -intel_miptree_create(struct brw_context *brw,
> - GLenum target,
> - mesa_format format,
> - GLuint first_level,
> - GLuint last_level,
> - GLuint width0,
> - GLuint height0,
> - GLuint depth0,
> - GLuint num_samples,
> - uint32_t layout_flags)
> +static struct intel_mipmap_tree *
> +miptree_create(struct brw_context *brw,
> +   GLenum target,
> +   mesa_format format,
> +   GLuint first_level,
> +   GLuint last_level,
> +   GLuint width0,
> +   GLuint height0,
> +   GLuint depth0,
> +   GLuint num_samples,
> +   enum intel_msaa_layout msaa_layout,
> +   uint32_t layout_flags)
>  {
> struct intel_mipmap_tree *mt;
> mesa_format tex_format = format;
> @@ -638,9 +639,7 @@ intel_miptree_create(struct brw_context *brw,
> mt = intel_miptree_create_layout(brw, target, format,
>  first_level, last_level, width0,
>  height0, depth0, num_samples,
> -compute_msaa_layout(brw, format,
> -num_samples, false),
> -layout_flags);
> +msaa_layout, layout_flags);
> /*
>  * pitch == 0 || height == 0  indicates the null texture
>  */
> @@ -658,12 +657,8 @@ intel_miptree_create(struct brw_context *brw,
>total_height = ALIGN(total_height, 64);
> }
>  
> -   bool y_or_x = false;
> -
> -   if (mt->tiling == (I915_TILING_Y | I915_TILING_X)) {
> -  y_or_x = true;
> +   if (mt->tiling == (I915_TILING_Y | I915_TILING_X))
>mt->tiling = I915_TILING_Y;
> -   }
>  
> if (layout_flags & MIPTREE_LAYOUT_ACCELERATED_UPLOAD)
>alloc_flags |= BO_ALLOC_FOR_RENDER;
> @@ -686,12 +681,46 @@ intel_miptree_create(struct brw_context *brw,
> }
>  
> mt->pitch = pitch;
> +   mt->offset = 0;
> +
> +   if (!mt->bo) {
> +  intel_miptree_release();
> +  return NULL;
> +   }
> +
> +   return mt;
> +}
> +
> +struct intel_mipmap_tree *
> +intel_miptree_create(struct brw_context *brw,
> + GLenum target,
> + mesa_format format,
> + GLuint first_level,
> + GLuint last_level,
> + GLuint width0,
> + GLuint height0,
> + GLuint depth0,
> + GLuint num_samples,
> + uint32_t layout_flags)
> +{

Re: [Mesa-dev] [PATCH 12/23] i965: Refactor resolving of auxiliary mode

2016-02-09 Thread Ben Widawsky

On Mon, Feb 08, 2016 at 06:51:32PM +0200, Topi Pohjolainen wrote:
> Signed-off-by: Topi Pohjolainen 
> ---
>  src/mesa/drivers/dri/i965/gen8_surface_state.c | 62 
> --
>  1 file changed, 29 insertions(+), 33 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/gen8_surface_state.c 
> b/src/mesa/drivers/dri/i965/gen8_surface_state.c
> index fc8f701..0a52815 100644
> --- a/src/mesa/drivers/dri/i965/gen8_surface_state.c
> +++ b/src/mesa/drivers/dri/i965/gen8_surface_state.c
> @@ -197,6 +197,28 @@ gen8_emit_fast_clear_color(struct brw_context *brw,
>surf[7] |= mt->fast_clear_color_value;
>  }
>  
> +static uint32_t
> +gen8_get_aux_mode(const struct brw_context *brw,
> +  const struct intel_mipmap_tree *mt,
> +  uint32_t surf_type)
> +{
> +   if (mt->mcs_mt == NULL)
> +  return GEN8_SURFACE_AUX_MODE_NONE;

I do have some patches that kind of change this for aux_hiz support. I store the
AUX surface type in the aux_buf data structure. This is mostly a reminder to
myself. I think that actually works really well for your stuff here, but it
requires several of my patches before
aux-buf-v2 branch: i965: Save aux_mode at creation and use it

> +
> +   /*
> +* From the BDW PRM, Volume 2d, page 260 (RENDER_SURFACE_STATE):
> +* "When MCS is enabled for non-MSRT, HALIGN_16 must be used"
> +*
> +* From the hardware spec for GEN9:
> +* "When Auxiliary Surface Mode is set to AUX_CCS_D or AUX_CCS_E, HALIGN
> +*  16 must be used."
> +*/
> +   if (brw->gen >= 9 || mt->num_samples == 1)
> +  assert(mt->halign == 16);
> +
> +   return GEN8_SURFACE_AUX_MODE_MCS;
> +}
> +

There was a pretty good ascii table in intel_miptree_create_layout() that should
perhaps be moved here with this comment.

>  static void
>  gen8_emit_texture_surface_state(struct brw_context *brw,
>  struct intel_mipmap_tree *mt,
> @@ -209,13 +231,13 @@ gen8_emit_texture_surface_state(struct brw_context *brw,
>  bool rw, bool for_gather)
>  {
> const unsigned depth = max_layer - min_layer;
> -   struct intel_mipmap_tree *aux_mt = NULL;
> -   uint32_t aux_mode = GEN8_SURFACE_AUX_MODE_NONE;
> +   struct intel_mipmap_tree *aux_mt = mt->mcs_mt;
> uint32_t mocs_wb = brw->gen >= 9 ? SKL_MOCS_WB : BDW_MOCS_WB;
> int surf_index = surf_offset - >wm.base.surf_offset[0];
> unsigned tiling_mode, pitch;
> const unsigned tr_mode = surface_tiling_resource_mode(mt->tr_mode);
> const uint32_t surf_type = translate_tex_target(target);
> +   uint32_t aux_mode = gen8_get_aux_mode(brw, mt, surf_type);
>  
> if (mt->format == MESA_FORMAT_S_UINT8) {
>tiling_mode = GEN8_SURFACE_TILING_W;
> @@ -229,20 +251,9 @@ gen8_emit_texture_surface_state(struct brw_context *brw,
>  * buffer should always have been resolved before it is used as a texture
>  * so there is no need for it.
>  */
> -   if (mt->mcs_mt && mt->num_samples > 1) {
> -  aux_mt = mt->mcs_mt;
> -  aux_mode = GEN8_SURFACE_AUX_MODE_MCS;
> -
> -  /*
> -   * From the BDW PRM, Volume 2d, page 260 (RENDER_SURFACE_STATE):
> -   * "When MCS is enabled for non-MSRT, HALIGN_16 must be used"
> -   *
> -   * From the hardware spec for GEN9:
> -   * "When Auxiliary Surface Mode is set to AUX_CCS_D or AUX_CCS_E, 
> HALIGN
> -   *  16 must be used."
> -   */
> -  if (brw->gen >= 9 || mt->num_samples == 1)
> - assert(mt->halign == 16);
> +   if (mt->num_samples <= 1) {
> +  aux_mt = NULL;
> +  aux_mode = GEN8_SURFACE_AUX_MODE_NONE;

hmm. Shouldn't this consideration just be part of gen8_get_aux_mode()? See my
aux-buf-v2 branch for where I rework this function a bit (i965/gen8: Consolidate
MCS logic). I think that patch might make sense before this one, and then expand
gen8_get_aux_mode() to include num_samples.

I possible you change this all later, and its moot.

> }
>  
> uint32_t *surf = allocate_surface_state(brw, surf_offset, surf_index);
> @@ -418,8 +429,6 @@ gen8_update_renderbuffer_surface(struct brw_context *brw,
> struct gl_context *ctx = >ctx;
> struct intel_renderbuffer *irb = intel_renderbuffer(rb);
> struct intel_mipmap_tree *mt = irb->mt;
> -   struct intel_mipmap_tree *aux_mt = NULL;
> -   uint32_t aux_mode = GEN8_SURFACE_AUX_MODE_NONE;
> unsigned width = mt->logical_width0;
> unsigned height = mt->logical_height0;
> unsigned pitch = mt->pitch;
> @@ -472,21 +481,8 @@ gen8_update_renderbuffer_surface(struct brw_context *brw,
> __func__, _mesa_get_format_name(rb_format));
> }
>  
> -   if (mt->mcs_mt) {
> -  aux_mt = mt->mcs_mt;
> -  aux_mode = GEN8_SURFACE_AUX_MODE_MCS;
> -
> -  /*
> -   * From the BDW PRM, Volume 2d, page 260 (RENDER_SURFACE_STATE):
> -   * "When MCS is enabled for non-MSRT, HALIGN_16 must be used"
> -   *
> -   * From the

[Mesa-dev] [PATCH 1/4] i965: Split brw_upload_texture_surfaces into compute/render atoms.

2016-02-09 Thread Kenneth Graunke

When uploading state for the compute pipeline, we don't want to
look at VS/TCS/TES/GS/FS programs, as they might be stale, and
aren't relevant anyway.  Likewise, the render pipeline shouldn't
look at CS.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93790
Signed-off-by: Kenneth Graunke 
Reviewed-by: Jordan Justen 
---
 src/mesa/drivers/dri/i965/brw_state.h|  1 +
 src/mesa/drivers/dri/i965/brw_state_upload.c |  4 +--
 src/mesa/drivers/dri/i965/brw_wm_surface_state.c | 38 +++-
 3 files changed, 34 insertions(+), 9 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_state.h 
b/src/mesa/drivers/dri/i965/brw_state.h
index f44ccd6..6b85eac 100644
--- a/src/mesa/drivers/dri/i965/brw_state.h
+++ b/src/mesa/drivers/dri/i965/brw_state.h
@@ -76,6 +76,7 @@ extern const struct brw_tracked_state brw_tcs_samplers;
 extern const struct brw_tracked_state brw_tes_samplers;
 extern const struct brw_tracked_state brw_gs_samplers;
 extern const struct brw_tracked_state brw_cs_samplers;
+extern const struct brw_tracked_state brw_cs_texture_surfaces;
 extern const struct brw_tracked_state brw_vs_ubo_surfaces;
 extern const struct brw_tracked_state brw_vs_abo_surfaces;
 extern const struct brw_tracked_state brw_vs_image_surfaces;
diff --git a/src/mesa/drivers/dri/i965/brw_state_upload.c 
b/src/mesa/drivers/dri/i965/brw_state_upload.c
index ee75ca8..a91d074 100644
--- a/src/mesa/drivers/dri/i965/brw_state_upload.c
+++ b/src/mesa/drivers/dri/i965/brw_state_upload.c
@@ -280,7 +280,7 @@ static const struct brw_tracked_state *gen7_compute_atoms[] 
=
_cs_pull_constants,
_cs_ubo_surfaces,
_cs_abo_surfaces,
-   _texture_surfaces,
+   _cs_texture_surfaces,
_cs_work_groups_surface,
_cs_samplers,
_cs_state,
@@ -395,7 +395,7 @@ static const struct brw_tracked_state *gen8_compute_atoms[] 
=
_cs_pull_constants,
_cs_ubo_surfaces,
_cs_abo_surfaces,
-   _texture_surfaces,
+   _cs_texture_surfaces,
_cs_work_groups_surface,
_cs_samplers,
_cs_state,
diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c 
b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
index 5ab2f7f..708be0e 100644
--- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
+++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
@@ -872,16 +872,12 @@ brw_update_texture_surfaces(struct brw_context *brw)
/* BRW_NEW_FRAGMENT_PROGRAM */
struct gl_program *fs = (struct gl_program *) brw->fragment_program;
 
-   /* BRW_NEW_COMPUTE_PROGRAM */
-   struct gl_program *cs = (struct gl_program *) brw->compute_program;
-
/* _NEW_TEXTURE */
update_stage_texture_surfaces(brw, vs, >vs.base, false);
update_stage_texture_surfaces(brw, tcs, >tcs.base, false);
update_stage_texture_surfaces(brw, tes, >tes.base, false);
update_stage_texture_surfaces(brw, gs, >gs.base, false);
update_stage_texture_surfaces(brw, fs, >wm.base, false);
-   update_stage_texture_surfaces(brw, cs, >cs.base, false);
 
/* emit alternate set of surface state for gather. this
 * allows the surface format to be overriden for only the
@@ -897,8 +893,6 @@ brw_update_texture_surfaces(struct brw_context *brw)
  update_stage_texture_surfaces(brw, gs, >gs.base, true);
   if (fs && fs->UsesGather)
  update_stage_texture_surfaces(brw, fs, >wm.base, true);
-  if (cs && cs->UsesGather)
- update_stage_texture_surfaces(brw, cs, >cs.base, true);
}
 
brw->ctx.NewDriverState |= BRW_NEW_SURFACES;
@@ -908,7 +902,6 @@ const struct brw_tracked_state brw_texture_surfaces = {
.dirty = {
   .mesa = _NEW_TEXTURE,
   .brw = BRW_NEW_BATCH |
- BRW_NEW_COMPUTE_PROGRAM |
  BRW_NEW_FRAGMENT_PROGRAM |
  BRW_NEW_FS_PROG_DATA |
  BRW_NEW_GEOMETRY_PROGRAM |
@@ -923,6 +916,37 @@ const struct brw_tracked_state brw_texture_surfaces = {
.emit = brw_update_texture_surfaces,
 };
 
+static void
+brw_update_cs_texture_surfaces(struct brw_context *brw)
+{
+   /* BRW_NEW_COMPUTE_PROGRAM */
+   struct gl_program *cs = (struct gl_program *) brw->compute_program;
+
+   /* _NEW_TEXTURE */
+   update_stage_texture_surfaces(brw, cs, >cs.base, false);
+
+   /* emit alternate set of surface state for gather. this
+* allows the surface format to be overriden for only the
+* gather4 messages.
+*/
+   if (brw->gen < 8) {
+  if (cs && cs->UsesGather)
+ update_stage_texture_surfaces(brw, cs, >cs.base, true);
+   }
+
+   brw->ctx.NewDriverState |= BRW_NEW_SURFACES;
+}
+
+const struct brw_tracked_state brw_cs_texture_surfaces = {
+   .dirty = {
+  .mesa = _NEW_TEXTURE,
+  .brw = BRW_NEW_BATCH |
+ BRW_NEW_COMPUTE_PROGRAM,
+   },
+   .emit = brw_update_cs_texture_surfaces,
+};
+
+
 void
 brw_upload_ubo_surfaces(struct brw_context *brw,
struct gl_shader *shader,
-- 
2.7.0

___
mesa-dev

[Mesa-dev] [PATCH 4/4] i965: Make brw_clear_cache flag all the bits on both pipelines.

2016-02-09 Thread Kenneth Graunke

Setting brw->ctx.NewDriverState and brw->ctx.NewGLState affects
the dirty bits for the current pipeline.  But, we need to flag
everything dirty on *both* pipelines, so that when we switch
back, we'll realize our programs are stale and re-upload them.

To accomplish this, flag the saved state for both pipelines.
Only one of them should matter, but this way we don't have to
check which we need to set.  It's harmless to set the other.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93790
Signed-off-by: Kenneth Graunke 
Reviewed-by: Jordan Justen 
---
 src/mesa/drivers/dri/i965/brw_state_cache.c | 8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_state_cache.c 
b/src/mesa/drivers/dri/i965/brw_state_cache.c
index cac06fa..ce178aa 100644
--- a/src/mesa/drivers/dri/i965/brw_state_cache.c
+++ b/src/mesa/drivers/dri/i965/brw_state_cache.c
@@ -387,8 +387,12 @@ brw_clear_cache(struct brw_context *brw, struct brw_cache 
*cache)
/* We need to make sure that the programs get regenerated, since
 * any offsets leftover in brw_context will no longer be valid.
 */
-   brw->NewGLState |= ~0;
-   brw->ctx.NewDriverState |= ~0ull;
+   brw->NewGLState = ~0;
+   brw->ctx.NewDriverState = ~0ull;
+   brw->state.pipelines[BRW_RENDER_PIPELINE].mesa = ~0;
+   brw->state.pipelines[BRW_RENDER_PIPELINE].brw = ~0ull;
+   brw->state.pipelines[BRW_COMPUTE_PIPELINE].mesa = ~0;
+   brw->state.pipelines[BRW_COMPUTE_PIPELINE].brw = ~0ull;
intel_batchbuffer_flush(brw);
 }
 
-- 
2.7.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 3/4] i965: Make L3 partitioning code only use the current pipeline's shaders.

2016-02-09 Thread Kenneth Graunke

When uploading state for the compute pipeline, we don't want to
look at VS/TCS/TES/GS/FS programs, as they might be stale, and
aren't relevant anyway.  Likewise, the render pipeline shouldn't
look at CS.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93790
Signed-off-by: Kenneth Graunke 
Reviewed-by: Jordan Justen 
---
 src/mesa/drivers/dri/i965/gen7_l3_state.c | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/gen7_l3_state.c 
b/src/mesa/drivers/dri/i965/gen7_l3_state.c
index c4babc2..b2e9306 100644
--- a/src/mesa/drivers/dri/i965/gen7_l3_state.c
+++ b/src/mesa/drivers/dri/i965/gen7_l3_state.c
@@ -307,7 +307,12 @@ get_pipeline_state_l3_weights(const struct brw_context 
*brw)
};
bool needs_dc = false, needs_slm = false;
 
-   for (unsigned i = 0; i < ARRAY_SIZE(stage_states); i++) {
+   unsigned first_stage = MESA_SHADER_VERTEX;
+   unsigned last_stage = MESA_SHADER_FRAGMENT;
+   if (brw->last_pipeline == BRW_COMPUTE_PIPELINE)
+  first_stage = last_stage = MESA_SHADER_COMPUTE;
+
+   for (unsigned i = first_stage; i <= last_stage; i++) {
   const struct gl_shader_program *prog =
  brw->ctx._Shader->CurrentProgram[stage_states[i]->stage];
   const struct brw_stage_prog_data *prog_data = stage_states[i]->prog_data;
-- 
2.7.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/4] i965: Consider tessellation in get_pipeline_state_l3_weights.

2016-02-09 Thread Kenneth Graunke

I think this was just missed; Curro and I were probably writing
code simultaneously and forgot to combine them at the end.

Signed-off-by: Kenneth Graunke 
Reviewed-by: Jordan Justen 
---
 src/mesa/drivers/dri/i965/gen7_l3_state.c | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/gen7_l3_state.c 
b/src/mesa/drivers/dri/i965/gen7_l3_state.c
index 0c1813f..c4babc2 100644
--- a/src/mesa/drivers/dri/i965/gen7_l3_state.c
+++ b/src/mesa/drivers/dri/i965/gen7_l3_state.c
@@ -298,7 +298,12 @@ static struct brw_l3_weights
 get_pipeline_state_l3_weights(const struct brw_context *brw)
 {
const struct brw_stage_state *stage_states[] = {
-  >vs.base, >gs.base, >wm.base, >cs.base
+  [MESA_SHADER_VERTEX] = >vs.base,
+  [MESA_SHADER_TESS_CTRL] = >tcs.base,
+  [MESA_SHADER_TESS_EVAL] = >tes.base,
+  [MESA_SHADER_GEOMETRY] = >gs.base,
+  [MESA_SHADER_FRAGMENT] = >wm.base,
+  [MESA_SHADER_COMPUTE] = >cs.base
};
bool needs_dc = false, needs_slm = false;
 
-- 
2.7.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 13/23] i965: Resolve color buffer also in lossless compression case

2016-02-09 Thread Ben Widawsky

On Mon, Feb 08, 2016 at 06:51:33PM +0200, Topi Pohjolainen wrote:
> Signed-off-by: Topi Pohjolainen 
> ---
>  src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
> b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> index 6f46385..6ec02d8 100644
> --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> @@ -2038,8 +2038,10 @@ intel_miptree_resolve_color(struct brw_context *brw,
> case INTEL_FAST_CLEAR_STATE_UNRESOLVED:
> case INTEL_FAST_CLEAR_STATE_CLEAR:
>/* Fast color clear resolves only make sense for non-MSAA buffers. */
> -  if (mt->msaa_layout == INTEL_MSAA_LAYOUT_NONE)
> +  if (mt->msaa_layout == INTEL_MSAA_LAYOUT_NONE ||
> +  mt->msaa_layout == INTEL_MSAA_LAYOUT_CSS) {
>   brw_meta_resolve_color(brw, mt);
> +  }
>break;
> }
>  }

Ah, now I see why it helps to have a different type than CMS. Although, if you
made the msaa_layout type be CMS, it still works, I think:
case INTEL_FAST_CLEAR_STATE_CLEAR:
   if (mt->num_samples < 2)
  brw_meta_resolve_color(brw, mt);

I think that works?

This is just for you to consider. I can live with whatever you think is best.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] nir: Remove the const_offset from nir_tex_instr

2016-02-09 Thread Connor Abbott

Did you make sure the other drivers and ttn don't use this? Assming that,

Reviewed-by: Connor Abbott 

I really should get to your other series, but I've been busy with
school stuff and whatnot and I've been too lazy -- sorry!

On Tue, Feb 9, 2016 at 8:08 PM, Jason Ekstrand  wrote:
> When NIR was originally drafted, there was no easy way to determine if
> something was constant or not.  The result was that we had lots of
> special-casing for constant values such as this.  Now that load_const
> instructions are SSA-only, it's really easy to find constants and this
> isn't really needed anymore.
> ---
>  src/compiler/nir/glsl_to_nir.cpp   | 16 +
>  src/compiler/nir/nir.h |  3 ---
>  src/compiler/nir/nir_clone.c   |  1 -
>  src/compiler/nir/nir_instr_set.c   |  3 ---
>  src/compiler/nir/nir_print.c   | 14 
>  .../drivers/freedreno/ir3/ir3_compiler_nir.c   | 15 -
>  src/mesa/drivers/dri/i965/brw_fs_nir.cpp   | 26 
> +-
>  src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 21 -
>  8 files changed, 27 insertions(+), 72 deletions(-)
>
> diff --git a/src/compiler/nir/glsl_to_nir.cpp 
> b/src/compiler/nir/glsl_to_nir.cpp
> index 3db2775..eaf71d0 100644
> --- a/src/compiler/nir/glsl_to_nir.cpp
> +++ b/src/compiler/nir/glsl_to_nir.cpp
> @@ -1824,7 +1824,7 @@ nir_visitor::visit(ir_texture *ir)
>num_srcs++;
> if (ir->shadow_comparitor != NULL)
>num_srcs++;
> -   if (ir->offset != NULL && ir->offset->as_constant() == NULL)
> +   if (ir->offset != NULL)
>num_srcs++;
>
> nir_tex_instr *instr = nir_tex_instr_create(this->shader, num_srcs);
> @@ -1881,16 +1881,10 @@ nir_visitor::visit(ir_texture *ir)
>/* we don't support multiple offsets yet */
>assert(ir->offset->type->is_vector() || ir->offset->type->is_scalar());
>
> -  ir_constant *const_offset = ir->offset->as_constant();
> -  if (const_offset != NULL) {
> - for (unsigned i = 0; i < const_offset->type->vector_elements; i++)
> -instr->const_offset[i] = const_offset->value.i[i];
> -  } else {
> - instr->src[src_number].src =
> -nir_src_for_ssa(evaluate_rvalue(ir->offset));
> - instr->src[src_number].src_type = nir_tex_src_offset;
> - src_number++;
> -  }
> +  instr->src[src_number].src =
> + nir_src_for_ssa(evaluate_rvalue(ir->offset));
> +  instr->src[src_number].src_type = nir_tex_src_offset;
> +  src_number++;
> }
>
> switch (ir->op) {
> diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h
> index a4dbfde..294b18e 100644
> --- a/src/compiler/nir/nir.h
> +++ b/src/compiler/nir/nir.h
> @@ -949,9 +949,6 @@ typedef struct {
>  */
> bool is_new_style_shadow;
>
> -   /* constant offset - must be 0 if the offset source is used */
> -   int const_offset[4];
> -
> /* gather component selector */
> unsigned component : 2;
>
> diff --git a/src/compiler/nir/nir_clone.c b/src/compiler/nir/nir_clone.c
> index 5eff743..58ba2eb 100644
> --- a/src/compiler/nir/nir_clone.c
> +++ b/src/compiler/nir/nir_clone.c
> @@ -355,7 +355,6 @@ clone_tex(clone_state *state, const nir_tex_instr *tex)
> ntex->is_array = tex->is_array;
> ntex->is_shadow = tex->is_shadow;
> ntex->is_new_style_shadow = tex->is_new_style_shadow;
> -   memcpy(ntex->const_offset, tex->const_offset, sizeof(ntex->const_offset));
> ntex->component = tex->component;
> ntex->sampler_index = tex->sampler_index;
> ntex->sampler_array_size = tex->sampler_array_size;
> diff --git a/src/compiler/nir/nir_instr_set.c 
> b/src/compiler/nir/nir_instr_set.c
> index d3f939f..880a814 100644
> --- a/src/compiler/nir/nir_instr_set.c
> +++ b/src/compiler/nir/nir_instr_set.c
> @@ -152,7 +152,6 @@ hash_tex(uint32_t hash, const nir_tex_instr *instr)
> hash = HASH(hash, instr->is_array);
> hash = HASH(hash, instr->is_shadow);
> hash = HASH(hash, instr->is_new_style_shadow);
> -   hash = HASH(hash, instr->const_offset);
> unsigned component = instr->component;
> hash = HASH(hash, component);
> hash = HASH(hash, instr->sampler_index);
> @@ -302,8 +301,6 @@ nir_instrs_equal(const nir_instr *instr1, const nir_instr 
> *instr2)
>tex1->is_array != tex2->is_array ||
>tex1->is_shadow != tex2->is_shadow ||
>tex1->is_new_style_shadow != tex2->is_new_style_shadow ||
> -  memcmp(tex1->const_offset, tex2->const_offset,
> - sizeof(tex1->const_offset)) != 0 ||
>tex1->component != tex2->component ||
>   tex1->sampler_index != tex2->sampler_index ||
>   tex1->sampler_array_size != tex2->sampler_array_size) {
> diff --git a/src/compiler/nir/nir_print.c b/src/compiler/nir/nir_print.c
> index 48ecb48..370eb8b 100644
> ---

Re: [Mesa-dev] [PATCH] nir: Remove the const_offset from nir_tex_instr

2016-02-09 Thread Jason Ekstrand

On Feb 9, 2016 7:06 PM, "Connor Abbott"  wrote:
>
> Did you make sure the other drivers and ttn don't use this? Assming that,

Well, I deleted it from the header and the other drivers built so...

> Reviewed-by: Connor Abbott 
>
> I really should get to your other series, but I've been busy with
> school stuff and whatnot and I've been too lazy -- sorry!
>
> On Tue, Feb 9, 2016 at 8:08 PM, Jason Ekstrand 
wrote:
> > When NIR was originally drafted, there was no easy way to determine if
> > something was constant or not.  The result was that we had lots of
> > special-casing for constant values such as this.  Now that load_const
> > instructions are SSA-only, it's really easy to find constants and this
> > isn't really needed anymore.
> > ---
> >  src/compiler/nir/glsl_to_nir.cpp   | 16 +
> >  src/compiler/nir/nir.h |  3 ---
> >  src/compiler/nir/nir_clone.c   |  1 -
> >  src/compiler/nir/nir_instr_set.c   |  3 ---
> >  src/compiler/nir/nir_print.c   | 14 
> >  .../drivers/freedreno/ir3/ir3_compiler_nir.c   | 15 -
> >  src/mesa/drivers/dri/i965/brw_fs_nir.cpp   | 26
+-
> >  src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 21
-
> >  8 files changed, 27 insertions(+), 72 deletions(-)
> >
> > diff --git a/src/compiler/nir/glsl_to_nir.cpp
b/src/compiler/nir/glsl_to_nir.cpp
> > index 3db2775..eaf71d0 100644
> > --- a/src/compiler/nir/glsl_to_nir.cpp
> > +++ b/src/compiler/nir/glsl_to_nir.cpp
> > @@ -1824,7 +1824,7 @@ nir_visitor::visit(ir_texture *ir)
> >num_srcs++;
> > if (ir->shadow_comparitor != NULL)
> >num_srcs++;
> > -   if (ir->offset != NULL && ir->offset->as_constant() == NULL)
> > +   if (ir->offset != NULL)
> >num_srcs++;
> >
> > nir_tex_instr *instr = nir_tex_instr_create(this->shader, num_srcs);
> > @@ -1881,16 +1881,10 @@ nir_visitor::visit(ir_texture *ir)
> >/* we don't support multiple offsets yet */
> >assert(ir->offset->type->is_vector() ||
ir->offset->type->is_scalar());
> >
> > -  ir_constant *const_offset = ir->offset->as_constant();
> > -  if (const_offset != NULL) {
> > - for (unsigned i = 0; i < const_offset->type->vector_elements;
i++)
> > -instr->const_offset[i] = const_offset->value.i[i];
> > -  } else {
> > - instr->src[src_number].src =
> > -nir_src_for_ssa(evaluate_rvalue(ir->offset));
> > - instr->src[src_number].src_type = nir_tex_src_offset;
> > - src_number++;
> > -  }
> > +  instr->src[src_number].src =
> > + nir_src_for_ssa(evaluate_rvalue(ir->offset));
> > +  instr->src[src_number].src_type = nir_tex_src_offset;
> > +  src_number++;
> > }
> >
> > switch (ir->op) {
> > diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h
> > index a4dbfde..294b18e 100644
> > --- a/src/compiler/nir/nir.h
> > +++ b/src/compiler/nir/nir.h
> > @@ -949,9 +949,6 @@ typedef struct {
> >  */
> > bool is_new_style_shadow;
> >
> > -   /* constant offset - must be 0 if the offset source is used */
> > -   int const_offset[4];
> > -
> > /* gather component selector */
> > unsigned component : 2;
> >
> > diff --git a/src/compiler/nir/nir_clone.c b/src/compiler/nir/nir_clone.c
> > index 5eff743..58ba2eb 100644
> > --- a/src/compiler/nir/nir_clone.c
> > +++ b/src/compiler/nir/nir_clone.c
> > @@ -355,7 +355,6 @@ clone_tex(clone_state *state, const nir_tex_instr
*tex)
> > ntex->is_array = tex->is_array;
> > ntex->is_shadow = tex->is_shadow;
> > ntex->is_new_style_shadow = tex->is_new_style_shadow;
> > -   memcpy(ntex->const_offset, tex->const_offset,
sizeof(ntex->const_offset));
> > ntex->component = tex->component;
> > ntex->sampler_index = tex->sampler_index;
> > ntex->sampler_array_size = tex->sampler_array_size;
> > diff --git a/src/compiler/nir/nir_instr_set.c
b/src/compiler/nir/nir_instr_set.c
> > index d3f939f..880a814 100644
> > --- a/src/compiler/nir/nir_instr_set.c
> > +++ b/src/compiler/nir/nir_instr_set.c
> > @@ -152,7 +152,6 @@ hash_tex(uint32_t hash, const nir_tex_instr *instr)
> > hash = HASH(hash, instr->is_array);
> > hash = HASH(hash, instr->is_shadow);
> > hash = HASH(hash, instr->is_new_style_shadow);
> > -   hash = HASH(hash, instr->const_offset);
> > unsigned component = instr->component;
> > hash = HASH(hash, component);
> > hash = HASH(hash, instr->sampler_index);
> > @@ -302,8 +301,6 @@ nir_instrs_equal(const nir_instr *instr1, const
nir_instr *instr2)
> >tex1->is_array != tex2->is_array ||
> >tex1->is_shadow != tex2->is_shadow ||
> >tex1->is_new_style_shadow != tex2->is_new_style_shadow ||
> > -  memcmp(tex1->const_offset, tex2->const_offset,
> > -

Re: [Mesa-dev] [PATCH] nir: Remove the const_offset from nir_tex_instr

2016-02-09 Thread Rob Clark

On Tue, Feb 9, 2016 at 10:06 PM, Connor Abbott  wrote:
> Did you make sure the other drivers and ttn don't use this? Assming that,

ttn doesn't use it..   freedreno does but only since I already started
pushing some of my backend fixes resulting from the gallium ttn stuff
that I'd been working on.

The freedreno/ir3 bits are r-b, vc4 and ttn should be fine..  I'd
started looking through the rest but didn't finish yet..

BR,
-R

> Reviewed-by: Connor Abbott 
>
> I really should get to your other series, but I've been busy with
> school stuff and whatnot and I've been too lazy -- sorry!
>
> On Tue, Feb 9, 2016 at 8:08 PM, Jason Ekstrand  wrote:
>> When NIR was originally drafted, there was no easy way to determine if
>> something was constant or not.  The result was that we had lots of
>> special-casing for constant values such as this.  Now that load_const
>> instructions are SSA-only, it's really easy to find constants and this
>> isn't really needed anymore.
>> ---
>>  src/compiler/nir/glsl_to_nir.cpp   | 16 +
>>  src/compiler/nir/nir.h |  3 ---
>>  src/compiler/nir/nir_clone.c   |  1 -
>>  src/compiler/nir/nir_instr_set.c   |  3 ---
>>  src/compiler/nir/nir_print.c   | 14 
>>  .../drivers/freedreno/ir3/ir3_compiler_nir.c   | 15 -
>>  src/mesa/drivers/dri/i965/brw_fs_nir.cpp   | 26 
>> +-
>>  src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 21 -
>>  8 files changed, 27 insertions(+), 72 deletions(-)
>>
>> diff --git a/src/compiler/nir/glsl_to_nir.cpp 
>> b/src/compiler/nir/glsl_to_nir.cpp
>> index 3db2775..eaf71d0 100644
>> --- a/src/compiler/nir/glsl_to_nir.cpp
>> +++ b/src/compiler/nir/glsl_to_nir.cpp
>> @@ -1824,7 +1824,7 @@ nir_visitor::visit(ir_texture *ir)
>>num_srcs++;
>> if (ir->shadow_comparitor != NULL)
>>num_srcs++;
>> -   if (ir->offset != NULL && ir->offset->as_constant() == NULL)
>> +   if (ir->offset != NULL)
>>num_srcs++;
>>
>> nir_tex_instr *instr = nir_tex_instr_create(this->shader, num_srcs);
>> @@ -1881,16 +1881,10 @@ nir_visitor::visit(ir_texture *ir)
>>/* we don't support multiple offsets yet */
>>assert(ir->offset->type->is_vector() || 
>> ir->offset->type->is_scalar());
>>
>> -  ir_constant *const_offset = ir->offset->as_constant();
>> -  if (const_offset != NULL) {
>> - for (unsigned i = 0; i < const_offset->type->vector_elements; i++)
>> -instr->const_offset[i] = const_offset->value.i[i];
>> -  } else {
>> - instr->src[src_number].src =
>> -nir_src_for_ssa(evaluate_rvalue(ir->offset));
>> - instr->src[src_number].src_type = nir_tex_src_offset;
>> - src_number++;
>> -  }
>> +  instr->src[src_number].src =
>> + nir_src_for_ssa(evaluate_rvalue(ir->offset));
>> +  instr->src[src_number].src_type = nir_tex_src_offset;
>> +  src_number++;
>> }
>>
>> switch (ir->op) {
>> diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h
>> index a4dbfde..294b18e 100644
>> --- a/src/compiler/nir/nir.h
>> +++ b/src/compiler/nir/nir.h
>> @@ -949,9 +949,6 @@ typedef struct {
>>  */
>> bool is_new_style_shadow;
>>
>> -   /* constant offset - must be 0 if the offset source is used */
>> -   int const_offset[4];
>> -
>> /* gather component selector */
>> unsigned component : 2;
>>
>> diff --git a/src/compiler/nir/nir_clone.c b/src/compiler/nir/nir_clone.c
>> index 5eff743..58ba2eb 100644
>> --- a/src/compiler/nir/nir_clone.c
>> +++ b/src/compiler/nir/nir_clone.c
>> @@ -355,7 +355,6 @@ clone_tex(clone_state *state, const nir_tex_instr *tex)
>> ntex->is_array = tex->is_array;
>> ntex->is_shadow = tex->is_shadow;
>> ntex->is_new_style_shadow = tex->is_new_style_shadow;
>> -   memcpy(ntex->const_offset, tex->const_offset, 
>> sizeof(ntex->const_offset));
>> ntex->component = tex->component;
>> ntex->sampler_index = tex->sampler_index;
>> ntex->sampler_array_size = tex->sampler_array_size;
>> diff --git a/src/compiler/nir/nir_instr_set.c 
>> b/src/compiler/nir/nir_instr_set.c
>> index d3f939f..880a814 100644
>> --- a/src/compiler/nir/nir_instr_set.c
>> +++ b/src/compiler/nir/nir_instr_set.c
>> @@ -152,7 +152,6 @@ hash_tex(uint32_t hash, const nir_tex_instr *instr)
>> hash = HASH(hash, instr->is_array);
>> hash = HASH(hash, instr->is_shadow);
>> hash = HASH(hash, instr->is_new_style_shadow);
>> -   hash = HASH(hash, instr->const_offset);
>> unsigned component = instr->component;
>> hash = HASH(hash, component);
>> hash = HASH(hash, instr->sampler_index);
>> @@ -302,8 +301,6 @@ nir_instrs_equal(const nir_instr *instr1, const 
>> nir_instr *instr2)
>>tex1->is_array != tex2->is_array ||
>>tex1->is_shadow != tex2->is_shadow ||
>>

[Mesa-dev] building with -Wshadow

2016-02-09 Thread Dave Airlie

I was just messing with warning flags on virglrenderer and noticed
-Wshadow generated a fair few warnings with gallium, so I did a mesa
build with -Wshadow enabled and it was fairly messy,

do we care? there could be bugs hiding in -Wshadow land.

Dave.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 94072] error: The command line is too long when building MESA on Windows with MinGW-W64

2016-02-09 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=94072

sav...@ukr.net changed:

   What|Removed |Added

Version|unspecified |git

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] Better explain why we are lowering the num_tile_pipes value for TAHITI

2016-02-09 Thread Alexandre Demers

On Tue, 9 Feb 2016 at 22:38 Michel Dänzer  wrote:

> On 10.02.2016 10:11, Alexandre Demers wrote:
> > Signed-off-by: Alexandre Demers 
> > ---
> >  src/gallium/winsys/radeon/drm/radeon_drm_winsys.c | 5 +++--
> >  1 file changed, 3 insertions(+), 2 deletions(-)
> >
> > diff --git a/src/gallium/winsys/radeon/drm/radeon_drm_winsys.c
> b/src/gallium/winsys/radeon/drm/radeon_drm_winsys.c
> > index 49c310c..aab81f9 100644
> > --- a/src/gallium/winsys/radeon/drm/radeon_drm_winsys.c
> > +++ b/src/gallium/winsys/radeon/drm/radeon_drm_winsys.c
> > @@ -405,8 +405,9 @@ static boolean do_winsys_init(struct
> radeon_drm_winsys *ws)
> >  radeon_get_drm_value(ws->fd, RADEON_INFO_NUM_TILE_PIPES,
> NULL,
> >   >info.num_tile_pipes);
> >
> > -/* The kernel returns 12 for some cards for an unknown
> reason.
> > - * I thought this was supposed to be a power of two.
> > +/* Tahiti have a max_tile_pipes of 12 exceptionally.
> However, we
> > + * work with power of two, so let's set it to the nearest
> power of
> > + * two value.
> >   */
>
> Not sure that's better I'm afraid. It doesn't look like 12 is actually a
> possible hardware configuration, it might simply be a kernel driver bug.
>
>
> --
> Earthling Michel Dänzer   |   http://www.amd.com
> Libre software enthusiast | Mesa and X developer
>

On the other hand, the current comment isn't true either (and probably
farther from the truth): it's not true that we don't know why the kernel
returns 12 or which cards/gpus are affected (only tahiti is concerned). If
someone has to investigate this portion of code later in time, it will be
even harder to understand why the comment was added: it gives no clue about
the reported bug it fixes nor any information we have about the specific
GPU identified and the fact that this value is known to be the one set in
the kernel driver.

While it could be true that this is a driver bug, I don't have any way to
verify it. Also, according to what is available under ci_gpu_init() @
si.c#n3254 (where max_tile_pipes is defined), the value "12" was already
flagged as problematic (there is a comment under tile_config for the
default case). This comment was inserted by Alex Deucher himself in the
initial commit (0ce635d67f8c65f9f804abd77b63a65c08107e79). And finally, if
I understand correctly what Alex Deucher summed up under the thread
"[PATCH] winsys/radeon: fix a wrong NUM_TILE_PIPES value from the kernel",
he doesn't remember clearly where and why the value was defined as 12, most
probably a hardware value, but it needed to be mapped to 8 for software
usage.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] Better explain why we are lowering the num_tile_pipes value for TAHITI

2016-02-09 Thread Michel Dänzer

On 10.02.2016 10:11, Alexandre Demers wrote:
> Signed-off-by: Alexandre Demers 
> ---
>  src/gallium/winsys/radeon/drm/radeon_drm_winsys.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/src/gallium/winsys/radeon/drm/radeon_drm_winsys.c 
> b/src/gallium/winsys/radeon/drm/radeon_drm_winsys.c
> index 49c310c..aab81f9 100644
> --- a/src/gallium/winsys/radeon/drm/radeon_drm_winsys.c
> +++ b/src/gallium/winsys/radeon/drm/radeon_drm_winsys.c
> @@ -405,8 +405,9 @@ static boolean do_winsys_init(struct radeon_drm_winsys 
> *ws)
>  radeon_get_drm_value(ws->fd, RADEON_INFO_NUM_TILE_PIPES, NULL,
>   >info.num_tile_pipes);
>  
> -/* The kernel returns 12 for some cards for an unknown reason.
> - * I thought this was supposed to be a power of two.
> +/* Tahiti have a max_tile_pipes of 12 exceptionally. However, we 
> + * work with power of two, so let's set it to the nearest power 
> of 
> + * two value.
>   */

Not sure that's better I'm afraid. It doesn't look like 12 is actually a
possible hardware configuration, it might simply be a kernel driver bug.


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 94072] error: The command line is too long when building MESA on Windows with MinGW-W64

2016-02-09 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=94072

Bug ID: 94072
   Summary: error: The command line is too long when building MESA
on Windows with MinGW-W64
   Product: Mesa
   Version: unspecified
  Hardware: x86-64 (AMD64)
OS: Windows (All)
Status: NEW
  Severity: normal
  Priority: medium
 Component: Other
  Assignee: mesa-dev@lists.freedesktop.org
  Reporter: sav...@ukr.net
QA Contact: mesa-dev@lists.freedesktop.org

Hi,

Used stuff:
- Windows 10,
- MinGW-W64 (5.3.0- builds got error:
===
scons.py build=release platform=windows toolchain=mingw machine=x86_64
libgl-gdi

[snip]

  Archiving build\windows-x86\mesa\libmesa.a ...
  Compiling src\gallium\auxiliary\draw\draw_pipe_twoside.c ...
The command line is too long.
scons: *** [build\windows-x86_64\mesa\libmesa.a] Error 1
scons: building terminated because of errors.
===

I found that it's very similar to which was reported four years ago:
"The command line is too long..."
(details at
https://lists.freedesktop.org/archives/mesa-users/2012-October/000496.html ).

Other users also experienced it:
"Compiling Mesa natively on Windows using GCC with SCons results in "the
command line is too long" error during linking..."
(details at https://wiki.qt.io/Cross_compiling_Mesa_for_Windows ).

While looking for solution, found in SCons wiki:
"MingW had some problems during the link phase of the build process. It turned
out our link lines were in excess of 1 characters, and this was causing
some major grief with python calling spawn()..."
(details at https://bitbucket.org/scons/scons/wiki/LongCmdLinesOnWin32 ).

I assume this error last for such long, because testing of MESA builds using
MinGW* was made on Linux with cross-compiling to Windows. Could it be fixed?


Regards,

Alexander

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 12/23] i965: Refactor resolving of auxiliary mode

2016-02-09 Thread Pohjolainen, Topi

On Tue, Feb 09, 2016 at 03:50:04PM -0800, Ben Widawsky wrote:
> On Mon, Feb 08, 2016 at 06:51:32PM +0200, Topi Pohjolainen wrote:
> > Signed-off-by: Topi Pohjolainen 
> > ---
> >  src/mesa/drivers/dri/i965/gen8_surface_state.c | 62 
> > --
> >  1 file changed, 29 insertions(+), 33 deletions(-)
> > 
> > diff --git a/src/mesa/drivers/dri/i965/gen8_surface_state.c 
> > b/src/mesa/drivers/dri/i965/gen8_surface_state.c
> > index fc8f701..0a52815 100644
> > --- a/src/mesa/drivers/dri/i965/gen8_surface_state.c
> > +++ b/src/mesa/drivers/dri/i965/gen8_surface_state.c
> > @@ -197,6 +197,28 @@ gen8_emit_fast_clear_color(struct brw_context *brw,
> >surf[7] |= mt->fast_clear_color_value;
> >  }
> >  
> > +static uint32_t
> > +gen8_get_aux_mode(const struct brw_context *brw,
> > +  const struct intel_mipmap_tree *mt,
> > +  uint32_t surf_type)
> > +{
> > +   if (mt->mcs_mt == NULL)
> > +  return GEN8_SURFACE_AUX_MODE_NONE;
> 
> I do have some patches that kind of change this for aux_hiz support. I store 
> the
> AUX surface type in the aux_buf data structure. This is mostly a reminder to
> myself. I think that actually works really well for your stuff here, but it
> requires several of my patches before
> aux-buf-v2 branch: i965: Save aux_mode at creation and use it
> 
> > +
> > +   /*
> > +* From the BDW PRM, Volume 2d, page 260 (RENDER_SURFACE_STATE):
> > +* "When MCS is enabled for non-MSRT, HALIGN_16 must be used"
> > +*
> > +* From the hardware spec for GEN9:
> > +* "When Auxiliary Surface Mode is set to AUX_CCS_D or AUX_CCS_E, HALIGN
> > +*  16 must be used."
> > +*/
> > +   if (brw->gen >= 9 || mt->num_samples == 1)
> > +  assert(mt->halign == 16);
> > +
> > +   return GEN8_SURFACE_AUX_MODE_MCS;
> > +}
> > +
> 
> There was a pretty good ascii table in intel_miptree_create_layout() that 
> should
> perhaps be moved here with this comment.
> 
> >  static void
> >  gen8_emit_texture_surface_state(struct brw_context *brw,
> >  struct intel_mipmap_tree *mt,
> > @@ -209,13 +231,13 @@ gen8_emit_texture_surface_state(struct brw_context 
> > *brw,
> >  bool rw, bool for_gather)
> >  {
> > const unsigned depth = max_layer - min_layer;
> > -   struct intel_mipmap_tree *aux_mt = NULL;
> > -   uint32_t aux_mode = GEN8_SURFACE_AUX_MODE_NONE;
> > +   struct intel_mipmap_tree *aux_mt = mt->mcs_mt;
> > uint32_t mocs_wb = brw->gen >= 9 ? SKL_MOCS_WB : BDW_MOCS_WB;
> > int surf_index = surf_offset - >wm.base.surf_offset[0];
> > unsigned tiling_mode, pitch;
> > const unsigned tr_mode = surface_tiling_resource_mode(mt->tr_mode);
> > const uint32_t surf_type = translate_tex_target(target);
> > +   uint32_t aux_mode = gen8_get_aux_mode(brw, mt, surf_type);
> >  
> > if (mt->format == MESA_FORMAT_S_UINT8) {
> >tiling_mode = GEN8_SURFACE_TILING_W;
> > @@ -229,20 +251,9 @@ gen8_emit_texture_surface_state(struct brw_context 
> > *brw,
> >  * buffer should always have been resolved before it is used as a 
> > texture
> >  * so there is no need for it.
> >  */
> > -   if (mt->mcs_mt && mt->num_samples > 1) {
> > -  aux_mt = mt->mcs_mt;
> > -  aux_mode = GEN8_SURFACE_AUX_MODE_MCS;
> > -
> > -  /*
> > -   * From the BDW PRM, Volume 2d, page 260 (RENDER_SURFACE_STATE):
> > -   * "When MCS is enabled for non-MSRT, HALIGN_16 must be used"
> > -   *
> > -   * From the hardware spec for GEN9:
> > -   * "When Auxiliary Surface Mode is set to AUX_CCS_D or AUX_CCS_E, 
> > HALIGN
> > -   *  16 must be used."
> > -   */
> > -  if (brw->gen >= 9 || mt->num_samples == 1)
> > - assert(mt->halign == 16);
> > +   if (mt->num_samples <= 1) {
> > +  aux_mt = NULL;
> > +  aux_mode = GEN8_SURFACE_AUX_MODE_NONE;
> 
> hmm. Shouldn't this consideration just be part of gen8_get_aux_mode()? See my
> aux-buf-v2 branch for where I rework this function a bit (i965/gen8: 
> Consolidate
> MCS logic). I think that patch might make sense before this one, and then 
> expand
> gen8_get_aux_mode() to include num_samples.

While gen8_get_aux_mode() is shared by texture and render buffer setup,
this is a special case for textures only. There is no reason to give the
auxiliary buffer to sampling engine if the egnine doesn't understand it.
(Note that this is existing logic, only now written explicitly. Previous
logic initialised aux_mt to NULL and left it untouched for single sampled.
Here we initialise it, and overwrite it back to NULL for single sampled).

For render target it has to be there as data port uses it.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 3/4] glsl: apply align layout qualifier rules to block offsets

2016-02-09 Thread Samuel Iglesias Gonsálvez

On Tue, 2016-02-09 at 11:46 -0800, Ian Romanick wrote:
> On 02/09/2016 08:56 AM, Ian Romanick wrote:
> > On 02/09/2016 06:06 AM, Samuel Iglesias Gonsálvez wrote:
> > > 
> > > On Tue, 2016-01-12 at 20:34 +1100, Timothy Arceri wrote:
> > > > From Section 4.4.5 (Uniform and Shader Storage Block Layout
> > > > Qualifiers) of the OpenGL 4.50 spec:
> > > > 
> > > >   "The align qualifier makes the start of each block member
> > > > have a
> > > >   minimum byte alignment.  It does not affect the internal
> > > > layout
> > > >   within each member, which will still follow the std140 or
> > > > std430
> > > >   rules. The specified alignment must be a power of 2, or a
> > > >   compile-time error results.
> > > > 
> > > >   The actual alignment of a member will be the greater of the
> > > >   specified align alignment and the standard (e.g., std140)
> > > > base
> > > >   alignment for the member's type. The actual offset of a
> > > > member is
> > > >   computed as follows: If offset was declared, start with that
> > > >   offset, otherwise start with the next available offset. If
> > > > the
> > > >   resulting offset is not a multiple of the actual alignment,
> > > >   increase it to the first offset that is a multiple of the
> > > > actual
> > > >   alignment. This results in the actual offset the member will
> > > > have.
> > > > 
> > > >   When align is applied to an array, it affects only the start
> > > > of
> > > >   the array, not the array's internal stride. Both an offset
> > > > and an
> > > >   align qualifier can be specified on a declaration.
> > > > 
> > > >   The align qualifier, when used on a block, has the same
> > > > effect as
> > > >   qualifying each member with the same align value as declared
> > > > on
> > > >   the block, and gets the same compile-time results and errors
> > > > as if
> > > >   this had been done. As described in general earlier, an
> > > > individual
> > > >   member can specify its own align, which overrides the block-
> > > > level
> > > >   align, but just for that member.
> > > > ---
> > > >  src/glsl/ast_to_hir.cpp | 51
> > > > ++---
> > > >  1 file changed, 48 insertions(+), 3 deletions(-)
> > > > 
> > > > diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp
> > > > index f5da771..fd05fd7 100644
> > > > --- a/src/glsl/ast_to_hir.cpp
> > > > +++ b/src/glsl/ast_to_hir.cpp
> > > > @@ -6212,7 +6212,8 @@
> > > > ast_process_struct_or_iface_block_members(exec_list
> > > > *instructions,
> > > >    ir_variable_mode
> > > > var_mode,
> > > >    ast_type_qualifier
> > > > *layout,
> > > >    unsigned
> > > > block_stream,
> > > > -  unsigned
> > > > expl_location)
> > > > +  unsigned
> > > > expl_location,
> > > > +  unsigned expl_align)
> > > >  {
> > > > unsigned decl_count = 0;
> > > > unsigned next_offset = 0;
> > > > @@ -6466,6 +6467,34 @@
> > > > ast_process_struct_or_iface_block_members(exec_list
> > > > *instructions,
> > > >  }
> > > >   } else {
> > > >  fields[i].offset = -1;
> > > > + }
> > > > +
> > > > + if (qual->flags.q.explicit_align || expl_align != 0)
> > > > {
> > > > +unsigned offset = fields[i].offset != -1 ?
> > > > fields[i].offset :
> > > > +   next_offset;
> > > > +if (align == 0 || size == 0) {
> > > > +   _mesa_glsl_error(, state, "align can only
> > > > be used
> > > > with "
> > > > +"std430 and std140 layouts");
> > > > +} else if (qual->flags.q.explicit_align) {
> > > > +   unsigned member_align;
> > > > +   if (process_qualifier_constant(state, ,
> > > > "align",
> > > > +  qual->align,
> > > > _align)) {
> > > > +  if (member_align == 0 ||
> > > 
> > > I have modified ubo-block-align-zero.vert piglit test to set the
> > > align
> > > qualifier only to block members and tested with align = 0. The
> > > shader
> > > compiles on NVIDIA proprietary driver.
> > > 
> > > According to the spec, the specified alignment must be a power of
> > > 2.
> > > However align = 0 could have different interpretations (for
> > > example,
> > > when applied to a block member, it would mean to ignore block's
> > > align
> > > value). As I am not sure about this case...
> > > 
> > > Have you checked if align = 0 is invalid?
> > 
> > I looked at the ARB_enhanced_layouts spec, and it doesn't provide
> > any
> > real guidance.  My gut tells me that align=0 is not valid because
> > the
> > spec doesn't say what it means.  Either way, I have submitted a
> > spec bug:
> > 
> > https://www.khronos.org/bugzilla/show_bug.cgi?id=1461
> > 
> > Does

Re: [Mesa-dev] [PATCH 3/4] glsl: apply align layout qualifier rules to block offsets

2016-02-09 Thread Timothy Arceri

On Tue, 2016-02-09 at 11:46 -0800, Ian Romanick wrote:
> On 02/09/2016 08:56 AM, Ian Romanick wrote:
> > On 02/09/2016 06:06 AM, Samuel Iglesias Gonsálvez wrote:
> > > 
> > > On Tue, 2016-01-12 at 20:34 +1100, Timothy Arceri wrote:
> > > > From Section 4.4.5 (Uniform and Shader Storage Block Layout
> > > > Qualifiers) of the OpenGL 4.50 spec:
> > > > 
> > > >   "The align qualifier makes the start of each block member
> > > > have a
> > > >   minimum byte alignment.  It does not affect the internal
> > > > layout
> > > >   within each member, which will still follow the std140 or
> > > > std430
> > > >   rules. The specified alignment must be a power of 2, or a
> > > >   compile-time error results.
> > > > 
> > > >   The actual alignment of a member will be the greater of the
> > > >   specified align alignment and the standard (e.g., std140)
> > > > base
> > > >   alignment for the member's type. The actual offset of a
> > > > member is
> > > >   computed as follows: If offset was declared, start with that
> > > >   offset, otherwise start with the next available offset. If
> > > > the
> > > >   resulting offset is not a multiple of the actual alignment,
> > > >   increase it to the first offset that is a multiple of the
> > > > actual
> > > >   alignment. This results in the actual offset the member will
> > > > have.
> > > > 
> > > >   When align is applied to an array, it affects only the start
> > > > of
> > > >   the array, not the array's internal stride. Both an offset
> > > > and an
> > > >   align qualifier can be specified on a declaration.
> > > > 
> > > >   The align qualifier, when used on a block, has the same
> > > > effect as
> > > >   qualifying each member with the same align value as declared
> > > > on
> > > >   the block, and gets the same compile-time results and errors
> > > > as if
> > > >   this had been done. As described in general earlier, an
> > > > individual
> > > >   member can specify its own align, which overrides the block-
> > > > level
> > > >   align, but just for that member.
> > > > ---
> > > >  src/glsl/ast_to_hir.cpp | 51
> > > > ++---
> > > >  1 file changed, 48 insertions(+), 3 deletions(-)
> > > > 
> > > > diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp
> > > > index f5da771..fd05fd7 100644
> > > > --- a/src/glsl/ast_to_hir.cpp
> > > > +++ b/src/glsl/ast_to_hir.cpp
> > > > @@ -6212,7 +6212,8 @@
> > > > ast_process_struct_or_iface_block_members(exec_list
> > > > *instructions,
> > > >    ir_variable_mode
> > > > var_mode,
> > > >    ast_type_qualifier
> > > > *layout,
> > > >    unsigned
> > > > block_stream,
> > > > -  unsigned
> > > > expl_location)
> > > > +  unsigned
> > > > expl_location,
> > > > +  unsigned expl_align)
> > > >  {
> > > > unsigned decl_count = 0;
> > > > unsigned next_offset = 0;
> > > > @@ -6466,6 +6467,34 @@
> > > > ast_process_struct_or_iface_block_members(exec_list
> > > > *instructions,
> > > >  }
> > > >   } else {
> > > >  fields[i].offset = -1;
> > > > + }
> > > > +
> > > > + if (qual->flags.q.explicit_align || expl_align != 0)
> > > > {
> > > > +unsigned offset = fields[i].offset != -1 ?
> > > > fields[i].offset :
> > > > +   next_offset;
> > > > +if (align == 0 || size == 0) {
> > > > +   _mesa_glsl_error(, state, "align can only
> > > > be used
> > > > with "
> > > > +"std430 and std140 layouts");
> > > > +} else if (qual->flags.q.explicit_align) {
> > > > +   unsigned member_align;
> > > > +   if (process_qualifier_constant(state, ,
> > > > "align",
> > > > +  qual->align,
> > > > _align)) {
> > > > +  if (member_align == 0 ||
> > > 
> > > I have modified ubo-block-align-zero.vert piglit test to set the
> > > align
> > > qualifier only to block members and tested with align = 0. The
> > > shader
> > > compiles on NVIDIA proprietary driver.
> > > 
> > > According to the spec, the specified alignment must be a power of
> > > 2.
> > > However align = 0 could have different interpretations (for
> > > example,
> > > when applied to a block member, it would mean to ignore block's
> > > align
> > > value). As I am not sure about this case...
> > > 
> > > Have you checked if align = 0 is invalid?

Hi Sam,

Thanks for taking a look at this.

As well as the information in Ian's bug report I tested this on the AMD
driver and it also had a compile error as expected. I've also pushed a
fix for the member piglit test. It now fails to pass on Nvidia:

https://cgit.freedesktop.org/piglit/commit/?id=fc4fb04a9f9a8f6ba1ef1af9

[Mesa-dev] [Bug 94072] error: The command line is too long when building MESA on Windows with MinGW-W64

2016-02-09 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=94072

sav...@ukr.net changed:

   What|Removed |Added

 CC||sav...@ukr.net

-- 
You are receiving this mail because:
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v3] workarounds: Update workaround names and platforms

2016-02-09 Thread Ben Widawsky

On Tue, Feb 09, 2016 at 10:35:45AM -0800, Matt Turner wrote:
> On Tue, Feb 9, 2016 at 9:44 AM, Sameer Kibey  wrote:
> > Update the format in which workarounds are documented
> > in the source code. This allows mesa to be parsed
> > by the list-workarounds utility in intel-gpu-tools.
> 
> I don't know that I find this valuable.
> 
> Ben touched on one concern -- keeping it updated. But I have another,
> and that's whether the information is accurate, or useful at all.
> 
> 

I think the bottom line is it really doesn't hurt the existing situation. The
primary benefit is it gives us internally a way to easily compare the
workarounds for different drivers.

> > Signed-off-by: Sameer Kibey 
> > ---
> >  src/mesa/drivers/dri/i965/brw_binding_tables.c | 3 ++-
> >  src/mesa/drivers/dri/i965/brw_blorp.cpp| 2 ++
> >  src/mesa/drivers/dri/i965/brw_defines.h| 2 ++
> >  src/mesa/drivers/dri/i965/brw_eu_emit.c| 2 ++
> >  src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 6 ++
> >  src/mesa/drivers/dri/i965/brw_pipe_control.c   | 2 ++
> >  src/mesa/drivers/dri/i965/gen6_queryobj.c  | 5 +++--
> >  src/mesa/drivers/dri/i965/gen8_depth_state.c   | 7 ---
> >  src/mesa/drivers/dri/i965/intel_batchbuffer.c  | 2 +-
> >  9 files changed, 24 insertions(+), 7 deletions(-)
> >
> > diff --git a/src/mesa/drivers/dri/i965/brw_binding_tables.c 
> > b/src/mesa/drivers/dri/i965/brw_binding_tables.c
> > index f3a0310..6dd35dd 100644
> > --- a/src/mesa/drivers/dri/i965/brw_binding_tables.c
> > +++ b/src/mesa/drivers/dri/i965/brw_binding_tables.c
> > @@ -54,13 +54,14 @@ static uint32_t
> >  reserve_hw_bt_space(struct brw_context *brw, unsigned bytes)
> >  {
> > /* From the Broadwell PRM, Volume 16, "Workarounds",
> > -* WaStateBindingTableOverfetch:
> >  * "HW over-fetches two cache lines of binding table indices.  When
> >  *  using the resource streamer, SW needs to pad binding table pointer
> >  *  updates with an additional two cache lines."
> >  *
> >  * Cache lines are 64 bytes, so we subtract 128 bytes from the size of
> >  * the binding table pool buffer.
> > +*
> > +* WaStateBindingTableOverfetch:hsw,bdw,chv,bxt
> >  */
> > if (brw->hw_bt_pool.next_offset + bytes >= brw->hw_bt_pool.bo->size - 
> > 128) {
> >gen7_reset_hw_bt_pool_offsets(brw);
> > diff --git a/src/mesa/drivers/dri/i965/brw_blorp.cpp 
> > b/src/mesa/drivers/dri/i965/brw_blorp.cpp
> > index 1bc6d15..f798e29 100644
> > --- a/src/mesa/drivers/dri/i965/brw_blorp.cpp
> > +++ b/src/mesa/drivers/dri/i965/brw_blorp.cpp
> > @@ -318,6 +318,8 @@ brw_hiz_op_params::brw_hiz_op_params(struct 
> > intel_mipmap_tree *mt,
> >  * SURFACE_STATE.Surface_Horizontal_Alignment should be 4 for Z24 
> > surfaces,
> >  * not 8. But commit 1f112cc increased the alignment from 4 to 8, which
> >  * prevents the clobbering.
> > +*
> > +* WaHizAmbiguate8x4Aligned:hsw
> >  */
> > depth.width = ALIGN(depth.width, 8);
> > depth.height = ALIGN(depth.height, 4);
> > diff --git a/src/mesa/drivers/dri/i965/brw_defines.h 
> > b/src/mesa/drivers/dri/i965/brw_defines.h
> > index 01e0c99..2146172 100644
> > --- a/src/mesa/drivers/dri/i965/brw_defines.h
> > +++ b/src/mesa/drivers/dri/i965/brw_defines.h
> > @@ -1762,6 +1762,8 @@ enum brw_message_target {
> >   * WaForceEnableNonCoherent, HDC memory access may have been overridden by 
> > the
> >   * kernel to be non-coherent (matching the behavior of the same BTI on
> >   * pre-Gen8 hardware) and BTI 255 may actually be an alias for BTI 253.
> > + *
> > + * WaForceEnableNonCoherent:bdw,chv,skl,kbl
> 
> This comment is just explaining that 255 on Gen8+ doesn't mean the
> same thing as it did before. It passingly mentions
> WaForceEnableNonCoherent, but that's a kernel driver workaround --
> nothing in Mesa.
> 
> Moreover, when I look that up in the workarounds database it says it
> applies to all BDW, CHV, KBL, but only SKL until D0. I'm not sure if
> we even care about D0. Was that publicly released? Mentioning that it
> applies to "skl" doesn't tell the whole story and might lead to us
> applying it when it's not needed.

There is definitely a huge gap regarding what the actual definition of a
workaround is. My definition is actually very different than most workarounds,
which consist of "set this bit". A workaround to me is like, geometry stage is
broken, so we have to do it in software but whatever.

I think your concerns about the accuracy are valid. I believe the kernel does
attempt to capture stepping info, and we do not. We (I, anyway) make an effort
to expunge workarounds for platforms that don't ship, so it's hopefully not even
a thing for us.

> 
> >   */
> >  #define GEN8_BTI_STATELESS_IA_COHERENT   255
> >  #define GEN8_BTI_STATELESS_NON_COHERENT  253
> > diff --git a/src/mesa/drivers/dri/i965/brw_eu_emit.c 
> > b/src/mesa/drivers/dri/i965/brw_eu_emit.c
> > index

Re: [Mesa-dev] [PATCH v2] workarounds: Update workaround names and platforms

2016-02-09 Thread Kibey, Sameer


> -Original Message-
> From: Ben Widawsky [mailto:b...@bwidawsk.net]
> Sent: Monday, February 08, 2016 5:41 PM
> To: Kibey, Sameer
> Cc: mesa-dev@lists.freedesktop.org; Sharp, Sarah A; Widawsky, Benjamin
> Subject: Re: [Mesa-dev] [PATCH v2] workarounds: Update workaround
> names and platforms
> 
> On Fri, Feb 05, 2016 at 01:59:23PM -0800, Sameer Kibey wrote:
> > Update the format in which workarounds are documented in the source
> > code. This allows mesa to be parsed by the list-workarounds utility in
> > intel-gpu-tools.
> >
> > Signed-off-by: Sameer Kibey 
> > ---
> > changed byt to vlv for consistency.
> >  src/mesa/drivers/dri/i965/brw_binding_tables.c | 2 +-
> >  src/mesa/drivers/dri/i965/brw_blorp.cpp| 2 ++
> >  src/mesa/drivers/dri/i965/brw_defines.h| 3 ++-
> >  src/mesa/drivers/dri/i965/brw_eu_emit.c| 3 ++-
> >  src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 9 ++---
> >  src/mesa/drivers/dri/i965/brw_pipe_control.c   | 4 +++-
> >  src/mesa/drivers/dri/i965/gen6_queryobj.c  | 2 +-
> >  src/mesa/drivers/dri/i965/gen8_depth_state.c   | 3 ++-
> >  src/mesa/drivers/dri/i965/intel_batchbuffer.c  | 2 +-
> >  9 files changed, 20 insertions(+), 10 deletions(-)
> >
> > diff --git a/src/mesa/drivers/dri/i965/brw_binding_tables.c
> > b/src/mesa/drivers/dri/i965/brw_binding_tables.c
> > index f3a0310..bcf6422 100644
> > --- a/src/mesa/drivers/dri/i965/brw_binding_tables.c
> > +++ b/src/mesa/drivers/dri/i965/brw_binding_tables.c
> > @@ -54,7 +54,7 @@ static uint32_t
> >  reserve_hw_bt_space(struct brw_context *brw, unsigned bytes)  {
> > /* From the Broadwell PRM, Volume 16, "Workarounds",
> > -* WaStateBindingTableOverfetch:
> > +* WaStateBindingTableOverfetch:hsw,bdw,chv,bxt
> >  * "HW over-fetches two cache lines of binding table indices.  When
> >  *  using the resource streamer, SW needs to pad binding table pointer
> >  *  updates with an additional two cache lines."
> > diff --git a/src/mesa/drivers/dri/i965/brw_blorp.cpp
> > b/src/mesa/drivers/dri/i965/brw_blorp.cpp
> > index 1bc6d15..dd01ea8 100644
> > --- a/src/mesa/drivers/dri/i965/brw_blorp.cpp
> > +++ b/src/mesa/drivers/dri/i965/brw_blorp.cpp
> > @@ -304,6 +304,8 @@ brw_hiz_op_params::brw_hiz_op_params(struct
> intel_mipmap_tree *mt,
> >  * aligned to an 8x4 pixel block relative to the upper left corner
> >  * of the depth buffer [...]
> >  *
> > +* WaHizAmbiguate8x4Aligned:hsw
> > +*
> >  * For hiz resolves, the rectangle must also be 8x4 aligned. Item
> >  * WaHizAmbiguate8x4Aligned from the Haswell workarounds page and
> the
> >  * Ivybridge simulator require the alignment.
> > diff --git a/src/mesa/drivers/dri/i965/brw_defines.h
> > b/src/mesa/drivers/dri/i965/brw_defines.h
> > index 01e0c99..5410a1d 100644
> > --- a/src/mesa/drivers/dri/i965/brw_defines.h
> > +++ b/src/mesa/drivers/dri/i965/brw_defines.h
> > @@ -1756,7 +1756,8 @@ enum brw_message_target {
> >  /* Dataport special binding table indices: */
> >  #define BRW_BTI_STATELESS255
> >  #define GEN7_BTI_SLM 254
> > -/* Note that on Gen8+ BTI 255 was redefined to be IA-coherent
> > according to the
> > +/* WaForceEnableNonCoherent:bdw,chv,skl,kbl
> > + * Note that on Gen8+ BTI 255 was redefined to be IA-coherent
> > +according to the
> >   * hardware spec, however because the DRM sets bit 4 of HDC_CHICKEN0
> on BDW,
> >   * CHV and at least some pre-production steppings of SKL due to
> >   * WaForceEnableNonCoherent, HDC memory access may have been
> > overridden by the diff --git a/src/mesa/drivers/dri/i965/brw_eu_emit.c
> > b/src/mesa/drivers/dri/i965/brw_eu_emit.c
> > index 35d8039..918d69e 100644
> > --- a/src/mesa/drivers/dri/i965/brw_eu_emit.c
> > +++ b/src/mesa/drivers/dri/i965/brw_eu_emit.c
> > @@ -1885,7 +1885,8 @@ void brw_CMP(struct brw_codegen *p,
> > brw_set_src0(p, insn, src0);
> > brw_set_src1(p, insn, src1);
> >
> > -   /* Item WaCMPInstNullDstForcesThreadSwitch in the Haswell Bspec
> workarounds
> > +   /* WaCMPInstNullDstForcesThreadSwitch:ivb,hsw,vlv
> > +* Item WaCMPInstNullDstForcesThreadSwitch in the Haswell Bspec
> > + workarounds
> >  * page says:
> >  *"Any CMP instruction with a null destination must use a {switch}."
> >  *
> > diff --git a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
> > b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
> > index 1916a99..24d4a9d 100644
> > --- a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
> > +++ b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
> > @@ -1836,7 +1836,8 @@ fs_generator::generate_code(const cfg_t *cfg,
> int dispatch_width)
> >   brw_F16TO32(p, dst, src[0]);
> >   break;
> >case BRW_OPCODE_CMP:
> > - /* The Ivybridge/BayTrail WaCMPInstFlagDepClearedEarly
> workaround says
> > + /* WaCMPInstFlagDepClearedEarly:ivb,hsw,vlv
> > +  * The Ivybridge/BayTrail

[Mesa-dev] [PATCH v3] workarounds: Update workaround names and platforms

2016-02-09 Thread Sameer Kibey

Update the format in which workarounds are documented
in the source code. This allows mesa to be parsed
by the list-workarounds utility in intel-gpu-tools.

Signed-off-by: Sameer Kibey 
---
 src/mesa/drivers/dri/i965/brw_binding_tables.c | 3 ++-
 src/mesa/drivers/dri/i965/brw_blorp.cpp| 2 ++
 src/mesa/drivers/dri/i965/brw_defines.h| 2 ++
 src/mesa/drivers/dri/i965/brw_eu_emit.c| 2 ++
 src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 6 ++
 src/mesa/drivers/dri/i965/brw_pipe_control.c   | 2 ++
 src/mesa/drivers/dri/i965/gen6_queryobj.c  | 5 +++--
 src/mesa/drivers/dri/i965/gen8_depth_state.c   | 7 ---
 src/mesa/drivers/dri/i965/intel_batchbuffer.c  | 2 +-
 9 files changed, 24 insertions(+), 7 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_binding_tables.c 
b/src/mesa/drivers/dri/i965/brw_binding_tables.c
index f3a0310..6dd35dd 100644
--- a/src/mesa/drivers/dri/i965/brw_binding_tables.c
+++ b/src/mesa/drivers/dri/i965/brw_binding_tables.c
@@ -54,13 +54,14 @@ static uint32_t
 reserve_hw_bt_space(struct brw_context *brw, unsigned bytes)
 {
/* From the Broadwell PRM, Volume 16, "Workarounds",
-* WaStateBindingTableOverfetch:
 * "HW over-fetches two cache lines of binding table indices.  When
 *  using the resource streamer, SW needs to pad binding table pointer
 *  updates with an additional two cache lines."
 *
 * Cache lines are 64 bytes, so we subtract 128 bytes from the size of
 * the binding table pool buffer.
+*
+* WaStateBindingTableOverfetch:hsw,bdw,chv,bxt
 */
if (brw->hw_bt_pool.next_offset + bytes >= brw->hw_bt_pool.bo->size - 128) {
   gen7_reset_hw_bt_pool_offsets(brw);
diff --git a/src/mesa/drivers/dri/i965/brw_blorp.cpp 
b/src/mesa/drivers/dri/i965/brw_blorp.cpp
index 1bc6d15..f798e29 100644
--- a/src/mesa/drivers/dri/i965/brw_blorp.cpp
+++ b/src/mesa/drivers/dri/i965/brw_blorp.cpp
@@ -318,6 +318,8 @@ brw_hiz_op_params::brw_hiz_op_params(struct 
intel_mipmap_tree *mt,
 * SURFACE_STATE.Surface_Horizontal_Alignment should be 4 for Z24 surfaces,
 * not 8. But commit 1f112cc increased the alignment from 4 to 8, which
 * prevents the clobbering.
+*
+* WaHizAmbiguate8x4Aligned:hsw
 */
depth.width = ALIGN(depth.width, 8);
depth.height = ALIGN(depth.height, 4);
diff --git a/src/mesa/drivers/dri/i965/brw_defines.h 
b/src/mesa/drivers/dri/i965/brw_defines.h
index 01e0c99..2146172 100644
--- a/src/mesa/drivers/dri/i965/brw_defines.h
+++ b/src/mesa/drivers/dri/i965/brw_defines.h
@@ -1762,6 +1762,8 @@ enum brw_message_target {
  * WaForceEnableNonCoherent, HDC memory access may have been overridden by the
  * kernel to be non-coherent (matching the behavior of the same BTI on
  * pre-Gen8 hardware) and BTI 255 may actually be an alias for BTI 253.
+ *
+ * WaForceEnableNonCoherent:bdw,chv,skl,kbl
  */
 #define GEN8_BTI_STATELESS_IA_COHERENT   255
 #define GEN8_BTI_STATELESS_NON_COHERENT  253
diff --git a/src/mesa/drivers/dri/i965/brw_eu_emit.c 
b/src/mesa/drivers/dri/i965/brw_eu_emit.c
index 35d8039..7a6179a 100644
--- a/src/mesa/drivers/dri/i965/brw_eu_emit.c
+++ b/src/mesa/drivers/dri/i965/brw_eu_emit.c
@@ -1891,6 +1891,8 @@ void brw_CMP(struct brw_codegen *p,
 *
 * It also applies to other Gen7 platforms (IVB, BYT) even though it isn't
 * mentioned on their work-arounds pages.
+*
+* WaCMPInstNullDstForcesThreadSwitch:ivb,hsw,vlv
 */
if (devinfo->gen == 7) {
   if (dest.file == BRW_ARCHITECTURE_REGISTER_FILE &&
diff --git a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
index 1916a99..3d19605 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
@@ -1846,6 +1846,8 @@ fs_generator::generate_code(const cfg_t *cfg, int 
dispatch_width)
   * We choose to split into CMP(8) instructions since disabling
   * coissuing would affect CMP instructions not otherwise affected by
   * the errata.
+  *
+  * WaCMPInstFlagDepClearedEarly:ivb,hsw,vlv
   */
  if (dispatch_width == 16 && devinfo->gen == 7 && 
!devinfo->is_haswell) {
 if (dst.file == BRW_GENERAL_REGISTER_FILE) {
@@ -1932,6 +1934,8 @@ fs_generator::generate_code(const cfg_t *cfg, int 
dispatch_width)
   * should
   *
   *"Force BFI instructions to be executed always in SIMD8."
+  *
+  * WaForceSIMD8ForBFIInstruction:hsw
   */
  if (dispatch_width == 16 && devinfo->is_haswell) {
 brw_set_default_exec_size(p, BRW_EXECUTE_8);
@@ -1954,6 +1958,8 @@ fs_generator::generate_code(const cfg_t *cfg, int 
dispatch_width)
   *
   * Otherwise we would be able to emit compressed instructions like we
   * do for the other three-source instructions.
+  *
+  * WaForceSIMD8ForBFIInstruction:hsw
   */

Re: [Mesa-dev] [PATCH] winsys/radeon: fix a wrong NUM_TILE_PIPES value from the kernel

2016-02-09 Thread Marek Olšák

On Tue, Feb 9, 2016 at 6:17 PM, Alexandre Demers
 wrote:
>> +/* The kernel returns 12 for some cards for an unknown
>> reason.
>> + * I thought this was supposed to be a power of two.
>> + */
>> +if (ws->gen == DRV_SI && ws->info.num_tile_pipes == 12)
>> +ws->info.num_tile_pipes = 8;
>> +
>
> I may be late in the conversation, but shouldn't we have a look at why the
> value reported by the kernel is wrong for "some" cards? Which ones and why
> should be identified. It seems to be limited to Southern Islands as far as
> we know for now, which limits the scope for now.
>
> Also, about the patch itself, even if only some cards were reported to be
> problematic, why would we limit it to "ws->gen == DRV_SI"? Any cards
> reporting a wrong value should be treated the same way by mapping its value
> from 12 to 8, no?

No. Only one card is affected (Tahiti or Pitcairn, I don't remember
which one). No other card reports 12.

There is no point in looking into why the value is wrong and I haven't
been able to find where the value had come from. It's part of the
kernel ABI now anyway. Userspace won't use it anymore.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] mesa: fix incorrect viewport position when GL_CLIP_ORIGIN = GL_LOWER_LEFT

2016-02-09 Thread Jose Fonseca


On 09/02/16 17:03, Brian Paul wrote:

Ilia Mirkin found/fixed the mistake.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93813
Cc: "11.1" 
---
  src/mesa/main/viewport.c | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/mesa/main/viewport.c b/src/mesa/main/viewport.c
index 7d891429..681e46b 100644
--- a/src/mesa/main/viewport.c
+++ b/src/mesa/main/viewport.c
@@ -456,11 +456,11 @@ _mesa_get_viewport_xform(struct gl_context *ctx, unsigned 
i,
 translate[0] = half_width + x;
 if (ctx->Transform.ClipOrigin == GL_UPPER_LEFT) {
scale[1] = -half_height;
-  translate[1] = half_height - y;
 } else {
scale[1] = half_height;
-  translate[1] = half_height + y;
 }
+   translate[1] = half_height + y;
+
 if (ctx->Transform.ClipDepthMode == GL_NEGATIVE_ONE_TO_ONE) {
scale[2] = 0.5 * (f - n);
translate[2] = 0.5 * (n + f);



Looks good.

Reviewed-by: Jose Fonseca 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

91 matches

Mail list logo