date:20161006

Re: [Mesa-dev] [PATCH 05/11] nir: Add a LCSAA-pass

2016-10-06 Thread Thomas Helland

2016-10-05 18:59 GMT+02:00 Jason Ekstrand :
> On Tue, Oct 4, 2016 at 6:46 PM, Timothy Arceri
>  wrote:
>>
>> On Tue, 2016-10-04 at 16:47 -0700, Jason Ekstrand wrote:
>> > On Fri, Sep 16, 2016 at 6:24 AM, Timothy Arceri > > bora.com> wrote:
>> > > From: Thomas Helland 
>> > >
>> > > V2: Do a "depth first search" to convert to LCSSA
>> > >
>> > > V3: Small comment fixup
>> > >
>> > > V4: Rebase, adapt to removal of function overloads
>> > >
>> > > V5: Rebase, adapt to relocation of nir to compiler/nir
>> > > Still need to adapt to potential if-uses
>> > > Work around nir_validate issue
>> > >
>> > > V6 (Timothy):
>> > >  - tidy lcssa and stop leaking memory
>> > >  - dont rewrite the src for the lcssa phi node
>> > >  - validate lcssa phi srcs to avoid postvalidate assert
>> > >  - don't add new phi if one already exists
>> > >  - more lcssa phi validation fixes
>> > >  - Rather than marking ssa defs inside a loop just mark blocks
>> > > inside
>> > >a loop. This is simpler and fixes lcssa for intrinsics which do
>> > >not have a destination.
>> > >  - don't create LCSSA phis for loops we won't unroll
>> > >  - require loop metadata for lcssa pass
>> > >  - handle case were the ssa defs use outside the loop is already a
>> > > phi
>> > >
>> > > V7: (Timothy)
>> > > - pass indirect mask to metadata call
>> > > ---
>> > >  src/compiler/Makefile.sources   |   1 +
>> > >  src/compiler/nir/nir.h  |   6 ++
>> > >  src/compiler/nir/nir_to_lcssa.c | 227
>> > > 
>> > >  src/compiler/nir/nir_validate.c |  11 +-
>> > >  4 files changed, 242 insertions(+), 3 deletions(-)
>> > >  create mode 100644 src/compiler/nir/nir_to_lcssa.c
>> > >
>> > > diff --git a/src/compiler/Makefile.sources
>> > > b/src/compiler/Makefile.sources
>> > > index 7ed26a9..8ef6080 100644
>> > > --- a/src/compiler/Makefile.sources
>> > > +++ b/src/compiler/Makefile.sources
>> > > @@ -247,6 +247,7 @@ NIR_FILES = \
>> > > nir/nir_search_helpers.h \
>> > > nir/nir_split_var_copies.c \
>> > > nir/nir_sweep.c \
>> > > +   nir/nir_to_lcssa.c \
>> > > nir/nir_to_ssa.c \
>> > > nir/nir_validate.c \
>> > > nir/nir_vla.h \
>> > > diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h
>> > > index cc8f4b6..29a6f45 100644
>> > > --- a/src/compiler/nir/nir.h
>> > > +++ b/src/compiler/nir/nir.h
>> > > @@ -1387,6 +1387,8 @@ typedef struct {
>> > > struct exec_list srcs; /** < list of nir_phi_src */
>> > >
>> > > nir_dest dest;
>> > > +
>> > > +   bool is_lcssa_phi;
>> > >  } nir_phi_instr;
>> > >
>> > >  typedef struct {
>> > > @@ -2643,6 +2645,10 @@ void nir_convert_to_ssa(nir_shader *shader);
>> > >  bool nir_repair_ssa_impl(nir_function_impl *impl);
>> > >  bool nir_repair_ssa(nir_shader *shader);
>> > >
>> > > +void nir_to_lcssa_impl(nir_function_impl *impl,
>> > > +   nir_variable_mode indirect_mask);
>> > > +void nir_to_lcssa(nir_shader *shader, nir_variable_mode
>> > > indirect_mask);
>> > > +
>> > >  /* If phi_webs_only is true, only convert SSA values involved in
>> > > phi nodes to
>> > >   * registers.  If false, convert all values (even those not
>> > > involved in a phi
>> > >   * node) to registers.
>> > > diff --git a/src/compiler/nir/nir_to_lcssa.c
>> > > b/src/compiler/nir/nir_to_lcssa.c
>> > > new file mode 100644
>> > > index 000..25d0bdb
>> > > --- /dev/null
>> > > +++ b/src/compiler/nir/nir_to_lcssa.c
>> > > @@ -0,0 +1,227 @@
>> > > +/*
>> > > + * Copyright © 2015 Thomas Helland
>> > > + *
>> > > + * Permission is hereby granted, free of charge, to any person
>> > > obtaining a
>> > > + * copy of this software and associated documentation files (the
>> > > "Software"),
>> > > + * to deal in the Software without restriction, including without
>> > > limitation
>> > > + * the rights to use, copy, modify, merge, publish, distribute,
>> > > sublicense,
>> > > + * and/or sell copies of the Software, and to permit persons to
>> > > whom the
>> > > + * Software is furnished to do so, subject to the following
>> > > conditions:
>> > > + *
>> > > + * The above copyright notice and this permission notice
>> > > (including the next
>> > > + * paragraph) shall be included in all copies or substantial
>> > > portions of the
>> > > + * Software.
>> > > + *
>> > > + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
>> > > EXPRESS OR
>> > > + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
>> > > MERCHANTABILITY,
>> > > + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO
>> > > EVENT SHALL
>> > > + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM,
>> > > DAMAGES OR OTHER
>> > > + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
>> > > ARISING
>> > > + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
>> > > OTHER DEALINGS
>> > > + * IN THE SOFTWARE.
>> > > + */
>> > > +
>> > > +/*
>> > > + * This pass converts the ssa-graph into

Re: [Mesa-dev] [PATCH] gallium/drivers: initialize pipe_resource::next to NULL

2016-10-06 Thread Axel Davy


Hi,

as of writing, there doesn't seem to be a concensus on the fix.
Could one be found for Mesa 13 ? Gallium nine is apparently broken
except on radeonsi which zeros out the next field... It'd need either the
proposed patch of this thread merged, or to zero the next field everywhere.

I guess other state trackers need to be fixed as well before the release.

Axel

On 04/10/2016 02:13, Roland Scheidegger wrote:

The reason I don't like this isn't really the number of callers, rather
that the driver is going actively against what the state tracker told it
to do. But I'm not strongly opposed to this, since effectively
restricting the next field to be only valid if the resource is created
externally might be a good idea on its own...
Albeit zero-initializing in the state tracker has the advantage that if
resource struct is going to be extended again it would work too.
(Technically, there's no need that the template and the actual resource
struct being the same it just makes things easier - with d3d10 in the
driver interface you basically only have the templates since the drivers
just return pointers to void.)

Roland

Am 04.10.2016 um 01:55 schrieb Marek Olšák:

BTW, I think fixing this in drivers is better, because the number of
resource_create implementations is limited and they are easy to find.

Marek

On Tue, Oct 4, 2016 at 1:45 AM, Roland Scheidegger  wrote:

Sounds reasonable to me.

Roland
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.freedesktop.org_mailman_listinfo_mesa-2Ddev&d=CwIFaQ&c=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEs&r=Vjtt0vs_iqoI31UfJxBl7yv9I2FeiaeAYgMTLKRBc_I&m=yZpTz6pGgFoZbK1LUVEwWTMRb1aA9Mib1imtI8mWHaM&s=hBrxe77phVUzt8iueqi-kYpf4UxiX1-K_uXhtnGSy04&e=

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 1/2] glsl: prohibit lowp, mediump precision on atomic_uint

2016-10-06 Thread Tapani Pälli

Fixes following dEQP tests:

   
dEQP-GLES31.functional.debug.negative_coverage.callbacks.atomic_counter.atomic_precision
   
dEQP-GLES31.functional.debug.negative_coverage.get_error.atomic_counter.atomic_precision
   
dEQP-GLES31.functional.debug.negative_coverage.log.atomic_counter.atomic_precision

Signed-off-by: Tapani Pälli 
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98131
---
 src/compiler/glsl/ast_to_hir.cpp | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/src/compiler/glsl/ast_to_hir.cpp b/src/compiler/glsl/ast_to_hir.cpp
index 8cdb917..c3c8cef 100644
--- a/src/compiler/glsl/ast_to_hir.cpp
+++ b/src/compiler/glsl/ast_to_hir.cpp
@@ -2585,6 +2585,20 @@ select_gles_precision(unsigned qual_precision,
   type->name);
   }
}
+
+
+   /* Section 4.1.7.3 (Atomic Counters) of the GLSL ES 3.10 spec says:
+*
+*"The default precision of all atomic types is highp. It is an error to
+*declare an atomic type with a different precision or to specify the
+*default precision for an atomic type to be lowp or mediump."
+*/
+   if (type->base_type == GLSL_TYPE_ATOMIC_UINT &&
+   precision != ast_precision_high) {
+  _mesa_glsl_error(loc, state,
+   "atomic_uint can only have highp precision qualifier");
+   }
+
return precision;
 }
 
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/2] mesa: throw error if bufSize negative in GetSynciv on OpenGL ES

2016-10-06 Thread Tapani Pälli

Fixes following dEQP tests:

   dEQP-GLES31.functional.debug.negative_coverage.callbacks.state.get_synciv
   dEQP-GLES31.functional.debug.negative_coverage.get_error.state.get_synciv
   dEQP-GLES31.functional.debug.negative_coverage.log.state.get_synciv

Signed-off-by: Tapani Pälli 
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98133
---
 src/mesa/main/syncobj.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/src/mesa/main/syncobj.c b/src/mesa/main/syncobj.c
index be758dd..3684c37 100644
--- a/src/mesa/main/syncobj.c
+++ b/src/mesa/main/syncobj.c
@@ -425,6 +425,14 @@ _mesa_GetSynciv(GLsync sync, GLenum pname, GLsizei 
bufSize, GLsizei *length,
   return;
}
 
+   /* Section 4.1.3 (Sync Object Queries) of the OpenGL ES 3.10 spec says:
+*
+*"An INVALID_VALUE error is generated if bufSize is negative."
+*/
+   if (_mesa_is_gles(ctx) && bufSize < 0) {
+  _mesa_error(ctx, GL_INVALID_VALUE, "glGetSynciv(pname=0x%x)\n", pname);
+   }
+
if (size > 0 && bufSize > 0) {
   const GLsizei copy_count = MIN2(size, bufSize);
 
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 98133] GetSynciv should raise an error if bufSize < 0

2016-10-06 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=98133

Tapani Pälli  changed:

   What|Removed |Added

   Assignee|mesa-dev@lists.freedesktop. |lem...@gmail.com
   |org |
 Status|NEW |ASSIGNED

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 98131] Compiler should reject lowp/mediump qualifiers on atomic_uints

2016-10-06 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=98131

Tapani Pälli  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
   Assignee|mesa-dev@lists.freedesktop. |lem...@gmail.com
   |org |

-- 
You are receiving this mail because:
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [RADV] vk_format.h:147:1: unknown type name 'uint' - is my compiler overage?

2016-10-06 Thread Dave Airlie

On 7 October 2016 at 12:05, Dieter Nützel  wrote:
> gcc (SUSE Linux) 4.8.3
>
> make[4]: Entering directory '/opt/mesa/src/amd/vulkan'

Should already be fixed, must be compiler/libc mismatches or something.

Dave.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] glsl: Let cache_test build when the shader cache is not enabled

2016-10-06 Thread Aaron Watry

Thanks for this.

This at least lets me build a 32-bit mesa on 64-bit host again by disabling
the cache.

Tested-by: Aaron Watry 

Tim: Just FYI, I get test failures for 32-bit builds on my x86-64 host.
With this patch, it no longer segfaults at least, just fails tests. If you
need more info, I can dig a bit.  I do get a bunch of warnings when
building cache_test.c, so some of those could be responsible... haven't
looked into it yet.

--Aaron

On Wed, Oct 5, 2016 at 3:22 PM, Ian Romanick  wrote:

> From: Ian Romanick 
>
> Signed-off-by: Ian Romanick 
> Cc: Timothy Arceri 
> ---
>  src/compiler/glsl/tests/cache_test.c | 4 
>  1 file changed, 4 insertions(+)
>
> diff --git a/src/compiler/glsl/tests/cache_test.c
> b/src/compiler/glsl/tests/cache_test.c
> index 1b0403c..724dfcd 100644
> --- a/src/compiler/glsl/tests/cache_test.c
> +++ b/src/compiler/glsl/tests/cache_test.c
> @@ -36,6 +36,7 @@
>
>  bool error = false;
>
> +#ifdef ENABLE_SHADER_CACHE
>  void
>  _mesa_warning(void *ctx, const char *fmt, ...);
>
> @@ -397,10 +398,12 @@ test_put_key_and_get_key(void)
>
> cache_destroy(cache);
>  }
> +#endif /* ENABLE_SHADER_CACHE */
>
>  int
>  main(void)
>  {
> +#ifdef ENABLE_SHADER_CACHE
> int err;
>
> test_cache_create();
> @@ -411,6 +414,7 @@ main(void)
>
> err = rmrf_local(CACHE_TEST_TMP);
> expect_equal(err, 0, "Removing " CACHE_TEST_TMP " again");
> +#endif /* ENABLE_SHADER_CACHE */
>
> return error ? 1 : 0;
>  }
> --
> 2.5.5
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [RADV] vk_format.h:147:1: unknown type name 'uint' - is my compiler overage?

2016-10-06 Thread Dieter Nützel


gcc (SUSE Linux) 4.8.3

make[4]: Entering directory '/opt/mesa/src/amd/vulkan'
Updating radv_timestamp.h
  GEN  radv_timestamp.h
  CC   radv_device.lo
  CC   vk_format_table.lo
In file included from vk_format_table.c:31:0:
vk_format.h:147:1: error: unknown type name 'uint'
 vk_format_get_blocksizebits(VkFormat format)
 ^
vk_format.h:163:1: error: unknown type name 'uint'
 vk_format_get_blocksize(VkFormat format)
 ^
vk_format.h: In function 'vk_format_get_blocksize':
vk_format.h:165:2: error: unknown type name 'uint'
  uint bits = vk_format_get_blocksizebits(format);
  ^
vk_format.h:166:2: error: unknown type name 'uint'
  uint bytes = bits / 8;
  ^
vk_format.h: At top level:
vk_format.h:178:1: error: unknown type name 'uint'
 vk_format_get_blockwidth(VkFormat format)
 ^
vk_format.h:191:1: error: unknown type name 'uint'
 vk_format_get_blockheight(VkFormat format)
 ^
vk_format.h:406:1: error: unknown type name 'uint'
 vk_format_get_component_bits(VkFormat format,
 ^
vk_format.h:408:9: error: unknown type name 'uint'
 uint component)
 ^
Makefile:917: recipe for target 'vk_format_table.lo' failed
make[4]: *** [vk_format_table.lo] Error 1
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [RFC] ralloc: use jemalloc for faster GLSL compilation

2016-10-06 Thread Michel Dänzer

On 07/10/16 05:44 AM, Eric Anholt wrote:
> Marek Olšák  writes:
> 
>> I'd like to have more feedback on the idea of using jemalloc for ralloc.
>>
>> Right now, I see these options:
>>
>> 1) Use jemalloc for ralloc and make it mandatory for all GL drivers.
>> - Distributions have shown that they are capable of doing anything
>> with the Mesa source code, so they don't need --disable-jemalloc.
>> - Reasonable people should build Mesa as-is.
>>
>> 2) Abandon the idea.
>> - The availability of --disable-jemalloc would send a clear message
>> that "you don't have to enable this", therefore the whole idea of
>> using jemalloc in Mesa would be pointless.
> 
> I'm generally of the opinion that if malloc is taking 10% of compile
> time, we're screwing up and we should just go fix that.  However, this
> is an easy fix and doesn't prevent going and fixing malloc abuse later.

I haven't seen anybody address the concern which was raised about having
multiple allocators independently grabbing heap from the kernel (and
possibly not returning it). Maybe it's not a big deal, but I'd like to
see at least a brief rationale as to why it's not. Marek, have you
compared the maximum heap usage with and without jemalloc, e.g. using
valgrind massif?

> I also don't like configure options -- they're mostly a chance to build
> things wrong.

I think you guys are over-dramatizing this a little. Most distros and
other users are probably using the defaults of most configure options,
so we just have to get the default right.

> I'm concerned that by shared linking against jemalloc we're going to run
> into similar problems to every other time we shared link against things
> and it's going to make our lives harder.  This is probably "we should
> figure out how to stop shared linking against anything" rather than "we
> shouldn't make this change", though.

Distros can't just link everything statically, if we try forcing that on
them they'll just have to revert the damage.

-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 98136] dEQP prohibits varying structs of arrays (and vice versa)?

2016-10-06 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=98136

Bug ID: 98136
   Summary: dEQP prohibits varying structs of arrays (and vice
versa)?
   Product: Mesa
   Version: git
  Hardware: Other
OS: All
Status: NEW
  Severity: normal
  Priority: medium
 Component: glsl-compiler
  Assignee: mesa-dev@lists.freedesktop.org
  Reporter: kenn...@whitecape.org
QA Contact: intel-3d-b...@lists.freedesktop.org
Blocks: 94448

dEQP-GLES31.functional.tessellation.user_defined_io.negative.per_patch_array_of_structs
dEQP-GLES31.functional.tessellation.user_defined_io.negative.per_patch_structs_containing_arrays

gripe about patch-qualified arrays of structs or structs of arrays.  I think
patch qualification is a red herring, and this is just about disallowing nested
array/struct varyings in ES.

I don't remember the rules here.  Need to sort it out and either fix the tests
and close as NOTOURBUG, or go add extra restrictions.

Maybe we already do and we're just failing to apply them to patch variables...


Referenced Bugs:

https://bugs.freedesktop.org/show_bug.cgi?id=94448
[Bug 94448] double Everything's Quality, Please! (Fix all the dEQP bugs!)
-- 
You are receiving this mail because:
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 98135] dEQP-GLES31.functional.debug.negative_coverage.get_error.shader.transform_feedback_varyings wants a different GL error code

2016-10-06 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=98135

Bug ID: 98135
   Summary: dEQP-GLES31.functional.debug.negative_coverage.get_err
or.shader.transform_feedback_varyings wants a
different GL error code
   Product: Mesa
   Version: git
  Hardware: Other
OS: All
Status: NEW
  Severity: normal
  Priority: medium
 Component: Mesa core
  Assignee: mesa-dev@lists.freedesktop.org
  Reporter: kenn...@whitecape.org
QA Contact: mesa-dev@lists.freedesktop.org
Blocks: 94448

dEQP-GLES31.functional.debug.negative_coverage.get_error.shader.transform_feedback_varyings

insists on INVALID_OPERATION rather than INVALID_VALUE.  No idea which is
right.


Referenced Bugs:

https://bugs.freedesktop.org/show_bug.cgi?id=94448
[Bug 94448] double Everything's Quality, Please! (Fix all the dEQP bugs!)
-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 98134] dEQP-GLES31.functional.debug.negative_coverage.get_error.buffer.draw_buffers wants a different GL error code

2016-10-06 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=98134

Bug ID: 98134
   Summary: dEQP-GLES31.functional.debug.negative_coverage.get_err
or.buffer.draw_buffers wants a different GL error code
   Product: Mesa
   Version: git
  Hardware: Other
OS: All
Status: NEW
  Severity: normal
  Priority: medium
 Component: Mesa core
  Assignee: mesa-dev@lists.freedesktop.org
  Reporter: kenn...@whitecape.org
QA Contact: mesa-dev@lists.freedesktop.org
Blocks: 94448

dEQP-GLES31.functional.debug.negative_coverage.get_error.buffer.draw_buffers

expects INVALID_OPERATION instead of INVALID_ENUM in some cases.  Haven't
looked into which is right.


Referenced Bugs:

https://bugs.freedesktop.org/show_bug.cgi?id=94448
[Bug 94448] double Everything's Quality, Please! (Fix all the dEQP bugs!)
-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 98133] GetSynciv should raise an error if bufSize < 0

2016-10-06 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=98133

Bug ID: 98133
   Summary: GetSynciv should raise an error if bufSize < 0
   Product: Mesa
   Version: git
  Hardware: Other
OS: All
Status: NEW
  Severity: normal
  Priority: medium
 Component: Mesa core
  Assignee: mesa-dev@lists.freedesktop.org
  Reporter: kenn...@whitecape.org
QA Contact: mesa-dev@lists.freedesktop.org
Blocks: 94448

dEQP-GLES31.functional.debug.negative_coverage.callbacks.state.get_synciv
dEQP-GLES31.functional.debug.negative_coverage.get_error.state.get_synciv
dEQP-GLES31.functional.debug.negative_coverage.log.state.get_synciv

expect GetSynciv to raise an error if bufSize < 0.  We should check if this is
actually an error condition in ES.


Referenced Bugs:

https://bugs.freedesktop.org/show_bug.cgi?id=94448
[Bug 94448] double Everything's Quality, Please! (Fix all the dEQP bugs!)
-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 98132] #version 300 es compute shaders should not be possible

2016-10-06 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=98132

Bug ID: 98132
   Summary: #version 300 es compute shaders should not be possible
   Product: Mesa
   Version: git
  Hardware: Other
OS: All
Status: NEW
  Severity: normal
  Priority: medium
 Component: glsl-compiler
  Assignee: mesa-dev@lists.freedesktop.org
  Reporter: kenn...@whitecape.org
QA Contact: intel-3d-b...@lists.freedesktop.org
Blocks: 94448

dEQP-GLES31.functional.debug.negative_coverage.callbacks.shader.compile_compute_shader
dEQP-GLES31.functional.debug.negative_coverage.get_error.shader.compile_compute_shader
dEQP-GLES31.functional.debug.negative_coverage.log.shader.compile_compute_shader

expect #version 300 es compute shaders to fail to compile.  presumably compute
shaders only work with #version 310 es.


Referenced Bugs:

https://bugs.freedesktop.org/show_bug.cgi?id=94448
[Bug 94448] double Everything's Quality, Please! (Fix all the dEQP bugs!)
-- 
You are receiving this mail because:
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 98131] Compiler should reject lowp/mediump qualifiers on atomic_uints

2016-10-06 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=98131

Bug ID: 98131
   Summary: Compiler should reject lowp/mediump qualifiers on
atomic_uints
   Product: Mesa
   Version: git
  Hardware: Other
OS: All
Status: NEW
  Severity: normal
  Priority: medium
 Component: glsl-compiler
  Assignee: mesa-dev@lists.freedesktop.org
  Reporter: kenn...@whitecape.org
QA Contact: intel-3d-b...@lists.freedesktop.org
Blocks: 94448

dEQP-GLES31.functional.debug.negative_coverage.callbacks.atomic_counter.atomic_precision
dEQP-GLES31.functional.debug.negative_coverage.get_error.atomic_counter.atomic_precision
dEQP-GLES31.functional.debug.negative_coverage.log.atomic_counter.atomic_precision

try to compile shaders with mediump/lowp atomic_uint variables.  It expects
compilation to fail, but we let it succeed.  We likely need to raise an error.


Referenced Bugs:

https://bugs.freedesktop.org/show_bug.cgi?id=94448
[Bug 94448] double Everything's Quality, Please! (Fix all the dEQP bugs!)
-- 
You are receiving this mail because:
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v3 09/10] anv: Enable fast depth clears

2016-10-06 Thread Jason Ekstrand

Nice and clean!  R-b still applies.

I think I've reviewed everything now.  If there's still something missing,
let me know.  May also want to give Chad a chance.

On Thu, Oct 6, 2016 at 3:21 PM, Nanley Chery  wrote:

> Provides an FPS increase of ~30% on the Sascha triangle and multisampling
> demos.
>
> Signed-off-by: Nanley Chery 
> Reviewed-by: Jason Ekstrand  (v2)
>
> ---
> v3. Emit required clear_params packet (Chad)
> Share clear_params code path IVB+ (Jason)
>
>  src/intel/vulkan/anv_pass.c| 13 +
>  src/intel/vulkan/genX_cmd_buffer.c | 24 ++--
>  2 files changed, 35 insertions(+), 2 deletions(-)
>
> diff --git a/src/intel/vulkan/anv_pass.c b/src/intel/vulkan/anv_pass.c
> index 69c3c7e..595c2ea 100644
> --- a/src/intel/vulkan/anv_pass.c
> +++ b/src/intel/vulkan/anv_pass.c
> @@ -155,5 +155,18 @@ void anv_GetRenderAreaGranularity(
>  VkRenderPassrenderPass,
>  VkExtent2D* pGranularity)
>  {
> +   ANV_FROM_HANDLE(anv_render_pass, pass, renderPass);
> +
> +   /* This granularity satisfies HiZ fast clear alignment requirements
> +* for all sample counts.
> +*/
> +   for (unsigned i = 0; i < pass->subpass_count; ++i) {
> +  if (pass->subpasses[i].depth_stencil_attachment !=
> +  VK_ATTACHMENT_UNUSED) {
> + *pGranularity = (VkExtent2D) { .width = 8, .height = 4 };
> + return;
> +  }
> +   }
> +
> *pGranularity = (VkExtent2D) { 1, 1 };
>  }
> diff --git a/src/intel/vulkan/genX_cmd_buffer.c
> b/src/intel/vulkan/genX_cmd_buffer.c
> index ed6a109..4089fc7 100644
> --- a/src/intel/vulkan/genX_cmd_buffer.c
> +++ b/src/intel/vulkan/genX_cmd_buffer.c
> @@ -1318,8 +1318,27 @@ cmd_buffer_emit_depth_stencil(struct
> anv_cmd_buffer *cmd_buffer)
>anv_batch_emit(&cmd_buffer->batch, GENX(3DSTATE_STENCIL_BUFFER),
> sb);
> }
>
> -   /* Clear the clear params. */
> -   anv_batch_emit(&cmd_buffer->batch, GENX(3DSTATE_CLEAR_PARAMS), cp);
> +   /* From the IVB PRM Vol2P1, 11.5.5.4 3DSTATE_CLEAR_PARAMS:
> +*
> +*3DSTATE_CLEAR_PARAMS must always be programmed in the along with
> +*the other Depth/Stencil state commands(i.e. 3DSTATE_DEPTH_BUFFER,
> +*3DSTATE_STENCIL_BUFFER, or 3DSTATE_HIER_DEPTH_BUFFER)
> +*
> +* Testing also shows that some variant of this restriction may exist
> HSW+.
> +* On BDW+, it is not possible to emit 2 of these packets
> consecutively when
> +* both have DepthClearValueValid set. An analysis of such state
> programming
> +* on SKL showed that the GPU doesn't register the latter packet's
> clear
> +* value.
> +*/
> +   anv_batch_emit(&cmd_buffer->batch, GENX(3DSTATE_CLEAR_PARAMS), cp) {
> +  if (has_hiz) {
> + cp.DepthClearValueValid = true;
> + const uint32_t ds =
> +cmd_buffer->state.subpass->depth_stencil_attachment;
> + cp.DepthClearValue =
> +cmd_buffer->state.attachments[ds].clear_value.depthStencil.
> depth;
> +  }
> +   }
>  }
>
>  static void
> @@ -1332,6 +1351,7 @@ genX(cmd_buffer_set_subpass)(struct anv_cmd_buffer
> *cmd_buffer,
>
> cmd_buffer_emit_depth_stencil(cmd_buffer);
> genX(cmd_buffer_emit_hz_op)(cmd_buffer, BLORP_HIZ_OP_HIZ_RESOLVE);
> +   genX(cmd_buffer_emit_hz_op)(cmd_buffer, BLORP_HIZ_OP_DEPTH_CLEAR);
>
> anv_cmd_buffer_clear_subpass(cmd_buffer);
>  }
> --
> 2.10.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/2] anv/cmd_buffer: Don't call set_subpass in a secondary

2016-10-06 Thread Jason Ekstrand

On Thu, Oct 6, 2016 at 1:33 PM, Nanley Chery  wrote:

> On Thu, Oct 06, 2016 at 12:35:53PM -0700, Jason Ekstrand wrote:
> > On Thu, Oct 6, 2016 at 12:30 PM, Nanley Chery 
> wrote:
> >
> > > On Wed, Oct 05, 2016 at 05:36:43PM -0700, Jason Ekstrand wrote:
> > > > Initially, we had intended set_subpass to be an interesting function
> that
> > > > did whatever (presumably a lot) setup we needed for a subpass.  In
> > > reality,
> > > > it just sets a pointer and a dirty bit and then emits depth and
> stencil
> > > > state.  When we call BeginCommandBuffer on a secondary, all of the
> dirty
> > > > bits are already set and there's no point in setting depth and
> stencil
> > > > state since it will already be set by the primary.  Instead, the only
> > > thing
> > > > we need to do at the start of a secondary is set the subpass pointer.
> > > >
> > > > Signed-off-by: Jason Ekstrand 
> > > > ---
> > > >  src/intel/vulkan/anv_cmd_buffer.c  | 39
> +-
> > > 
> > > >  src/intel/vulkan/anv_genX.h|  3 ---
> > > >  src/intel/vulkan/anv_private.h |  3 ---
> > > >  src/intel/vulkan/genX_cmd_buffer.c |  5 +
> > > >  4 files changed, 2 insertions(+), 48 deletions(-)
> > > >
> > > > diff --git a/src/intel/vulkan/anv_cmd_buffer.c
> > > b/src/intel/vulkan/anv_cmd_buffer.c
> > > > index 9dedde8..ef13dfc 100644
> > > > --- a/src/intel/vulkan/anv_cmd_buffer.c
> > > > +++ b/src/intel/vulkan/anv_cmd_buffer.c
> > > > @@ -407,10 +407,8 @@ VkResult anv_BeginCommandBuffer(
> > > >cmd_buffer->state.pass =
> > > >   anv_render_pass_from_handle(pBeginInfo->pInheritanceInfo->
> > > renderPass);
> > > >
> > > > -  struct anv_subpass *subpass =
> > > > +  cmd_buffer->state.subpass =
> > > >   &cmd_buffer->state.pass->subpasses[pBeginInfo->
> > > pInheritanceInfo->subpass];
> > > > -
> > > > -  anv_cmd_buffer_set_subpass(cmd_buffer, subpass);
> > >
> > > I'm not sure why we always set the fragment descriptor bit in
> > > set_subpass, but it seems like we need to do it here as well to keep
> > > the logic the same. I don't see where we set the dirty bits on a
> > > secondary command buffer at BeginCommandBuffer. Aside from that, this
> > > patch looks good.
> > >
> >
> > Initially, I think we did it to ensure that binding tables got
> re-emitted.
> > However, we're now also re-emitting binding tables on pipeline changes
> and
> > you have a pipeline change at the top of every subpass, so it shouldn't
> be
> > needed either place.
> >
> >
>
> That makes sense. This series is
> Reviewed-by: Nanley Chery 
>

That wasn't 100% true... We do actually need to flag that render targets
are dirty in set_subpass.  It ends up not mattering in secondaries, but I
think it's worth doing there too.  I've pushed a version of the patches
which I think does what we want.


> > > -Nanley
> > >
> > > > }
> > > >
> > > > return VK_SUCCESS;
> > > > @@ -1050,41 +1048,6 @@ anv_cmd_buffer_merge_dynamic(struct
> > > anv_cmd_buffer *cmd_buffer,
> > > > return state;
> > > >  }
> > > >
> > > > -/**
> > > > - * @brief Setup the command buffer for recording commands inside the
> > > given
> > > > - * subpass.
> > > > - *
> > > > - * This does not record all commands needed for starting the
> subpass.
> > > > - * Starting the subpass may require additional commands.
> > > > - *
> > > > - * Note that vkCmdBeginRenderPass, vkCmdNextSubpass, and
> > > vkBeginCommandBuffer
> > > > - * with VK_COMMAND_BUFFER_USAGE_RENDER_PASS_CONTINUE_BIT, all
> setup the
> > > > - * command buffer for recording commands for some subpass.  But only
> > > the first
> > > > - * two, vkCmdBeginRenderPass and vkCmdNextSubpass, can start a
> subpass.
> > > > - */
> > > > -void
> > > > -anv_cmd_buffer_set_subpass(struct anv_cmd_buffer *cmd_buffer,
> > > > -   struct anv_subpass *subpass)
> > > > -{
> > > > -   switch (cmd_buffer->device->info.gen) {
> > > > -   case 7:
> > > > -  if (cmd_buffer->device->info.is_haswell) {
> > > > - gen75_cmd_buffer_set_subpass(cmd_buffer, subpass);
> > > > -  } else {
> > > > - gen7_cmd_buffer_set_subpass(cmd_buffer, subpass);
> > > > -  }
> > > > -  break;
> > > > -   case 8:
> > > > -  gen8_cmd_buffer_set_subpass(cmd_buffer, subpass);
> > > > -  break;
> > > > -   case 9:
> > > > -  gen9_cmd_buffer_set_subpass(cmd_buffer, subpass);
> > > > -  break;
> > > > -   default:
> > > > -  unreachable("unsupported gen\n");
> > > > -   }
> > > > -}
> > > > -
> > > >  struct anv_state
> > > >  anv_cmd_buffer_push_constants(struct anv_cmd_buffer *cmd_buffer,
> > > >gl_shader_stage stage)
> > > > diff --git a/src/intel/vulkan/anv_genX.h
> b/src/intel/vulkan/anv_genX.h
> > > > index 02e79c2..dc2dd5d 100644
> > > > --- a/src/intel/vulkan/anv_genX.h
> > > > +++ b/src/intel/vulkan/anv_genX.h
> > > > @@ -36,9 +36,6 @@ struct anv_state
> > > >  genX(cmd_buffer_alloc_null_surface_sta

Re: [Mesa-dev] [PATCH] anv/wsi: Advertise UNORM formats as well as sRGB

2016-10-06 Thread Lionel Landwerlin


Thanks, I was wondering why those weren't available.

Reviewed-by: Lionel Landwerlin 

On 05/10/16 21:26, Jason Ekstrand wrote:

Because WSI images are created with VkImageCreateInfo::flags explicitly set
to 0, they don't ever have the VK_IMAGE_CREATE_MUTABLE_FORMAT_BIT set.
This means that you can't create an image view of it with a different
format so applications can't render directly in sRGB (without automatic
encoding) unless we actually advertise UNORM formats.  There are a lot of
applications that want to do their own sRGB conversion, so we should allow
for that.  We do, however, make UNORM come after sRGB in the list so that
the default for dumb apps that just grab the first thing is to render in
linear and let the sRGB conversion happen automatically.

Signed-off-by: Jason Ekstrand 
---
  src/intel/vulkan/anv_wsi_wayland.c | 4 
  src/intel/vulkan/anv_wsi_x11.c | 1 +
  2 files changed, 5 insertions(+)

diff --git a/src/intel/vulkan/anv_wsi_wayland.c 
b/src/intel/vulkan/anv_wsi_wayland.c
index d210e79..afce96f 100644
--- a/src/intel/vulkan/anv_wsi_wayland.c
+++ b/src/intel/vulkan/anv_wsi_wayland.c
@@ -106,8 +106,10 @@ wl_drm_format_for_vk_format(VkFormat vk_format, bool alpha)
 case VK_FORMAT_B5G5R5A1_UNORM:
return alpha ? WL_DRM_FORMAT_XRGB1555 : WL_DRM_FORMAT_XRGB1555;
  #endif
+   case VK_FORMAT_B8G8R8_UNORM:
 case VK_FORMAT_B8G8R8_SRGB:
return WL_DRM_FORMAT_BGRX;
+   case VK_FORMAT_B8G8R8A8_UNORM:
 case VK_FORMAT_B8G8R8A8_SRGB:
return alpha ? WL_DRM_FORMAT_ARGB : WL_DRM_FORMAT_XRGB;
  #if 0
@@ -163,9 +165,11 @@ drm_handle_format(void *data, struct wl_drm *drm, uint32_t 
wl_format)
  #endif
 case WL_DRM_FORMAT_XRGB:
wsi_wl_display_add_vk_format(display, VK_FORMAT_B8G8R8_SRGB);
+  wsi_wl_display_add_vk_format(display, VK_FORMAT_B8G8R8_UNORM);
/* fallthrough */
 case WL_DRM_FORMAT_ARGB:
wsi_wl_display_add_vk_format(display, VK_FORMAT_B8G8R8A8_SRGB);
+  wsi_wl_display_add_vk_format(display, VK_FORMAT_B8G8R8A8_UNORM);
break;
  #if 0
 case WL_DRM_FORMAT_ARGB2101010:
diff --git a/src/intel/vulkan/anv_wsi_x11.c b/src/intel/vulkan/anv_wsi_x11.c
index 7c6ef97..25c585f 100644
--- a/src/intel/vulkan/anv_wsi_x11.c
+++ b/src/intel/vulkan/anv_wsi_x11.c
@@ -123,6 +123,7 @@ wsi_x11_get_connection(struct anv_physical_device *device,
  
  static const VkSurfaceFormatKHR formats[] = {

 { .format = VK_FORMAT_B8G8R8A8_SRGB, },
+   { .format = VK_FORMAT_B8G8R8A8_UNORM, },
  };
  
  static const VkPresentModeKHR present_modes[] = {



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] nv50/ir: start LocalCSE with getFirst to merge PHI instructions

2016-10-06 Thread Ilia Mirkin

Reviewed-by: Ilia Mirkin 

On Thu, Oct 6, 2016 at 5:33 PM, Karol Herbst  wrote:
> total instructions in shared programs : 2818606 -> 2818227 (-0.01%)
> total gprs used in shared programs: 379273 -> 379238 (-0.01%)
> total local used in shared programs   : 9505 -> 9505 (0.00%)
> total bytes used in shared programs   : 25837192 -> 25833736 (-0.01%)
>
> localgpr   inst  bytes
> helped   0  25 100 100
>   hurt   0   0   0   0
>
> Signed-off-by: Karol Herbst 
> ---
>  src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp 
> b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
> index 1c71155..168aa05 100644
> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
> @@ -3220,7 +3220,7 @@ LocalCSE::visit(BasicBlock *bb)
>for (ir = bb->getFirst(); ir; ir = ir->next)
>   ir->serial = serial++;
>
> -  for (ir = bb->getEntry(); ir; ir = next) {
> +  for (ir = bb->getFirst(); ir; ir = next) {
>   int s;
>   Value *src = NULL;
>
> --
> 2.10.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] vbo: disable primitive restart when GL >= 4.5

2016-10-06 Thread Ian Romanick

On 10/06/2016 06:22 AM, Martina Kollarova wrote:
> The OpenGL 4.5 spec updated the section on primitive restart, and now it
> doesn't have to be performed on drawing commands not taking a parameter,
> regardless of whether PRIMITIVE_RESTART_FIXED_INDEX is enabled or not.
> 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98106
> Signed-off-by: Martina Kollarova 
> ---
>  src/mesa/vbo/vbo_exec_array.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/src/mesa/vbo/vbo_exec_array.c b/src/mesa/vbo/vbo_exec_array.c
> index 46543f8..cf1ba13 100644
> --- a/src/mesa/vbo/vbo_exec_array.c
> +++ b/src/mesa/vbo/vbo_exec_array.c
> @@ -423,7 +423,7 @@ vbo_draw_arrays(struct gl_context *ctx, GLenum mode, 
> GLint start,
>  
> /* Implement the primitive restart index */
> if (ctx->Array.PrimitiveRestart && !ctx->Array.PrimitiveRestartFixedIndex 
> &&
> -   ctx->Array.RestartIndex < count) {
> +   ctx->Version < 45 && ctx->Array.RestartIndex < count) {

I'm really unsure about having this version check.  I believe that this
change was intended to be a clarification because several drivers /
hardware combinations never supported it.  I'll have to dig back through
Khronos bugs and notes, but I think this should universally apply.

>GLuint primCount = 0;
>  
>if (ctx->Array.RestartIndex == start) {
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] reviewers: Throw myself on the GLX grenade

2016-10-06 Thread Ian Romanick

Heh... I'm not there... well done.

Reviewed-by: Ian Romanick 

On 10/06/2016 12:37 PM, Adam Jackson wrote:
> Signed-off-by: Adam Jackson 
> ---
>  REVIEWERS | 4 
>  1 file changed, 4 insertions(+)
> 
> diff --git a/REVIEWERS b/REVIEWERS
> index f7574b3..f822421 100644
> --- a/REVIEWERS
> +++ b/REVIEWERS
> @@ -104,3 +104,7 @@ F: src/egl/drivers/dri2/platform_wayland.c
>  FREEDRENO
>  R:   Rob Clark 
>  F:   src/gallium/drivers/freedreno/
> +
> +GLX
> +R: Adam Jackson 
> +F: src/glx/
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] nv50/ir: fix wrong check when optimizing MAD to SHLADD

2016-10-06 Thread Ilia Mirkin

Reviewed-by: Ilia Mirkin 

On Thu, Oct 6, 2016 at 7:11 PM, Samuel Pitoiset
 wrote:
> Checking if MAD is supported is definitely wrong, and it's
> more likely a typo I introduced few days ago which breaks
> NV50 because SHLADD is not supported there.
>
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp 
> b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
> index 1c71155..6efb29e 100644
> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
> @@ -1030,7 +1030,7 @@ ConstantFolding::opnd(Instruction *i, ImmediateValue 
> &imm0, int s)
>   i->op = OP_ADD;
>} else
>if (s == 1 && !imm0.isNegative() && imm0.isPow2() &&
> -  target->isOpSupported(i->op, i->dType)) {
> +  target->isOpSupported(OP_SHLADD, i->dType)) {
>   i->op = OP_SHLADD;
>   imm0.applyLog2();
>   i->setSrc(1, new_ImmediateValue(prog, imm0.reg.data.u32));
> --
> 2.10.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] nv50/ir: fix wrong check when optimizing MAD to SHLADD

2016-10-06 Thread Samuel Pitoiset

Checking if MAD is supported is definitely wrong, and it's
more likely a typo I introduced few days ago which breaks
NV50 because SHLADD is not supported there.

Signed-off-by: Samuel Pitoiset 
---
 src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
index 1c71155..6efb29e 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
@@ -1030,7 +1030,7 @@ ConstantFolding::opnd(Instruction *i, ImmediateValue 
&imm0, int s)
  i->op = OP_ADD;
   } else
   if (s == 1 && !imm0.isNegative() && imm0.isPow2() &&
-  target->isOpSupported(i->op, i->dType)) {
+  target->isOpSupported(OP_SHLADD, i->dType)) {
  i->op = OP_SHLADD;
  imm0.applyLog2();
  i->setSrc(1, new_ImmediateValue(prog, imm0.reg.data.u32));
-- 
2.10.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 08/10] nir: add a loop unrolling pass

2016-10-06 Thread Timothy Arceri

On Thu, 2016-10-06 at 10:33 -0700, Jason Ekstrand wrote:
> > 
> > On Wed, Oct 5, 2016 at 7:25 PM, Timothy Arceri  > abora.com> wrote:
> > > Just 
> > > 
> > > On Wed, 2016-10-05 at 16:23 -0700, Jason Ekstrand wrote:
> > > >
> > > >
> > > > On Thu, Sep 15, 2016 at 12:03 AM, Timothy Arceri  > > i@coll
> > > > abora.com> wrote:
> > > > > V2:
> > > > > - tidy ups suggested by Connor.
> > > > > - tidy up cloning logic and handle copy propagation
> > > > >  based of suggestion by Connor.
> > > > > - use nir_ssa_def_rewrite_uses to fix up lcssa phis
> > > > >   suggested by Connor.
> > > > > - add support for complex loop unrolling (two terminators)
> > > > > - handle case were the ssa defs use outside the loop is
> > > already a
> > > > > phi
> > > > > - support unrolling loops with multiple terminators when trip
> > > count
> > > > >   is know for each terminator
> > > > > ---
> > > > >  src/compiler/Makefile.sources          |   1 +
> > > > >  src/compiler/nir/nir.h                 |   2 +
> > > > >  src/compiler/nir/nir_opt_loop_unroll.c | 820
> > > > > +
> > > > >  3 files changed, 823 insertions(+)
> > > > >  create mode 100644 src/compiler/nir/nir_opt_loop_unroll.c
> > > > >
> > > > > diff --git a/src/compiler/Makefile.sources
> > > > > b/src/compiler/Makefile.sources
> > > > > index 8ef6080..b3512bb 100644
> > > > > --- a/src/compiler/Makefile.sources
> > > > > +++ b/src/compiler/Makefile.sources
> > > > > @@ -233,6 +233,7 @@ NIR_FILES = \
> > > > >         nir/nir_opt_dead_cf.c \
> > > > >         nir/nir_opt_gcm.c \
> > > > >         nir/nir_opt_global_to_local.c \
> > > > > +       nir/nir_opt_loop_unroll.c \
> > > > >         nir/nir_opt_peephole_select.c \
> > > > >         nir/nir_opt_remove_phis.c \
> > > > >         nir/nir_opt_undef.c \
> > > > > diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h
> > > > > index 9887432..0513d81 100644
> > > > > --- a/src/compiler/nir/nir.h
> > > > > +++ b/src/compiler/nir/nir.h
> > > > > @@ -2661,6 +2661,8 @@ bool nir_opt_dead_cf(nir_shader
> > > *shader);
> > > > >
> > > > >  bool nir_opt_gcm(nir_shader *shader, bool value_number);
> > > > >
> > > > > +bool nir_opt_loop_unroll(nir_shader *shader,
> > > nir_variable_mode
> > > > > indirect_mask);
> > > > > +
> > > > >  bool nir_opt_peephole_select(nir_shader *shader);
> > > > >
> > > > >  bool nir_opt_remove_phis(nir_shader *shader);
> > > > > diff --git a/src/compiler/nir/nir_opt_loop_unroll.c
> > > > > b/src/compiler/nir/nir_opt_loop_unroll.c
> > > > > new file mode 100644
> > > > > index 000..1de02f6
> > > > > --- /dev/null
> > > > > +++ b/src/compiler/nir/nir_opt_loop_unroll.c
> > > > > @@ -0,0 +1,820 @@
> > > > > +/*
> > > > > + * Copyright © 2016 Intel Corporation
> > > > > + *
> > > > > + * Permission is hereby granted, free of charge, to any
> > > person
> > > > > obtaining a
> > > > > + * copy of this software and associated documentation files
> > > (the
> > > > > "Software"),
> > > > > + * to deal in the Software without restriction, including
> > > without
> > > > > limitation
> > > > > + * the rights to use, copy, modify, merge, publish,
> > > distribute,
> > > > > sublicense,
> > > > > + * and/or sell copies of the Software, and to permit persons
> > > to
> > > > > whom the
> > > > > + * Software is furnished to do so, subject to the following
> > > > > conditions:
> > > > > + *
> > > > > + * The above copyright notice and this permission notice
> > > > > (including the next
> > > > > + * paragraph) shall be included in all copies or substantial
> > > > > portions of the
> > > > > + * Software.
> > > > > + *
> > > > > + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY
> > > KIND,
> > > > > EXPRESS OR
> > > > > + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
> > > > > MERCHANTABILITY,
> > > > > + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN
> > > NO
> > > > > EVENT SHALL
> > > > > + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM,
> > > > > DAMAGES OR OTHER
> > > > > + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
> > > OTHERWISE,
> > > > > ARISING
> > > > > + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE
> > > USE OR
> > > > > OTHER
> > > > > + * DEALINGS IN THE SOFTWARE.
> > > > > + */
> > > > > +
> > > > > +#include "nir.h"
> > > > > +#include "nir_builder.h"
> > > > > +#include "nir_control_flow.h"
> > > > > +
> > > > > +static void
> > > > > +extract_loop_body(nir_cf_list *extracted, nir_cf_node *node)
> > > >
> > > > "node" is not particularly descriptive.  Perhaps "start_node"
> > > or
> > > > something like that.
> > > >  
> > > > > +{
> > > > > +   nir_cf_node *end = node;
> > > > > +   while (!nir_cf_node_is_last(end))
> > > > > +      end = nir_cf_node_next(end);
> > > >
> > > > This bit of iteration seems unfortunate.  If you have the loop
> > > > pointer, you can just do
> > > >
> > > > nir_cf_extract(extracted, nir_before_cf_node(node),
> > > > nir_after_cf_node(nir_loop_l

[Mesa-dev] [PATCH 5/5] glsl: simplified ast_type_qualifier::merge_[in|out]_qualifier API

2016-10-06 Thread Andres Gomez

Since we modified the way in which multiple repetitions of the same
layout-qualifier-name in a single declaration collapse into the
ast_type_qualifier class, we can simplify the merge_[in|out]_qualifier
APIs through removing the create_node parameter.

Signed-off-by: Andres Gomez 
---
 src/compiler/glsl/ast.h  |  4 ++--
 src/compiler/glsl/ast_type.cpp   | 18 +++---
 src/compiler/glsl/glsl_parser.yy |  4 ++--
 3 files changed, 11 insertions(+), 15 deletions(-)

diff --git a/src/compiler/glsl/ast.h b/src/compiler/glsl/ast.h
index 804111f..aee4cd8 100644
--- a/src/compiler/glsl/ast.h
+++ b/src/compiler/glsl/ast.h
@@ -752,12 +752,12 @@ struct ast_type_qualifier {
bool merge_out_qualifier(YYLTYPE *loc,
_mesa_glsl_parse_state *state,
const ast_type_qualifier &q,
-   ast_node* &node, bool create_node);
+   ast_node* &node);
 
bool merge_in_qualifier(YYLTYPE *loc,
_mesa_glsl_parse_state *state,
const ast_type_qualifier &q,
-   ast_node* &node, bool create_node);
+   ast_node* &node);
 
/**
 * Push pending layout qualifiers to the global values.
diff --git a/src/compiler/glsl/ast_type.cpp b/src/compiler/glsl/ast_type.cpp
index 4dbec59..2a68a83 100644
--- a/src/compiler/glsl/ast_type.cpp
+++ b/src/compiler/glsl/ast_type.cpp
@@ -359,7 +359,7 @@ bool
 ast_type_qualifier::merge_out_qualifier(YYLTYPE *loc,
 _mesa_glsl_parse_state *state,
 const ast_type_qualifier &q,
-ast_node* &node, bool create_node)
+ast_node* &node)
 {
void *mem_ctx = state;
const bool r = this->merge_qualifier(loc, state, q, false);
@@ -393,9 +393,7 @@ ast_type_qualifier::merge_out_qualifier(YYLTYPE *loc,
   valid_out_mask.flags.q.max_vertices = 1;
   valid_out_mask.flags.q.prim_type = 1;
} else if (state->stage == MESA_SHADER_TESS_CTRL) {
-  if (create_node) {
- node = new(mem_ctx) ast_tcs_output_layout(*loc);
-  }
+  node = new(mem_ctx) ast_tcs_output_layout(*loc);
   valid_out_mask.flags.q.vertices = 1;
   valid_out_mask.flags.q.explicit_xfb_buffer = 1;
   valid_out_mask.flags.q.xfb_buffer = 1;
@@ -432,7 +430,7 @@ bool
 ast_type_qualifier::merge_in_qualifier(YYLTYPE *loc,
_mesa_glsl_parse_state *state,
const ast_type_qualifier &q,
-   ast_node* &node, bool create_node)
+   ast_node* &node)
 {
void *mem_ctx = state;
bool create_gs_ast = false;
@@ -571,12 +569,10 @@ ast_type_qualifier::merge_in_qualifier(YYLTYPE *loc,
   this->point_mode = q.point_mode;
}
 
-   if (create_node) {
-  if (create_gs_ast) {
- node = new(mem_ctx) ast_gs_input_layout(*loc, q.prim_type);
-  } else if (create_cs_ast) {
- node = new(mem_ctx) ast_cs_input_layout(*loc, q.local_size);
-  }
+   if (create_gs_ast) {
+  node = new(mem_ctx) ast_gs_input_layout(*loc, q.prim_type);
+   } else if (create_cs_ast) {
+  node = new(mem_ctx) ast_cs_input_layout(*loc, q.local_size);
}
 
return true;
diff --git a/src/compiler/glsl/glsl_parser.yy b/src/compiler/glsl/glsl_parser.yy
index 225d58b..67516bc 100644
--- a/src/compiler/glsl/glsl_parser.yy
+++ b/src/compiler/glsl/glsl_parser.yy
@@ -2914,7 +2914,7 @@ layout_defaults:
{
   $$ = NULL;
   if (!state->in_qualifier->
- merge_in_qualifier(& @1, state, $1, $$, true)) {
+ merge_in_qualifier(& @1, state, $1, $$)) {
  YYERROR;
   }
   if (!state->in_qualifier->push_to_global(& @1, state)) {
@@ -2925,7 +2925,7 @@ layout_defaults:
{
   $$ = NULL;
   if (!state->out_qualifier->
- merge_out_qualifier(& @1, state, $1, $$, true)) {
+ merge_out_qualifier(& @1, state, $1, $$)) {
  YYERROR;
   }
   if (!state->out_qualifier->push_to_global(& @1, state)) {
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/5] glsl: last duplicated layout-qualifier-name in multiple layout-qualifiers overrides the former

2016-10-06 Thread Andres Gomez

From page 46 (page 52 of the PDF) of the GLSL 4.20 spec:

  " More than one layout qualifier may appear in a single
declaration. If the same layout-qualifier-name occurs in multiple
layout qualifiers for the same declaration, the last one overrides
the former ones."

Consider this example:

  " #version 150
#extension GL_ARB_shading_language_420pack: enable
#extension GL_ARB_enhanced_layouts: enable

layout(max_vertices=2) layout(max_vertices=3) out;
layout(max_vertices=3) out;"

Although different values for the "max_vertices" layout-qualifier-name
should end in a compilation failure, since only the last occurrence is
taken into account, this small piece of code from a shader is valid.

Hence, when merging qualifiers in an ast_type_qualifier, we now ignore
new appearances of a same layout-qualifier-name if the new
"is_multiple_layouts_merge" parameter is on, since the GLSL parser
works in this case from right to left.

Also, the GLSL parser has been simplified to check for the needed
GL_ARB_shading_language_420pack extension just when merging the
qualifiers in the proper cases.

In addition, any special treatment for the buffer, uniform, in or out
layout defaults has been moved in the GLSL parser to the rule
triggered just after any previous processing/merging on the
layout-qualifiers has happened in a single declaration since it was
run too soon previously.

Finally, the merging of an ast_layout_expression is now done
prepending instead of appending since the processing of the qualifier
constant returns the first value in the list and the last appearing
declaration of a variable or default overrides the previous
declarations.

Fixes GL44-CTS.shading_language_420pack.qualifier_override_layout

Signed-off-by: Andres Gomez 
---
 src/compiler/glsl/ast.h  |   5 +-
 src/compiler/glsl/ast_type.cpp   |  34 +++---
 src/compiler/glsl/glsl_parser.yy | 131 +++
 3 files changed, 77 insertions(+), 93 deletions(-)

diff --git a/src/compiler/glsl/ast.h b/src/compiler/glsl/ast.h
index 4c648d0..73c73b7 100644
--- a/src/compiler/glsl/ast.h
+++ b/src/compiler/glsl/ast.h
@@ -382,7 +382,7 @@ public:
 
void merge_qualifier(ast_layout_expression *l_expr)
{
-  layout_const_expressions.append_list(&l_expr->layout_const_expressions);
+  layout_const_expressions.prepend_list(&l_expr->layout_const_expressions);
}
 
exec_list layout_const_expressions;
@@ -746,7 +746,8 @@ struct ast_type_qualifier {
bool merge_qualifier(YYLTYPE *loc,
_mesa_glsl_parse_state *state,
 const ast_type_qualifier &q,
-bool is_single_layout_merge);
+bool is_single_layout_merge,
+bool is_multiple_layouts_merge = false);
 
bool merge_out_qualifier(YYLTYPE *loc,
_mesa_glsl_parse_state *state,
diff --git a/src/compiler/glsl/ast_type.cpp b/src/compiler/glsl/ast_type.cpp
index 504b533..f02f71b 100644
--- a/src/compiler/glsl/ast_type.cpp
+++ b/src/compiler/glsl/ast_type.cpp
@@ -108,15 +108,21 @@ ast_type_qualifier::has_auxiliary_storage() const
 }
 
 /**
- * This function merges both duplicate identifies within a single layout and
- * multiple layout qualifiers on a single variable declaration. The
- * is_single_layout_merge param is used differentiate between the two.
+ * This function merges duplicate layout identifiers.
+ *
+ * It deals with duplicates within a single layout qualifier, among multiple
+ * layout qualifiers on a single declaration and on several declarations for
+ * the same variable.
+ *
+ * The is_single_layout_merge and is_multiple_layouts_merge parameters are
+ * used to differentiate among them.
  */
 bool
 ast_type_qualifier::merge_qualifier(YYLTYPE *loc,
 _mesa_glsl_parse_state *state,
 const ast_type_qualifier &q,
-bool is_single_layout_merge)
+bool is_single_layout_merge,
+bool is_multiple_layouts_merge)
 {
ast_type_qualifier ubo_mat_mask;
ubo_mat_mask.flags.i = 0;
@@ -186,6 +192,12 @@ ast_type_qualifier::merge_qualifier(YYLTYPE *loc,
   return false;
}
 
+   if (is_multiple_layouts_merge && !state->has_420pack_or_es31()) {
+  _mesa_glsl_error(loc, state,
+   "duplicate layout(...) qualifiers");
+  return false;
+   }
+
if (q.flags.q.prim_type) {
   if (this->flags.q.prim_type && this->prim_type != q.prim_type) {
  _mesa_glsl_error(loc, state,
@@ -196,7 +208,8 @@ ast_type_qualifier::merge_qualifier(YYLTYPE *loc,
}
 
if (q.flags.q.max_vertices) {
-  if (this->max_vertices && !is_single_layout_merge) {
+  if (this->max_vertices
+  && !is_single_layout_merge && !is_multiple_layouts_merge) {
  this->max_vertices->merge_qualif

[Mesa-dev] [PATCH 4/5] glsl: by default, any ast_layout_expression variable value must match its previous appearances

2016-10-06 Thread Andres Gomez

Recently, we added code to check that any appearance of the
"max_vertices" layout-qualifier-name in a program holds the same
value.

Now, we make this the default behavior for any layout-qualifier-name
represented as a ast_layout_expression since, as it happens, the same
constraint applies to all the current ones; "max_vertices",
"invocations", "vertices", "local_size_[x|y|z]" and "xfb_stride".

From page 44 (page 50 of the PDF) of the GLSL 4.00 spec:

  " If an invocation count is declared, all such declarations must
specify the same count."

From page 47 (page 53 of the PDF) of the GLSL 4.00 spec:

  " All tessellation control shader layout declarations in a program
must specify the same output patch vertex count."

From page 60 (page 66 of the PDF) of the GLSL 4.30 spec:

  " Also, if such a layout qualifier is declared more than once in the
same shader, all those declarations must set the same set of local
work-group sizes and set them to the same values; otherwise a
compile-time error results. If multiple compute shaders attached
to a single program object declare local work-group size, the
declarations must be identical; otherwise a link-time error
results."

From page 73 (page 79 of the PDF) of the GLSL 4.40 spec:

  " While xfb_stride can be declared multiple times for the same
buffer, it is a compile-time or link-time error to have different
values specified for the stride for the same buffer."

Fixes GL44-CTS.enhanced_layouts.xfb_duplicated_stride

Signed-off-by: Andres Gomez 
---
 src/compiler/glsl/ast.h  | 2 +-
 src/compiler/glsl/glsl_parser_extras.cpp | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/compiler/glsl/ast.h b/src/compiler/glsl/ast.h
index c1453a2..804111f 100644
--- a/src/compiler/glsl/ast.h
+++ b/src/compiler/glsl/ast.h
@@ -378,7 +378,7 @@ public:
bool process_qualifier_constant(struct _mesa_glsl_parse_state *state,
const char *qual_indentifier,
unsigned *value, bool can_be_zero,
-   bool must_match = false);
+   bool must_match = true);
 
void merge_qualifier(ast_layout_expression *l_expr)
{
diff --git a/src/compiler/glsl/glsl_parser_extras.cpp 
b/src/compiler/glsl/glsl_parser_extras.cpp
index 5f3474e..bcbcb24 100644
--- a/src/compiler/glsl/glsl_parser_extras.cpp
+++ b/src/compiler/glsl/glsl_parser_extras.cpp
@@ -1738,7 +1738,7 @@ set_shader_inout_layout(struct gl_shader *shader,
  unsigned qual_max_vertices;
  if (state->out_qualifier->max_vertices->
process_qualifier_constant(state, "max_vertices",
-  &qual_max_vertices, true, true)) {
+  &qual_max_vertices, true)) {
 
 if (qual_max_vertices > state->Const.MaxGeometryOutputVertices) {
YYLTYPE loc = 
state->out_qualifier->max_vertices->get_location();
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 1/5] glsl: last duplicated layout-qualifier-name in a layout-qualifier overrides the former

2016-10-06 Thread Andres Gomez

In a declaration, when a layout qualifier appears and holds duplicated
layout-qualifier-name, only the last occurrence should be taken into
account.

From page 59 (page 65 of the PDF) of the GLSL 4.40 spec:

  " More than one layout qualifier may appear in a single
declaration. Additionally, the same layout-qualifier-name can
occur multiple times within a layout qualifier or across multiple
layout qualifiers in the same declaration. When the same
layout-qualifier-name occurs multiple times, in a single
declaration, the last occurrence overrides the former
occurrence(s)."

Consider this example:

  " #version 150
#extension GL_ARB_shading_language_420pack: enable
#extension GL_ARB_enhanced_layouts: enable

layout(max_vertices=2, max_vertices=3) out;
layout(max_vertices=3) out;"

Although different values for the "max_vertices" layout-qualifier-name
should end in a compilation failure, since only the last occurrence is
taken into account, this small piece of code from a shader is valid.

Hence, when merging qualifiers in an ast_type_qualifier, we now ignore
new appearances of a same layout-qualifier-name if the
"is_single_layout_merge" parameter is on, since the GLSL parser works
in this case from right to left.

Signed-off-by: Andres Gomez 
---
 src/compiler/glsl/ast_type.cpp | 11 ++-
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/src/compiler/glsl/ast_type.cpp b/src/compiler/glsl/ast_type.cpp
index b586f94..504b533 100644
--- a/src/compiler/glsl/ast_type.cpp
+++ b/src/compiler/glsl/ast_type.cpp
@@ -196,7 +196,7 @@ ast_type_qualifier::merge_qualifier(YYLTYPE *loc,
}
 
if (q.flags.q.max_vertices) {
-  if (this->max_vertices) {
+  if (this->max_vertices && !is_single_layout_merge) {
  this->max_vertices->merge_qualifier(q.max_vertices);
   } else {
  this->max_vertices = q.max_vertices;
@@ -213,7 +213,7 @@ ast_type_qualifier::merge_qualifier(YYLTYPE *loc,
}
 
if (q.flags.q.invocations) {
-  if (this->invocations) {
+  if (this->invocations && !is_single_layout_merge) {
  this->invocations->merge_qualifier(q.invocations);
   } else {
  this->invocations = q.invocations;
@@ -262,7 +262,8 @@ ast_type_qualifier::merge_qualifier(YYLTYPE *loc,
  unsigned buff_idx;
  if (process_qualifier_constant(state, loc, "xfb_buffer",
 this->xfb_buffer, &buff_idx)) {
-if (state->out_qualifier->out_xfb_stride[buff_idx]) {
+if (state->out_qualifier->out_xfb_stride[buff_idx]
+&& !is_single_layout_merge) {
state->out_qualifier->out_xfb_stride[buff_idx]->merge_qualifier(
   new(state) ast_layout_expression(*loc, this->xfb_stride));
 } else {
@@ -274,7 +275,7 @@ ast_type_qualifier::merge_qualifier(YYLTYPE *loc,
}
 
if (q.flags.q.vertices) {
-  if (this->vertices) {
+  if (this->vertices && !is_single_layout_merge) {
  this->vertices->merge_qualifier(q.vertices);
   } else {
  this->vertices = q.vertices;
@@ -312,7 +313,7 @@ ast_type_qualifier::merge_qualifier(YYLTYPE *loc,
 
for (int i = 0; i < 3; i++) {
   if (q.flags.q.local_size & (1 << i)) {
- if (this->local_size[i]) {
+ if (this->local_size[i] && !is_single_layout_merge) {
 this->local_size[i]->merge_qualifier(q.local_size[i]);
  } else {
 this->local_size[i] = q.local_size[i];
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 0/5] deal with multiple appearances of the same layout-qualifier-name in a single declaration

2016-10-06 Thread Andres Gomez

In the case of layout-qualifier-names that can appear multiple times
in different declarations of the same shader or, even, the same
program, we are using the ast_layout_expression class which holds a
list to store all the appearances to be able to check later which is
the overriding value and whether it matches (or not) previous
appearances of the same layout-qualifier-name.

Until now, we were also holding inside the ast_layout_expression
values of the same layout-qualifier-name that could appear inside a
single layout-qualifier or across multiple layout-qualifiers in a
single declaration.

This was a problem since, inside a declaration, only the last
appearance should be taken into account. As we were not doing this,
the compilation or linking was failing due to different values of the
same layout-qualifier-name in a single declaration when such
layout-qualifier-name had as a constraint to hold the same value
across the same shader or program.

Now, we only hold the last appearanace of a repeated
layout-qualifier-name inside a single declaration.

These following 2 example will help to illustrate the problem:

- " #version 150
#extension GL_ARB_shading_language_420pack: enable
#extension GL_ARB_enhanced_layouts: enable

layout(max_vertices=2, max_vertices=3) out;
layout(max_vertices=3) out;"

- " #version 150
#extension GL_ARB_shading_language_420pack: enable
#extension GL_ARB_enhanced_layouts: enable

layout(max_vertices=2) layout(max_vertices=3) out;
layout(max_vertices=3) out;"

Although different values for the "max_vertices" layout-qualifier-name
should end in a compilation failure, since only the last occurrence is
taken into account, these two small pieces of code from a shader are
valid.

Fixes:
- GL44-CTS.shading_language_420pack.qualifier_override_layout
- GL44-CTS.enhanced_layouts.xfb_duplicated_stride

Andres Gomez (5):
  glsl: last duplicated layout-qualifier-name in a layout-qualifier
overrides the former
  glsl: last duplicated layout-qualifier-name in multiple
layout-qualifiers overrides the former
  glsl: push layout-qualifier-name values from variable declarations to
global
  glsl: by default, any ast_layout_expression variable value must match
its previous appearances
  glsl: simplified ast_type_qualifier::merge_[in|out]_qualifier API

 src/compiler/glsl/ast.h  |  17 +++-
 src/compiler/glsl/ast_type.cpp   |  95 +++
 src/compiler/glsl/glsl_parser.yy | 156 +++
 src/compiler/glsl/glsl_parser_extras.cpp |   2 +-
 4 files changed, 144 insertions(+), 126 deletions(-)

-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 3/5] glsl: push layout-qualifier-name values from variable declarations to global

2016-10-06 Thread Andres Gomez

After the previous modifications in the merging of the
layout-qualifier-name values, we no longer push the final value in a
declaration to the global values.

This regression happens because we don't call for merging on the
right-most layout qualifier of a declaration which is also the
overriding one in case of multiple appearances.

Now, we add a new method to push these values to the global ones and
we call for this just after all the layout-qualifier collapsing has
happened in a declaration.

This simplifies how this was working in two ways; we make a clear
differentiation of when we are pushing this to the global values since
before it was mixed in the merging call and we only run this once all
the processing for layout-qualifiers in a declaration has happened.

Signed-off-by: Andres Gomez 
---
 src/compiler/glsl/ast.h  |  6 +
 src/compiler/glsl/ast_type.cpp   | 48 ++--
 src/compiler/glsl/glsl_parser.yy | 27 ++
 3 files changed, 59 insertions(+), 22 deletions(-)

diff --git a/src/compiler/glsl/ast.h b/src/compiler/glsl/ast.h
index 73c73b7..c1453a2 100644
--- a/src/compiler/glsl/ast.h
+++ b/src/compiler/glsl/ast.h
@@ -759,6 +759,12 @@ struct ast_type_qualifier {
const ast_type_qualifier &q,
ast_node* &node, bool create_node);
 
+   /**
+* Push pending layout qualifiers to the global values.
+*/
+   bool push_to_global(YYLTYPE *loc,
+   _mesa_glsl_parse_state *state);
+
bool validate_flags(YYLTYPE *loc,
_mesa_glsl_parse_state *state,
const ast_type_qualifier &allowed_flags,
diff --git a/src/compiler/glsl/ast_type.cpp b/src/compiler/glsl/ast_type.cpp
index f02f71b..4dbec59 100644
--- a/src/compiler/glsl/ast_type.cpp
+++ b/src/compiler/glsl/ast_type.cpp
@@ -262,29 +262,10 @@ ast_type_qualifier::merge_qualifier(YYLTYPE *loc,
  }
   }
 
-  if (q.flags.q.explicit_xfb_stride)
+  if (q.flags.q.explicit_xfb_stride) {
+ this->flags.q.xfb_stride = 1;
+ this->flags.q.explicit_xfb_stride = 1;
  this->xfb_stride = q.xfb_stride;
-
-  /* Merge all we xfb_stride qualifiers into the global out */
-  if (q.flags.q.explicit_xfb_stride || this->flags.q.xfb_stride) {
-
- /* Set xfb_stride flag to 0 to avoid adding duplicates every time
-  * there is a merge.
-  */
- this->flags.q.xfb_stride = 0;
-
- unsigned buff_idx;
- if (process_qualifier_constant(state, loc, "xfb_buffer",
-this->xfb_buffer, &buff_idx)) {
-if (state->out_qualifier->out_xfb_stride[buff_idx]
-&& !is_single_layout_merge && !is_multiple_layouts_merge) {
-   state->out_qualifier->out_xfb_stride[buff_idx]->merge_qualifier(
-  new(state) ast_layout_expression(*loc, this->xfb_stride));
-} else {
-   state->out_qualifier->out_xfb_stride[buff_idx] =
-  new(state) ast_layout_expression(*loc, this->xfb_stride);
-}
- }
   }
}
 
@@ -601,6 +582,29 @@ ast_type_qualifier::merge_in_qualifier(YYLTYPE *loc,
return true;
 }
 
+bool
+ast_type_qualifier::push_to_global(YYLTYPE *loc,
+   _mesa_glsl_parse_state *state)
+{
+   if (this->flags.q.xfb_stride) {
+  this->flags.q.xfb_stride = 0;
+
+  unsigned buff_idx;
+  if (process_qualifier_constant(state, loc, "xfb_buffer",
+ this->xfb_buffer, &buff_idx)) {
+ if (state->out_qualifier->out_xfb_stride[buff_idx]) {
+state->out_qualifier->out_xfb_stride[buff_idx]->merge_qualifier(
+   new(state) ast_layout_expression(*loc, this->xfb_stride));
+ } else {
+state->out_qualifier->out_xfb_stride[buff_idx] =
+   new(state) ast_layout_expression(*loc, this->xfb_stride);
+ }
+  }
+   }
+
+   return true;
+}
+
 /**
  * Check if the current type qualifier has any illegal flags.
  *
diff --git a/src/compiler/glsl/glsl_parser.yy b/src/compiler/glsl/glsl_parser.yy
index f4bf8fc..225d58b 100644
--- a/src/compiler/glsl/glsl_parser.yy
+++ b/src/compiler/glsl/glsl_parser.yy
@@ -838,6 +838,10 @@ declaration:
}
| interface_block
{
+  ast_interface_block *block = (ast_interface_block *) $1;
+  if (!block->layout.push_to_global(& @1, state)) {
+ YYERROR;
+  }
   $$ = $1;
}
;
@@ -913,6 +917,9 @@ parameter_declaration:
{
   $$ = $2;
   $$->type->qualifier = $1;
+  if (!$$->type->qualifier.push_to_global(& @1, state)) {
+ YYERROR;
+  }
}
| parameter_qualifier parameter_type_specifier
{
@@ -922,6 +929,9 @@ parameter_declaration:
   $$->type = new(ctx) ast_fully_specified_type();
   $$->type->set_location_range(@1, @2);
   $$->type->qualifier = $1;
+

[Mesa-dev] [Bug 98128] nir/tests/control_flow_tests.cpp:79:73: error: ‘nir_loop_first_cf_node’ was not declared in this scope

2016-10-06 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=98128

Jason Ekstrand  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|NEW |RESOLVED

--- Comment #1 from Jason Ekstrand  ---
Sorry about that.  It's fixed now:

commit 325b3fd668369e2ed0af937843e80e750d0b91ed
Author: Jason Ekstrand 
Date:   Thu Oct 6 15:46:22 2016 -0700

nir: Fix the control flow tests for nir_loop_first_block changes

Commit 2ed17d46de045404042f13c6591895a1cf31b167 changed
nir_loop_first_cf_node and friends to return a nir_block instead of a
nir_cf_node.  This broke one of the NIR control flow tests.

Signed-off-by: Jason Ekstrand 
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98128

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] gallium: Fix install-gallium-links.mk on non-bash /bin/sh

2016-10-06 Thread Eric Anholt

Debian uses dash by default, which doesn't do '+='.  Fixes servo's
osmesa-based headless testing system, which was looking for libOSMesa in
the lib/ directory.
---
 install-gallium-links.mk | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/install-gallium-links.mk b/install-gallium-links.mk
index ac5a499c48fb..fc2f75db5e83 100644
--- a/install-gallium-links.mk
+++ b/install-gallium-links.mk
@@ -13,8 +13,8 @@ all-local : .install-gallium-links
fi; \
$(MKDIR_P) $$link_dir;  \
file_list="$(dri_LTLIBRARIES:%.la=.libs/%.so)"; \
-   file_list+="$(egl_LTLIBRARIES:%.la=.libs/%.$(LIB_EXT)*)";   \
-   file_list+="$(lib_LTLIBRARIES:%.la=.libs/%.$(LIB_EXT)*)";   \
+   file_list="$$file_list$(egl_LTLIBRARIES:%.la=.libs/%.$(LIB_EXT)*)"; 
\
+   file_list="$$file_list$(lib_LTLIBRARIES:%.la=.libs/%.$(LIB_EXT)*)"; 
\
for f in $$file_list; do\
if test -h .libs/$$f; then  \
cp -d $$f $$link_dir;   \
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 20/75] st/nine: Initial ProcessVertices support

2016-10-06 Thread Axel Davy


On 06/10/2016 22:28, Ilia Mirkin wrote:

On Thu, Oct 6, 2016 at 4:24 PM, Axel Davy  wrote:

On 05/10/2016 22:08, Axel Davy wrote:

   HRESULT NINE_WINAPI
   NineDevice9_ProcessVertices( struct NineDevice9 *This,
UINT SrcStartIndex,
@@ -3174,33 +3188,69 @@ NineDevice9_ProcessVertices( struct NineDevice9
*This,
IDirect3DVertexDeclaration9 *pVertexDecl,
DWORD Flags )
   {
-struct pipe_screen *screen = This->screen;
+struct pipe_screen *screen_sw = This->screen_sw;
+struct pipe_context *pipe_sw = This->pipe_sw;
   struct NineVertexDeclaration9 *vdecl =
NineVertexDeclaration9(pVertexDecl);
+struct NineVertexBuffer9 *dst = NineVertexBuffer9(pDestBuffer);
   struct NineVertexShader9 *vs;
   struct pipe_resource *resource;
+struct pipe_transfer *transfer = NULL;
+struct pipe_stream_output_info so;
   struct pipe_stream_output_target *target;
   struct pipe_draw_info draw;
+struct pipe_box box;
+unsigned offsets[1] = {0};
   HRESULT hr;
-unsigned buffer_offset, buffer_size;
+unsigned buffer_size;
+void *map;
 DBG("This=%p SrcStartIndex=%u DestIndex=%u VertexCount=%u "
   "pDestBuffer=%p pVertexDecl=%p Flags=%d\n",
   This, SrcStartIndex, DestIndex, VertexCount, pDestBuffer,
   pVertexDecl, Flags);
   -if (!screen->get_param(screen, PIPE_CAP_MAX_STREAM_OUTPUT_BUFFERS))
-STUB(D3DERR_INVALIDCALL);
+if (!screen_sw->get_param(screen_sw,
PIPE_CAP_MAX_STREAM_OUTPUT_BUFFERS)) {
+DBG("ProcessVertices not supported\n");
+return D3DERR_INVALIDCALL;
+}
   -nine_update_state(This);
   -/* TODO: Create shader with stream output. */
-STUB(D3DERR_INVALIDCALL);
-struct NineVertexBuffer9 *dst = NineVertexBuffer9(pDestBuffer);
+vs = This->state.programmable_vs ? This->state.vs : This->ff.vs;
+/* Note: version is 0 for ff */
+user_assert(vdecl || (vs->byte_code.version < 0x30 && dst->desc.FVF),
+D3DERR_INVALIDCALL);
+if (!vdecl) {
+DWORD FVF = dst->desc.FVF;
+vdecl = util_hash_table_get(This->ff.ht_fvf, &FVF);
+if (!vdecl) {
+hr = NineVertexDeclaration9_new_from_fvf(This, FVF, &vdecl);
+if (FAILED(hr))
+return hr;
+vdecl->fvf = FVF;
+util_hash_table_set(This->ff.ht_fvf, &vdecl->fvf, vdecl);
+NineUnknown_ConvertRefToBind(NineUnknown(vdecl));
+}
+}
   -vs = This->state.vs ? This->state.vs : This->ff.vs;
+/* Flags: Can be 0 or D3DPV_DONOTCOPYDATA, and/or lock flags
+ * D3DPV_DONOTCOPYDATA -> Has effect only for ff. In particular
+ * if not set, everything from src will be used, and dst
+ * must match exactly the ff vs outputs.
+ * TODO: Handle all the checks, etc for ff */
+user_assert(vdecl->position_t || This->state.programmable_vs,
+D3DERR_INVALIDCALL);
+
+/* TODO: Support vs < 3 and ff */
+user_assert(vs->byte_code.version == 0x30,
+D3DERR_INVALIDCALL);
+/* TODO: Not hardcode the constant buffers for swvp */
+user_assert(This->may_swvp,
+D3DERR_INVALIDCALL);
+
+nine_state_prepare_draw_sw(This, vdecl, SrcStartIndex, VertexCount,
&so);
   -buffer_size = VertexCount * vs->so->stride[0];
-if (1) {
+buffer_size = VertexCount * so.stride[0] * 4;
+{
   struct pipe_resource templ;
 templ.target = PIPE_BUFFER;
@@ -3212,49 +3262,50 @@ NineDevice9_ProcessVertices( struct NineDevice9
*This,
   templ.height0 = templ.depth0 = templ.array_size = 1;
   templ.last_level = templ.nr_samples = 0;
   -resource = This->screen->resource_create(This->screen, &templ);
+resource = screen_sw->resource_create(screen_sw, &templ);
   if (!resource)
   return E_OUTOFMEMORY;
-buffer_offset = 0;
-} else {
-/* SO matches vertex declaration */
-resource = NineVertexBuffer9_GetResource(dst);
-buffer_offset = DestIndex * vs->so->stride[0];
   }
-target = This->pipe->create_stream_output_target(This->pipe,
resource,
- buffer_offset,
- buffer_size);
+target = pipe_sw->create_stream_output_target(pipe_sw, resource,
+  0, buffer_size);
   if (!target) {
   pipe_resource_reference(&resource, NULL);
   return D3DERR_DRIVERINTERNALERROR;
   }
   -if (!vdecl) {
-hr = NineVertexDeclaration9_new_from_fvf(This, dst->desc.FVF,
&vdecl);
-if (FAILED(hr))
-goto out;
-}
-
   init_draw_info(&draw, This, D3DPT_POINTLIST, VertexCount);
   draw.instance_count = 1;
   draw.indexed = FALSE;
-draw.start = SrcStartIndex;
+draw.start = 0;
   draw.ind

Re: [Mesa-dev] [PATCH v3 5/5] intel: aubinator: enable loading dumps from standard input

2016-10-06 Thread Gandikota, Sirisha

>-Original Message-
>From: mesa-dev [mailto:mesa-dev-boun...@lists.freedesktop.org] On Behalf Of
>Lionel Landwerlin
>Sent: Wednesday, October 05, 2016 8:56 AM
>To: mesa-dev@lists.freedesktop.org
>Cc: Landwerlin, Lionel G 
>Subject: [Mesa-dev] [PATCH v3 5/5] intel: aubinator: enable loading dumps from
>standard input
>
>In conjuction with an intel_aubdump change, you can now look at your
>application's output like this :
>
>$ intel_aubdump -c '/path/to/aubinator --gen=hsw' my_gl_app
>
>v2: Add print_help() comment about standard input handling (Eero)
>Remove shrinked gtt space debug workaround (Eero)
>
>v3: Use realloc rather than memcpy/free (Ben)
>
>Signed-off-by: Lionel Landwerlin 
>---
> src/intel/tools/aubinator.c | 165 ++---
>---
> 1 file changed, 129 insertions(+), 36 deletions(-)
>
>diff --git a/src/intel/tools/aubinator.c b/src/intel/tools/aubinator.c index
>7f6655a..d716a65 100644
>--- a/src/intel/tools/aubinator.c
>+++ b/src/intel/tools/aubinator.c
>@@ -835,48 +835,51 @@ handle_trace_block(struct gen_spec *spec, uint32_t
>*p)  }
>
> struct aub_file {
>-   char *filename;
>-   int fd;
>-   struct stat sb;
>+   FILE *stream;
>+
>uint32_t *map, *end, *cursor;
>+   uint32_t *mem_end;
> };
>
> static struct aub_file *
> aub_file_open(const char *filename)
> {
>struct aub_file *file;
>+   struct stat sb;
>+   int fd;
>
>-   file = malloc(sizeof *file);
>-   file->filename = strdup(filename);
>-   file->fd = open(file->filename, O_RDONLY);
>-   if (file->fd == -1) {
>-  fprintf(stderr, "open %s failed: %s\n", file->filename, 
>strerror(errno));
>+   file = calloc(1, sizeof *file);
>+   fd = open(filename, O_RDONLY);
>+   if (fd == -1) {
>+  fprintf(stderr, "open %s failed: %s\n", filename,
>+ strerror(errno));
>   exit(EXIT_FAILURE);
>}
>
>-   if (fstat(file->fd, &file->sb) == -1) {
>+   if (fstat(fd, &sb) == -1) {
>   fprintf(stderr, "stat failed: %s\n", strerror(errno));
>   exit(EXIT_FAILURE);
>}
>
>-   file->map = mmap(NULL, file->sb.st_size,
>-PROT_READ, MAP_SHARED, file->fd, 0);
>+   file->map = mmap(NULL, sb.st_size,
>+PROT_READ, MAP_SHARED, fd, 0);
>if (file->map == MAP_FAILED) {
>   fprintf(stderr, "mmap failed: %s\n", strerror(errno));
>   exit(EXIT_FAILURE);
>}
>
>file->cursor = file->map;
>-   file->end = file->map + file->sb.st_size / 4;
>+   file->end = file->map + sb.st_size / 4;
>
>-   /* mmap a terabyte for our gtt space. */
>-   gtt_size = 1ul << 40;
>-   gtt = mmap(NULL, gtt_size, PROT_READ | PROT_WRITE,
>-  MAP_PRIVATE | MAP_ANONYMOUS |  MAP_NORESERVE, -1, 0);
>-   if (gtt == MAP_FAILED) {
>-  fprintf(stderr, "failed to alloc gtt space: %s\n", strerror(errno));
>-  exit(1);
>-   }
>+   return file;
>+}
>+
>+static struct aub_file *
>+aub_file_stdin(void)
>+{
>+   struct aub_file *file;
>+
>+   file = calloc(1, sizeof *file);
>+   file->stream = stdin;
>
>return file;
> }
>@@ -926,12 +929,21 @@ struct {
>{ "bxt", MAKE_GEN(9, 0) }
> };
>
>-static void
>+enum {
>+   AUB_ITEM_DECODE_OK,
>+   AUB_ITEM_DECODE_FAILED,
>+   AUB_ITEM_DECODE_NEED_MORE_DATA,
>+};
>+
>+static int
> aub_file_decode_batch(struct aub_file *file, struct gen_spec *spec)  {
>-   uint32_t *p, h, device, data_type;
>+   uint32_t *p, h, device, data_type, *new_cursor;
>int header_length, payload_size, bias;
>
>+   if (file->end - file->cursor < 12)
>+  return AUB_ITEM_DECODE_NEED_MORE_DATA;
>+
>p = file->cursor;
>h = *p;
>header_length = h & 0x;
>@@ -947,8 +959,7 @@ aub_file_decode_batch(struct aub_file *file, struct
>gen_spec *spec)
>   printf("unknown opcode %d at %td/%td\n",
>  OPCODE(h), file->cursor - file->map,
>  file->end - file->map);
>-  file->cursor = file->end;
>-  return;
>+  return AUB_ITEM_DECODE_FAILED;
>}
>
>payload_size = 0;
>@@ -960,9 +971,22 @@ aub_file_decode_batch(struct aub_file *file, struct
>gen_spec *spec)
>   payload_size = p[4];
>   handle_trace_block(spec, p);
>   break;
>-   case MAKE_HEADER(TYPE_AUB, OPCODE_AUB, SUBOPCODE_BMP):
>+   default:
>   break;
>+   }
>
>+   new_cursor = p + header_length + bias + payload_size / 4;
>+   if (new_cursor > file->end)
>+  return AUB_ITEM_DECODE_NEED_MORE_DATA;
>+
>+   switch (h & 0x) {
>+   case MAKE_HEADER(TYPE_AUB, OPCODE_AUB, SUBOPCODE_HEADER):
>+  break;
>+   case MAKE_HEADER(TYPE_AUB, OPCODE_AUB, SUBOPCODE_BLOCK):
>+  handle_trace_block(spec, p);
>+  break;
>+   case MAKE_HEADER(TYPE_AUB, OPCODE_AUB, SUBOPCODE_BMP):
>+  break;
>case MAKE_HEADER(TYPE_AUB, OPCODE_NEW_AUB, SUBOPCODE_VERSION):
>   printf("version block: dw1 %08x\n", p[1]);
>   device = (p[1] >> 8) & 0xff;
>@@ -988,13 +1012,57 @@ aub_file_decode_batch(struct aub_file *file, struct
>gen_spec *spec)
>  "subopcode=0x%x (%08x)\n", TYPE(h), OPCODE(h), SUBOPCODE(h), h);
>   break;
>

Re: [Mesa-dev] [PATCH 4/5] intel: aubinator: enable loading xml files from a given directory

2016-10-06 Thread Gandikota, Sirisha

>-Original Message-
>From: mesa-dev [mailto:mesa-dev-boun...@lists.freedesktop.org] On Behalf Of
>Lionel Landwerlin
>Sent: Wednesday, October 05, 2016 8:55 AM
>To: mesa-dev@lists.freedesktop.org
>Cc: Landwerlin, Lionel G 
>Subject: [Mesa-dev] [PATCH 4/5] intel: aubinator: enable loading xml files 
>from a
>given directory
>
>This might be useful for people who debug with out of tree descriptions.
>
>Signed-off-by: Lionel Landwerlin 
>---
> src/intel/tools/aubinator.c | 18 ++---
> src/intel/tools/decoder.c   | 64
>+
> src/intel/tools/decoder.h   |  2 ++
> 3 files changed, 81 insertions(+), 3 deletions(-)
>
>diff --git a/src/intel/tools/aubinator.c b/src/intel/tools/aubinator.c index
>8be7580..7f6655a 100644
>--- a/src/intel/tools/aubinator.c
>+++ b/src/intel/tools/aubinator.c
>@@ -1037,7 +1037,8 @@ print_help(const char *progname, FILE *file)
>"  --color[=WHEN]  colorize the output; WHEN can be 'auto' 
> (default\n"
>"if omitted), 'always', or 'never'\n"
>"  --no-pager  don't launch pager\n"
>-   "  --no-offsetsdon't print instruction offsets\n",
>+   "  --no-offsetsdon't print instruction offsets\n"
>+   "  --xml=DIR   load hardware xml description from 
>directory DIR\n",
>progname);
> }
>
>@@ -1047,7 +1048,7 @@ int main(int argc, char *argv[])
>struct aub_file *file;
>int c, i;
>bool help = false, pager = true;
>-   const char *input_file = NULL;
>+   char *input_file = NULL, *xml_path = NULL;
>char gen_val[24];
>const struct {
>   const char *name;
>@@ -1069,6 +1070,7 @@ int main(int argc, char *argv[])
>   { "gen",required_argument, NULL,  'g' },
>   { "headers",no_argument,   (int *) &option_full_decode,   false 
> },
>   { "color",  required_argument, NULL,  'c' },
>+  { "xml",required_argument, NULL,  'x' },
>   { NULL, 0, NULL,  0 }
>};
>struct gen_device_info devinfo;
>@@ -1091,6 +1093,9 @@ int main(int argc, char *argv[])
> exit(EXIT_FAILURE);
>  }
>  break;
>+  case 'x':
>+ xml_path = strdup(optarg);
>+ break;
>   default:
>  break;
>   }
>@@ -1131,9 +1136,15 @@ int main(int argc, char *argv[])
>if (isatty(1) && pager)
>   setup_pager();
>
>-   spec = gen_spec_load(&devinfo);
>+   if (xml_path == NULL)
>+  spec = gen_spec_load(&devinfo);
>+   else
>+  spec = gen_spec_load_from_path(&devinfo, xml_path);
>disasm = gen_disasm_create(gen->pci_id);
>
>+   if (spec == NULL || disasm == NULL)
>+  exit(EXIT_FAILURE);
>+
>if (input_file == NULL) {
>print_help(input_file, stderr);
>exit(EXIT_FAILURE);
>@@ -1147,6 +1158,7 @@ int main(int argc, char *argv[])
>fflush(stdout);
>/* close the stdout which is opened to write the output */
>close(1);
>+   free(xml_path);
>
>wait(NULL);
>
>diff --git a/src/intel/tools/decoder.c b/src/intel/tools/decoder.c index
>76c237c..d294040 100644
>--- a/src/intel/tools/decoder.c
>+++ b/src/intel/tools/decoder.c
>@@ -482,6 +482,70 @@ gen_spec_load(const struct gen_device_info *devinfo)
>return ctx.spec;
> }
>
>+struct gen_spec *
>+gen_spec_load_from_path(const struct gen_device_info *devinfo,
>+const char *path) {
>+   struct parser_context ctx;
>+   size_t len, filename_len = strlen(path) + 20;
>+   char *filename = malloc(filename_len);
>+   void *buf;
>+   FILE *input;
>+
>+   len = snprintf(filename, filename_len, "%s/gen%i.xml",
>+  path, devinfo_to_gen(devinfo));
>+   assert(len < filename_len);
>+
>+   input = fopen(filename, "r");
>+   if (input == NULL) {
>+  fprintf(stderr, "failed to open xml description\n");
>+  free(filename);
>+  return NULL;
>+   }
>+
>+   memset(&ctx, 0, sizeof ctx);
>+   ctx.parser = XML_ParserCreate(NULL);
>+   XML_SetUserData(ctx.parser, &ctx);
>+   if (ctx.parser == NULL) {
>+  fprintf(stderr, "failed to create parser\n");
>+  free(filename);
>+  return NULL;
>+   }
>+
>+   XML_SetElementHandler(ctx.parser, start_element, end_element);
>+   XML_SetCharacterDataHandler(ctx.parser, character_data);
>+   ctx.loc.filename = filename;
>+   ctx.spec = xzalloc(sizeof(*ctx.spec));
>+
>+   do {
>+  buf = XML_GetBuffer(ctx.parser, XML_BUFFER_SIZE);
>+  len = fread(buf, 1, XML_BUFFER_SIZE, input);
>+  if (len < 0) {
>+ fprintf(stderr, "fread: %m\n");
>+ fclose(input);
>+ free(filename);
>+ return NULL;
>+  }
>+  if (XML_ParseBuffer(ctx.parser, len, len == 0) == 0) {
>+ fprintf(stderr,
>+ "Error parsing XML at line %ld col %ld: %s\n",
>+ XML_GetCurrentLineNumber(ctx.parser),
>+

[Mesa-dev] [PATCH v3 01/10] isl: Correct a comment in the isl_format enum

2016-10-06 Thread Nanley Chery

HiZ is not a color surface, but an auxiliary depth surface.

Signed-off-by: Nanley Chery 
Reviewed-by: Chad Versace 
Reviewed-by: Jason Ekstrand 
---
 src/intel/isl/isl.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/intel/isl/isl.h b/src/intel/isl/isl.h
index 29fb3d0..967bcb2 100644
--- a/src/intel/isl/isl.h
+++ b/src/intel/isl/isl.h
@@ -358,7 +358,7 @@ enum isl_format {
 * actual hardware formats *must* come before these in the list.
 */
 
-   /* Formats for color compression surfaces */
+   /* Formats for auxiliary surfaces */
ISL_FORMAT_HIZ,
ISL_FORMAT_MCS_2X,
ISL_FORMAT_MCS_4X,
-- 
2.10.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v3 08/10] anv/cmd_buffer: Enable rendering to HiZ

2016-10-06 Thread Nanley Chery

From: Chad Versace 

Nanley Chery:
(rebase)
 - Resolve conflicts with new anv_batch_emit macro
(amend)
 - Handle a QPitch TODO
 - Emit 3DSTATE_HIER_DEPTH_BUFFER on pre-BDW systems
 - Only use HiZ for single-subpass renderpasses
 - Emit the HiZ instruction before the stencil instruction to follow the
   optimized clear sequence specified in the PRMs
 - Don't modify clear params
 - Enable resolves when a HiZ buffer is used to ensure depth buffer validity

Provides an FPS increase of ~15% on the Sascha triangle and multisampling
demos.

Signed-off-by: Nanley Chery 
Reviewed-by: Chad Versace 
Reviewed-by: Jason Ekstrand 

---
v3. Replace FIXME with FINISHME (Chad)
Check the HiZ dimension when determining then Qpitch

 src/intel/vulkan/gen8_cmd_buffer.c |  4 
 src/intel/vulkan/genX_cmd_buffer.c | 40 ++
 2 files changed, 40 insertions(+), 4 deletions(-)

diff --git a/src/intel/vulkan/gen8_cmd_buffer.c 
b/src/intel/vulkan/gen8_cmd_buffer.c
index e50f1a5..e6a3c3d 100644
--- a/src/intel/vulkan/gen8_cmd_buffer.c
+++ b/src/intel/vulkan/gen8_cmd_buffer.c
@@ -417,6 +417,10 @@ genX(cmd_buffer_emit_hz_op)(struct anv_cmd_buffer 
*cmd_buffer,
if (iview == NULL || !anv_image_has_hiz(iview->image))
   return;
 
+   /* FINISHME: Implement multi-subpass HiZ */
+   if (cmd_buffer->state.pass->subpass_count > 1)
+  return;
+
const uint32_t ds = cmd_state->subpass->depth_stencil_attachment;
 
/* Section 7.4. of the Vulkan 1.0.27 spec states:
diff --git a/src/intel/vulkan/genX_cmd_buffer.c 
b/src/intel/vulkan/genX_cmd_buffer.c
index 9466601..ed6a109 100644
--- a/src/intel/vulkan/genX_cmd_buffer.c
+++ b/src/intel/vulkan/genX_cmd_buffer.c
@@ -1199,6 +1199,7 @@ cmd_buffer_emit_depth_stencil(struct anv_cmd_buffer 
*cmd_buffer)
   anv_cmd_buffer_get_depth_stencil_view(cmd_buffer);
const struct anv_image *image = iview ? iview->image : NULL;
const bool has_depth = image && (image->aspects & 
VK_IMAGE_ASPECT_DEPTH_BIT);
+   const bool has_hiz = image != NULL && anv_image_has_hiz(image);
const bool has_stencil =
   image && (image->aspects & VK_IMAGE_ASPECT_STENCIL_BIT);
 
@@ -1211,7 +1212,12 @@ cmd_buffer_emit_depth_stencil(struct anv_cmd_buffer 
*cmd_buffer)
  db.SurfaceType   = SURFTYPE_2D;
  db.DepthWriteEnable  = true;
  db.StencilWriteEnable= has_stencil;
- db.HierarchicalDepthBufferEnable = false;
+
+ if (cmd_buffer->state.pass->subpass_count == 1) {
+db.HierarchicalDepthBufferEnable = has_hiz;
+ } else {
+anv_finishme("Multiple-subpass HiZ not implemented");
+ }
 
  db.SurfaceFormat = isl_surf_get_depth_format(&device->isl_dev,
   
&image->depth_surface.isl);
@@ -1263,6 +1269,33 @@ cmd_buffer_emit_depth_stencil(struct anv_cmd_buffer 
*cmd_buffer)
   }
}
 
+   if (has_hiz) {
+  anv_batch_emit(&cmd_buffer->batch, GENX(3DSTATE_HIER_DEPTH_BUFFER), hdb) 
{
+ hdb.HierarchicalDepthBufferObjectControlState = GENX(MOCS);
+ hdb.SurfacePitch = image->hiz_surface.isl.row_pitch - 1;
+ hdb.SurfaceBaseAddress = (struct anv_address) {
+.bo = image->bo,
+.offset = image->offset + image->hiz_surface.offset,
+ };
+#if GEN_GEN >= 8
+ /* From the SKL PRM Vol2a:
+  *
+  *The interpretation of this field is dependent on Surface Type
+  *as follows:
+  *- SURFTYPE_1D: distance in pixels between array slices
+  *- SURFTYPE_2D/CUBE: distance in rows between array slices
+  *- SURFTYPE_3D: distance in rows between R - slices
+  */
+ hdb.SurfaceQPitch =
+image->hiz_surface.isl.dim == ISL_SURF_DIM_1D ?
+   isl_surf_get_array_pitch_el(&image->hiz_surface.isl) >> 2 :
+   isl_surf_get_array_pitch_el_rows(&image->hiz_surface.isl) >> 2;
+#endif
+  }
+   } else {
+  anv_batch_emit(&cmd_buffer->batch, GENX(3DSTATE_HIER_DEPTH_BUFFER), hdb);
+   }
+
/* Emit 3DSTATE_STENCIL_BUFFER */
if (has_stencil) {
   anv_batch_emit(&cmd_buffer->batch, GENX(3DSTATE_STENCIL_BUFFER), sb) {
@@ -1285,9 +1318,6 @@ cmd_buffer_emit_depth_stencil(struct anv_cmd_buffer 
*cmd_buffer)
   anv_batch_emit(&cmd_buffer->batch, GENX(3DSTATE_STENCIL_BUFFER), sb);
}
 
-   /* Disable hierarchial depth buffers. */
-   anv_batch_emit(&cmd_buffer->batch, GENX(3DSTATE_HIER_DEPTH_BUFFER), hz);
-
/* Clear the clear params. */
anv_batch_emit(&cmd_buffer->batch, GENX(3DSTATE_CLEAR_PARAMS), cp);
 }
@@ -1301,6 +1331,7 @@ genX(cmd_buffer_set_subpass)(struct anv_cmd_buffer 
*cmd_buffer,
cmd_buffer->state.descriptors_dirty |= VK_SHADER_STAGE_FRAGMENT_BIT;
 
cmd_buffer_emit_depth_stencil(cmd_buffer);
+   genX(cmd_buffer_emit_hz_op)(cmd_buffer, BLORP_HIZ_OP_HIZ_RESOLVE);
 
anv_cmd_buffer_clear_subpass(cmd_buffer);
 }

[Mesa-dev] [PATCH v3 03/10] anv: Add func anv_image_has_hiz()

2016-10-06 Thread Nanley Chery

From: Chad Versace 

Signed-off-by: Nanley Chery 
Reviewed-by: Jason Ekstrand 
---
 src/intel/vulkan/anv_private.h | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_private.h
index dfcedd1..fd886bf 100644
--- a/src/intel/vulkan/anv_private.h
+++ b/src/intel/vulkan/anv_private.h
@@ -1759,6 +1759,16 @@ const struct anv_surface *
 anv_image_get_surface_for_aspect_mask(const struct anv_image *image,
   VkImageAspectFlags aspect_mask);
 
+static inline bool
+anv_image_has_hiz(const struct anv_image *image)
+{
+   /* We must check the aspect because anv_image::hiz_surface belongs to
+* a union.
+*/
+   return (image->aspects & VK_IMAGE_ASPECT_DEPTH_BIT) &&
+  image->hiz_surface.isl.size > 0;
+}
+
 void anv_image_view_init(struct anv_image_view *view,
  struct anv_device *device,
  const VkImageViewCreateInfo* pCreateInfo,
-- 
2.10.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v3 10/10] anv/TODO: Update the HiZ task

2016-10-06 Thread Nanley Chery

Signed-off-by: Nanley Chery 
Reviewed-by: Jason Ekstrand 
---
 src/intel/vulkan/TODO | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/intel/vulkan/TODO b/src/intel/vulkan/TODO
index 8fac370..9ac63eb 100644
--- a/src/intel/vulkan/TODO
+++ b/src/intel/vulkan/TODO
@@ -19,7 +19,7 @@ Code sharing with GL:
  - Generalize blorp to use ISL and be sharable between the two drivers
 
 Performance:
- - HiZ (Nanley)
+ - Multi-{sampled/gen8,LOD,subpass} HiZ
  - Fast color clears (after HiZ?)
  - Compressed multisample support
  - Renderbuffer compression (SKL+)
-- 
2.10.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v3 02/10] anv: Add anv_image::hiz_surface

2016-10-06 Thread Nanley Chery

From: Chad Versace 

Unused.

Signed-off-by: Nanley Chery 
Reviewed-by: Jason Ekstrand 
---
 src/intel/vulkan/anv_private.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_private.h
index 4fa403f..dfcedd1 100644
--- a/src/intel/vulkan/anv_private.h
+++ b/src/intel/vulkan/anv_private.h
@@ -1648,6 +1648,7 @@ anv_pipeline_setup_l3_config(struct anv_pipeline 
*pipeline, bool needs_slm);
  * Subsurface of an anv_image.
  */
 struct anv_surface {
+   /** Valid only if isl_surf::size > 0. */
struct isl_surf isl;
 
/**
@@ -1694,6 +1695,7 @@ struct anv_image {
 
   struct {
  struct anv_surface depth_surface;
+ struct anv_surface hiz_surface;
  struct anv_surface stencil_surface;
   };
};
-- 
2.10.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v3 06/10] anv/image: Memset hiz surfaces to 0 when binding memory

2016-10-06 Thread Nanley Chery

From: Jason Ekstrand 

Nanley Chery (amend):
 - Change memset value from 0xff to 0 (a defined value for HiZ).

Signed-off-by: Nanley Chery 
Reviewed-by: Chad Versace 
Reviewed-by: Jason Ekstrand 
---
 src/intel/vulkan/anv_image.c | 31 ++-
 1 file changed, 30 insertions(+), 1 deletion(-)

diff --git a/src/intel/vulkan/anv_image.c b/src/intel/vulkan/anv_image.c
index 7dada66..f125aa6 100644
--- a/src/intel/vulkan/anv_image.c
+++ b/src/intel/vulkan/anv_image.c
@@ -317,11 +317,12 @@ anv_DestroyImage(VkDevice _device, VkImage _image,
 }
 
 VkResult anv_BindImageMemory(
-VkDevicedevice,
+VkDevice_device,
 VkImage _image,
 VkDeviceMemory  _memory,
 VkDeviceSizememoryOffset)
 {
+   ANV_FROM_HANDLE(anv_device, device, _device);
ANV_FROM_HANDLE(anv_device_memory, mem, _memory);
ANV_FROM_HANDLE(anv_image, image, _image);
 
@@ -333,6 +334,34 @@ VkResult anv_BindImageMemory(
   image->offset = 0;
}
 
+   if (anv_image_has_hiz(image)) {
+
+  /* The offset and size must be a multiple of 4K or else the
+   * anv_gem_mmap call below will return NULL.
+   */
+  assert((image->offset + image->hiz_surface.offset) % 4096 == 0);
+  assert(image->hiz_surface.isl.size % 4096 == 0);
+
+  /* HiZ surfaces need to have their memory cleared to 0 before they
+   * can be used.  If we let it have garbage data, it can cause GPU
+   * hangs on some hardware.
+   */
+  void *map = anv_gem_mmap(device, image->bo->gem_handle,
+   image->offset + image->hiz_surface.offset,
+   image->hiz_surface.isl.size,
+   device->info.has_llc ? 0 : I915_MMAP_WC);
+
+  /* If anv_gem_mmap returns NULL, it's likely that the kernel was
+   * not able to find space on the host to create a proper mapping.
+   */
+  if (map == NULL)
+ return vk_error(VK_ERROR_OUT_OF_HOST_MEMORY);
+
+  memset(map, 0, image->hiz_surface.isl.size);
+
+  anv_gem_munmap(map, image->hiz_surface.isl.size);
+   }
+
return VK_SUCCESS;
 }
 
-- 
2.10.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v3 09/10] anv: Enable fast depth clears

2016-10-06 Thread Nanley Chery

Provides an FPS increase of ~30% on the Sascha triangle and multisampling
demos.

Signed-off-by: Nanley Chery 
Reviewed-by: Jason Ekstrand  (v2)

---
v3. Emit required clear_params packet (Chad)
Share clear_params code path IVB+ (Jason)

 src/intel/vulkan/anv_pass.c| 13 +
 src/intel/vulkan/genX_cmd_buffer.c | 24 ++--
 2 files changed, 35 insertions(+), 2 deletions(-)

diff --git a/src/intel/vulkan/anv_pass.c b/src/intel/vulkan/anv_pass.c
index 69c3c7e..595c2ea 100644
--- a/src/intel/vulkan/anv_pass.c
+++ b/src/intel/vulkan/anv_pass.c
@@ -155,5 +155,18 @@ void anv_GetRenderAreaGranularity(
 VkRenderPassrenderPass,
 VkExtent2D* pGranularity)
 {
+   ANV_FROM_HANDLE(anv_render_pass, pass, renderPass);
+
+   /* This granularity satisfies HiZ fast clear alignment requirements
+* for all sample counts.
+*/
+   for (unsigned i = 0; i < pass->subpass_count; ++i) {
+  if (pass->subpasses[i].depth_stencil_attachment !=
+  VK_ATTACHMENT_UNUSED) {
+ *pGranularity = (VkExtent2D) { .width = 8, .height = 4 };
+ return;
+  }
+   }
+
*pGranularity = (VkExtent2D) { 1, 1 };
 }
diff --git a/src/intel/vulkan/genX_cmd_buffer.c 
b/src/intel/vulkan/genX_cmd_buffer.c
index ed6a109..4089fc7 100644
--- a/src/intel/vulkan/genX_cmd_buffer.c
+++ b/src/intel/vulkan/genX_cmd_buffer.c
@@ -1318,8 +1318,27 @@ cmd_buffer_emit_depth_stencil(struct anv_cmd_buffer 
*cmd_buffer)
   anv_batch_emit(&cmd_buffer->batch, GENX(3DSTATE_STENCIL_BUFFER), sb);
}
 
-   /* Clear the clear params. */
-   anv_batch_emit(&cmd_buffer->batch, GENX(3DSTATE_CLEAR_PARAMS), cp);
+   /* From the IVB PRM Vol2P1, 11.5.5.4 3DSTATE_CLEAR_PARAMS:
+*
+*3DSTATE_CLEAR_PARAMS must always be programmed in the along with
+*the other Depth/Stencil state commands(i.e. 3DSTATE_DEPTH_BUFFER,
+*3DSTATE_STENCIL_BUFFER, or 3DSTATE_HIER_DEPTH_BUFFER)
+*
+* Testing also shows that some variant of this restriction may exist HSW+.
+* On BDW+, it is not possible to emit 2 of these packets consecutively when
+* both have DepthClearValueValid set. An analysis of such state programming
+* on SKL showed that the GPU doesn't register the latter packet's clear
+* value.
+*/
+   anv_batch_emit(&cmd_buffer->batch, GENX(3DSTATE_CLEAR_PARAMS), cp) {
+  if (has_hiz) {
+ cp.DepthClearValueValid = true;
+ const uint32_t ds =
+cmd_buffer->state.subpass->depth_stencil_attachment;
+ cp.DepthClearValue =
+cmd_buffer->state.attachments[ds].clear_value.depthStencil.depth;
+  }
+   }
 }
 
 static void
@@ -1332,6 +1351,7 @@ genX(cmd_buffer_set_subpass)(struct anv_cmd_buffer 
*cmd_buffer,
 
cmd_buffer_emit_depth_stencil(cmd_buffer);
genX(cmd_buffer_emit_hz_op)(cmd_buffer, BLORP_HIZ_OP_HIZ_RESOLVE);
+   genX(cmd_buffer_emit_hz_op)(cmd_buffer, BLORP_HIZ_OP_DEPTH_CLEAR);
 
anv_cmd_buffer_clear_subpass(cmd_buffer);
 }
-- 
2.10.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v3 05/10] anv: Move BindImageMemory to anv_image.c

2016-10-06 Thread Nanley Chery

From: Jason Ekstrand 

Signed-off-by: Nanley Chery 
Reviewed-by: Chad Versace 
Reviewed-by: Jason Ekstrand 
---
 src/intel/vulkan/anv_device.c | 20 
 src/intel/vulkan/anv_image.c  | 20 
 2 files changed, 20 insertions(+), 20 deletions(-)

diff --git a/src/intel/vulkan/anv_device.c b/src/intel/vulkan/anv_device.c
index c7b9979..9f8fa33 100644
--- a/src/intel/vulkan/anv_device.c
+++ b/src/intel/vulkan/anv_device.c
@@ -1421,26 +1421,6 @@ VkResult anv_BindBufferMemory(
return VK_SUCCESS;
 }
 
-VkResult anv_BindImageMemory(
-VkDevicedevice,
-VkImage _image,
-VkDeviceMemory  _memory,
-VkDeviceSizememoryOffset)
-{
-   ANV_FROM_HANDLE(anv_device_memory, mem, _memory);
-   ANV_FROM_HANDLE(anv_image, image, _image);
-
-   if (mem) {
-  image->bo = &mem->bo;
-  image->offset = memoryOffset;
-   } else {
-  image->bo = NULL;
-  image->offset = 0;
-   }
-
-   return VK_SUCCESS;
-}
-
 VkResult anv_QueueBindSparse(
 VkQueue queue,
 uint32_tbindInfoCount,
diff --git a/src/intel/vulkan/anv_image.c b/src/intel/vulkan/anv_image.c
index 77dcd46..7dada66 100644
--- a/src/intel/vulkan/anv_image.c
+++ b/src/intel/vulkan/anv_image.c
@@ -316,6 +316,26 @@ anv_DestroyImage(VkDevice _device, VkImage _image,
anv_free2(&device->alloc, pAllocator, anv_image_from_handle(_image));
 }
 
+VkResult anv_BindImageMemory(
+VkDevicedevice,
+VkImage _image,
+VkDeviceMemory  _memory,
+VkDeviceSizememoryOffset)
+{
+   ANV_FROM_HANDLE(anv_device_memory, mem, _memory);
+   ANV_FROM_HANDLE(anv_image, image, _image);
+
+   if (mem) {
+  image->bo = &mem->bo;
+  image->offset = memoryOffset;
+   } else {
+  image->bo = NULL;
+  image->offset = 0;
+   }
+
+   return VK_SUCCESS;
+}
+
 static void
 anv_surface_get_subresource_layout(struct anv_image *image,
struct anv_surface *surface,
-- 
2.10.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v3 04/10] anv: Allocate hiz surface

2016-10-06 Thread Nanley Chery

From: Chad Versace 

Nanley Chery:
(rebase)
 - Use isl_surf_get_hiz_surf()
(amend)
 - Only add a HiZ surface onto a depth/stencil attachment
 - Add comment above HiZ surface addition
 - Hide HiZ behind INTEL_VK_HIZ prior to BDW
 - Disable HiZ for untested cases
 - Remove DISABLE_AUX_BIT instead of preventing it from being added

Signed-off-by: Nanley Chery 
Reviewed-by: Jason Ekstrand  (v2)
Reviewed-by: Chad Versace  (v2)

---
v3. Avoid conflicts with future aux surfaces (Jason)

 src/intel/vulkan/anv_image.c | 37 ++---
 1 file changed, 34 insertions(+), 3 deletions(-)

diff --git a/src/intel/vulkan/anv_image.c b/src/intel/vulkan/anv_image.c
index f6e8672..77dcd46 100644
--- a/src/intel/vulkan/anv_image.c
+++ b/src/intel/vulkan/anv_image.c
@@ -28,6 +28,7 @@
 #include 
 
 #include "anv_private.h"
+#include "util/debug.h"
 
 #include "vk_format_info.h"
 
@@ -60,6 +61,7 @@ choose_isl_surf_usage(VkImageUsageFlags vk_usage,
   default:
  unreachable("bad VkImageAspect");
   case VK_IMAGE_ASPECT_DEPTH_BIT:
+ isl_usage &= ~ISL_SURF_USAGE_DISABLE_AUX_BIT;
  isl_usage |= ISL_SURF_USAGE_DEPTH_BIT;
  break;
   case VK_IMAGE_ASPECT_STENCIL_BIT:
@@ -99,6 +101,16 @@ get_surface(struct anv_image *image, VkImageAspectFlags 
aspect)
}
 }
 
+static void
+add_surface(struct anv_image *image, struct anv_surface *surf)
+{
+   assert(surf->isl.size > 0); /* isl surface must be initialized */
+
+   surf->offset = align_u32(image->size, surf->isl.alignment);
+   image->size = surf->offset + surf->isl.size;
+   image->alignment = MAX(image->alignment, surf->isl.alignment);
+}
+
 /**
  * Initialize the anv_image::*_surface selected by \a aspect. Then update the
  * image's memory requirements (that is, the image's size and alignment).
@@ -160,9 +172,28 @@ make_surface(const struct anv_device *dev,
 */
assert(ok);
 
-   anv_surf->offset = align_u32(image->size, anv_surf->isl.alignment);
-   image->size = anv_surf->offset + anv_surf->isl.size;
-   image->alignment = MAX(image->alignment, anv_surf->isl.alignment);
+   add_surface(image, anv_surf);
+
+   /* Add a HiZ surface to a depth buffer that will be used for rendering.
+*/
+   if (aspect == VK_IMAGE_ASPECT_DEPTH_BIT &&
+   (image->usage & VK_IMAGE_USAGE_DEPTH_STENCIL_ATTACHMENT_BIT)) {
+
+  /* Allow the user to control HiZ enabling. Disable by default on gen7
+   * because resolves are not currently implemented pre-BDW.
+   */
+  if (!env_var_as_boolean("INTEL_VK_HIZ", dev->info.gen >= 8)) {
+ anv_finishme("Implement gen7 HiZ");
+  } else if (vk_info->mipLevels > 1) {
+ anv_finishme("Test multi-LOD HiZ");
+  } else if (dev->info.gen == 8 && vk_info->samples > 1) {
+ anv_finishme("Test gen8 multisampled HiZ");
+  } else {
+ isl_surf_get_hiz_surf(&dev->isl_dev, &image->depth_surface.isl,
+   &image->hiz_surface.isl);
+ add_surface(image, &image->hiz_surface);
+  }
+   }
 
return VK_SUCCESS;
 }
-- 
2.10.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v3 07/10] anv/cmd_buffer: Add code for performing HZ operations

2016-10-06 Thread Nanley Chery

Create a function that performs one of three HiZ operations -
depth/stencil clears, HiZ resolve, and depth resolves.

Signed-off-by: Nanley Chery 
Reviewed-by: Jason Ekstrand  (v2)

---
v3. Change do_hz to emit_hz (Chad)
Always set XMin YMin explicitly (Chad)
Add anv_finishme for gen7 emit_hz function (Jason)
Add full_surface render_area offset check (Jason)
Enable a fast-depth clear case (Jason)
Clarify #else case for px_dim calculation (Jason)
Remove experimental Depth Pipecontrol TODO
Comment on variable naming

 src/intel/vulkan/anv_genX.h|   3 +
 src/intel/vulkan/gen7_cmd_buffer.c |   7 ++
 src/intel/vulkan/gen8_cmd_buffer.c | 187 +
 3 files changed, 197 insertions(+)

diff --git a/src/intel/vulkan/anv_genX.h b/src/intel/vulkan/anv_genX.h
index dc2dd5d..81ebbaa 100644
--- a/src/intel/vulkan/anv_genX.h
+++ b/src/intel/vulkan/anv_genX.h
@@ -55,6 +55,9 @@ genX(emit_urb_setup)(struct anv_device *device, struct 
anv_batch *batch,
  unsigned vs_entry_size, unsigned gs_entry_size,
  const struct gen_l3_config *l3_config);
 
+void genX(cmd_buffer_emit_hz_op)(struct anv_cmd_buffer *cmd_buffer,
+   enum blorp_hiz_op op);
+
 VkResult
 genX(graphics_pipeline_create)(VkDevice _device,
struct anv_pipeline_cache *cache,
diff --git a/src/intel/vulkan/gen7_cmd_buffer.c 
b/src/intel/vulkan/gen7_cmd_buffer.c
index b627ef0..225533c 100644
--- a/src/intel/vulkan/gen7_cmd_buffer.c
+++ b/src/intel/vulkan/gen7_cmd_buffer.c
@@ -323,6 +323,13 @@ genX(cmd_buffer_flush_dynamic_state)(struct anv_cmd_buffer 
*cmd_buffer)
cmd_buffer->state.dirty = 0;
 }
 
+void
+genX(cmd_buffer_emit_hz_op)(struct anv_cmd_buffer *cmd_buffer,
+  enum blorp_hiz_op op)
+{
+   anv_finishme("Implement Gen7 HZ ops");
+}
+
 void genX(CmdSetEvent)(
 VkCommandBuffer commandBuffer,
 VkEvent event,
diff --git a/src/intel/vulkan/gen8_cmd_buffer.c 
b/src/intel/vulkan/gen8_cmd_buffer.c
index 7058608..e50f1a5 100644
--- a/src/intel/vulkan/gen8_cmd_buffer.c
+++ b/src/intel/vulkan/gen8_cmd_buffer.c
@@ -399,6 +399,193 @@ genX(cmd_buffer_flush_compute_state)(struct 
anv_cmd_buffer *cmd_buffer)
genX(cmd_buffer_apply_pipe_flushes)(cmd_buffer);
 }
 
+
+/**
+ * Emit the HZ_OP packet in the sequence specified by the BDW PRM section
+ * entitled: "Optimized Depth Buffer Clear and/or Stencil Buffer Clear."
+ *
+ * \todo Enable Stencil Buffer-only clears
+ */
+void
+genX(cmd_buffer_emit_hz_op)(struct anv_cmd_buffer *cmd_buffer,
+  enum blorp_hiz_op op)
+{
+   struct anv_cmd_state *cmd_state = &cmd_buffer->state;
+   const struct anv_image_view *iview =
+  anv_cmd_buffer_get_depth_stencil_view(cmd_buffer);
+
+   if (iview == NULL || !anv_image_has_hiz(iview->image))
+  return;
+
+   const uint32_t ds = cmd_state->subpass->depth_stencil_attachment;
+
+   /* Section 7.4. of the Vulkan 1.0.27 spec states:
+*
+*   "The render area must be contained within the framebuffer dimensions."
+*
+* Therefore, the only way the extent of the render area can match that of
+* the image view is if the render area offset equals (0, 0).
+*/
+   const bool full_surface_op =
+ cmd_state->render_area.extent.width == iview->extent.width &&
+ cmd_state->render_area.extent.height == iview->extent.height;
+   if (full_surface_op)
+  assert(cmd_state->render_area.offset.x == 0 &&
+ cmd_state->render_area.offset.y == 0);
+
+   /* This variable corresponds to the Pixel Dim column in the table below */
+   struct isl_extent2d px_dim;
+
+   /* Validate that we can perform the HZ operation and that it's necessary. */
+   switch (op) {
+   case BLORP_HIZ_OP_DEPTH_CLEAR:
+  if (cmd_buffer->state.pass->attachments[ds].load_op !=
+  VK_ATTACHMENT_LOAD_OP_CLEAR)
+ return;
+
+  /* Apply alignment restrictions. Despite the BDW PRM mentioning this is
+   * only needed for a depth buffer surface type of D16_UNORM, testing
+   * showed it to be necessary for other depth formats as well
+   * (e.g., D32_FLOAT).
+   */
+#if GEN_GEN == 8
+  /* Pre-SKL, HiZ has an 8x4 sample block. As the number of samples
+   * increases, the number of pixels representable by this block
+   * decreases by a factor of the sample dimensions. Sample dimensions
+   * scale following the MSAA interleaved pattern.
+   *
+   * Sample|Sample|Pixel
+   * Count |Dim   |Dim
+   * ===
+   *1  | 1x1  | 8x4
+   *2  | 2x1  | 4x4
+   *4  | 2x2  | 4x2
+   *8  | 4x2  | 2x2
+   *   16  | 4x4  | 2x1
+   *
+   * Table: Pixel Dimensions in a HiZ Sample Block Pre-SKL
+   */
+  /* This variable corresponds to the Sample Dim column in the table
+   * above.

[Mesa-dev] [PATCH v3 00/10] anv: Enable HiZ for basic cases

2016-10-06 Thread Nanley Chery

This work is a revision of this series:
https://lists.freedesktop.org/archives/mesa-dev/2016-September/129845.html

And is dependent on this series:
https://patchwork.freedesktop.org/series/13360/

Cc: Chad Versace 
Cc: Jason Ekstrand 

Chad Versace (4):
  anv: Add anv_image::hiz_surface
  anv: Add func anv_image_has_hiz()
  anv: Allocate hiz surface
  anv/cmd_buffer: Enable rendering to HiZ

Jason Ekstrand (2):
  anv: Move BindImageMemory to anv_image.c
  anv/image: Memset hiz surfaces to 0 when binding memory

Nanley Chery (4):
  isl: Correct a comment in the isl_format enum
  anv/cmd_buffer: Add code for performing HZ operations
  anv: Enable fast depth clears
  anv/TODO: Update the HiZ task

 src/intel/isl/isl.h|   2 +-
 src/intel/vulkan/TODO  |   2 +-
 src/intel/vulkan/anv_device.c  |  20 
 src/intel/vulkan/anv_genX.h|   3 +
 src/intel/vulkan/anv_image.c   |  86 -
 src/intel/vulkan/anv_pass.c|  13 +++
 src/intel/vulkan/anv_private.h |  12 +++
 src/intel/vulkan/gen7_cmd_buffer.c |   7 ++
 src/intel/vulkan/gen8_cmd_buffer.c | 191 +
 src/intel/vulkan/genX_cmd_buffer.c |  64 +++--
 10 files changed, 369 insertions(+), 31 deletions(-)

-- 
2.10.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/5] intel: aubinator: use getopt to parse arguments

2016-10-06 Thread Gandikota, Sirisha

>-Original Message-
>From: mesa-dev [mailto:mesa-dev-boun...@lists.freedesktop.org] On Behalf Of
>Lionel Landwerlin
>Sent: Wednesday, October 05, 2016 8:55 AM
>To: mesa-dev@lists.freedesktop.org
>Cc: Landwerlin, Lionel G 
>Subject: [Mesa-dev] [PATCH 2/5] intel: aubinator: use getopt to parse arguments
>
>Signed-off-by: Lionel Landwerlin 
>---
> src/intel/tools/aubinator.c | 90 +
> 1 file changed, 33 insertions(+), 57 deletions(-)
>
>diff --git a/src/intel/tools/aubinator.c b/src/intel/tools/aubinator.c index
>b74885a..44a6bb2 100644
>--- a/src/intel/tools/aubinator.c
>+++ b/src/intel/tools/aubinator.c
>@@ -24,6 +24,7 @@
> #include 
> #include 
> #include 
>+#include 
>
> #include 
> #include 
>@@ -1040,30 +1041,13 @@ print_help(const char *progname, FILE *file)
>progname);
> }
>
>-static bool
>-is_prefix(const char *arg, const char *prefix, const char **value) -{
>-   int l = strlen(prefix);
>-
>-   if (strncmp(arg, prefix, l) == 0 && (arg[l] == '\0' || arg[l] == '=')) {
>-  if (arg[l] == '=')
>- *value = arg + l + 1;
>-  else
>- *value = NULL;
>-
>-  return true;
>-   }
>-
>-   return false;
>-}
>-
> int main(int argc, char *argv[])
> {
>struct gen_spec *spec;
>struct aub_file *file;
>-   int i;
>-   bool found_arg_gen = false, pager = true;
>-   const char *value, *input_file;
>+   int c, i;
>+   bool help = false, pager = true;
>+   const char *input_file = NULL;
>char gen_file[256], gen_val[24];
>const struct {
>   const char *name;
>@@ -1080,55 +1064,47 @@ int main(int argc, char *argv[])
>   { "kbl", 0x591D, 9, 0 }, /* Intel(R) Kabylake GT2 */
>   { "bxt", 0x0A84, 9, 0 }  /* Intel(R) HD Graphics (Broxton) */
>}, *gen = NULL;
>-
>-   if (argc == 1) {
>-  print_help(argv[0], stderr);
>-  exit(EXIT_FAILURE);
>-   }
>-
>-   for (i = 1; i < argc; ++i) {
>-  if (strcmp(argv[i], "--no-pager") == 0) {
>- pager = false;
>-  } else if (strcmp(argv[i], "--no-offsets") == 0) {
>- option_print_offsets = false;
>-  } else if (is_prefix(argv[i], "--gen", &value)) {
>- if (value == NULL) {
>-fprintf(stderr, "option '--gen' requires an argument\n");
>-exit(EXIT_FAILURE);
>- }
>- found_arg_gen = true;
>- snprintf(gen_val, sizeof(gen_val), "%s", value);
>-  } else if (strcmp(argv[i], "--headers") == 0) {
>- option_full_decode = false;
>-  } else if (is_prefix(argv[i], "--color", &value)) {
>- if (value == NULL || strcmp(value, "always") == 0)
>+   const struct option aubinator_opts[] = {
>+  { "help",   no_argument,   (int *) &help, true 
>},
>+  { "no-pager",   no_argument,   (int *) &pager,false 
>},
>+  { "no-offsets", no_argument,   (int *) &option_print_offsets, false 
>},
>+  { "gen",required_argument, NULL,  'g' },
>+  { "headers",no_argument,   (int *) &option_full_decode,   false 
>},
>+  { "color",  required_argument, NULL,  'c' },
>+  { NULL, 0, NULL,  0 }
>+   };
>+
>+   i = 0;
>+   while ((c = getopt_long(argc, argv, "", aubinator_opts, &i)) != -1) {
>+  switch (c) {
>+  case 'g':
>+ snprintf(gen_val, sizeof(gen_val), "%s", optarg);
>+ break;
>+  case 'c':
>+ if (optarg == NULL || strcmp(optarg, "always") == 0)
> option_color = COLOR_ALWAYS;
>- else if (strcmp(value, "never") == 0)
>+ else if (strcmp(optarg, "never") == 0)
> option_color = COLOR_NEVER;
>- else if (strcmp(value, "auto") == 0)
>+ else if (strcmp(optarg, "auto") == 0)
> option_color = COLOR_AUTO;
>  else {
>-fprintf(stderr, "invalid value for --color: %s", value);
>+fprintf(stderr, "invalid value for --color: %s", optarg);
> exit(EXIT_FAILURE);
>  }
>-  } else if (strcmp(argv[i], "--help") == 0) {
>- print_help(argv[0], stdout);
>- exit(EXIT_SUCCESS);
>-  } else {
>- if (argv[i][0] == '-') {
>-fprintf(stderr, "unknown option %s\n", argv[i]);
>-exit(EXIT_FAILURE);
>- }
>- input_file = argv[i];
>+ break;
>+  default:
>  break;
>   }
>}
>
>-   if (!found_arg_gen) {
>-  fprintf(stderr, "argument --gen is required\n");
>-  exit(EXIT_FAILURE);
>+   if (help || argc == 1) {
>+  print_help(argv[0], stderr);
>+  exit(0);
>}
>
>+   if (optind < argc)
>+  input_file = argv[optind];
>+
>for (i = 0; i < ARRAY_SIZE(gens); i++) {
>   if (!strcmp(gen_val, gens[i].name)) {
>  gen = &gens[i];
>--
>2.9.3

[SG] Works for me. Thanks for cleaning this up.
Reviewed-by: Sirisha Gandikota
>___
>mesa-dev mailin

Re: [Mesa-dev] [PATCH v4 04/14] glsl: process local_size_variable input qualifier

2016-10-06 Thread Samuel Pitoiset




On 10/06/2016 09:27 AM, Nicolai Hähnle wrote:

On 05.10.2016 20:48, Samuel Pitoiset wrote:

This is the new layout qualifier introduced by
ARB_compute_variable_group_size which allows to use a variable work
group size.

v4: - add missing '%s' in the monster format string

Signed-off-by: Samuel Pitoiset 
Reviewed-by: Ian Romanick 


Reviewed-by: Nicolai Hähnle 

With the small comment on patch #2 fixed, that means I'm happy with the
(non-nouveau parts of) this series. Great work! :)


Thanks for taking care of that series. :)



Cheers,
Nicolai

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v4 02/14] mesa/main: add support for ARB_compute_variable_groups_size

2016-10-06 Thread Samuel Pitoiset




On 10/06/2016 09:25 AM, Nicolai Hähnle wrote:

On 05.10.2016 20:48, Samuel Pitoiset wrote:

v4: - slightly indent spec quotes (Nicolai)
- drop useless _mesa_has_compute_shaders() check (Nicolai)
- move the fixed local size outside of the loop (Nicolai)
- add missing check for invalid use of work group count
v2: - update formatting spec quotations (Ian)
- move the total_invocations check outside of the loop (Ian)

Signed-off-by: Samuel Pitoiset 

fix patch 2
---
 src/mesa/main/api_validate.c | 111
+++
 src/mesa/main/api_validate.h |   4 ++
 src/mesa/main/compute.c  |  17 ++
 src/mesa/main/context.c  |   6 +++
 src/mesa/main/dd.h   |   9 
 src/mesa/main/extensions_table.h |   1 +
 src/mesa/main/get.c  |  10 
 src/mesa/main/get_hash_params.py |   3 ++
 src/mesa/main/mtypes.h   |  24 -
 src/mesa/main/shaderapi.c|   1 +
 src/mesa/main/shaderobj.c|   2 +
 11 files changed, 187 insertions(+), 1 deletion(-)

diff --git a/src/mesa/main/api_validate.c b/src/mesa/main/api_validate.c
index 05691d2..341c40e 100644
--- a/src/mesa/main/api_validate.c
+++ b/src/mesa/main/api_validate.c
@@ -1285,6 +1285,7 @@ GLboolean
 _mesa_validate_DispatchCompute(struct gl_context *ctx,
const GLuint *num_groups)
 {
+   struct gl_shader_program *prog;
int i;
FLUSH_CURRENT(ctx, 0);

@@ -1317,6 +1318,103 @@ _mesa_validate_DispatchCompute(struct
gl_context *ctx,
   }
}

+   /* The ARB_compute_variable_group_size spec says:
+*
+* "An INVALID_OPERATION error is generated by DispatchCompute if
the active
+*  program for the compute shader stage has a variable work group
size."
+*/
+   prog = ctx->_Shader->CurrentProgram[MESA_SHADER_COMPUTE];
+   if (prog->Comp.LocalSizeVariable) {
+  _mesa_error(ctx, GL_INVALID_OPERATION,
+  "glDispatchCompute(variable work group size
forbidden)");
+  return GL_FALSE;
+   }
+
+   return GL_TRUE;
+}
+
+GLboolean
+_mesa_validate_DispatchComputeGroupSizeARB(struct gl_context *ctx,
+   const GLuint *num_groups,
+   const GLuint *group_size)
+{
+   struct gl_shader_program *prog;
+   GLuint total_invocations = 1;
+   GLuint fixed_local_size = 0;
+   int i;
+
+   FLUSH_CURRENT(ctx, 0);
+
+   if (!check_valid_to_compute(ctx, "glDispatchComputeGroupSizeARB"))
+  return GL_FALSE;
+
+   prog = ctx->_Shader->CurrentProgram[MESA_SHADER_COMPUTE];
+   for (i = 0; i < 3; i++) {
+  /* The ARB_compute_variable_group_size spec says:
+   *
+   * "An INVALID_VALUE error is generated if any of num_groups_x,
+   *  num_groups_y and num_groups_z are greater than or equal to the
+   *  maximum work group count for the corresponding dimension."
+   */
+  if (num_groups[i] > ctx->Const.MaxComputeWorkGroupCount[i]) {
+ _mesa_error(ctx, GL_INVALID_VALUE,
+ "glDispatchComputeGroupSizeARB(num_groups_%c)",
'x' + i);
+ return GL_FALSE;
+  }
+
+  /* The ARB_compute_variable_group_size spec says:
+   *
+   * "An INVALID_VALUE error is generated by
DispatchComputeGroupSizeARB if
+   *  any of , , or  is
less than
+   *  or equal to zero or greater than the maximum local work
group size
+   *  for compute shaders with variable group size
+   *  (MAX_COMPUTE_VARIABLE_GROUP_SIZE_ARB) in the corresponding
+   *  dimension."
+   *
+   * However, the "less than" is a spec bug because they are
declared as
+   * unsigned integers.
+   */
+  if (group_size[i] == 0 ||
+  group_size[i] > ctx->Const.MaxComputeVariableGroupSize[i]) {
+ _mesa_error(ctx, GL_INVALID_VALUE,
+ "glDispatchComputeGroupSizeARB(group_size_%c)",
'x' + i);
+ return GL_FALSE;
+  }
+
+  fixed_local_size  += prog->Comp.LocalSize[i];
+  total_invocations *= group_size[i];
+   }
+
+   /* The ARB_compute_variable_group_size spec says:
+*
+* "An INVALID_OPERATION error is generated by
+*  DispatchComputeGroupSizeARB if the active program for the compute
+*  shader stage has a fixed work group size."
+*/
+   if (fixed_local_size > 0) {
+  _mesa_error(ctx, GL_INVALID_OPERATION,
+  "glDispatchComputeGroupSizeARB(fixed work group size "
+  "forbidden)");
+  return GL_FALSE;
+   }


I think this test should just be if (!prog->Comp.LocalSizeVariable), and
it should go above the loop (as a consequence, fixed_local_size can be
removed).


Actually both tests are correct, but your solution requires less code 
than mine, so I agree. :)




I don't know whether the spec really cares about the order in which
errors are checked, but the error messages we generate will be less
confusing that way.


Okay, fixed locally.



With that changed, the patch is

Review

Re: [Mesa-dev] [PATCH 1/5] intel: aubinator: add missing return characters

2016-10-06 Thread Gandikota, Sirisha

>-Original Message-
>From: mesa-dev [mailto:mesa-dev-boun...@lists.freedesktop.org] On Behalf Of
>Lionel Landwerlin
>Sent: Wednesday, October 05, 2016 8:55 AM
>To: mesa-dev@lists.freedesktop.org
>Cc: Landwerlin, Lionel G 
>Subject: [Mesa-dev] [PATCH 1/5] intel: aubinator: add missing return characters
>
>Signed-off-by: Lionel Landwerlin 
>---
> src/intel/tools/aubinator.c | 10 +-
> 1 file changed, 5 insertions(+), 5 deletions(-)
>
>diff --git a/src/intel/tools/aubinator.c b/src/intel/tools/aubinator.c index
>6fc0208..b74885a 100644
>--- a/src/intel/tools/aubinator.c
>+++ b/src/intel/tools/aubinator.c
>@@ -807,7 +807,7 @@ handle_trace_block(struct gen_spec *spec, uint32_t *p)
>   if (address_space != AUB_TRACE_MEMTYPE_GTT)
>  break;
>   if (gtt_size < offset + size) {
>- fprintf(stderr, "overflow gtt space: %s", strerror(errno));
>+ fprintf(stderr, "overflow gtt space: %s\n", strerror(errno));
>  exit(EXIT_FAILURE);
>   }
>   memcpy((char *) gtt + offset, data, size); @@ -849,19 +849,19 @@
>aub_file_open(const char *filename)
>file->filename = strdup(filename);
>file->fd = open(file->filename, O_RDONLY);
>if (file->fd == -1) {
>-  fprintf(stderr, "open %s failed: %s", file->filename, strerror(errno));
>+  fprintf(stderr, "open %s failed: %s\n", file->filename,
>+ strerror(errno));
>   exit(EXIT_FAILURE);
>}
>
>if (fstat(file->fd, &file->sb) == -1) {
>-  fprintf(stderr, "stat failed: %s", strerror(errno));
>+  fprintf(stderr, "stat failed: %s\n", strerror(errno));
>   exit(EXIT_FAILURE);
>}
>
>file->map = mmap(NULL, file->sb.st_size,
> PROT_READ, MAP_SHARED, file->fd, 0);
>if (file->map == MAP_FAILED) {
>-  fprintf(stderr, "mmap failed: %s", strerror(errno));
>+  fprintf(stderr, "mmap failed: %s\n", strerror(errno));
>   exit(EXIT_FAILURE);
>}
>
>@@ -873,7 +873,7 @@ aub_file_open(const char *filename)
>gtt = mmap(NULL, gtt_size, PROT_READ | PROT_WRITE,
>   MAP_PRIVATE | MAP_ANONYMOUS |  MAP_NORESERVE, -1, 0);
>if (gtt == MAP_FAILED) {
>-  fprintf(stderr, "failed to alloc gtt space: %s", strerror(errno));
>+  fprintf(stderr, "failed to alloc gtt space: %s\n",
>+ strerror(errno));
>   exit(1);
>}
>
>--
>2.9.3

[SG] Works for me
Reviewed-by: Sirisha Gandikota
>___
>mesa-dev mailing list
>mesa-dev@lists.freedesktop.org
>https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 98128] nir/tests/control_flow_tests.cpp:79:73: error: ‘nir_loop_first_cf_node’ was not declared in this scope

2016-10-06 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=98128

Bug ID: 98128
   Summary: nir/tests/control_flow_tests.cpp:79:73: error:
‘nir_loop_first_cf_node’ was not declared in this
scope
   Product: Mesa
   Version: git
  Hardware: x86-64 (AMD64)
OS: All
Status: NEW
  Keywords: bisected, regression
  Severity: normal
  Priority: medium
 Component: Mesa core
  Assignee: mesa-dev@lists.freedesktop.org
  Reporter: v...@freedesktop.org
QA Contact: mesa-dev@lists.freedesktop.org
CC: cwabbo...@gmail.com, ja...@jlekstrand.net

mesa: f96945c5b5c3a52685e76795f03f75c75fb62fc7 (master 12.1.0-devel)

  CXX  nir/tests/nir_tests_control_flow_tests-control_flow_tests.o
nir/tests/control_flow_tests.cpp: In member function ‘virtual void
nir_cf_test_delete_break_in_loop_Test::TestBody()’:
nir/tests/control_flow_tests.cpp:79:73: error: ‘nir_loop_first_cf_node’ was not
declared in this scope
nir_block *block_1 = nir_cf_node_as_block(nir_loop_first_cf_node(loop));
 ^

2ed17d46de045404042f13c6591895a1cf31b167 is the first bad commit
commit 2ed17d46de045404042f13c6591895a1cf31b167
Author: Jason Ekstrand 
Date:   Wed Oct 5 19:08:57 2016 -0700

nir: Make nir_foo_first/last_cf_node return a block instead

One of NIR's invariants is that control flow lists always start and end
with blocks.  There's no good reason why we should return a cf_node from
these functions since we know that it's always a block.  Making it a block
lets us remove a bunch of code.

Signed-off-by: Jason Ekstrand 
Reviewed-by: Connor Abbott 

:04 04 0ecfd382f1e2c6fafb103838f5da4f711b8eeebd
1ed04a96ce72361430d8e7fbfcf62557745a5ecf M  src
bisect run success

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] nv50/ir: start LocalCSE with getFirst to merge PHI instructions

2016-10-06 Thread Karol Herbst

total instructions in shared programs : 2818606 -> 2818227 (-0.01%)
total gprs used in shared programs: 379273 -> 379238 (-0.01%)
total local used in shared programs   : 9505 -> 9505 (0.00%)
total bytes used in shared programs   : 25837192 -> 25833736 (-0.01%)

localgpr   inst  bytes
helped   0  25 100 100
  hurt   0   0   0   0

Signed-off-by: Karol Herbst 
---
 src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
index 1c71155..168aa05 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
@@ -3220,7 +3220,7 @@ LocalCSE::visit(BasicBlock *bb)
   for (ir = bb->getFirst(); ir; ir = ir->next)
  ir->serial = serial++;
 
-  for (ir = bb->getEntry(); ir; ir = next) {
+  for (ir = bb->getFirst(); ir; ir = next) {
  int s;
  Value *src = NULL;
 
-- 
2.10.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] anv: Return correct result in EnumeratePhysicalDevices

2016-10-06 Thread Anuj Phogat

On Thu, Oct 6, 2016 at 12:21 PM, Nicolas Koch  wrote:
> If pPhysicalDevices is too small for all physical devices,
> the driver must return VK_INCOMPLETE.
> Since only a single physical device is supported, this is only the case
> when pPhysicalDeviceCount == 0 && pPhysicalDevices != NULL.
> ---
>  src/intel/vulkan/anv_device.c | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/src/intel/vulkan/anv_device.c b/src/intel/vulkan/anv_device.c
> index c7b9979..76cbb69 100644
> --- a/src/intel/vulkan/anv_device.c
> +++ b/src/intel/vulkan/anv_device.c
> @@ -385,6 +385,8 @@ VkResult anv_EnumeratePhysicalDevices(
> } else if (*pPhysicalDeviceCount >= 1) {
>pPhysicalDevices[0] = 
> anv_physical_device_to_handle(&instance->physicalDevice);
>*pPhysicalDeviceCount = 1;
> +   } else if (*pPhysicalDeviceCount < instance->physicalDeviceCount) {
> +  return VK_INCOMPLETE;
> } else {
>*pPhysicalDeviceCount = 0;
> }
> --
> 2.10.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Reviewed-by: Anuj Phogat 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [RFC] ralloc: use jemalloc for faster GLSL compilation

2016-10-06 Thread Eric Anholt

Marek Olšák  writes:

> I'd like to have more feedback on the idea of using jemalloc for ralloc.
>
> Right now, I see these options:
>
> 1) Use jemalloc for ralloc and make it mandatory for all GL drivers.
> - Distributions have shown that they are capable of doing anything
> with the Mesa source code, so they don't need --disable-jemalloc.
> - Reasonable people should build Mesa as-is.
>
> 2) Abandon the idea.
> - The availability of --disable-jemalloc would send a clear message
> that "you don't have to enable this", therefore the whole idea of
> using jemalloc in Mesa would be pointless.

I'm generally of the opinion that if malloc is taking 10% of compile
time, we're screwing up and we should just go fix that.  However, this
is an easy fix and doesn't prevent going and fixing malloc abuse later.
I also don't like configure options -- they're mostly a chance to build
things wrong.

I'm concerned that by shared linking against jemalloc we're going to run
into similar problems to every other time we shared link against things
and it's going to make our lives harder.  This is probably "we should
figure out how to stop shared linking against anything" rather than "we
shouldn't make this change", though.

signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/2] anv/cmd_buffer: Don't call set_subpass in a secondary

2016-10-06 Thread Nanley Chery

On Thu, Oct 06, 2016 at 12:35:53PM -0700, Jason Ekstrand wrote:
> On Thu, Oct 6, 2016 at 12:30 PM, Nanley Chery  wrote:
> 
> > On Wed, Oct 05, 2016 at 05:36:43PM -0700, Jason Ekstrand wrote:
> > > Initially, we had intended set_subpass to be an interesting function that
> > > did whatever (presumably a lot) setup we needed for a subpass.  In
> > reality,
> > > it just sets a pointer and a dirty bit and then emits depth and stencil
> > > state.  When we call BeginCommandBuffer on a secondary, all of the dirty
> > > bits are already set and there's no point in setting depth and stencil
> > > state since it will already be set by the primary.  Instead, the only
> > thing
> > > we need to do at the start of a secondary is set the subpass pointer.
> > >
> > > Signed-off-by: Jason Ekstrand 
> > > ---
> > >  src/intel/vulkan/anv_cmd_buffer.c  | 39 +-
> > 
> > >  src/intel/vulkan/anv_genX.h|  3 ---
> > >  src/intel/vulkan/anv_private.h |  3 ---
> > >  src/intel/vulkan/genX_cmd_buffer.c |  5 +
> > >  4 files changed, 2 insertions(+), 48 deletions(-)
> > >
> > > diff --git a/src/intel/vulkan/anv_cmd_buffer.c
> > b/src/intel/vulkan/anv_cmd_buffer.c
> > > index 9dedde8..ef13dfc 100644
> > > --- a/src/intel/vulkan/anv_cmd_buffer.c
> > > +++ b/src/intel/vulkan/anv_cmd_buffer.c
> > > @@ -407,10 +407,8 @@ VkResult anv_BeginCommandBuffer(
> > >cmd_buffer->state.pass =
> > >   anv_render_pass_from_handle(pBeginInfo->pInheritanceInfo->
> > renderPass);
> > >
> > > -  struct anv_subpass *subpass =
> > > +  cmd_buffer->state.subpass =
> > >   &cmd_buffer->state.pass->subpasses[pBeginInfo->
> > pInheritanceInfo->subpass];
> > > -
> > > -  anv_cmd_buffer_set_subpass(cmd_buffer, subpass);
> >
> > I'm not sure why we always set the fragment descriptor bit in
> > set_subpass, but it seems like we need to do it here as well to keep
> > the logic the same. I don't see where we set the dirty bits on a
> > secondary command buffer at BeginCommandBuffer. Aside from that, this
> > patch looks good.
> >
> 
> Initially, I think we did it to ensure that binding tables got re-emitted.
> However, we're now also re-emitting binding tables on pipeline changes and
> you have a pipeline change at the top of every subpass, so it shouldn't be
> needed either place.
> 
> 

That makes sense. This series is
Reviewed-by: Nanley Chery 

> > -Nanley
> >
> > > }
> > >
> > > return VK_SUCCESS;
> > > @@ -1050,41 +1048,6 @@ anv_cmd_buffer_merge_dynamic(struct
> > anv_cmd_buffer *cmd_buffer,
> > > return state;
> > >  }
> > >
> > > -/**
> > > - * @brief Setup the command buffer for recording commands inside the
> > given
> > > - * subpass.
> > > - *
> > > - * This does not record all commands needed for starting the subpass.
> > > - * Starting the subpass may require additional commands.
> > > - *
> > > - * Note that vkCmdBeginRenderPass, vkCmdNextSubpass, and
> > vkBeginCommandBuffer
> > > - * with VK_COMMAND_BUFFER_USAGE_RENDER_PASS_CONTINUE_BIT, all setup the
> > > - * command buffer for recording commands for some subpass.  But only
> > the first
> > > - * two, vkCmdBeginRenderPass and vkCmdNextSubpass, can start a subpass.
> > > - */
> > > -void
> > > -anv_cmd_buffer_set_subpass(struct anv_cmd_buffer *cmd_buffer,
> > > -   struct anv_subpass *subpass)
> > > -{
> > > -   switch (cmd_buffer->device->info.gen) {
> > > -   case 7:
> > > -  if (cmd_buffer->device->info.is_haswell) {
> > > - gen75_cmd_buffer_set_subpass(cmd_buffer, subpass);
> > > -  } else {
> > > - gen7_cmd_buffer_set_subpass(cmd_buffer, subpass);
> > > -  }
> > > -  break;
> > > -   case 8:
> > > -  gen8_cmd_buffer_set_subpass(cmd_buffer, subpass);
> > > -  break;
> > > -   case 9:
> > > -  gen9_cmd_buffer_set_subpass(cmd_buffer, subpass);
> > > -  break;
> > > -   default:
> > > -  unreachable("unsupported gen\n");
> > > -   }
> > > -}
> > > -
> > >  struct anv_state
> > >  anv_cmd_buffer_push_constants(struct anv_cmd_buffer *cmd_buffer,
> > >gl_shader_stage stage)
> > > diff --git a/src/intel/vulkan/anv_genX.h b/src/intel/vulkan/anv_genX.h
> > > index 02e79c2..dc2dd5d 100644
> > > --- a/src/intel/vulkan/anv_genX.h
> > > +++ b/src/intel/vulkan/anv_genX.h
> > > @@ -36,9 +36,6 @@ struct anv_state
> > >  genX(cmd_buffer_alloc_null_surface_state)(struct anv_cmd_buffer
> > *cmd_buffer,
> > >struct anv_framebuffer *fb);
> > >
> > > -void genX(cmd_buffer_set_subpass)(struct anv_cmd_buffer *cmd_buffer,
> > > -  struct anv_subpass *subpass);
> > > -
> > >  void genX(cmd_buffer_apply_pipe_flushes)(struct anv_cmd_buffer
> > *cmd_buffer);
> > >
> > >  void genX(flush_pipeline_select_3d)(struct anv_cmd_buffer *cmd_buffer);
> > > diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_
> > private.h
> > > index 44

Re: [Mesa-dev] [PATCH 20/75] st/nine: Initial ProcessVertices support

2016-10-06 Thread Ilia Mirkin

On Thu, Oct 6, 2016 at 4:24 PM, Axel Davy  wrote:
> On 05/10/2016 22:08, Axel Davy wrote:
>>
>>   HRESULT NINE_WINAPI
>>   NineDevice9_ProcessVertices( struct NineDevice9 *This,
>>UINT SrcStartIndex,
>> @@ -3174,33 +3188,69 @@ NineDevice9_ProcessVertices( struct NineDevice9
>> *This,
>>IDirect3DVertexDeclaration9 *pVertexDecl,
>>DWORD Flags )
>>   {
>> -struct pipe_screen *screen = This->screen;
>> +struct pipe_screen *screen_sw = This->screen_sw;
>> +struct pipe_context *pipe_sw = This->pipe_sw;
>>   struct NineVertexDeclaration9 *vdecl =
>> NineVertexDeclaration9(pVertexDecl);
>> +struct NineVertexBuffer9 *dst = NineVertexBuffer9(pDestBuffer);
>>   struct NineVertexShader9 *vs;
>>   struct pipe_resource *resource;
>> +struct pipe_transfer *transfer = NULL;
>> +struct pipe_stream_output_info so;
>>   struct pipe_stream_output_target *target;
>>   struct pipe_draw_info draw;
>> +struct pipe_box box;
>> +unsigned offsets[1] = {0};
>>   HRESULT hr;
>> -unsigned buffer_offset, buffer_size;
>> +unsigned buffer_size;
>> +void *map;
>> DBG("This=%p SrcStartIndex=%u DestIndex=%u VertexCount=%u "
>>   "pDestBuffer=%p pVertexDecl=%p Flags=%d\n",
>>   This, SrcStartIndex, DestIndex, VertexCount, pDestBuffer,
>>   pVertexDecl, Flags);
>>   -if (!screen->get_param(screen, PIPE_CAP_MAX_STREAM_OUTPUT_BUFFERS))
>> -STUB(D3DERR_INVALIDCALL);
>> +if (!screen_sw->get_param(screen_sw,
>> PIPE_CAP_MAX_STREAM_OUTPUT_BUFFERS)) {
>> +DBG("ProcessVertices not supported\n");
>> +return D3DERR_INVALIDCALL;
>> +}
>>   -nine_update_state(This);
>>   -/* TODO: Create shader with stream output. */
>> -STUB(D3DERR_INVALIDCALL);
>> -struct NineVertexBuffer9 *dst = NineVertexBuffer9(pDestBuffer);
>> +vs = This->state.programmable_vs ? This->state.vs : This->ff.vs;
>> +/* Note: version is 0 for ff */
>> +user_assert(vdecl || (vs->byte_code.version < 0x30 && dst->desc.FVF),
>> +D3DERR_INVALIDCALL);
>> +if (!vdecl) {
>> +DWORD FVF = dst->desc.FVF;
>> +vdecl = util_hash_table_get(This->ff.ht_fvf, &FVF);
>> +if (!vdecl) {
>> +hr = NineVertexDeclaration9_new_from_fvf(This, FVF, &vdecl);
>> +if (FAILED(hr))
>> +return hr;
>> +vdecl->fvf = FVF;
>> +util_hash_table_set(This->ff.ht_fvf, &vdecl->fvf, vdecl);
>> +NineUnknown_ConvertRefToBind(NineUnknown(vdecl));
>> +}
>> +}
>>   -vs = This->state.vs ? This->state.vs : This->ff.vs;
>> +/* Flags: Can be 0 or D3DPV_DONOTCOPYDATA, and/or lock flags
>> + * D3DPV_DONOTCOPYDATA -> Has effect only for ff. In particular
>> + * if not set, everything from src will be used, and dst
>> + * must match exactly the ff vs outputs.
>> + * TODO: Handle all the checks, etc for ff */
>> +user_assert(vdecl->position_t || This->state.programmable_vs,
>> +D3DERR_INVALIDCALL);
>> +
>> +/* TODO: Support vs < 3 and ff */
>> +user_assert(vs->byte_code.version == 0x30,
>> +D3DERR_INVALIDCALL);
>> +/* TODO: Not hardcode the constant buffers for swvp */
>> +user_assert(This->may_swvp,
>> +D3DERR_INVALIDCALL);
>> +
>> +nine_state_prepare_draw_sw(This, vdecl, SrcStartIndex, VertexCount,
>> &so);
>>   -buffer_size = VertexCount * vs->so->stride[0];
>> -if (1) {
>> +buffer_size = VertexCount * so.stride[0] * 4;
>> +{
>>   struct pipe_resource templ;
>> templ.target = PIPE_BUFFER;
>> @@ -3212,49 +3262,50 @@ NineDevice9_ProcessVertices( struct NineDevice9
>> *This,
>>   templ.height0 = templ.depth0 = templ.array_size = 1;
>>   templ.last_level = templ.nr_samples = 0;
>>   -resource = This->screen->resource_create(This->screen, &templ);
>> +resource = screen_sw->resource_create(screen_sw, &templ);
>>   if (!resource)
>>   return E_OUTOFMEMORY;
>> -buffer_offset = 0;
>> -} else {
>> -/* SO matches vertex declaration */
>> -resource = NineVertexBuffer9_GetResource(dst);
>> -buffer_offset = DestIndex * vs->so->stride[0];
>>   }
>> -target = This->pipe->create_stream_output_target(This->pipe,
>> resource,
>> - buffer_offset,
>> - buffer_size);
>> +target = pipe_sw->create_stream_output_target(pipe_sw, resource,
>> +  0, buffer_size);
>>   if (!target) {
>>   pipe_resource_reference(&resource, NULL);
>>   return D3DERR_DRIVERINTERNALERROR;
>>   }
>>   -if (!vdecl) {
>> -hr = NineVertexDeclaration9_new_from_fvf(This, dst->desc.FVF,
>> &vdecl);
>>

Re: [Mesa-dev] [RFC] ralloc: use jemalloc for faster GLSL compilation

2016-10-06 Thread Marek Olšák

I'd like to have more feedback on the idea of using jemalloc for ralloc.

Right now, I see these options:

1) Use jemalloc for ralloc and make it mandatory for all GL drivers.
- Distributions have shown that they are capable of doing anything
with the Mesa source code, so they don't need --disable-jemalloc.
- Reasonable people should build Mesa as-is.

2) Abandon the idea.
- The availability of --disable-jemalloc would send a clear message
that "you don't have to enable this", therefore the whole idea of
using jemalloc in Mesa would be pointless.

Marek


On Thu, Sep 29, 2016 at 9:10 PM, Marek Olšák  wrote:
> On Thu, Sep 29, 2016 at 8:19 PM, Emil Velikov  
> wrote:
>> On 29 September 2016 at 18:48, Marek Olšák  wrote:
>>> On Thu, Sep 29, 2016 at 4:56 PM, Emil Velikov  
>>> wrote:
 On 29 September 2016 at 11:48, Marek Olšák  wrote:
> On Thu, Sep 29, 2016 at 11:20 AM, Nicolai Hähnle  
> wrote:
>> On 28.09.2016 18:49, Marek Olšák wrote:
>>>
>>> From: Marek Olšák 
>>>
>>> More info about jemalloc:
>>>https://github.com/jemalloc/jemalloc/wiki/History
>>>
>>> Average from 3 takes compiling Alien Isolation shaders from GLSL to GCN
>>> bytecode:
>>>glibc:17.183s
>>>jemalloc: 15.558s
>>>diff: -9.5%
>>>
>>> The diff is -10.5% for a full shader-db run.
>>> ---
>>>
>>> TODO: The jemalloc dependency should be added to configure.ac before 
>>> this.
>>>
>>> We can probably redirect all malloc/calloc/realloc/free calls in Mesa to
>>> jemalloc. We can either use _mesa_jemalloc, etc. everywhere or we can
>>> redirect calls to jemalloc using #define malloc _mesa_jemalloc, etc.
>>>
>>> Right now, I just use: export LDFLAGS=-ljemalloc
>>
>>
>> Sounds good to me. It should probably be a configurable option, 
>> defaulting
>> to jemalloc and failing if not available unless explicitly disabled.
>
> If it was a configurable option, almost nobody would use it. Let's
> make it mandatory.
>
 This combined with ...

>>
>> On the Gallium side of things, switching to jemalloc could be pretty
>> straightforward via the macros in u_memory.h, once we know that they're
>> actually used consistently (which we currently don't -- it would be nice 
>> to
>> know how jemalloc and glibc malloc react when the calls are mixed).
>
> Redefining malloc/calloc/realloc/free/posix_memalign for all Mesa code
> would be more robust.
>
 ... this doesn't is not a wise move.

 Don't force jemalloc onto everyone without having an explicit ACK from
 a wide audience, please ? Considering the static/shared link (or w/o
 jemalloc all together) distributions will have their
 preferences/policies which won't align with my/your view.
>>>
>>> I guess we can have an option to disable jemalloc, but only if most
>>> users won't use that option. The real problem is that the GLSL
>>> compiler is alloc-bound and anybody wanting to use the GLSL compiler
>>> should stay away from glibc's allocator.
>>>
>>> The GLSL compiler can be slowed down significantly by keeping 5x
>>> LLVMContext in memory between compilations in radeonsi. The fact that
>>> radeonsi can indirectly slow down the GLSL compiler (but not LLVM) is
>>> a strong indication that we have a problem.
>>>
>> If the issue is present in only one driver, one might be looking at
>> the wrong end. Alternatively others will also be in favour of this and
>> things will flow naturally.
>
> The improvement is even better with the Gallium noop driver. When I
> tested noop, the compile time was reduced by 15%. I think radeonsi is
> the driver that will benefit the least from it, not the most.
>
> Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 20/75] st/nine: Initial ProcessVertices support

2016-10-06 Thread Axel Davy


On 05/10/2016 22:08, Axel Davy wrote:

  HRESULT NINE_WINAPI
  NineDevice9_ProcessVertices( struct NineDevice9 *This,
   UINT SrcStartIndex,
@@ -3174,33 +3188,69 @@ NineDevice9_ProcessVertices( struct NineDevice9 *This,
   IDirect3DVertexDeclaration9 *pVertexDecl,
   DWORD Flags )
  {
-struct pipe_screen *screen = This->screen;
+struct pipe_screen *screen_sw = This->screen_sw;
+struct pipe_context *pipe_sw = This->pipe_sw;
  struct NineVertexDeclaration9 *vdecl = 
NineVertexDeclaration9(pVertexDecl);
+struct NineVertexBuffer9 *dst = NineVertexBuffer9(pDestBuffer);
  struct NineVertexShader9 *vs;
  struct pipe_resource *resource;
+struct pipe_transfer *transfer = NULL;
+struct pipe_stream_output_info so;
  struct pipe_stream_output_target *target;
  struct pipe_draw_info draw;
+struct pipe_box box;
+unsigned offsets[1] = {0};
  HRESULT hr;
-unsigned buffer_offset, buffer_size;
+unsigned buffer_size;
+void *map;
  
  DBG("This=%p SrcStartIndex=%u DestIndex=%u VertexCount=%u "

  "pDestBuffer=%p pVertexDecl=%p Flags=%d\n",
  This, SrcStartIndex, DestIndex, VertexCount, pDestBuffer,
  pVertexDecl, Flags);
  
-if (!screen->get_param(screen, PIPE_CAP_MAX_STREAM_OUTPUT_BUFFERS))

-STUB(D3DERR_INVALIDCALL);
+if (!screen_sw->get_param(screen_sw, PIPE_CAP_MAX_STREAM_OUTPUT_BUFFERS)) {
+DBG("ProcessVertices not supported\n");
+return D3DERR_INVALIDCALL;
+}
  
-nine_update_state(This);
  
-/* TODO: Create shader with stream output. */

-STUB(D3DERR_INVALIDCALL);
-struct NineVertexBuffer9 *dst = NineVertexBuffer9(pDestBuffer);
+vs = This->state.programmable_vs ? This->state.vs : This->ff.vs;
+/* Note: version is 0 for ff */
+user_assert(vdecl || (vs->byte_code.version < 0x30 && dst->desc.FVF),
+D3DERR_INVALIDCALL);
+if (!vdecl) {
+DWORD FVF = dst->desc.FVF;
+vdecl = util_hash_table_get(This->ff.ht_fvf, &FVF);
+if (!vdecl) {
+hr = NineVertexDeclaration9_new_from_fvf(This, FVF, &vdecl);
+if (FAILED(hr))
+return hr;
+vdecl->fvf = FVF;
+util_hash_table_set(This->ff.ht_fvf, &vdecl->fvf, vdecl);
+NineUnknown_ConvertRefToBind(NineUnknown(vdecl));
+}
+}
  
-vs = This->state.vs ? This->state.vs : This->ff.vs;

+/* Flags: Can be 0 or D3DPV_DONOTCOPYDATA, and/or lock flags
+ * D3DPV_DONOTCOPYDATA -> Has effect only for ff. In particular
+ * if not set, everything from src will be used, and dst
+ * must match exactly the ff vs outputs.
+ * TODO: Handle all the checks, etc for ff */
+user_assert(vdecl->position_t || This->state.programmable_vs,
+D3DERR_INVALIDCALL);
+
+/* TODO: Support vs < 3 and ff */
+user_assert(vs->byte_code.version == 0x30,
+D3DERR_INVALIDCALL);
+/* TODO: Not hardcode the constant buffers for swvp */
+user_assert(This->may_swvp,
+D3DERR_INVALIDCALL);
+
+nine_state_prepare_draw_sw(This, vdecl, SrcStartIndex, VertexCount, &so);
  
-buffer_size = VertexCount * vs->so->stride[0];

-if (1) {
+buffer_size = VertexCount * so.stride[0] * 4;
+{
  struct pipe_resource templ;
  
  templ.target = PIPE_BUFFER;

@@ -3212,49 +3262,50 @@ NineDevice9_ProcessVertices( struct NineDevice9 *This,
  templ.height0 = templ.depth0 = templ.array_size = 1;
  templ.last_level = templ.nr_samples = 0;
  
-resource = This->screen->resource_create(This->screen, &templ);

+resource = screen_sw->resource_create(screen_sw, &templ);
  if (!resource)
  return E_OUTOFMEMORY;
-buffer_offset = 0;
-} else {
-/* SO matches vertex declaration */
-resource = NineVertexBuffer9_GetResource(dst);
-buffer_offset = DestIndex * vs->so->stride[0];
  }
-target = This->pipe->create_stream_output_target(This->pipe, resource,
- buffer_offset,
- buffer_size);
+target = pipe_sw->create_stream_output_target(pipe_sw, resource,
+  0, buffer_size);
  if (!target) {
  pipe_resource_reference(&resource, NULL);
  return D3DERR_DRIVERINTERNALERROR;
  }
  
-if (!vdecl) {

-hr = NineVertexDeclaration9_new_from_fvf(This, dst->desc.FVF, &vdecl);
-if (FAILED(hr))
-goto out;
-}
-
  init_draw_info(&draw, This, D3DPT_POINTLIST, VertexCount);
  draw.instance_count = 1;
  draw.indexed = FALSE;
-draw.start = SrcStartIndex;
+draw.start = 0;
  draw.index_bias = 0;
-draw.min_index = SrcStartIndex;
-draw.max_index = SrcStartIndex + VertexCount - 1;
+dr

[Mesa-dev] [PATCH] reviewers: Throw myself on the GLX grenade

2016-10-06 Thread Adam Jackson

Signed-off-by: Adam Jackson 
---
 REVIEWERS | 4 
 1 file changed, 4 insertions(+)

diff --git a/REVIEWERS b/REVIEWERS
index f7574b3..f822421 100644
--- a/REVIEWERS
+++ b/REVIEWERS
@@ -104,3 +104,7 @@ F: src/egl/drivers/dri2/platform_wayland.c
 FREEDRENO
 R: Rob Clark 
 F: src/gallium/drivers/freedreno/
+
+GLX
+R: Adam Jackson 
+F: src/glx/
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/2] anv/cmd_buffer: Don't call set_subpass in a secondary

2016-10-06 Thread Jason Ekstrand

On Thu, Oct 6, 2016 at 12:30 PM, Nanley Chery  wrote:

> On Wed, Oct 05, 2016 at 05:36:43PM -0700, Jason Ekstrand wrote:
> > Initially, we had intended set_subpass to be an interesting function that
> > did whatever (presumably a lot) setup we needed for a subpass.  In
> reality,
> > it just sets a pointer and a dirty bit and then emits depth and stencil
> > state.  When we call BeginCommandBuffer on a secondary, all of the dirty
> > bits are already set and there's no point in setting depth and stencil
> > state since it will already be set by the primary.  Instead, the only
> thing
> > we need to do at the start of a secondary is set the subpass pointer.
> >
> > Signed-off-by: Jason Ekstrand 
> > ---
> >  src/intel/vulkan/anv_cmd_buffer.c  | 39 +-
> 
> >  src/intel/vulkan/anv_genX.h|  3 ---
> >  src/intel/vulkan/anv_private.h |  3 ---
> >  src/intel/vulkan/genX_cmd_buffer.c |  5 +
> >  4 files changed, 2 insertions(+), 48 deletions(-)
> >
> > diff --git a/src/intel/vulkan/anv_cmd_buffer.c
> b/src/intel/vulkan/anv_cmd_buffer.c
> > index 9dedde8..ef13dfc 100644
> > --- a/src/intel/vulkan/anv_cmd_buffer.c
> > +++ b/src/intel/vulkan/anv_cmd_buffer.c
> > @@ -407,10 +407,8 @@ VkResult anv_BeginCommandBuffer(
> >cmd_buffer->state.pass =
> >   anv_render_pass_from_handle(pBeginInfo->pInheritanceInfo->
> renderPass);
> >
> > -  struct anv_subpass *subpass =
> > +  cmd_buffer->state.subpass =
> >   &cmd_buffer->state.pass->subpasses[pBeginInfo->
> pInheritanceInfo->subpass];
> > -
> > -  anv_cmd_buffer_set_subpass(cmd_buffer, subpass);
>
> I'm not sure why we always set the fragment descriptor bit in
> set_subpass, but it seems like we need to do it here as well to keep
> the logic the same. I don't see where we set the dirty bits on a
> secondary command buffer at BeginCommandBuffer. Aside from that, this
> patch looks good.
>

Initially, I think we did it to ensure that binding tables got re-emitted.
However, we're now also re-emitting binding tables on pipeline changes and
you have a pipeline change at the top of every subpass, so it shouldn't be
needed either place.


> -Nanley
>
> > }
> >
> > return VK_SUCCESS;
> > @@ -1050,41 +1048,6 @@ anv_cmd_buffer_merge_dynamic(struct
> anv_cmd_buffer *cmd_buffer,
> > return state;
> >  }
> >
> > -/**
> > - * @brief Setup the command buffer for recording commands inside the
> given
> > - * subpass.
> > - *
> > - * This does not record all commands needed for starting the subpass.
> > - * Starting the subpass may require additional commands.
> > - *
> > - * Note that vkCmdBeginRenderPass, vkCmdNextSubpass, and
> vkBeginCommandBuffer
> > - * with VK_COMMAND_BUFFER_USAGE_RENDER_PASS_CONTINUE_BIT, all setup the
> > - * command buffer for recording commands for some subpass.  But only
> the first
> > - * two, vkCmdBeginRenderPass and vkCmdNextSubpass, can start a subpass.
> > - */
> > -void
> > -anv_cmd_buffer_set_subpass(struct anv_cmd_buffer *cmd_buffer,
> > -   struct anv_subpass *subpass)
> > -{
> > -   switch (cmd_buffer->device->info.gen) {
> > -   case 7:
> > -  if (cmd_buffer->device->info.is_haswell) {
> > - gen75_cmd_buffer_set_subpass(cmd_buffer, subpass);
> > -  } else {
> > - gen7_cmd_buffer_set_subpass(cmd_buffer, subpass);
> > -  }
> > -  break;
> > -   case 8:
> > -  gen8_cmd_buffer_set_subpass(cmd_buffer, subpass);
> > -  break;
> > -   case 9:
> > -  gen9_cmd_buffer_set_subpass(cmd_buffer, subpass);
> > -  break;
> > -   default:
> > -  unreachable("unsupported gen\n");
> > -   }
> > -}
> > -
> >  struct anv_state
> >  anv_cmd_buffer_push_constants(struct anv_cmd_buffer *cmd_buffer,
> >gl_shader_stage stage)
> > diff --git a/src/intel/vulkan/anv_genX.h b/src/intel/vulkan/anv_genX.h
> > index 02e79c2..dc2dd5d 100644
> > --- a/src/intel/vulkan/anv_genX.h
> > +++ b/src/intel/vulkan/anv_genX.h
> > @@ -36,9 +36,6 @@ struct anv_state
> >  genX(cmd_buffer_alloc_null_surface_state)(struct anv_cmd_buffer
> *cmd_buffer,
> >struct anv_framebuffer *fb);
> >
> > -void genX(cmd_buffer_set_subpass)(struct anv_cmd_buffer *cmd_buffer,
> > -  struct anv_subpass *subpass);
> > -
> >  void genX(cmd_buffer_apply_pipe_flushes)(struct anv_cmd_buffer
> *cmd_buffer);
> >
> >  void genX(flush_pipeline_select_3d)(struct anv_cmd_buffer *cmd_buffer);
> > diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_
> private.h
> > index 443c31f..4fa403f 100644
> > --- a/src/intel/vulkan/anv_private.h
> > +++ b/src/intel/vulkan/anv_private.h
> > @@ -1394,9 +1394,6 @@ void anv_cmd_buffer_emit_state_base_address(struct
> anv_cmd_buffer *cmd_buffer);
> >  void anv_cmd_state_setup_attachments(struct anv_cmd_buffer *cmd_buffer,
> >   const VkRenderPassBeginInfo *info);
> >

Re: [Mesa-dev] [PATCH 1/2] anv/cmd_buffer: Don't call set_subpass in a secondary

2016-10-06 Thread Nanley Chery

On Wed, Oct 05, 2016 at 05:36:43PM -0700, Jason Ekstrand wrote:
> Initially, we had intended set_subpass to be an interesting function that
> did whatever (presumably a lot) setup we needed for a subpass.  In reality,
> it just sets a pointer and a dirty bit and then emits depth and stencil
> state.  When we call BeginCommandBuffer on a secondary, all of the dirty
> bits are already set and there's no point in setting depth and stencil
> state since it will already be set by the primary.  Instead, the only thing
> we need to do at the start of a secondary is set the subpass pointer.
> 
> Signed-off-by: Jason Ekstrand 
> ---
>  src/intel/vulkan/anv_cmd_buffer.c  | 39 
> +-
>  src/intel/vulkan/anv_genX.h|  3 ---
>  src/intel/vulkan/anv_private.h |  3 ---
>  src/intel/vulkan/genX_cmd_buffer.c |  5 +
>  4 files changed, 2 insertions(+), 48 deletions(-)
> 
> diff --git a/src/intel/vulkan/anv_cmd_buffer.c 
> b/src/intel/vulkan/anv_cmd_buffer.c
> index 9dedde8..ef13dfc 100644
> --- a/src/intel/vulkan/anv_cmd_buffer.c
> +++ b/src/intel/vulkan/anv_cmd_buffer.c
> @@ -407,10 +407,8 @@ VkResult anv_BeginCommandBuffer(
>cmd_buffer->state.pass =
>   
> anv_render_pass_from_handle(pBeginInfo->pInheritanceInfo->renderPass);
>  
> -  struct anv_subpass *subpass =
> +  cmd_buffer->state.subpass =
>   
> &cmd_buffer->state.pass->subpasses[pBeginInfo->pInheritanceInfo->subpass];
> -
> -  anv_cmd_buffer_set_subpass(cmd_buffer, subpass);

I'm not sure why we always set the fragment descriptor bit in
set_subpass, but it seems like we need to do it here as well to keep
the logic the same. I don't see where we set the dirty bits on a
secondary command buffer at BeginCommandBuffer. Aside from that, this
patch looks good.

-Nanley

> }
>  
> return VK_SUCCESS;
> @@ -1050,41 +1048,6 @@ anv_cmd_buffer_merge_dynamic(struct anv_cmd_buffer 
> *cmd_buffer,
> return state;
>  }
>  
> -/**
> - * @brief Setup the command buffer for recording commands inside the given
> - * subpass.
> - *
> - * This does not record all commands needed for starting the subpass.
> - * Starting the subpass may require additional commands.
> - *
> - * Note that vkCmdBeginRenderPass, vkCmdNextSubpass, and vkBeginCommandBuffer
> - * with VK_COMMAND_BUFFER_USAGE_RENDER_PASS_CONTINUE_BIT, all setup the
> - * command buffer for recording commands for some subpass.  But only the 
> first
> - * two, vkCmdBeginRenderPass and vkCmdNextSubpass, can start a subpass.
> - */
> -void
> -anv_cmd_buffer_set_subpass(struct anv_cmd_buffer *cmd_buffer,
> -   struct anv_subpass *subpass)
> -{
> -   switch (cmd_buffer->device->info.gen) {
> -   case 7:
> -  if (cmd_buffer->device->info.is_haswell) {
> - gen75_cmd_buffer_set_subpass(cmd_buffer, subpass);
> -  } else {
> - gen7_cmd_buffer_set_subpass(cmd_buffer, subpass);
> -  }
> -  break;
> -   case 8:
> -  gen8_cmd_buffer_set_subpass(cmd_buffer, subpass);
> -  break;
> -   case 9:
> -  gen9_cmd_buffer_set_subpass(cmd_buffer, subpass);
> -  break;
> -   default:
> -  unreachable("unsupported gen\n");
> -   }
> -}
> -
>  struct anv_state
>  anv_cmd_buffer_push_constants(struct anv_cmd_buffer *cmd_buffer,
>gl_shader_stage stage)
> diff --git a/src/intel/vulkan/anv_genX.h b/src/intel/vulkan/anv_genX.h
> index 02e79c2..dc2dd5d 100644
> --- a/src/intel/vulkan/anv_genX.h
> +++ b/src/intel/vulkan/anv_genX.h
> @@ -36,9 +36,6 @@ struct anv_state
>  genX(cmd_buffer_alloc_null_surface_state)(struct anv_cmd_buffer *cmd_buffer,
>struct anv_framebuffer *fb);
>  
> -void genX(cmd_buffer_set_subpass)(struct anv_cmd_buffer *cmd_buffer,
> -  struct anv_subpass *subpass);
> -
>  void genX(cmd_buffer_apply_pipe_flushes)(struct anv_cmd_buffer *cmd_buffer);
>  
>  void genX(flush_pipeline_select_3d)(struct anv_cmd_buffer *cmd_buffer);
> diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_private.h
> index 443c31f..4fa403f 100644
> --- a/src/intel/vulkan/anv_private.h
> +++ b/src/intel/vulkan/anv_private.h
> @@ -1394,9 +1394,6 @@ void anv_cmd_buffer_emit_state_base_address(struct 
> anv_cmd_buffer *cmd_buffer);
>  void anv_cmd_state_setup_attachments(struct anv_cmd_buffer *cmd_buffer,
>   const VkRenderPassBeginInfo *info);
>  
> -void anv_cmd_buffer_set_subpass(struct anv_cmd_buffer *cmd_buffer,
> -  struct anv_subpass *subpass);
> -
>  struct anv_state
>  anv_cmd_buffer_push_constants(struct anv_cmd_buffer *cmd_buffer,
>gl_shader_stage stage);
> diff --git a/src/intel/vulkan/genX_cmd_buffer.c 
> b/src/intel/vulkan/genX_cmd_buffer.c
> index 6a84383..1dff6a1 100644
> --- a/src/intel/vulkan/genX_cmd_buffer.c
> +++ b/src/intel/vulkan/genX_cmd_buffer.c
> @@ -1292,10 +

[Mesa-dev] [PATCH] anv: Return correct result in EnumeratePhysicalDevices

2016-10-06 Thread Nicolas Koch

If pPhysicalDevices is too small for all physical devices,
the driver must return VK_INCOMPLETE.
Since only a single physical device is supported, this is only the case
when pPhysicalDeviceCount == 0 && pPhysicalDevices != NULL.
---
 src/intel/vulkan/anv_device.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/src/intel/vulkan/anv_device.c b/src/intel/vulkan/anv_device.c
index c7b9979..76cbb69 100644
--- a/src/intel/vulkan/anv_device.c
+++ b/src/intel/vulkan/anv_device.c
@@ -385,6 +385,8 @@ VkResult anv_EnumeratePhysicalDevices(
} else if (*pPhysicalDeviceCount >= 1) {
   pPhysicalDevices[0] = 
anv_physical_device_to_handle(&instance->physicalDevice);
   *pPhysicalDeviceCount = 1;
+   } else if (*pPhysicalDeviceCount < instance->physicalDeviceCount) {
+  return VK_INCOMPLETE;
} else {
   *pPhysicalDeviceCount = 0;
}
-- 
2.10.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 04/75] st/nine: Fix some check flags

2016-10-06 Thread Axel Davy


On 06/10/2016 11:34, Emil Velikov wrote:

On 5 October 2016 at 21:08, Axel Davy  wrote:

Uses the new defines introduced in previous commit.


Please describe why the newly introduced flags are used over the
present ones. Worth copying some of the in-source comment or
referencing it ?

-Emil


Do you mean write for example:

"Uses the new defines introduced in previous commit.
See comment in the commit for more explanation."

?

Axel
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] clover: Allow OpenCL version override

2016-10-06 Thread Vedran Miletić

On 10/06/2016 07:15 PM, Jan Vesely wrote:
> On Thu, 2016-10-06 at 16:26 +0200, Vedran Miletić wrote:
>> CLOVER_CL_VERSION_OVERRIDE allows overriding default OpenCL version
>> supported by Clover, analogous to MESA_GL_VERSION_OVERRIDE for
>> OpenGL.
>> CLOVER_CL_C_VERSION_OVERRIDE allows overridng default OpenCL C
>> version.
> 
> WHat's the use of CL_C_VERSION_OVERRIDE? as implemented it only
> modifies behaviour of the device API query. The specs say that it's
> also the default value of -cl-std used by the compiler. does it makes
> sense to add "cl-std=" option if CLOVER_CL_C_VERSION_OVERRIDE is
> present?
> 
> Jan
> 

You are right, I will look into it.

Thanks,
Vedran

-- 
Vedran Miletić
vedran.miletic.net
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 3/3] softpipe: Cap to 2 GB on 32 bits

2016-10-06 Thread Roland Scheidegger

Am 06.10.2016 um 19:51 schrieb Axel Davy:
> On 32 bits system, application memory is quite limited.
> softpipe uses application memory. To help prevent memory
> exhaustion, limit reported memory availability to 2GB.
> 
> Some gallium nine apps do check reported memory by allocating
> resources until memory is full. Gallium nine refuses allocations
> when 80% of the reported memory limit is used. This change
> helps some apps to start.
> 
> Signed-off-by: Axel Davy 
> ---
>  src/gallium/drivers/softpipe/sp_screen.c | 6 ++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/src/gallium/drivers/softpipe/sp_screen.c 
> b/src/gallium/drivers/softpipe/sp_screen.c
> index cd4269f..b2c3c7a 100644
> --- a/src/gallium/drivers/softpipe/sp_screen.c
> +++ b/src/gallium/drivers/softpipe/sp_screen.c
> @@ -230,6 +230,12 @@ softpipe_get_param(struct pipe_screen *screen, enum 
> pipe_cap param)
>if (!os_get_total_physical_memory(&system_memory))
>   return 0;
>  
> +  if (sizeof(void *) == 4)
> + /* Cap to 2 GB on 32 bits system. We do this because softpipe does
> +  * eat application memory, which is quite limited on 32 bits. App
> +  * shouldn't expect too much available memory. */
> + system_memory = MIN2(system_memory, 2048 << 20);
> +
>return (int)(system_memory >> 20);
> }
> case PIPE_CAP_UMA:
> 

For 1-3/3 (and the new 2/75 for the aliasing issue)
Reviewed-by: Roland Scheidegger 


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 02/75] gallium/util: Really allow aliasing for u_box_union_*

2016-10-06 Thread Axel Davy


Sorry, ignore this, I sent a new one with the corrected title.

On 06/10/2016 19:57, Axel Davy wrote:

Gallium nine relies on aliasing to work with this function.
Without this patch, dirty region tracking was incorrect, which
could lead to incorrect textures or vertex buffers.
Fixes several game bugs with nine.
Fixes https://github.com/iXit/Mesa-3D/issues/234

Signed-off-by: Axel Davy 
Reviewed-by: Patrick Rudolph 

Cc: "12.0" 
---
  src/gallium/auxiliary/util/u_box.h | 31 ---
  1 file changed, 20 insertions(+), 11 deletions(-)

diff --git a/src/gallium/auxiliary/util/u_box.h 
b/src/gallium/auxiliary/util/u_box.h
index eb41f8a..b3f478e 100644
--- a/src/gallium/auxiliary/util/u_box.h
+++ b/src/gallium/auxiliary/util/u_box.h
@@ -124,11 +124,15 @@ static inline void
  u_box_union_2d(struct pipe_box *dst,
 const struct pipe_box *a, const struct pipe_box *b)
  {
-   dst->x = MIN2(a->x, b->x);
-   dst->y = MIN2(a->y, b->y);
+   int x, y;
  
-   dst->width = MAX2(a->x + a->width, b->x + b->width) - dst->x;

-   dst->height = MAX2(a->y + a->height, b->y + b->height) - dst->y;
+   x = MIN2(a->x, b->x);
+   y = MIN2(a->y, b->y);
+
+   dst->width = MAX2(a->x + a->width, b->x + b->width) - x;
+   dst->height = MAX2(a->y + a->height, b->y + b->height) - y;
+   dst->x = x;
+   dst->y = y;
  }
  
  /* Aliasing of @dst permitted. */

@@ -136,13 +140,18 @@ static inline void
  u_box_union_3d(struct pipe_box *dst,
 const struct pipe_box *a, const struct pipe_box *b)
  {
-   dst->x = MIN2(a->x, b->x);
-   dst->y = MIN2(a->y, b->y);
-   dst->z = MIN2(a->z, b->z);
-
-   dst->width = MAX2(a->x + a->width, b->x + b->width) - dst->x;
-   dst->height = MAX2(a->y + a->height, b->y + b->height) - dst->y;
-   dst->depth = MAX2(a->z + a->depth, b->z + b->depth) - dst->z;
+   int x, y, z;
+
+   x = MIN2(a->x, b->x);
+   y = MIN2(a->y, b->y);
+   z = MIN2(a->z, b->z);
+
+   dst->width = MAX2(a->x + a->width, b->x + b->width) - x;
+   dst->height = MAX2(a->y + a->height, b->y + b->height) - y;
+   dst->depth = MAX2(a->z + a->depth, b->z + b->depth) - z;
+   dst->x = x;
+   dst->y = y;
+   dst->z = z;
  }
  
  static inline boolean



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 02/75] gallium/util: Really allow aliasing of dst for u_box_union_*

2016-10-06 Thread Axel Davy

Gallium nine relies on aliasing to work with this function.
Without this patch, dirty region tracking was incorrect, which
could lead to incorrect textures or vertex buffers.
Fixes several game bugs with nine.
Fixes https://github.com/iXit/Mesa-3D/issues/234

Signed-off-by: Axel Davy 
Reviewed-by: Patrick Rudolph 

Cc: "12.0" 
---
 src/gallium/auxiliary/util/u_box.h | 31 ---
 1 file changed, 20 insertions(+), 11 deletions(-)

diff --git a/src/gallium/auxiliary/util/u_box.h 
b/src/gallium/auxiliary/util/u_box.h
index eb41f8a..b3f478e 100644
--- a/src/gallium/auxiliary/util/u_box.h
+++ b/src/gallium/auxiliary/util/u_box.h
@@ -124,11 +124,15 @@ static inline void
 u_box_union_2d(struct pipe_box *dst,
const struct pipe_box *a, const struct pipe_box *b)
 {
-   dst->x = MIN2(a->x, b->x);
-   dst->y = MIN2(a->y, b->y);
+   int x, y;
 
-   dst->width = MAX2(a->x + a->width, b->x + b->width) - dst->x;
-   dst->height = MAX2(a->y + a->height, b->y + b->height) - dst->y;
+   x = MIN2(a->x, b->x);
+   y = MIN2(a->y, b->y);
+
+   dst->width = MAX2(a->x + a->width, b->x + b->width) - x;
+   dst->height = MAX2(a->y + a->height, b->y + b->height) - y;
+   dst->x = x;
+   dst->y = y;
 }
 
 /* Aliasing of @dst permitted. */
@@ -136,13 +140,18 @@ static inline void
 u_box_union_3d(struct pipe_box *dst,
const struct pipe_box *a, const struct pipe_box *b)
 {
-   dst->x = MIN2(a->x, b->x);
-   dst->y = MIN2(a->y, b->y);
-   dst->z = MIN2(a->z, b->z);
-
-   dst->width = MAX2(a->x + a->width, b->x + b->width) - dst->x;
-   dst->height = MAX2(a->y + a->height, b->y + b->height) - dst->y;
-   dst->depth = MAX2(a->z + a->depth, b->z + b->depth) - dst->z;
+   int x, y, z;
+
+   x = MIN2(a->x, b->x);
+   y = MIN2(a->y, b->y);
+   z = MIN2(a->z, b->z);
+
+   dst->width = MAX2(a->x + a->width, b->x + b->width) - x;
+   dst->height = MAX2(a->y + a->height, b->y + b->height) - y;
+   dst->depth = MAX2(a->z + a->depth, b->z + b->depth) - z;
+   dst->x = x;
+   dst->y = y;
+   dst->z = z;
 }
 
 static inline boolean
-- 
2.10.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 02/75] gallium/util: Really allow aliasing for u_box_union_*

2016-10-06 Thread Axel Davy

Gallium nine relies on aliasing to work with this function.
Without this patch, dirty region tracking was incorrect, which
could lead to incorrect textures or vertex buffers.
Fixes several game bugs with nine.
Fixes https://github.com/iXit/Mesa-3D/issues/234

Signed-off-by: Axel Davy 
Reviewed-by: Patrick Rudolph 

Cc: "12.0" 
---
 src/gallium/auxiliary/util/u_box.h | 31 ---
 1 file changed, 20 insertions(+), 11 deletions(-)

diff --git a/src/gallium/auxiliary/util/u_box.h 
b/src/gallium/auxiliary/util/u_box.h
index eb41f8a..b3f478e 100644
--- a/src/gallium/auxiliary/util/u_box.h
+++ b/src/gallium/auxiliary/util/u_box.h
@@ -124,11 +124,15 @@ static inline void
 u_box_union_2d(struct pipe_box *dst,
const struct pipe_box *a, const struct pipe_box *b)
 {
-   dst->x = MIN2(a->x, b->x);
-   dst->y = MIN2(a->y, b->y);
+   int x, y;
 
-   dst->width = MAX2(a->x + a->width, b->x + b->width) - dst->x;
-   dst->height = MAX2(a->y + a->height, b->y + b->height) - dst->y;
+   x = MIN2(a->x, b->x);
+   y = MIN2(a->y, b->y);
+
+   dst->width = MAX2(a->x + a->width, b->x + b->width) - x;
+   dst->height = MAX2(a->y + a->height, b->y + b->height) - y;
+   dst->x = x;
+   dst->y = y;
 }
 
 /* Aliasing of @dst permitted. */
@@ -136,13 +140,18 @@ static inline void
 u_box_union_3d(struct pipe_box *dst,
const struct pipe_box *a, const struct pipe_box *b)
 {
-   dst->x = MIN2(a->x, b->x);
-   dst->y = MIN2(a->y, b->y);
-   dst->z = MIN2(a->z, b->z);
-
-   dst->width = MAX2(a->x + a->width, b->x + b->width) - dst->x;
-   dst->height = MAX2(a->y + a->height, b->y + b->height) - dst->y;
-   dst->depth = MAX2(a->z + a->depth, b->z + b->depth) - dst->z;
+   int x, y, z;
+
+   x = MIN2(a->x, b->x);
+   y = MIN2(a->y, b->y);
+   z = MIN2(a->z, b->z);
+
+   dst->width = MAX2(a->x + a->width, b->x + b->width) - x;
+   dst->height = MAX2(a->y + a->height, b->y + b->height) - y;
+   dst->depth = MAX2(a->z + a->depth, b->z + b->depth) - z;
+   dst->x = x;
+   dst->y = y;
+   dst->z = z;
 }
 
 static inline boolean
-- 
2.10.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 3/3] softpipe: Cap to 2 GB on 32 bits

2016-10-06 Thread Axel Davy

On 32 bits system, application memory is quite limited.
softpipe uses application memory. To help prevent memory
exhaustion, limit reported memory availability to 2GB.

Some gallium nine apps do check reported memory by allocating
resources until memory is full. Gallium nine refuses allocations
when 80% of the reported memory limit is used. This change
helps some apps to start.

Signed-off-by: Axel Davy 
---
 src/gallium/drivers/softpipe/sp_screen.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/src/gallium/drivers/softpipe/sp_screen.c 
b/src/gallium/drivers/softpipe/sp_screen.c
index cd4269f..b2c3c7a 100644
--- a/src/gallium/drivers/softpipe/sp_screen.c
+++ b/src/gallium/drivers/softpipe/sp_screen.c
@@ -230,6 +230,12 @@ softpipe_get_param(struct pipe_screen *screen, enum 
pipe_cap param)
   if (!os_get_total_physical_memory(&system_memory))
  return 0;
 
+  if (sizeof(void *) == 4)
+ /* Cap to 2 GB on 32 bits system. We do this because softpipe does
+  * eat application memory, which is quite limited on 32 bits. App
+  * shouldn't expect too much available memory. */
+ system_memory = MIN2(system_memory, 2048 << 20);
+
   return (int)(system_memory >> 20);
}
case PIPE_CAP_UMA:
-- 
2.10.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 1/3] gallium/os: Fix overflow on 32 bits

2016-10-06 Thread Axel Davy

On systems with more than 4GB of ram,
os_get_total_physical_memory was triggering an integer
overflow for the linux and haiku path, when on
32 bits.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94561

Signed-off-by: Axel Davy 
---
 src/gallium/auxiliary/os/os_misc.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/gallium/auxiliary/os/os_misc.c 
b/src/gallium/auxiliary/os/os_misc.c
index 82e4957..5e2bedc 100644
--- a/src/gallium/auxiliary/os/os_misc.c
+++ b/src/gallium/auxiliary/os/os_misc.c
@@ -128,7 +128,7 @@ os_get_total_physical_memory(uint64_t *size)
const long phys_pages = sysconf(_SC_PHYS_PAGES);
const long page_size = sysconf(_SC_PAGE_SIZE);
 
-   *size = phys_pages * page_size;
+   *size = (int64_t)phys_pages * (int64_t)page_size;
return (phys_pages > 0 && page_size > 0);
 #elif defined(PIPE_OS_APPLE) || defined(PIPE_OS_BSD)
size_t len = sizeof(*size);
@@ -153,7 +153,7 @@ os_get_total_physical_memory(uint64_t *size)
status_t ret;
 
ret = get_system_info(&info);
-   *size = info.max_pages * B_PAGE_SIZE;
+   *size = (int64_t)info.max_pages * (int64_t)B_PAGE_SIZE;
return (ret == B_OK);
 #elif defined(PIPE_OS_WINDOWS)
MEMORYSTATUSEX status;
-- 
2.10.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/3] llvmpipe: Cap to 2 GB on 32 bits

2016-10-06 Thread Axel Davy

On 32 bits system, application memory is quite limited.
llvmpipe uses application memory. To help prevent memory
exhaustion, limit reported memory availability to 2GB.

Some gallium nine apps do check reported memory by allocating
resources until memory is full. Gallium nine refuses allocations
when 80% of the reported memory limit is used. This change
helps some apps to start.

Signed-off-by: Axel Davy 
---
 src/gallium/drivers/llvmpipe/lp_screen.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/src/gallium/drivers/llvmpipe/lp_screen.c 
b/src/gallium/drivers/llvmpipe/lp_screen.c
index 18837a2..9a0a1a2 100644
--- a/src/gallium/drivers/llvmpipe/lp_screen.c
+++ b/src/gallium/drivers/llvmpipe/lp_screen.c
@@ -280,6 +280,12 @@ llvmpipe_get_param(struct pipe_screen *screen, enum 
pipe_cap param)
   if (!os_get_total_physical_memory(&system_memory))
  return 0;
 
+  if (sizeof(void *) == 4)
+ /* Cap to 2 GB on 32 bits system. We do this because llvmpipe does
+  * eat application memory, which is quite limited on 32 bits. App
+  * shouldn't expect too much available memory. */
+ system_memory = MIN2(system_memory, 2048 << 20);
+
   return (int)(system_memory >> 20);
}
case PIPE_CAP_UMA:
-- 
2.10.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/2] radv/winsys: Fix radv_amdgpu_cs_grow min_size argument. (v2)

2016-10-06 Thread Gustaw Smolarczyk

It's supposed to be how much at least we want to grow the cs, not the
minimum size of the cs after growth.

v2: Unbreak use_ib_bos.
Don't mask the ib_size when !use_ib_bos, since it's not needed.
---
 src/amd/vulkan/winsys/amdgpu/radv_amdgpu_cs.c | 11 +++
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/src/amd/vulkan/winsys/amdgpu/radv_amdgpu_cs.c 
b/src/amd/vulkan/winsys/amdgpu/radv_amdgpu_cs.c
index dedc778..c07c092 100644
--- a/src/amd/vulkan/winsys/amdgpu/radv_amdgpu_cs.c
+++ b/src/amd/vulkan/winsys/amdgpu/radv_amdgpu_cs.c
@@ -178,10 +178,6 @@ radv_amdgpu_cs_create(struct radeon_winsys *ws,
 static void radv_amdgpu_cs_grow(struct radeon_winsys_cs *_cs, size_t min_size)
 {
struct radv_amdgpu_cs *cs = radv_amdgpu_cs(_cs);
-   uint64_t ib_size = MAX2(min_size * 4 + 16, cs->base.max_dw * 4 * 2);
-
-   /* max that fits in the chain size field. */
-   ib_size = MIN2(ib_size, 0xf);
 
if (cs->failed) {
cs->base.cdw = 0;
@@ -189,6 +185,8 @@ static void radv_amdgpu_cs_grow(struct radeon_winsys_cs 
*_cs, size_t min_size)
}
 
if (!cs->ws->use_ib_bos) {
+   uint64_t ib_size = MAX2((cs->base.cdw + min_size) * 4 + 16,
+   cs->base.max_dw * 4 * 2);
uint32_t *new_buf = realloc(cs->base.buf, ib_size);
if (new_buf) {
cs->base.buf = new_buf;
@@ -200,6 +198,11 @@ static void radv_amdgpu_cs_grow(struct radeon_winsys_cs 
*_cs, size_t min_size)
return;
}
 
+   uint64_t ib_size = MAX2(min_size * 4 + 16, cs->base.max_dw * 4 * 2);
+
+   /* max that fits in the chain size field. */
+   ib_size = MIN2(ib_size, 0xf);
+
while (!cs->base.cdw || (cs->base.cdw & 7) != 4)
cs->base.buf[cs->base.cdw++] = 0x1000;
 
-- 
2.10.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 08/10] nir: add a loop unrolling pass

2016-10-06 Thread Jason Ekstrand

On Thu, Oct 6, 2016 at 10:22 AM, Jason Ekstrand 
wrote:
>
>
> On Wed, Oct 5, 2016 at 7:25 PM, Timothy Arceri <
> timothy.arc...@collabora.com> wrote:
>
>> Just
>>
>> On Wed, 2016-10-05 at 16:23 -0700, Jason Ekstrand wrote:
>> >
>> >
>> > On Thu, Sep 15, 2016 at 12:03 AM, Timothy Arceri > > abora.com> wrote:
>> > > V2:
>> > > - tidy ups suggested by Connor.
>> > > - tidy up cloning logic and handle copy propagation
>> > >  based of suggestion by Connor.
>> > > - use nir_ssa_def_rewrite_uses to fix up lcssa phis
>> > >   suggested by Connor.
>> > > - add support for complex loop unrolling (two terminators)
>> > > - handle case were the ssa defs use outside the loop is already a
>> > > phi
>> > > - support unrolling loops with multiple terminators when trip count
>> > >   is know for each terminator
>> > > ---
>> > >  src/compiler/Makefile.sources  |   1 +
>> > >  src/compiler/nir/nir.h |   2 +
>> > >  src/compiler/nir/nir_opt_loop_unroll.c | 820
>> > > +
>> > >  3 files changed, 823 insertions(+)
>> > >  create mode 100644 src/compiler/nir/nir_opt_loop_unroll.c
>> > >
>> > > diff --git a/src/compiler/Makefile.sources
>> > > b/src/compiler/Makefile.sources
>> > > index 8ef6080..b3512bb 100644
>> > > --- a/src/compiler/Makefile.sources
>> > > +++ b/src/compiler/Makefile.sources
>> > > @@ -233,6 +233,7 @@ NIR_FILES = \
>> > > nir/nir_opt_dead_cf.c \
>> > > nir/nir_opt_gcm.c \
>> > > nir/nir_opt_global_to_local.c \
>> > > +   nir/nir_opt_loop_unroll.c \
>> > > nir/nir_opt_peephole_select.c \
>> > > nir/nir_opt_remove_phis.c \
>> > > nir/nir_opt_undef.c \
>> > > diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h
>> > > index 9887432..0513d81 100644
>> > > --- a/src/compiler/nir/nir.h
>> > > +++ b/src/compiler/nir/nir.h
>> > > @@ -2661,6 +2661,8 @@ bool nir_opt_dead_cf(nir_shader *shader);
>> > >
>> > >  bool nir_opt_gcm(nir_shader *shader, bool value_number);
>> > >
>> > > +bool nir_opt_loop_unroll(nir_shader *shader, nir_variable_mode
>> > > indirect_mask);
>> > > +
>> > >  bool nir_opt_peephole_select(nir_shader *shader);
>> > >
>> > >  bool nir_opt_remove_phis(nir_shader *shader);
>> > > diff --git a/src/compiler/nir/nir_opt_loop_unroll.c
>> > > b/src/compiler/nir/nir_opt_loop_unroll.c
>> > > new file mode 100644
>> > > index 000..1de02f6
>> > > --- /dev/null
>> > > +++ b/src/compiler/nir/nir_opt_loop_unroll.c
>> > > @@ -0,0 +1,820 @@
>> > > +/*
>> > > + * Copyright © 2016 Intel Corporation
>> > > + *
>> > > + * Permission is hereby granted, free of charge, to any person
>> > > obtaining a
>> > > + * copy of this software and associated documentation files (the
>> > > "Software"),
>> > > + * to deal in the Software without restriction, including without
>> > > limitation
>> > > + * the rights to use, copy, modify, merge, publish, distribute,
>> > > sublicense,
>> > > + * and/or sell copies of the Software, and to permit persons to
>> > > whom the
>> > > + * Software is furnished to do so, subject to the following
>> > > conditions:
>> > > + *
>> > > + * The above copyright notice and this permission notice
>> > > (including the next
>> > > + * paragraph) shall be included in all copies or substantial
>> > > portions of the
>> > > + * Software.
>> > > + *
>> > > + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
>> > > EXPRESS OR
>> > > + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
>> > > MERCHANTABILITY,
>> > > + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO
>> > > EVENT SHALL
>> > > + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM,
>> > > DAMAGES OR OTHER
>> > > + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
>> > > ARISING
>> > > + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
>> > > OTHER
>> > > + * DEALINGS IN THE SOFTWARE.
>> > > + */
>> > > +
>> > > +#include "nir.h"
>> > > +#include "nir_builder.h"
>> > > +#include "nir_control_flow.h"
>> > > +
>> > > +static void
>> > > +extract_loop_body(nir_cf_list *extracted, nir_cf_node *node)
>> >
>> > "node" is not particularly descriptive.  Perhaps "start_node" or
>> > something like that.
>> >
>> > > +{
>> > > +   nir_cf_node *end = node;
>> > > +   while (!nir_cf_node_is_last(end))
>> > > +  end = nir_cf_node_next(end);
>> >
>> > This bit of iteration seems unfortunate.  If you have the loop
>> > pointer, you can just do
>> >
>> > nir_cf_extract(extracted, nir_before_cf_node(node),
>> > nir_after_cf_node(nir_loop_last_cf_node(loop))
>> >
>> > For that matter, is the helper even needed?  If you don't want to
>> > type that much and want to keep the helper, you could easily get the
>> > loop from node->parent.
>>
>> Sure this was one of the first bits I wrote before finding the nir
>> helpers. Will see if I can just remove it.
>>
>> >
>> > > +
>> > > +   nir_cf_extract(extracted, nir_before_cf_node(node),
>> > > +  nir_after_c

[Mesa-dev] [Bug 97952] /usr/include/string.h:518:12: error: exception specification in declaration does not match previous declaration

2016-10-06 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=97952

--- Comment #4 from Rob Clark  ---
(In reply to Timothy Arceri from comment #3)
> I believe its because the ffs param in string.h is an int while the one in
> bitscan.h is unsigned.

oh, yeah, probably should be int in bitscan.h then

(still not entirely sure why this issue starts showing up now, but I guess it
comes down to #include order and luck before)

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 01/75] llvmpipe: Fix overflow for 32 bits available memory computation

2016-10-06 Thread Axel Davy


On 06/10/2016 11:44, Emil Velikov wrote:

Hi Axel,


You seem to have forgotten/ignored virtually every suggestion for this
patch from last time around.

Did you send the wrong patch or ?
Emil

Well it was sitting on our repo, and I got persuaded I had addressed the 
issues.


Looks like not.


I'll send a new version with past and present comments taken into account.


Axel

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/6] st/mesa: simplify some code in get_texture_format_swizzle()

2016-10-06 Thread Marek Olšák

For the series:

Reviewed-by: Marek Olšák 

Marek

On Thu, Oct 6, 2016 at 2:42 AM, Brian Paul  wrote:
> There's no need to cast to st_texture_image.  Just use gl_texture_image.
>
> Reviewed-by: Edward O'Callaghan 
> ---
>  src/mesa/state_tracker/st_atom_texture.c | 10 +-
>  1 file changed, 5 insertions(+), 5 deletions(-)
>
> diff --git a/src/mesa/state_tracker/st_atom_texture.c 
> b/src/mesa/state_tracker/st_atom_texture.c
> index efc8c90..113c0ed 100644
> --- a/src/mesa/state_tracker/st_atom_texture.c
> +++ b/src/mesa/state_tracker/st_atom_texture.c
> @@ -211,11 +211,11 @@ get_texture_format_swizzle(const struct st_context *st,
> */
>if (_mesa_is_gles3(st->ctx) &&
>util_format_is_depth_or_stencil(stObj->pt->format)) {
> - const struct st_texture_image *firstImage =
> -st_texture_image_const(_mesa_base_tex_image(&stObj->base));
> - if (firstImage->base.InternalFormat != GL_DEPTH_COMPONENT &&
> - firstImage->base.InternalFormat != GL_DEPTH_STENCIL &&
> - firstImage->base.InternalFormat != GL_STENCIL_INDEX)
> + const struct gl_texture_image *firstImage =
> +_mesa_base_tex_image(&stObj->base);
> + if (firstImage->InternalFormat != GL_DEPTH_COMPONENT &&
> + firstImage->InternalFormat != GL_DEPTH_STENCIL &&
> + firstImage->InternalFormat != GL_STENCIL_INDEX)
>  depth_mode = GL_RED;
>}
>tex_swizzle = compute_texture_format_swizzle(baseFormat,
> --
> 1.9.1
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 08/10] nir: add a loop unrolling pass

2016-10-06 Thread Jason Ekstrand

On Wed, Oct 5, 2016 at 7:25 PM, Timothy Arceri  wrote:

> Just
>
> On Wed, 2016-10-05 at 16:23 -0700, Jason Ekstrand wrote:
> >
> >
> > On Thu, Sep 15, 2016 at 12:03 AM, Timothy Arceri  > abora.com> wrote:
> > > V2:
> > > - tidy ups suggested by Connor.
> > > - tidy up cloning logic and handle copy propagation
> > >  based of suggestion by Connor.
> > > - use nir_ssa_def_rewrite_uses to fix up lcssa phis
> > >   suggested by Connor.
> > > - add support for complex loop unrolling (two terminators)
> > > - handle case were the ssa defs use outside the loop is already a
> > > phi
> > > - support unrolling loops with multiple terminators when trip count
> > >   is know for each terminator
> > > ---
> > >  src/compiler/Makefile.sources  |   1 +
> > >  src/compiler/nir/nir.h |   2 +
> > >  src/compiler/nir/nir_opt_loop_unroll.c | 820
> > > +
> > >  3 files changed, 823 insertions(+)
> > >  create mode 100644 src/compiler/nir/nir_opt_loop_unroll.c
> > >
> > > diff --git a/src/compiler/Makefile.sources
> > > b/src/compiler/Makefile.sources
> > > index 8ef6080..b3512bb 100644
> > > --- a/src/compiler/Makefile.sources
> > > +++ b/src/compiler/Makefile.sources
> > > @@ -233,6 +233,7 @@ NIR_FILES = \
> > > nir/nir_opt_dead_cf.c \
> > > nir/nir_opt_gcm.c \
> > > nir/nir_opt_global_to_local.c \
> > > +   nir/nir_opt_loop_unroll.c \
> > > nir/nir_opt_peephole_select.c \
> > > nir/nir_opt_remove_phis.c \
> > > nir/nir_opt_undef.c \
> > > diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h
> > > index 9887432..0513d81 100644
> > > --- a/src/compiler/nir/nir.h
> > > +++ b/src/compiler/nir/nir.h
> > > @@ -2661,6 +2661,8 @@ bool nir_opt_dead_cf(nir_shader *shader);
> > >
> > >  bool nir_opt_gcm(nir_shader *shader, bool value_number);
> > >
> > > +bool nir_opt_loop_unroll(nir_shader *shader, nir_variable_mode
> > > indirect_mask);
> > > +
> > >  bool nir_opt_peephole_select(nir_shader *shader);
> > >
> > >  bool nir_opt_remove_phis(nir_shader *shader);
> > > diff --git a/src/compiler/nir/nir_opt_loop_unroll.c
> > > b/src/compiler/nir/nir_opt_loop_unroll.c
> > > new file mode 100644
> > > index 000..1de02f6
> > > --- /dev/null
> > > +++ b/src/compiler/nir/nir_opt_loop_unroll.c
> > > @@ -0,0 +1,820 @@
> > > +/*
> > > + * Copyright © 2016 Intel Corporation
> > > + *
> > > + * Permission is hereby granted, free of charge, to any person
> > > obtaining a
> > > + * copy of this software and associated documentation files (the
> > > "Software"),
> > > + * to deal in the Software without restriction, including without
> > > limitation
> > > + * the rights to use, copy, modify, merge, publish, distribute,
> > > sublicense,
> > > + * and/or sell copies of the Software, and to permit persons to
> > > whom the
> > > + * Software is furnished to do so, subject to the following
> > > conditions:
> > > + *
> > > + * The above copyright notice and this permission notice
> > > (including the next
> > > + * paragraph) shall be included in all copies or substantial
> > > portions of the
> > > + * Software.
> > > + *
> > > + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
> > > EXPRESS OR
> > > + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
> > > MERCHANTABILITY,
> > > + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO
> > > EVENT SHALL
> > > + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM,
> > > DAMAGES OR OTHER
> > > + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
> > > ARISING
> > > + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
> > > OTHER
> > > + * DEALINGS IN THE SOFTWARE.
> > > + */
> > > +
> > > +#include "nir.h"
> > > +#include "nir_builder.h"
> > > +#include "nir_control_flow.h"
> > > +
> > > +static void
> > > +extract_loop_body(nir_cf_list *extracted, nir_cf_node *node)
> >
> > "node" is not particularly descriptive.  Perhaps "start_node" or
> > something like that.
> >
> > > +{
> > > +   nir_cf_node *end = node;
> > > +   while (!nir_cf_node_is_last(end))
> > > +  end = nir_cf_node_next(end);
> >
> > This bit of iteration seems unfortunate.  If you have the loop
> > pointer, you can just do
> >
> > nir_cf_extract(extracted, nir_before_cf_node(node),
> > nir_after_cf_node(nir_loop_last_cf_node(loop))
> >
> > For that matter, is the helper even needed?  If you don't want to
> > type that much and want to keep the helper, you could easily get the
> > loop from node->parent.
>
> Sure this was one of the first bits I wrote before finding the nir
> helpers. Will see if I can just remove it.
>
> >
> > > +
> > > +   nir_cf_extract(extracted, nir_before_cf_node(node),
> > > +  nir_after_cf_node(end));
> > > +}
> > > +
> > > +static void
> > > +clone_list(nir_shader *ns, nir_loop *loop, nir_cf_list
> > > *src_cf_list,
> > > +   nir_cf_list *cloned_cf_list, struct hash_table
> > > *remap_table)
> > > +{

Re: [Mesa-dev] [PATCH 1/2] clover: support CL_PROGRAM_BINARY_TYPE (CL1.2)

2016-10-06 Thread Jan Vesely

On Fri, 2014-12-19 at 16:42 +0100, EdB wrote:
> CL_PROGRAM_BINARY_TYPE have been added to clGetProgramBuildInfo in
> CL1.2
> ---
>  src/gallium/state_trackers/clover/api/program.cpp  |  4 +++
>  src/gallium/state_trackers/clover/core/program.cpp | 31
> +-
>  src/gallium/state_trackers/clover/core/program.hpp |  1 +
>  3 files changed, 29 insertions(+), 7 deletions(-)
> 
> diff --git a/src/gallium/state_trackers/clover/api/program.cpp
> b/src/gallium/state_trackers/clover/api/program.cpp
> index be97ae5..9373fac 100644
> --- a/src/gallium/state_trackers/clover/api/program.cpp
> +++ b/src/gallium/state_trackers/clover/api/program.cpp
> @@ -363,6 +363,10 @@ clGetProgramBuildInfo(cl_program d_prog,
> cl_device_id d_dev,
>    buf.as_string() = prog.build_log(dev);
>    break;
>  
> +   case CL_PROGRAM_BINARY_TYPE:
> +  buf.as_scalar() =
> prog.binary_type(dev);
> +  break;
> +
> default:
>    throw error(CL_INVALID_VALUE);
> }
> diff --git a/src/gallium/state_trackers/clover/core/program.cpp
> b/src/gallium/state_trackers/clover/core/program.cpp
> index 8bece05..5fcde2c 100644
> --- a/src/gallium/state_trackers/clover/core/program.cpp
> +++ b/src/gallium/state_trackers/clover/core/program.cpp
> @@ -147,14 +147,11 @@ program::has_executable() const {
>  
>  bool
>  program::has_linkable(const device &dev) const {
> -   const auto bin = _binaries.find(&dev);
> +   cl_program_binary_type type = binary_type(dev);
>  
> -   if (bin != _binaries.end()) {
> -  const auto &secs = bin->second.secs;
> -  if (any_of(type_equals(module::section::text_compiled), secs)
> ||
> -  any_of(type_equals(module::section::text_library), secs))
> - return true;
> -   }
> +   if (type == CL_PROGRAM_BINARY_TYPE_COMPILED_OBJECT ||
> +   type == CL_PROGRAM_BINARY_TYPE_LIBRARY)
> +  return true;
>  
> return false;
>  }
> @@ -192,6 +189,26 @@ program::build_log(const device &dev) const {
> return _logs.count(&dev) ? _logs.find(&dev)->second : "";
>  }
>  
> +cl_program_binary_type
> +program::binary_type(const device &dev) const {
> +   const auto bin = _binaries.find(&dev);
> +
> +   if (bin != _binaries.end()) {
> +  const auto &secs = bin->second.secs;
> +
> +  if (any_of(type_equals(module::section::text_compiled), secs))
> + return CL_PROGRAM_BINARY_TYPE_COMPILED_OBJECT;
> +
> +  if (any_of(type_equals(module::section::text_library), secs))
> + return CL_PROGRAM_BINARY_TYPE_LIBRARY;
> +
> +  if (any_of(type_equals(module::section::text_executable),
> secs))
> + return CL_PROGRAM_BINARY_TYPE_EXECUTABLE;

Can you add a short comment on ordering of the queries here? does it
matter? can you have sections of different types at the same time?

thanks,
Jan

> +   }
> +
> +   return CL_PROGRAM_BINARY_TYPE_NONE;
> +}
> +
>  const compat::vector &
>  program::symbols() const {
> if (_binaries.empty())
> diff --git a/src/gallium/state_trackers/clover/core/program.hpp
> b/src/gallium/state_trackers/clover/core/program.hpp
> index 19c4420..13abc21 100644
> --- a/src/gallium/state_trackers/clover/core/program.hpp
> +++ b/src/gallium/state_trackers/clover/core/program.hpp
> @@ -63,6 +63,7 @@ namespace clover {
>    cl_build_status build_status(const device &dev) const;
>    std::string build_opts(const device &dev) const;
>    std::string build_log(const device &dev) const;
> +  cl_program_binary_type binary_type(const device &dev) const;
>  
>    const compat::vector &symbols() const;
>  

signature.asc
Description: This is a digitally signed message part
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] clover: Allow OpenCL version override

2016-10-06 Thread Jan Vesely

On Thu, 2016-10-06 at 16:26 +0200, Vedran Miletić wrote:
> CLOVER_CL_VERSION_OVERRIDE allows overriding default OpenCL version
> supported by Clover, analogous to MESA_GL_VERSION_OVERRIDE for
> OpenGL.
> CLOVER_CL_C_VERSION_OVERRIDE allows overridng default OpenCL C
> version.

WHat's the use of CL_C_VERSION_OVERRIDE? as implemented it only
modifies behaviour of the device API query. The specs say that it's
also the default value of -cl-std used by the compiler. does it makes
sense to add "cl-std=" option if CLOVER_CL_C_VERSION_OVERRIDE is
present?

Jan

> ---
>  docs/envvars.html | 12
> 
>  src/gallium/state_trackers/clover/api/device.cpp  |  4 ++--
>  src/gallium/state_trackers/clover/api/platform.cpp|  4 ++--
>  src/gallium/state_trackers/clover/core/device.cpp | 19
> +++
>  src/gallium/state_trackers/clover/core/device.hpp |  4 
>  src/gallium/state_trackers/clover/core/platform.cpp   |  9 +
>  src/gallium/state_trackers/clover/core/platform.hpp   |  3 +++
>  src/gallium/state_trackers/clover/core/program.cpp|  4 +++-
>  src/gallium/state_trackers/clover/llvm/invocation.cpp | 18
> ++
>  src/gallium/state_trackers/clover/llvm/invocation.hpp |  1 +
>  src/gallium/state_trackers/clover/llvm/util.hpp   |  4 ++--
>  11 files changed, 71 insertions(+), 11 deletions(-)
> 
> diff --git a/docs/envvars.html b/docs/envvars.html
> index cf57ca5..f76827b 100644
> --- a/docs/envvars.html
> +++ b/docs/envvars.html
> @@ -235,6 +235,18 @@ Setting to "tgsi", for example, will print all
> the TGSI shaders.
>  See src/mesa/state_tracker/st_debug.c for other options.
>  
>  
> +Clover state tracker environment variables
> +
> +
> +CLOVER_CL_VERSION_OVERRIDE - allows overriding OpenCL version
> returned by
> +clGetPlatformInfo(CL_PLATFORM_VERSION) and
> +clGetDeviceInfo(CL_DEVICE_VERSION). Note that the setting sets
> the version
> +of the platform and all the devices to the same value.
> +CLOVER_CL_C_VERSION_OVERRIDE - allows overriding OpenCL C
> version reported
> +by clGetDeviceInfo(CL_DEVICE_OPENCL_C_VERSION) and the value of
> the
> +__OPENCL_VERSION__ macro in the OpenCL compiler.
> +
> +
>  Softpipe driver environment variables
>  
>  SOFTPIPE_DUMP_FS - if set, the softpipe driver will print
> fragment shaders
> diff --git a/src/gallium/state_trackers/clover/api/device.cpp
> b/src/gallium/state_trackers/clover/api/device.cpp
> index f7bd61b..e23de7a 100644
> --- a/src/gallium/state_trackers/clover/api/device.cpp
> +++ b/src/gallium/state_trackers/clover/api/device.cpp
> @@ -301,7 +301,7 @@ clGetDeviceInfo(cl_device_id d_dev,
> cl_device_info param,
>    break;
>  
> case CL_DEVICE_VERSION:
> -  buf.as_string() = "OpenCL 1.1 Mesa " PACKAGE_VERSION
> +  buf.as_string() = "OpenCL " + dev.opencl_version() + " Mesa "
> PACKAGE_VERSION
>  #ifdef MESA_GIT_SHA1
>  " (" MESA_GIT_SHA1 ")"
>  #endif
> @@ -355,7 +355,7 @@ clGetDeviceInfo(cl_device_id d_dev,
> cl_device_info param,
>    break;
>  
> case CL_DEVICE_OPENCL_C_VERSION:
> -  buf.as_string() = "OpenCL C 1.1 ";
> +  buf.as_string() = "OpenCL C " + dev.opencl_c_version() + " ";
>    break;
>  
> case CL_DEVICE_PARENT_DEVICE:
> diff --git a/src/gallium/state_trackers/clover/api/platform.cpp
> b/src/gallium/state_trackers/clover/api/platform.cpp
> index b1b1fdf..f344ec8 100644
> --- a/src/gallium/state_trackers/clover/api/platform.cpp
> +++ b/src/gallium/state_trackers/clover/api/platform.cpp
> @@ -50,7 +50,7 @@ clover::GetPlatformInfo(cl_platform_id d_platform,
> cl_platform_info param,
>  size_t size, void *r_buf, size_t *r_size)
> try {
> property_buffer buf { r_buf, size, r_size };
>  
> -   obj(d_platform);
> +   auto &platform = obj(d_platform);
>  
> switch (param) {
> case CL_PLATFORM_PROFILE:
> @@ -58,7 +58,7 @@ clover::GetPlatformInfo(cl_platform_id d_platform,
> cl_platform_info param,
>    break;
>  
> case CL_PLATFORM_VERSION:
> -  buf.as_string() = "OpenCL 1.1 Mesa " PACKAGE_VERSION
> +  buf.as_string() = "OpenCL " + platform.opencl_version() + "
> Mesa " PACKAGE_VERSION
>  #ifdef MESA_GIT_SHA1
>  " (" MESA_GIT_SHA1 ")"
>  #endif
> diff --git a/src/gallium/state_trackers/clover/core/device.cpp
> b/src/gallium/state_trackers/clover/core/device.cpp
> index 8825f99..fce6fb3 100644
> --- a/src/gallium/state_trackers/clover/core/device.cpp
> +++ b/src/gallium/state_trackers/clover/core/device.cpp
> @@ -24,6 +24,7 @@
>  #include "core/platform.hpp"
>  #include "pipe/p_screen.h"
>  #include "pipe/p_state.h"
> +#include "util/u_debug.h"
>  
>  using namespace clover;
>  
> @@ -48,6 +49,14 @@ device::device(clover::platform &platform,
> pipe_loader_device *ldev) :
>   pipe->destroy(pipe);
>    throw error(CL_INVALID_DEVICE);
> }
> +
> +   const std::string cl_version_override =
>

Re: [Mesa-dev] [PATCH v4] intel: aubinator: generate a standalone binary

2016-10-06 Thread Jason Ekstrand

On Wed, Oct 5, 2016 at 3:56 PM, Lionel Landwerlin 
wrote:

> Embed the xml files into the binary, so aubinator can be used from any
> location.
>
> v2: Split generation packing into another patch (Jason)
> Check for xxd (Jason)
>
> v3: Fix out of tree builds (Jason)
> Generate custom variable name rather than names generated by xxd
> (Lionel)
>
> v4: Move generated _xml.h files to genxml/ (Sirisha)
>
> Signed-off-by: Lionel Landwerlin 
> Cc: Sirisha Gandikota 
> ---
>  configure.ac |  1 +
>  src/intel/Makefile.genxml.am | 10 -
>  src/intel/Makefile.sources   |  8 +++-
>  src/intel/genxml/.gitignore  |  1 +
>  src/intel/tools/aubinator.c  | 37 +
>  src/intel/tools/decoder.c| 94 +-
> --
>  src/intel/tools/decoder.h|  4 +-
>  7 files changed, 102 insertions(+), 53 deletions(-)
>
> diff --git a/configure.ac b/configure.ac
> index 421f4f3..6b600f5 100644
> --- a/configure.ac
> +++ b/configure.ac
> @@ -110,6 +110,7 @@ LT_PREREQ([2.2])
>  LT_INIT([disable-static])
>
>  AC_CHECK_PROG(RM, rm, [rm -f])
> +AC_CHECK_PROG(XXD, xxd, [xxd])
>
>  AX_PROG_BISON([],
>AS_IF([test ! -f "$srcdir/src/compiler/glsl/
> glcpp/glcpp-parse.c"],
> diff --git a/src/intel/Makefile.genxml.am b/src/intel/Makefile.genxml.am
> index f80e2fd..0ac6b70 100644
> --- a/src/intel/Makefile.genxml.am
> +++ b/src/intel/Makefile.genxml.am
> @@ -21,7 +21,7 @@
>
>  BUILT_SOURCES += $(GENXML_GENERATED_FILES)
>
> -SUFFIXES = _pack.h .xml
> +SUFFIXES = _pack.h _xml.h .xml
>
>  $(GENXML_GENERATED_FILES): genxml/gen_pack_header.py
>
> @@ -29,6 +29,14 @@ $(GENXML_GENERATED_FILES): genxml/gen_pack_header.py
> $(MKDIR_GEN)
> $(PYTHON_GEN) $(srcdir)/genxml/gen_pack_header.py $< > $@
>
> +%_xml.h:  %.xml
> +   $(MKDIR_GEN)
> +   $(AM_V_GEN) echo -n "static const uint8_t " > $@; \
> +   echo -n `basename $@` | sed -e 's,_xml.h,,' >> $@; \
> +   echo "_xml[] = {" >> $@; \
> +   cat $< | $(XXD) -i >> $@; \
> +   echo "};" >> $@
>

What's the purpose of this?  Does xxd not generate a consistent name in
out-of-tree builds?  If that's the case, a comment to that effect would be
nice.

Pumping the regular result of xxd through sed -e
's/\w*\(gen[0-9]*_xml\)/\1/' should work equally well.


> +
>  EXTRA_DIST += \
> genxml/gen4.xml \
> genxml/gen45.xml \
> diff --git a/src/intel/Makefile.sources b/src/intel/Makefile.sources
> index 94073d2..315f127 100644
> --- a/src/intel/Makefile.sources
> +++ b/src/intel/Makefile.sources
> @@ -20,7 +20,13 @@ GENXML_GENERATED_FILES = \
> genxml/gen7_pack.h \
> genxml/gen75_pack.h \
> genxml/gen8_pack.h \
> -   genxml/gen9_pack.h
> +   genxml/gen9_pack.h \
> +   \
>

I don't think the weird newline that isn't a newline is actually doing
anything for us here.


> +   genxml/gen6_xml.h \
> +   genxml/gen7_xml.h \
> +   genxml/gen75_xml.h \
> +   genxml/gen8_xml.h \
> +   genxml/gen9_xml.h
>
>  ISL_FILES = \
> isl/isl.c \
> diff --git a/src/intel/genxml/.gitignore b/src/intel/genxml/.gitignore
> index dd11495..c5672b5 100644
> --- a/src/intel/genxml/.gitignore
> +++ b/src/intel/genxml/.gitignore
> @@ -1 +1,2 @@
>  gen*_pack.h
> +gen*_xml.h
> diff --git a/src/intel/tools/aubinator.c b/src/intel/tools/aubinator.c
> index 44a6bb2..8be7580 100644
> --- a/src/intel/tools/aubinator.c
> +++ b/src/intel/tools/aubinator.c
> @@ -1048,21 +1048,19 @@ int main(int argc, char *argv[])
> int c, i;
> bool help = false, pager = true;
> const char *input_file = NULL;
> -   char gen_file[256], gen_val[24];
> +   char gen_val[24];
> const struct {
>const char *name;
>int pci_id;
> -  int major;
> -  int minor;
> } gens[] = {
> -  { "ivb", 0x0166, 7, 0 }, /* Intel(R) Ivybridge Mobile GT2 */
> -  { "hsw", 0x0416, 7, 5 }, /* Intel(R) Haswell Mobile GT2 */
> -  { "byt", 0x0155, 7, 5 }, /* Intel(R) Bay Trail */
> -  { "bdw", 0x1616, 8, 0 }, /* Intel(R) HD Graphics 5500 (Broadwell
> GT2) */
> -  { "chv", 0x22B3, 8, 0 }, /* Intel(R) HD Graphics (Cherryview) */
> -  { "skl", 0x1912, 9, 0 }, /* Intel(R) HD Graphics 530 (Skylake GT2)
> */
> -  { "kbl", 0x591D, 9, 0 }, /* Intel(R) Kabylake GT2 */
> -  { "bxt", 0x0A84, 9, 0 }  /* Intel(R) HD Graphics (Broxton) */
> +  { "ivb", 0x0166 }, /* Intel(R) Ivybridge Mobile GT2 */
> +  { "hsw", 0x0416 }, /* Intel(R) Haswell Mobile GT2 */
> +  { "byt", 0x0155 }, /* Intel(R) Bay Trail */
> +  { "bdw", 0x1616 }, /* Intel(R) HD Graphics 5500 (Broadwell GT2) */
> +  { "chv", 0x22B3 }, /* Intel(R) HD Graphics (Cherryview) */
> +  { "skl", 0x1912 }, /* Intel(R) HD Graphics 530 (Skylake GT2) */
> +  { "kbl", 0x591D }, /* Intel(R) Kabylake GT2 */
> +  { "bxt", 0x0A84 }  /* Intel(R) HD Graphics (Broxton) */
> }, *gen = NULL;
> const struct option aubinator_opts[] = {
>{ "help",   no_ar

Re: [Mesa-dev] [PATCH 1/3] nir: Add asserts to the casting functions

2016-10-06 Thread Jason Ekstrand

Pushed!

Tim, this is going to cause you a bit of rebase trouble on loop unrolling
but it should actually make your code simpler.

On Wed, Oct 5, 2016 at 9:56 PM, Jason Ekstrand  wrote:

> On Oct 5, 2016 21:42, "Connor Abbott"  wrote:
> >
> > Thanks for doing this! This has always bugged me. For the series,
>
> Yeah, nir_loop_last_cf_node and friends in particular have been bugging me
> for a lng time.
>
> > Reviewed-by: Connor Abbott 
>
> Thanks!
>
> > On Wed, Oct 5, 2016 at 11:37 PM, Jason Ekstrand 
> wrote:
> > > This makes calling nir_foo_as_bar a bit safer because we're no longer
> 100%
> > > trusting in the caller to ensure that it's safe.  The caller still
> needs to
> > > do the right thing but this ensures that we catch invalid casts with an
> > > assert rather than by reading garbage data.  The one downside is that
> we do
> > > use the casts a bit in nir_validate and it's not a validate_assert.
> > >
> > > Signed-off-by: Jason Ekstrand 
> > > ---
> > >  src/compiler/nir/nir.h| 60 --
> -
> > >  src/compiler/nir/nir_search.h |  9 ---
> > >  2 files changed, 45 insertions(+), 24 deletions(-)
> > >
> > > diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h
> > > index 8d1afb9..a122f12 100644
> > > --- a/src/compiler/nir/nir.h
> > > +++ b/src/compiler/nir/nir.h
> > > @@ -59,11 +59,13 @@ struct gl_shader_program;
> > >   * Note that you have to be a bit careful as the generated cast
> function
> > >   * destroys constness.
> > >   */
> > > -#define NIR_DEFINE_CAST(name, in_type, out_type, field)  \
> > > -static inline out_type * \
> > > -name(const in_type *parent)  \
> > > -{\
> > > -   return exec_node_data(out_type, parent, field);   \
> > > +#define NIR_DEFINE_CAST(name, in_type, out_type, field, \
> > > +type_field, type_value) \
> > > +static inline out_type *\
> > > +name(const in_type *parent) \
> > > +{   \
> > > +   assert(parent && parent->type_field == type_value);  \
> > > +   return exec_node_data(out_type, parent, field);  \
> > >  }
> > >
> > >  struct nir_function;
> > > @@ -841,9 +843,12 @@ typedef struct {
> > > unsigned index;
> > >  } nir_deref_struct;
> > >
> > > -NIR_DEFINE_CAST(nir_deref_as_var, nir_deref, nir_deref_var, deref)
> > > -NIR_DEFINE_CAST(nir_deref_as_array, nir_deref, nir_deref_array,
> deref)
> > > -NIR_DEFINE_CAST(nir_deref_as_struct, nir_deref, nir_deref_struct,
> deref)
> > > +NIR_DEFINE_CAST(nir_deref_as_var, nir_deref, nir_deref_var, deref,
> > > +deref_type, nir_deref_type_var)
> > > +NIR_DEFINE_CAST(nir_deref_as_array, nir_deref, nir_deref_array,
> deref,
> > > +deref_type, nir_deref_type_array)
> > > +NIR_DEFINE_CAST(nir_deref_as_struct, nir_deref, nir_deref_struct,
> deref,
> > > +deref_type, nir_deref_type_struct)
> > >
> > >  /* Returns the last deref in the chain. */
> > >  static inline nir_deref *
> > > @@ -1409,16 +1414,25 @@ typedef struct {
> > > struct exec_list entries;
> > >  } nir_parallel_copy_instr;
> > >
> > > -NIR_DEFINE_CAST(nir_instr_as_alu, nir_instr, nir_alu_instr, instr)
> > > -NIR_DEFINE_CAST(nir_instr_as_call, nir_instr, nir_call_instr, instr)
> > > -NIR_DEFINE_CAST(nir_instr_as_jump, nir_instr, nir_jump_instr, instr)
> > > -NIR_DEFINE_CAST(nir_instr_as_tex, nir_instr, nir_tex_instr, instr)
> > > -NIR_DEFINE_CAST(nir_instr_as_intrinsic, nir_instr,
> nir_intrinsic_instr, instr)
> > > -NIR_DEFINE_CAST(nir_instr_as_load_const, nir_instr,
> nir_load_const_instr, instr)
> > > -NIR_DEFINE_CAST(nir_instr_as_ssa_undef, nir_instr,
> nir_ssa_undef_instr, instr)
> > > -NIR_DEFINE_CAST(nir_instr_as_phi, nir_instr, nir_phi_instr, instr)
> > > +NIR_DEFINE_CAST(nir_instr_as_alu, nir_instr, nir_alu_instr, instr,
> > > +type, nir_instr_type_alu)
> > > +NIR_DEFINE_CAST(nir_instr_as_call, nir_instr, nir_call_instr, instr,
> > > +type, nir_instr_type_call)
> > > +NIR_DEFINE_CAST(nir_instr_as_jump, nir_instr, nir_jump_instr, instr,
> > > +type, nir_instr_type_jump)
> > > +NIR_DEFINE_CAST(nir_instr_as_tex, nir_instr, nir_tex_instr, instr,
> > > +type, nir_instr_type_tex)
> > > +NIR_DEFINE_CAST(nir_instr_as_intrinsic, nir_instr,
> nir_intrinsic_instr, instr,
> > > +type, nir_instr_type_intrinsic)
> > > +NIR_DEFINE_CAST(nir_instr_as_load_const, nir_instr,
> nir_load_const_instr, instr,
> > > +type, nir_instr_type_load_const)
> > > +NIR_DEFINE_CAST(nir_instr_as_ssa_undef, nir_instr,
> nir_ssa_undef_instr, instr,
> > > +type, nir_instr_type_ssa_undef)
> > > +NIR_DEFINE_CAST(nir_instr_as_phi, nir_instr, nir_phi_instr, instr,
> > > +type, nir_instr_type

Re: [Mesa-dev] [PATCH 1/3] i965: solve cubemap negative x/y/z faces buffer offset issue in dEQP.

2016-10-06 Thread Jason Ekstrand

On Wed, Oct 5, 2016 at 9:11 PM, Jason Ekstrand  wrote:

> On Wed, Oct 5, 2016 at 7:05 PM, Xu, Randy  wrote:
>
>> Hi, Jason
>>
>>
>>
>> Do you want to add this assert in the patch? I did some test, no issue
>> found, but I don’t see the case that we need override the texture target in
>> brw_emit_surface_state, i.e. surf.dim_layout != dim_layout
>>
>> How can we create this case?  And we may need another patch to solve the
>> issue as it’s a new corner case.
>>
>
> I believe we can only hit that case if we render to it and use a render
> target read.  You probably can hit that case but it'll be a bit tricky to
> trigger.  On second thought, I don't think an assert is right.  Instead, I
> think we probably need to get the tile_x/y from
> intel_miptree_get_tile_offsets and then add that to tile_x and tile_y.  I
> don't think we can ever end up in the case where we have tile offsets
> coming in from EGL *and* we have a non-zero base_level or
> base-array_layer.  In fact, we should probably assert as much.
>

Never mind... Now that I think about it, I don't think that case is
possible.  I think the only time we'll have a tile offset coming in from
outside via an EGL image is if the texture is 2D.  In that case, we won't
hit the surf.dim_layout != dim_layout case and we should be fine.  I think
just the assert that you have below will do.


> --Jason
>
>
>> Thanks,
>>
>> Randy
>>
>>
>>
>> diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
>> b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
>>
>> index 3a5c573..d727526 100644
>>
>> --- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
>>
>> +++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
>>
>> @@ -109,6 +109,7 @@ brw_emit_surface_state(struct brw_context *brw,
>>
>> */
>>
>>assert(brw->has_surface_tile_offset);
>>
>>assert(view.levels == 1 && view.array_len == 1);
>>
>> +  assert(tile_x == 0 && tile_y == 0);
>>
>>
>>
>>offset += intel_miptree_get_tile_offsets(mt, view.base_level,
>>
>> view.base_array_layer,
>>
>>
>>
>>
>>
>>
>>
>> *From:* Jason Ekstrand [mailto:ja...@jlekstrand.net]
>> *Sent:* Thursday, October 6, 2016 1:58 AM
>> *To:* Xu, Randy 
>> *Cc:* Palli, Tapani ;
>> mesa-dev@lists.freedesktop.org
>>
>> *Subject:* Re: [Mesa-dev] [PATCH 1/3] i965: solve cubemap negative x/y/z
>> faces buffer offset issue in dEQP.
>>
>>
>>
>> Randy,
>>
>> I hadn't realized that we could get images in from EGL where we have a
>> non-zero tile_x and tile_y offset for layer 0 mip 0.  That explains
>> things.  In that case, I believe this is the correct patch.  That said, I
>> would like to see an "assert(tile_x == 0 && tile_y == 0)" right before we
>> do the intel_miptree_get_tile_offset() in the case below.  I don't think
>> those can ever happen at the same time, but if they do, I want to know.
>>
>> --Jason
>>
>>
>>
>> On Tue, Oct 4, 2016 at 5:13 PM, Xu, Randy  wrote:
>>
>> Hi, Jason & Tapani
>>
>>
>>
>> Thanks for your review, let me introduce the dEQP failure first.
>>
>>
>>
>> In dEQP-EGL.functional.image.create.gles2_cubemap_negative_*_texture, 2D
>> textures are generated from all 6 faces of a Cubemap texture (64x64), and
>> then rendered through glDrawXXX.
>>
>> In brw_miptree_get_vertical_slice_pitch, the mt->qpitch is counted as
>> 144.
>>
>>   return h0 + h1 + (brw->gen >= 7 ? 12 : 11) * mt->valign;
>> // 64+32+12*4 = 144
>>
>>
>>
>> Take the face negative_x for example, the total offset in bo is
>> 144(y)*64(x)*4(bpp) = 36864.
>>
>> It’s TILING_Y buffer, as the y (144) is not 32 aligned (mask_y = 31 from
>> intel_region_get_tile_masks), the total bo offset is divided into two
>> parts: 36864 =  32768 (offset 128*64*4) + 16(tile_y)*64*4
>>
>>case I915_TILING_Y:
>>
>>   *mask_x = 128 / cpp - 1;
>>
>>   *mask_y = 31;
>>
>>
>>
>>
>>
>> Both the tile_y and offset are passed to texture2D in
>> create_mt_for_dri_image, while the tile_y is not used to count the total
>> offset in rendering path, that’s why I add this patch.
>>
>> Please check and comment more.
>>
>>
>>
>> Thanks,
>>
>> Randy
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> *From:* Jason Ekstrand [mailto:ja...@jlekstrand.net]
>> *Sent:* Tuesday, October 4, 2016 11:59 PM
>> *To:* Palli, Tapani 
>> *Cc:* Xu, Randy ; mesa-dev@lists.freedesktop.org;
>> x...@freedesktop.org
>> *Subject:* Re: [Mesa-dev] [PATCH 1/3] i965: solve cubemap negative x/y/z
>> faces buffer offset issue in dEQP.
>>
>>
>>
>> On Tue, Oct 4, 2016 at 8:55 AM, Tapani Pälli 
>> wrote:
>>
>> On 10/04/2016 06:09 PM, Jason Ekstrand wrote:
>>
>> On Thu, Sep 29, 2016 at 11:27 PM, Xu,Randy  wrote:
>>
>> Add the miptree level/slice x/y_offset when count the surface offset
>> in brw_emit_surface_state. The surface offset has two parts, one is
>> from mt->offset, which should be 32 aligned in width/height for tiled
>> buffer; another is from mt->level[current_level].slice[current_slice].
>> x/y_offset.
>>
>> This fix will so

Re: [Mesa-dev] [PATCH 1/2] radv: Skip already signalled fences.

2016-10-06 Thread Bas Nieuwenhuizen

On Thu, Oct 6, 2016 at 1:09 AM, Gustaw Smolarczyk  wrote:
> If the user created a fence with VK_FENCE_CREATE_SIGNALED_BIT set, we
> shouldn't fail to wait for a fence if it was not submitted since that is
> not necessary.
> ---
>  src/amd/vulkan/radv_device.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/src/amd/vulkan/radv_device.c b/src/amd/vulkan/radv_device.c
> index 1894b10..ed72109 100644
> --- a/src/amd/vulkan/radv_device.c
> +++ b/src/amd/vulkan/radv_device.c
> @@ -,12 +,12 @@ VkResult radv_WaitForFences(
> RADV_FROM_HANDLE(radv_fence, fence, pFences[i]);
> bool expired = false;
>
> -   if (!fence->submitted)
> -   return VK_TIMEOUT;
> -
> if (fence->signalled)
> continue;
>
> +   if (!fence->submitted)
> +   return VK_TIMEOUT;
> +

Reviewed-by: Bas Nieuwenhuizen 

> expired = device->ws->fence_wait(device->ws, fence->fence, 
> true, timeout);
> if (!expired)
> return VK_TIMEOUT;
> --
> 2.10.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/2] radv/winsys: Fix radv_amdgpu_cs_grow min_size argument.

2016-10-06 Thread Bas Nieuwenhuizen

On Thu, Oct 6, 2016 at 1:09 AM, Gustaw Smolarczyk  wrote:
> It's supposed to be how much at least we want to grow the cs, not the
> minimum size of the cs after growth.
> ---
>  src/amd/vulkan/winsys/amdgpu/radv_amdgpu_cs.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/src/amd/vulkan/winsys/amdgpu/radv_amdgpu_cs.c 
> b/src/amd/vulkan/winsys/amdgpu/radv_amdgpu_cs.c
> index dedc778..205b598 100644
> --- a/src/amd/vulkan/winsys/amdgpu/radv_amdgpu_cs.c
> +++ b/src/amd/vulkan/winsys/amdgpu/radv_amdgpu_cs.c
> @@ -178,7 +178,8 @@ radv_amdgpu_cs_create(struct radeon_winsys *ws,
>  static void radv_amdgpu_cs_grow(struct radeon_winsys_cs *_cs, size_t 
> min_size)
>  {
> struct radv_amdgpu_cs *cs = radv_amdgpu_cs(_cs);
> -   uint64_t ib_size = MAX2(min_size * 4 + 16, cs->base.max_dw * 4 * 2);
> +   uint64_t ib_size = MAX2((cs->base.cdw + min_size) * 4 + 16,
> +   cs->base.max_dw * 4 * 2);

The old code is correct when cs->ws->use_ib_bos is set, as we don't
resize the IB but allocate a new one and link, so cdw gets reset to 0.
In the other case there is indeed a bug though.
>
> /* max that fits in the chain size field. */
> ib_size = MIN2(ib_size, 0xf);
> --
> 2.10.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 98048] Mesa CANNOT use libpthread-stubs

2016-10-06 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=98048

Emil Velikov  changed:

   What|Removed |Added

   Assignee|mesa-dev@lists.freedesktop. |x...@lists.freedesktop.org
   |org |
 CC||psyc...@znc.in
Product|Mesa|XCB
  Component|Mesa core   |Library
 QA Contact|mesa-dev@lists.freedesktop. |x...@lists.freedesktop.org
   |org |
Version|git |unspecified

--- Comment #2 from Emil Velikov  ---
Actually seems like libpthreads-stub has decided to pick the hacky patch,
despite my suggestion not to :-\

Rob, Ben, you recall when I said it is not a good idea and it will cause grief
;-)
Well here it is (ableit not sure what/how exactly libpthread-stubs.so ends up
in Ian's link chain)

Gents, can we revert fa6db2f9c018c54a47e94c0175450303d700aa92 ?

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 98048] Mesa CANNOT use libpthread-stubs

2016-10-06 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=98048

Emil Velikov  changed:

   What|Removed |Added

 CC||b...@bwidawsk.net,
   ||robcl...@freedesktop.org

--- Comment #1 from Emil Velikov  ---
I'm slightly confused by the whole thing.

On Linux platforms (and modern Solaris)  building libpthread-stubs results is a
plain .pc file, and no DSO being built/installed. Regardless, the library is
used only by libgbm, which based on your waffle/mixed_glx_egl will/should never
get loaded.

That said, IMHO it would be great if glibc provides a native C11 threads
implementation (like musl) or alternatively we could drop the single recursive
mutex from mesa.

Regardless, can you share a bit more on the topic - do you use
LD_PRELOAD/LD_LIBRARY_PATH, is this specific to local builds and/or
distribution ones ?

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] gallium/hud: Remove superfluous debug

2016-10-06 Thread Emil Velikov

On 6 October 2016 at 16:26, Steven Toth  wrote:
> No longer required.
>
> Signed-off-by: Steven Toth 
> ---
>  src/gallium/auxiliary/hud/hud_cpufreq.c  |  9 -
>  src/gallium/auxiliary/hud/hud_diskstat.c |  8 
>  src/gallium/auxiliary/hud/hud_nic.c  | 16 
>  src/gallium/auxiliary/hud/hud_sensors_temp.c | 19 ---
>  4 files changed, 52 deletions(-)
>
Rb and pushed to master.

To ssh://git.freedesktop.org/git/mesa/mesa
  03350c9..e00fdd6  master -> master

Thanks !
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 5/6] st/mesa: optimize pipe_sampler_view validation

2016-10-06 Thread Brian Paul


On 10/06/2016 01:45 AM, Nicolai Hähnle wrote:

On 06.10.2016 02:42, Brian Paul wrote:

Before, st_get_texture_sampler_view_from_stobj() did a lot of work to
check if the texture parameters matched the sampler view (format,
swizzle, min/max lod, first/last layer, etc).  We did this every time
we validated the texture state.

Now, we use a ctx->Driver.TexParameter() callback and a couple other
checks to proactively release texture views when we know that
view-related parameters have changed.  Then, the validation step is
simplified:
- Search the texture's list of sampler views (just match the context).
- If found, we're done.
- Else, create a new sampler view.

There will never be old, out-of-date sampler views attached to texture
objects that we have to test.

Most apps create textures and set the texture parameters once.  This
make sampler view validation much cheaper for that case.

Note that the old texture/sampler comparison code has been converted
into a set of assertions to verify that the sampler view is in fact
consistent with the texture parameters.  This should help to spot any
potential regressions.

Reviewed-by: Edward O'Callaghan 
---
 src/mesa/state_tracker/st_atom_texture.c | 58

 src/mesa/state_tracker/st_cb_texture.c   | 52

 src/mesa/state_tracker/st_texture.c  | 16 -
 src/mesa/state_tracker/st_texture.h  |  5 +++
 4 files changed, 101 insertions(+), 30 deletions(-)

diff --git a/src/mesa/state_tracker/st_atom_texture.c
b/src/mesa/state_tracker/st_atom_texture.c
index bfa16dc..45f1f6b 100644
--- a/src/mesa/state_tracker/st_atom_texture.c
+++ b/src/mesa/state_tracker/st_atom_texture.c
@@ -370,7 +370,7 @@ st_create_texture_sampler_view_from_stobj(struct
st_context *st,
 static struct pipe_sampler_view *
 st_get_texture_sampler_view_from_stobj(struct st_context *st,
struct st_texture_object *stObj,
-   enum pipe_format format,
+   const struct gl_sampler_object
*samp,
unsigned glsl_version)
 {
struct pipe_sampler_view **sv;
@@ -381,34 +381,42 @@ st_get_texture_sampler_view_from_stobj(struct
st_context *st,

sv = st_texture_get_sampler_view(st, stObj);

-   /* if sampler view has changed dereference it */
if (*sv) {
-  if (check_sampler_swizzle(st, stObj, *sv, glsl_version) ||
-  (format != (*sv)->format) ||
-  gl_target_to_pipe(stObj->base.Target) != (*sv)->target ||
-  stObj->base.MinLevel + stObj->base.BaseLevel !=
(*sv)->u.tex.first_level ||
-  last_level(stObj) != (*sv)->u.tex.last_level ||
-  stObj->base.MinLayer != (*sv)->u.tex.first_layer ||
-  last_layer(stObj) != (*sv)->u.tex.last_layer) {
- pipe_sampler_view_reference(sv, NULL);
+  /* Debug check: make sure that the sampler view's parameters are
+   * what they're supposed to be.
+   */
+  struct pipe_sampler_view *view = *sv;
+  assert(!check_sampler_swizzle(st, stObj, view, glsl_version));
+  assert(get_sampler_view_format(st, stObj, samp) == view->format);
+  assert(gl_target_to_pipe(stObj->base.Target) == view->target);
+  if (stObj->base.Target == GL_TEXTURE_BUFFER) {
+ unsigned base = stObj->base.BufferOffset;
+ unsigned size = MIN2(stObj->pt->width0 - base,
+  (unsigned) stObj->base.BufferSize);
+ assert(view->u.buf.offset == base);
+ assert(view->u.buf.size == size);
+  }
+  else {


Huh, so is this else-style intentional, or a spill-over from another
code base? I wasn't aware of this style in st/mesa.


I've been doing it that way for about 25 years.  The } else { style is 
relatively new to Mesa and I guess I haven't adopted that habit yet.


'git grep else' shows a mix of styles.



Also, the whole sampler lookup code gives me the race condition chills,
but it looks like it's always been that way, so... patches 4 & 5:

Reviewed-by: Nicolai Hähnle 

Patch 6 is Acked-by: Nicolai Hähnle 


Thanks for reviewing.

-Brian


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [Mesa-stable] [PATCH 1/4] i965/sync: Fix uninitalized usage and leak of mutex

2016-10-06 Thread Emil Velikov

Hi Chad,

On 4 October 2016 at 23:37, Chad Versace  wrote:
> We locked an unitialized mutex in the callstack
> glClientWaitSync
> intel_gl_client_wait_sync
> brw_fence_client_wait_sync
> because we forgot to initialize it in intel_gl_fence_sync.
> (The EGLSync codepath didn't have this bug. It initialized the mutex in
> intel_dri_create_sync).
>
> We also forgot to tear down (mtx_destroy) the mutex when destroying
> the sync object.
>
> Cc: mesa-sta...@lists.freedesktop.org
Do you have a few minutes to roll an similar fix for i915 ? It should
be identical (barring naming fixes) to this.

Thanks
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] gallium/hud: Remove superfluous debug

2016-10-06 Thread Steven Toth

No longer required.

Signed-off-by: Steven Toth 
---
 src/gallium/auxiliary/hud/hud_cpufreq.c  |  9 -
 src/gallium/auxiliary/hud/hud_diskstat.c |  8 
 src/gallium/auxiliary/hud/hud_nic.c  | 16 
 src/gallium/auxiliary/hud/hud_sensors_temp.c | 19 ---
 4 files changed, 52 deletions(-)

diff --git a/src/gallium/auxiliary/hud/hud_cpufreq.c 
b/src/gallium/auxiliary/hud/hud_cpufreq.c
index 1296ece..4501bbb 100644
--- a/src/gallium/auxiliary/hud/hud_cpufreq.c
+++ b/src/gallium/auxiliary/hud/hud_cpufreq.c
@@ -46,8 +46,6 @@
 #include 
 #include 
 
-#define LOCAL_DEBUG 0
-
 struct cpufreq_info
 {
struct list_head list;
@@ -139,13 +137,6 @@ hud_cpufreq_graph_install(struct hud_pane *pane, int 
cpu_index,
if (num_cpus <= 0)
   return;
 
-#if LOCAL_DEBUG
-   printf("%s(%d, %s) - Creating HUD object\n", __func__, cpu_index,
-  mode == CPUFREQ_MINIMUM ? "MIN" :
-  mode == CPUFREQ_CURRENT ? "CUR" :
-  mode == CPUFREQ_MAXIMUM ? "MAX" : "UNDEFINED");
-#endif
-
cfi = find_cfi_by_index(cpu_index, mode);
if (!cfi)
   return;
diff --git a/src/gallium/auxiliary/hud/hud_diskstat.c 
b/src/gallium/auxiliary/hud/hud_diskstat.c
index a2290cc..b248baf 100644
--- a/src/gallium/auxiliary/hud/hud_diskstat.c
+++ b/src/gallium/auxiliary/hud/hud_diskstat.c
@@ -46,8 +46,6 @@
 #include 
 #include 
 
-#define LOCAL_DEBUG 0
-
 struct stat_s
 {
/* Read */
@@ -189,12 +187,6 @@ hud_diskstat_graph_install(struct hud_pane *pane, const 
char *dev_name,
if (num_devs <= 0)
   return;
 
-#if LOCAL_DEBUG
-   printf("%s(%s, %s) - Creating HUD object\n", __func__, dev_name,
-  mode == DISKSTAT_RD ? "RD" :
-  mode == DISKSTAT_WR ? "WR" : "UNDEFINED");
-#endif
-
dsi = find_dsi_by_name(dev_name, mode);
if (!dsi)
   return;
diff --git a/src/gallium/auxiliary/hud/hud_nic.c 
b/src/gallium/auxiliary/hud/hud_nic.c
index 36088a0..fb6b8c0 100644
--- a/src/gallium/auxiliary/hud/hud_nic.c
+++ b/src/gallium/auxiliary/hud/hud_nic.c
@@ -48,8 +48,6 @@
 #include 
 #include 
 
-#define LOCAL_DEBUG 0
-
 struct nic_info
 {
struct list_head list;
@@ -168,13 +166,6 @@ query_nic_rssi(const struct nic_info *nic, uint64_t 
*leveldBm)
*leveldBm = ((char) stats.qual.level * -1);
 
close(sockfd);
-
-#if LOCAL_DEBUG
-   printf("NIC signal level%s is %d%s.\n",
-  (stats.qual.updated & IW_QUAL_DBM ? " (in dBm)" : ""),
-  (char) stats.qual.level,
-  (stats.qual.updated & IW_QUAL_LEVEL_UPDATED ? " (updated)" : ""));
-#endif
 }
 
 static void
@@ -268,13 +259,6 @@ hud_nic_graph_install(struct hud_pane *pane, const char 
*nic_name,
if (num_nics <= 0)
   return;
 
-#if LOCAL_DEBUG
-   printf("%s(%s, %s) - Creating HUD object\n", __func__, nic_name,
-  mode == NIC_DIRECTION_RX ? "RX" :
-  mode == NIC_DIRECTION_TX ? "TX" :
-  mode == NIC_RSSI_DBM ? "RSSI" : "UNDEFINED");
-#endif
-
nic = find_nic_by_name(nic_name, mode);
if (!nic)
   return;
diff --git a/src/gallium/auxiliary/hud/hud_sensors_temp.c 
b/src/gallium/auxiliary/hud/hud_sensors_temp.c
index 7d1398a..e41b847 100644
--- a/src/gallium/auxiliary/hud/hud_sensors_temp.c
+++ b/src/gallium/auxiliary/hud/hud_sensors_temp.c
@@ -44,8 +44,6 @@
 #include 
 #include 
 
-#define LOCAL_DEBUG 0
-
 /* TODO: We don't handle dynamic sensor discovery / arrival or removal.
  * Static globals specific to this HUD category.
  */
@@ -139,12 +137,6 @@ get_sensor_values(struct sensors_temp_info *sti)
SENSORS_SUBFEATURE_TEMP_MAX);
if (sf)
   sti->max = get_value(sti->chip, sf);
-#if LOCAL_DEBUG
-   printf("%s.%s.current = %.1f\n", sti->chipname, sti->featurename,
-  sti->current);
-   printf("%s.%s.critical = %.1f\n", sti->chipname, sti->featurename,
-  sti->critical);
-#endif
 }
 
 static struct sensors_temp_info *
@@ -224,14 +216,6 @@ hud_sensors_temp_graph_install(struct hud_pane *pane, 
const char *dev_name,
int num_devs = hud_get_num_sensors(0);
if (num_devs <= 0)
   return;
-#if LOCAL_DEBUG
-   printf("%s(%s, %s) - Creating HUD object\n", __func__, dev_name,
-  mode == SENSORS_VOLTAGE_CURRENT ? "VOLTS" :
-  mode == SENSORS_CURRENT_CURRENT ? "AMPS" :
-  mode == SENSORS_TEMP_CURRENT ? "CU" :
-  mode == SENSORS_POWER_CURRENT ? "POWER" :
-  mode == SENSORS_TEMP_CRITICAL ? "CR" : "UNDEFINED");
-#endif
 
sti = find_sti_by_name(dev_name, mode);
if (!sti)
@@ -281,9 +265,6 @@ create_object(const char *chipname, const char *featurename,
  const sensors_chip_name *chip, const sensors_feature *feature,
  int mode)
 {
-#if LOCAL_DEBUG
-   printf("%03d: %s.%s\n", gsensors_temp_count, chipname, featurename);
-#endif
struct sensors_temp_info *sti = CALLOC_STRUCT(sensors_temp_info);
 
sti->mode = mode;
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.free

Re: [Mesa-dev] [PATCH] autoconf: Make header install distinct for various APIs (v2)

2016-10-06 Thread Emil Velikov

On 4 October 2016 at 16:05, Chuck Atkins  wrote:
> This fixes a problem where GL headers would only get installed if
> glx was enabled.  So if osmesa was enabled but not glx, then the
> GL headers required by osmesa would be missing from the install.
>
> v2: Dropped unneeded mesa_glinterop.h redundant osmesa.h install
>
> CC: Emil Velikov 
> Signed-off-by: Chuck Atkins 
Rb and merged to master.

You're a star, thank you !
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/2] loader/dri3: add get_dri_screen() to the vtable

2016-10-06 Thread Emil Velikov

On 6 October 2016 at 15:13, Martin Peres  wrote:
> This allows querying the current active screen from the
> loader's common code.
>
> Signed-off-by: Martin Peres 
> ---
>  src/egl/drivers/dri2/platform_x11_dri3.c | 12 
>  src/glx/dri3_glx.c   | 11 +++
>  src/loader/loader_dri3_helper.h  |  1 +
>  3 files changed, 24 insertions(+)
>
> diff --git a/src/egl/drivers/dri2/platform_x11_dri3.c 
> b/src/egl/drivers/dri2/platform_x11_dri3.c
> index 31649fe..d93f5bc 100644
> --- a/src/egl/drivers/dri2/platform_x11_dri3.c
> +++ b/src/egl/drivers/dri2/platform_x11_dri3.c
> @@ -103,6 +103,17 @@ egl_dri3_get_dri_context(struct loader_dri3_drawable 
> *draw)
> return dri2_ctx->dri_context;
>  }
>
> +static __DRIscreen *
> +egl_dri3_get_dri_screen(struct loader_dri3_drawable *draw)
> +{
> +   _EGLContext *ctx = _eglGetCurrentContext();
> +   struct dri2_egl_context *dri2_ctx;
> +   if (!ctx)
> +  return NULL;


Only the loader_dri3 code seems to do this NULL check. I'm wondering
how likely it is to hit.
At the same time, many places could/should check if we have a dummyctx
(via _eglIsCurrentThreadDummy) yet they don't bother.

We had a similar (bug) hunt on the GLX side recently and we should
audit the EGL codepaths one of these days.


> +   dri2_ctx = dri2_egl_context(ctx);
> +   return dri2_egl_display(dri2_ctx->base.Resource.Display)->dri_screen;
> +}
> +
>  static void
>  egl_dri3_flush_drawable(struct loader_dri3_drawable *draw, unsigned flags)
>  {
> @@ -119,6 +130,7 @@ static struct loader_dri3_vtable egl_dri3_vtable = {
> .set_drawable_size = egl_dri3_set_drawable_size,
> .in_current_context = egl_dri3_in_current_context,
> .get_dri_context = egl_dri3_get_dri_context,
> +   .get_dri_screen = egl_dri3_get_dri_screen,
> .flush_drawable = egl_dri3_flush_drawable,
> .show_fps = NULL,
>  };
> diff --git a/src/glx/dri3_glx.c b/src/glx/dri3_glx.c
> index 90d7bba..3bc2e1b 100644
> --- a/src/glx/dri3_glx.c
> +++ b/src/glx/dri3_glx.c
> @@ -132,6 +132,16 @@ glx_dri3_get_dri_context(struct loader_dri3_drawable 
> *draw)
> return (gc != &dummyContext) ? dri3Ctx->driContext : NULL;
>  }
>
> +static __DRIscreen *
> +glx_dri3_get_dri_screen(struct loader_dri3_drawable *draw)
> +{
> +   struct glx_context *gc = __glXGetCurrentContext();
> +   struct dri3_context *pcp = (struct dri3_context *) 
> __glXGetCurrentContext();
s/__glXGetCurrentContext()/gc/

With this small fix the series is:
Cc: mesa-sta...@lists.freedesktop.org
Reviewed-by: Emil Velikov 

Thanks,
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] clover: Allow OpenCL version override

2016-10-06 Thread Vedran Miletić

On 10/06/2016 04:26 PM, Vedran Miletić wrote:
> CLOVER_CL_VERSION_OVERRIDE allows overriding default OpenCL version
> supported by Clover, analogous to MESA_GL_VERSION_OVERRIDE for OpenGL.
> CLOVER_CL_C_VERSION_OVERRIDE allows overridng default OpenCL C version.
> ---
>  docs/envvars.html | 12 
>  src/gallium/state_trackers/clover/api/device.cpp  |  4 ++--
>  src/gallium/state_trackers/clover/api/platform.cpp|  4 ++--
>  src/gallium/state_trackers/clover/core/device.cpp | 19 
> +++
>  src/gallium/state_trackers/clover/core/device.hpp |  4 
>  src/gallium/state_trackers/clover/core/platform.cpp   |  9 +
>  src/gallium/state_trackers/clover/core/platform.hpp   |  3 +++
>  src/gallium/state_trackers/clover/core/program.cpp|  4 +++-
>  src/gallium/state_trackers/clover/llvm/invocation.cpp | 18 ++
>  src/gallium/state_trackers/clover/llvm/invocation.hpp |  1 +
>  src/gallium/state_trackers/clover/llvm/util.hpp   |  4 ++--
>  11 files changed, 71 insertions(+), 11 deletions(-)
> 

This will conflict with [1] due to both modifying tokenize in an
incompatible way. I would prefer if we can merge [1] first, then I will
rebase this one.

Regards,
Vedran

[1]
https://lists.freedesktop.org/archives/mesa-dev/2016-September/130001.html

-- 
Vedran Miletić
vedran.miletic.net
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2] clover: assert struct argument is compiled usably

2016-10-06 Thread Vedran Miletić

Make sure that a struct argument did not get compiled into a pointer
type with the byval attribute. If we try to handle the pointer with
byval, we end up with the pointer size instead of the struct size.

Struct arguments on AMD GPUs will work correctly on Clang versions
containing https://reviews.llvm.org/D20168 or an equivalent patch.

Signed-off-by: Vedran Miletić 
---
 src/gallium/state_trackers/clover/llvm/codegen/common.cpp | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/src/gallium/state_trackers/clover/llvm/codegen/common.cpp 
b/src/gallium/state_trackers/clover/llvm/codegen/common.cpp
index 834b06a..2d1759e 100644
--- a/src/gallium/state_trackers/clover/llvm/codegen/common.cpp
+++ b/src/gallium/state_trackers/clover/llvm/codegen/common.cpp
@@ -78,6 +78,9 @@ namespace {
 
   for (const auto &arg : f.args()) {
  const auto arg_type = arg.getType();
+ assert(!(arg_type->isPointerTy() && arg.hasByValAttr() &&
+arg_type->getPointerElementType()->isStructTy()) &&
+"Unable to handle struct compiled as pointer with byval.");
 
  // OpenCL 1.2 specification, Ch. 6.1.5: "A built-in data
  // type that is not a power of two bytes in size must be
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] clover: Allow OpenCL version override

2016-10-06 Thread Vedran Miletić

CLOVER_CL_VERSION_OVERRIDE allows overriding default OpenCL version
supported by Clover, analogous to MESA_GL_VERSION_OVERRIDE for OpenGL.
CLOVER_CL_C_VERSION_OVERRIDE allows overridng default OpenCL C version.
---
 docs/envvars.html | 12 
 src/gallium/state_trackers/clover/api/device.cpp  |  4 ++--
 src/gallium/state_trackers/clover/api/platform.cpp|  4 ++--
 src/gallium/state_trackers/clover/core/device.cpp | 19 +++
 src/gallium/state_trackers/clover/core/device.hpp |  4 
 src/gallium/state_trackers/clover/core/platform.cpp   |  9 +
 src/gallium/state_trackers/clover/core/platform.hpp   |  3 +++
 src/gallium/state_trackers/clover/core/program.cpp|  4 +++-
 src/gallium/state_trackers/clover/llvm/invocation.cpp | 18 ++
 src/gallium/state_trackers/clover/llvm/invocation.hpp |  1 +
 src/gallium/state_trackers/clover/llvm/util.hpp   |  4 ++--
 11 files changed, 71 insertions(+), 11 deletions(-)

diff --git a/docs/envvars.html b/docs/envvars.html
index cf57ca5..f76827b 100644
--- a/docs/envvars.html
+++ b/docs/envvars.html
@@ -235,6 +235,18 @@ Setting to "tgsi", for example, will print all the TGSI 
shaders.
 See src/mesa/state_tracker/st_debug.c for other options.
 
 
+Clover state tracker environment variables
+
+
+CLOVER_CL_VERSION_OVERRIDE - allows overriding OpenCL version returned by
+clGetPlatformInfo(CL_PLATFORM_VERSION) and
+clGetDeviceInfo(CL_DEVICE_VERSION). Note that the setting sets the version
+of the platform and all the devices to the same value.
+CLOVER_CL_C_VERSION_OVERRIDE - allows overriding OpenCL C version reported
+by clGetDeviceInfo(CL_DEVICE_OPENCL_C_VERSION) and the value of the
+__OPENCL_VERSION__ macro in the OpenCL compiler.
+
+
 Softpipe driver environment variables
 
 SOFTPIPE_DUMP_FS - if set, the softpipe driver will print fragment shaders
diff --git a/src/gallium/state_trackers/clover/api/device.cpp 
b/src/gallium/state_trackers/clover/api/device.cpp
index f7bd61b..e23de7a 100644
--- a/src/gallium/state_trackers/clover/api/device.cpp
+++ b/src/gallium/state_trackers/clover/api/device.cpp
@@ -301,7 +301,7 @@ clGetDeviceInfo(cl_device_id d_dev, cl_device_info param,
   break;
 
case CL_DEVICE_VERSION:
-  buf.as_string() = "OpenCL 1.1 Mesa " PACKAGE_VERSION
+  buf.as_string() = "OpenCL " + dev.opencl_version() + " Mesa " 
PACKAGE_VERSION
 #ifdef MESA_GIT_SHA1
 " (" MESA_GIT_SHA1 ")"
 #endif
@@ -355,7 +355,7 @@ clGetDeviceInfo(cl_device_id d_dev, cl_device_info param,
   break;
 
case CL_DEVICE_OPENCL_C_VERSION:
-  buf.as_string() = "OpenCL C 1.1 ";
+  buf.as_string() = "OpenCL C " + dev.opencl_c_version() + " ";
   break;
 
case CL_DEVICE_PARENT_DEVICE:
diff --git a/src/gallium/state_trackers/clover/api/platform.cpp 
b/src/gallium/state_trackers/clover/api/platform.cpp
index b1b1fdf..f344ec8 100644
--- a/src/gallium/state_trackers/clover/api/platform.cpp
+++ b/src/gallium/state_trackers/clover/api/platform.cpp
@@ -50,7 +50,7 @@ clover::GetPlatformInfo(cl_platform_id d_platform, 
cl_platform_info param,
 size_t size, void *r_buf, size_t *r_size) try {
property_buffer buf { r_buf, size, r_size };
 
-   obj(d_platform);
+   auto &platform = obj(d_platform);
 
switch (param) {
case CL_PLATFORM_PROFILE:
@@ -58,7 +58,7 @@ clover::GetPlatformInfo(cl_platform_id d_platform, 
cl_platform_info param,
   break;
 
case CL_PLATFORM_VERSION:
-  buf.as_string() = "OpenCL 1.1 Mesa " PACKAGE_VERSION
+  buf.as_string() = "OpenCL " + platform.opencl_version() + " Mesa " 
PACKAGE_VERSION
 #ifdef MESA_GIT_SHA1
 " (" MESA_GIT_SHA1 ")"
 #endif
diff --git a/src/gallium/state_trackers/clover/core/device.cpp 
b/src/gallium/state_trackers/clover/core/device.cpp
index 8825f99..fce6fb3 100644
--- a/src/gallium/state_trackers/clover/core/device.cpp
+++ b/src/gallium/state_trackers/clover/core/device.cpp
@@ -24,6 +24,7 @@
 #include "core/platform.hpp"
 #include "pipe/p_screen.h"
 #include "pipe/p_state.h"
+#include "util/u_debug.h"
 
 using namespace clover;
 
@@ -48,6 +49,14 @@ device::device(clover::platform &platform, 
pipe_loader_device *ldev) :
  pipe->destroy(pipe);
   throw error(CL_INVALID_DEVICE);
}
+
+   const std::string cl_version_override =
+ debug_get_option("CLOVER_CL_VERSION_OVERRIDE", 
"");
+   ocl_version = !cl_version_override.empty() ? cl_version_override : "1.1";
+
+   const std::string clc_version_override =
+debug_get_option("CLOVER_CLC_VERSION_OVERRIDE", 
"");
+   oclc_version = !clc_version_override.empty() ? clc_version_override : "1.1";
 }
 
 device::~device() {
@@ -209,6 +218,16 @@ device::vendor_name() const {
return pipe->get_device_vendor(pipe);
 }
 
+std::string
+device::opencl_version() const {
+return ocl_version;
+}
+
+std::st

Re: [Mesa-dev] [PATCH] vbo: disable primitive restart when GL >= 4.5

2016-10-06 Thread Daniel Scharrer

On 2016-10-06 15:22, Martina Kollarova wrote:
> The OpenGL 4.5 spec updated the section on primitive restart, and now it
> doesn't have to be performed on drawing commands not taking a parameter,
> regardless of whether PRIMITIVE_RESTART_FIXED_INDEX is enabled or not.
> 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98106
> Signed-off-by: Martina Kollarova 
> ---
>  src/mesa/vbo/vbo_exec_array.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/src/mesa/vbo/vbo_exec_array.c b/src/mesa/vbo/vbo_exec_array.c
> index 46543f8..cf1ba13 100644
> --- a/src/mesa/vbo/vbo_exec_array.c
> +++ b/src/mesa/vbo/vbo_exec_array.c
> @@ -423,7 +423,7 @@ vbo_draw_arrays(struct gl_context *ctx, GLenum mode, 
> GLint start,
>  
> /* Implement the primitive restart index */
> if (ctx->Array.PrimitiveRestart && !ctx->Array.PrimitiveRestartFixedIndex 
> &&
> -   ctx->Array.RestartIndex < count) {
> +   ctx->Version < 45 && ctx->Array.RestartIndex < count) {
>GLuint primCount = 0;
>  
>if (ctx->Array.RestartIndex == start) {
> 

Mesa currently sets ctx->Version to the highest version the driver supports for
the selected profile, which may be higher than the GL version requested in
glXCreateContextAttribsARB. I'm afraid that this change could theoretically
break conformant applications written against older OpenGL versions.

Maybe Mesa should limit the context version to the one requested by the
application. The blob drivers seem to do this (at least the AMD one) and the
current behaviour has already caused problems before:
 https://bugs.freedesktop.org/show_bug.cgi?id=95374

--
Daniel Scharrer
http://constexpr.org/
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] vbo: disable primitive restart when GL >= 4.5

2016-10-06 Thread Marek Olšák

That's funny, because Polaris is our first hardware that supports
primitive restart with DrawArrays (including draw indirect), while
OpenGL removed the support at the same time.

Reviewed-by: Marek Olšák 

Marek

On Thu, Oct 6, 2016 at 3:22 PM, Martina Kollarova
 wrote:
> The OpenGL 4.5 spec updated the section on primitive restart, and now it
> doesn't have to be performed on drawing commands not taking a parameter,
> regardless of whether PRIMITIVE_RESTART_FIXED_INDEX is enabled or not.
>
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98106
> Signed-off-by: Martina Kollarova 
> ---
>  src/mesa/vbo/vbo_exec_array.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/src/mesa/vbo/vbo_exec_array.c b/src/mesa/vbo/vbo_exec_array.c
> index 46543f8..cf1ba13 100644
> --- a/src/mesa/vbo/vbo_exec_array.c
> +++ b/src/mesa/vbo/vbo_exec_array.c
> @@ -423,7 +423,7 @@ vbo_draw_arrays(struct gl_context *ctx, GLenum mode, 
> GLint start,
>
> /* Implement the primitive restart index */
> if (ctx->Array.PrimitiveRestart && !ctx->Array.PrimitiveRestartFixedIndex 
> &&
> -   ctx->Array.RestartIndex < count) {
> +   ctx->Version < 45 && ctx->Array.RestartIndex < count) {
>GLuint primCount = 0;
>
>if (ctx->Array.RestartIndex == start) {
> --
> 1.9.1
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 1/2] loader/dri3: add get_dri_screen() to the vtable

2016-10-06 Thread Martin Peres

This allows querying the current active screen from the
loader's common code.

Signed-off-by: Martin Peres 
---
 src/egl/drivers/dri2/platform_x11_dri3.c | 12 
 src/glx/dri3_glx.c   | 11 +++
 src/loader/loader_dri3_helper.h  |  1 +
 3 files changed, 24 insertions(+)

diff --git a/src/egl/drivers/dri2/platform_x11_dri3.c 
b/src/egl/drivers/dri2/platform_x11_dri3.c
index 31649fe..d93f5bc 100644
--- a/src/egl/drivers/dri2/platform_x11_dri3.c
+++ b/src/egl/drivers/dri2/platform_x11_dri3.c
@@ -103,6 +103,17 @@ egl_dri3_get_dri_context(struct loader_dri3_drawable *draw)
return dri2_ctx->dri_context;
 }
 
+static __DRIscreen *
+egl_dri3_get_dri_screen(struct loader_dri3_drawable *draw)
+{
+   _EGLContext *ctx = _eglGetCurrentContext();
+   struct dri2_egl_context *dri2_ctx;
+   if (!ctx)
+  return NULL;
+   dri2_ctx = dri2_egl_context(ctx);
+   return dri2_egl_display(dri2_ctx->base.Resource.Display)->dri_screen;
+}
+
 static void
 egl_dri3_flush_drawable(struct loader_dri3_drawable *draw, unsigned flags)
 {
@@ -119,6 +130,7 @@ static struct loader_dri3_vtable egl_dri3_vtable = {
.set_drawable_size = egl_dri3_set_drawable_size,
.in_current_context = egl_dri3_in_current_context,
.get_dri_context = egl_dri3_get_dri_context,
+   .get_dri_screen = egl_dri3_get_dri_screen,
.flush_drawable = egl_dri3_flush_drawable,
.show_fps = NULL,
 };
diff --git a/src/glx/dri3_glx.c b/src/glx/dri3_glx.c
index 90d7bba..3bc2e1b 100644
--- a/src/glx/dri3_glx.c
+++ b/src/glx/dri3_glx.c
@@ -132,6 +132,16 @@ glx_dri3_get_dri_context(struct loader_dri3_drawable *draw)
return (gc != &dummyContext) ? dri3Ctx->driContext : NULL;
 }
 
+static __DRIscreen *
+glx_dri3_get_dri_screen(struct loader_dri3_drawable *draw)
+{
+   struct glx_context *gc = __glXGetCurrentContext();
+   struct dri3_context *pcp = (struct dri3_context *) __glXGetCurrentContext();
+   struct dri3_screen *psc = (struct dri3_screen *) pcp->base.psc;
+
+   return (gc != &dummyContext && psc) ? psc->driScreen : NULL;
+}
+
 static void
 glx_dri3_flush_drawable(struct loader_dri3_drawable *draw, unsigned flags)
 {
@@ -169,6 +179,7 @@ static struct loader_dri3_vtable glx_dri3_vtable = {
.set_drawable_size = glx_dri3_set_drawable_size,
.in_current_context = glx_dri3_in_current_context,
.get_dri_context = glx_dri3_get_dri_context,
+   .get_dri_screen = glx_dri3_get_dri_screen,
.flush_drawable = glx_dri3_flush_drawable,
.show_fps = glx_dri3_show_fps,
 };
diff --git a/src/loader/loader_dri3_helper.h b/src/loader/loader_dri3_helper.h
index 5b8fd1d..658e190 100644
--- a/src/loader/loader_dri3_helper.h
+++ b/src/loader/loader_dri3_helper.h
@@ -103,6 +103,7 @@ struct loader_dri3_vtable {
void (*set_drawable_size)(struct loader_dri3_drawable *, int, int);
bool (*in_current_context)(struct loader_dri3_drawable *);
__DRIcontext *(*get_dri_context)(struct loader_dri3_drawable *);
+   __DRIscreen *(*get_dri_screen)(struct loader_dri3_drawable *);
void (*flush_drawable)(struct loader_dri3_drawable *, unsigned);
void (*show_fps)(struct loader_dri3_drawable *, uint64_t);
 };
-- 
2.10.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/2] loader/dri3: import prime buffers in the currently-bound screen

2016-10-06 Thread Martin Peres

This tries to mirrors the codepath taken by DRI2 in IntelSetTexBuffer2()
and fixes many applications when using DRI3:
 - Totem with libva on hw-accelerated decoding
 - obs-studio, using Window Capture (Xcomposite) as a Source
 - gstreamer with VAAPI

v2:
 - introduce get_dri_screen() in the dri3 loader's vtable (krh)

Tested-by: Timo Aaltonen 
Tested-by: Ionut Biru 
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=71759
Signed-off-by: Martin Peres 
---
 src/loader/loader_dri3_helper.c | 12 +++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/src/loader/loader_dri3_helper.c b/src/loader/loader_dri3_helper.c
index 3ce0352..8179297 100644
--- a/src/loader/loader_dri3_helper.c
+++ b/src/loader/loader_dri3_helper.c
@@ -1117,6 +1117,7 @@ dri3_get_pixmap_buffer(__DRIdrawable *driDrawable, 
unsigned int format,
xcb_sync_fence_t sync_fence;
struct xshmfence *shm_fence;
int  fence_fd;
+   __DRIscreen  *cur_screen;
 
if (buffer)
   return buffer;
@@ -1147,8 +1148,17 @@ dri3_get_pixmap_buffer(__DRIdrawable *driDrawable, 
unsigned int format,
if (!bp_reply)
   goto no_image;
 
+   /* Get the currently-bound screen or revert to using the drawable's screen 
if
+* no contexts are currently bound. The latter case is at least necessary 
for
+* obs-studio, when using Window Capture (Xcomposite) as a Source.
+*/
+   cur_screen = draw->vtable->get_dri_screen(draw);
+   if (!cur_screen) {
+   cur_screen = draw->dri_screen;
+   }
+
buffer->image = loader_dri3_create_image(draw->conn, bp_reply, format,
-draw->dri_screen, draw->ext->image,
+cur_screen, draw->ext->image,
 buffer);
if (!buffer->image)
   goto no_image;
-- 
2.10.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

1 2 >

1 - 100 of 123 matches

Mail list logo