Re: [Mesa-dev] [PATCH 10/12] i965/fs: Reimplement emit_texture() in terms of logical send messages.

2015-07-22 Thread Jason Ekstrand
On Wed, Jul 22, 2015 at 12:41 AM, Francisco Jerez curroje...@riseup.net wrote:
 Jason Ekstrand ja...@jlekstrand.net writes:

 On Sat, Jul 18, 2015 at 7:34 AM, Francisco Jerez curroje...@riseup.net 
 wrote:
 ---
  src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 66 
 +---
  1 file changed, 49 insertions(+), 17 deletions(-)

 diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp 
 b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
 index 89fcc49..4011639 100644
 --- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
 +++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
 @@ -861,6 +861,14 @@ fs_visitor::emit_texture(ir_texture_opcode op,
}
 }

 +   if (op == ir_query_levels) {
 +  /* textureQueryLevels() is implemented in terms of TXS so we need to
 +   * pass a valid LOD argument.
 +   */
 +  assert(lod.file == BAD_FILE);
 +  lod = fs_reg(0u);

 Why can't this go as part of the switch below?  It seems kind of odd
 to split the logic up like this.

 What I had in mind originally was to factor out the switch statement
 below into a separate function -- Later on I noticed that you wanted to
 rework emit_texture() and friends to stop using ir_texture_opcode as
 representation for the texturing operation, so you would likely want to
 get rid of the whole switch-case statement below or replace it with
 something else, so I didn't bother.  What are your current plans for
 emit_texture()?

At this point, we may just roll it into nir_emit_texture or we may
leave it alone.  In any case, we will want to get rid of the
ir_texture_opcode eventually.  Feel free to leave this patch as is.  I
can merge things or whatever as needed whenever we do that refactor.

 +   }
 +
 if (coordinate.file != BAD_FILE) {
/* FINISHME: Texture coordinate rescaling doesn't work with 
 non-constant
 * samplers.  This should only be a problem with GL_CLAMP on Gen7.
 @@ -873,26 +881,50 @@ fs_visitor::emit_texture(ir_texture_opcode op,
  * samples, so don't worry about them.
  */
 fs_reg dst = vgrf(glsl_type::get_instance(dest_type-base_type, 4, 1));
 +   const fs_reg srcs[] = {
 +  coordinate, shadow_c, lod, lod2,
 +  sample_index, mcs, sampler_reg, offset_value,
 +  fs_reg(coord_components), fs_reg(grad_components)
 +   };
 +   enum opcode opcode;

 -   if (devinfo-gen = 7) {
 -  inst = emit_texture_gen7(op, dst, coordinate, coord_components,
 -   shadow_c, lod, lod2, grad_components,
 -   sample_index, mcs, sampler_reg,
 -   offset_value);
 -   } else if (devinfo-gen = 5) {
 -  inst = emit_texture_gen5(op, dst, coordinate, coord_components,
 -   shadow_c, lod, lod2, grad_components,
 -   sample_index, sampler,
 -   offset_value.file != BAD_FILE);
 -   } else if (dispatch_width == 16) {
 -  inst = emit_texture_gen4_simd16(op, dst, coordinate, 
 coord_components,
 -  shadow_c, lod, sampler);
 -   } else {
 -  inst = emit_texture_gen4(op, dst, coordinate, coord_components,
 -   shadow_c, lod, lod2, grad_components,
 -   sampler);
 +   switch (op) {
 +   case ir_tex:
 +  opcode = SHADER_OPCODE_TEX_LOGICAL;
 +  break;
 +   case ir_txb:
 +  opcode = FS_OPCODE_TXB_LOGICAL;
 +  break;
 +   case ir_txl:
 +  opcode = SHADER_OPCODE_TXL_LOGICAL;
 +  break;
 +   case ir_txd:
 +  opcode = SHADER_OPCODE_TXD_LOGICAL;
 +  break;
 +   case ir_txf:
 +  opcode = SHADER_OPCODE_TXF_LOGICAL;
 +  break;
 +   case ir_txf_ms:
 +  opcode = SHADER_OPCODE_TXF_CMS_LOGICAL;
 +  break;
 +   case ir_txs:
 +   case ir_query_levels:
 +  opcode = SHADER_OPCODE_TXS_LOGICAL;
 +  break;
 +   case ir_lod:
 +  opcode = SHADER_OPCODE_LOD_LOGICAL;
 +  break;
 +   case ir_tg4:
 +  opcode = (offset_value.file != BAD_FILE  offset_value.file != IMM ?
 +SHADER_OPCODE_TG4_OFFSET_LOGICAL : 
 SHADER_OPCODE_TG4_LOGICAL);
 +  break;
 +   default:
 +  unreachable(not reached);

 Please choose something more descriptive than not reached.  How
 about invalid texture opcode

 Sure, I'll fix that, thanks.


 }

 +   inst = bld.emit(opcode, dst, srcs, ARRAY_SIZE(srcs));
 +   inst-regs_written = 4 * dispatch_width / 8;
 +
 if (shadow_c.file != BAD_FILE)
inst-shadow_compare = true;

 --
 2.4.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] gallium/util: Stop bundling our snprintf implementation.

2015-07-22 Thread Jose Fonseca
Use MSVCRT functions instead.  Their semantics are slightly
different but they can be made to work as expected.

Also, use the same code paths for both MSVCRT and MinGW.

No testing yet.  Just built.

https://bugs.freedesktop.org/show_bug.cgi?id=91418
---
 src/gallium/auxiliary/Makefile.sources  |1 -
 src/gallium/auxiliary/util/u_snprintf.c | 1480 ---
 src/gallium/auxiliary/util/u_string.h   |   33 +-
 3 files changed, 30 insertions(+), 1484 deletions(-)
 delete mode 100644 src/gallium/auxiliary/util/u_snprintf.c

diff --git a/src/gallium/auxiliary/Makefile.sources 
b/src/gallium/auxiliary/Makefile.sources
index 62e6b94..3616d88 100644
--- a/src/gallium/auxiliary/Makefile.sources
+++ b/src/gallium/auxiliary/Makefile.sources
@@ -274,7 +274,6 @@ C_SOURCES := \
util/u_simple_shaders.h \
util/u_slab.c \
util/u_slab.h \
-   util/u_snprintf.c \
util/u_split_prim.h \
util/u_sse.h \
util/u_staging.c \
diff --git a/src/gallium/auxiliary/util/u_snprintf.c 
b/src/gallium/auxiliary/util/u_snprintf.c
deleted file mode 100644
index 39e9b70..000
--- a/src/gallium/auxiliary/util/u_snprintf.c
+++ /dev/null
@@ -1,1480 +0,0 @@
-/*
- * Copyright (c) 1995 Patrick Powell.
- *
- * This code is based on code written by Patrick Powell papow...@astart.com.
- * It may be used for any purpose as long as this notice remains intact on all
- * source code distributions.
- */
-
-/*
- * Copyright (c) 2008 Holger Weiss.
- *
- * This version of the code is maintained by Holger Weiss hol...@jhweiss.de.
- * My changes to the code may freely be used, modified and/or redistributed for
- * any purpose.  It would be nice if additions and fixes to this file 
(including
- * trivial code cleanups) would be sent back in order to let me include them in
- * the version available at http://www.jhweiss.de/software/snprintf.html.
- * However, this is not a requirement for using or redistributing (possibly
- * modified) versions of this file, nor is leaving this notice intact 
mandatory.
- */
-
-/*
- * History
- *
- * 2008-01-20 Holger Weiss hol...@jhweiss.de for C99-snprintf 1.1:
- *
- * Fixed the detection of infinite floating point values on IRIX (and
- * possibly other systems) and applied another few minor cleanups.
- *
- * 2008-01-06 Holger Weiss hol...@jhweiss.de for C99-snprintf 1.0:
- *
- * Added a lot of new features, fixed many bugs, and incorporated various
- * improvements done by Andrew Tridgell tri...@samba.org, Russ Allbery
- * r...@stanford.edu, Hrvoje Niksic hnik...@xemacs.org, Damien Miller
- * d...@mindrot.org, and others for the Samba, INN, Wget, and OpenSSH
- * projects.  The additions include: support the e, E, g, G, and
- * F conversion specifiers (and use conversion style f or F for the
- * still unsupported a and A specifiers); support the hh, ll, j,
- * t, and z length modifiers; support the # flag and the (non-C99)
- * ' flag; use localeconv(3) (if available) to get both the current
- * locale's decimal point character and the separator between groups of
- * digits; fix the handling of various corner cases of field width and
- * precision specifications; fix various floating point conversion bugs;
- * handle infinite and NaN floating point values; don't attempt to write to
- * the output buffer (which may be NULL) if a size of zero was specified;
- * check for integer overflow of the field width, precision, and return
- * values and during the floating point conversion; use the OUTCHAR() macro
- * instead of a function for better performance; provide asprintf(3) and
- * vasprintf(3) functions; add new test cases.  The replacement functions
- * have been renamed to use an rpl_ prefix, the function calls in the
- * main project (and in this file) must be redefined accordingly for each
- * replacement function which is needed (by using Autoconf or other means).
- * Various other minor improvements have been applied and the coding style
- * was cleaned up for consistency.
- *
- * 2007-07-23 Holger Weiss hol...@jhweiss.de for Mutt 1.5.13:
- *
- * C99 compliant snprintf(3) and vsnprintf(3) functions return the number
- * of characters that would have been written to a sufficiently sized
- * buffer (excluding the '\0').  The original code simply returned the
- * length of the resulting output string, so that's been fixed.
- *
- * 1998-03-05 Michael Elkins m...@mutt.org for Mutt 0.90.8:
- *
- * The original code assumed that both snprintf(3) and vsnprintf(3) were
- * missing.  Some systems only have snprintf(3) but not vsnprintf(3), so
- * the code is now broken down under HAVE_SNPRINTF and HAVE_VSNPRINTF.
- *
- * 1998-01-27 Thomas Roessler roess...@does-not-exist.org for Mutt 0.89i:
- *
- * The PGP code was using unsigned hexadecimal formats.  Unfortunately,
- * unsigned formats simply didn't work.
- *
- * 

Re: [Mesa-dev] [PATCH] targets/dri: scons: add missing link against libdrm

2015-07-22 Thread Jose Fonseca

On 22/07/15 16:04, Emil Velikov wrote:

Otherwise the final dri module will have (additional) unresolved
symbols.

Cc: Brian Paul bri...@vmware.com
Cc: Jose Fonseca jfons...@vmware.com
Signed-off-by: Emil Velikov emil.l.veli...@gmail.com
---

We can only fix the remaining unresolved symbols (_glapi_foo), as we
remove the non-shared glapi when building with DRI.

With this we at least match the autotools build.

-Emil

  src/gallium/targets/dri/SConscript | 2 ++
  1 file changed, 2 insertions(+)

diff --git a/src/gallium/targets/dri/SConscript 
b/src/gallium/targets/dri/SConscript
index 8d29f3b..2fb0da0 100644
--- a/src/gallium/targets/dri/SConscript
+++ b/src/gallium/targets/dri/SConscript
@@ -25,6 +25,8 @@ if env['llvm']:
  env.Append(CPPDEFINES = 'GALLIUM_LLVMPIPE')
  env.Prepend(LIBS = [llvmpipe])

+env.PkgUseModules('DRM')
+
  env.Append(CPPDEFINES = [
  'GALLIUM_VMWGFX',
  'GALLIUM_SOFTPIPE',



Reviwed-by: Jose Fonseca jfons...@vmware.com
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 4/4] i965/fs: Implement pass to lower instructions of unsupported SIMD width.

2015-07-22 Thread Jason Ekstrand
On Wed, Jul 22, 2015 at 12:31 AM, Francisco Jerez curroje...@riseup.net wrote:
 Jason Ekstrand ja...@jlekstrand.net writes:

 A few comments below.  Mostly just asking for explanation.

 1-3 are

 Reviewed-by: Jason Ekstrand jason.ekstr...@intel.com

 Obviously, don't merge 4/4 until it actually has users.
 --Jason

 Thanks.


 On Thu, Jul 16, 2015 at 8:35 AM, Francisco Jerez curroje...@riseup.net 
 wrote:
 This lowering pass implements an algorithm to expand SIMDN
 instructions into a sequence of SIMDM instructions in cases where the
 hardware doesn't support the original execution size natively for some
 particular instruction.  The most important use-cases are:

  - Lowering send message instructions that don't support SIMD16
natively into SIMD8 (several texturing, framebuffer write and typed
surface operations).

  - Lowering messages that don't support SIMD8 natively into SIMD16
(*cough*gen4*cough*).

  - 64-bit precision operations (e.g. FP64 and 64-bit integer
multiplication).

  - SIMD32.

 The algorithm works by splitting the sources of the original
 instruction into chunks of width appropriate for the lowered
 instructions, and then interleaving the results component-wise into
 the destination of the original instruction.  The pass is controlled
 by the get_lowered_simd_width() function that currently just returns
 the original execution size making the whole pass a no-op for the
 moment until some user is introduced.
 ---
  src/mesa/drivers/dri/i965/brw_fs.cpp | 142 
 +++
  src/mesa/drivers/dri/i965/brw_fs.h   |   1 +
  2 files changed, 143 insertions(+)

 diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
 b/src/mesa/drivers/dri/i965/brw_fs.cpp
 index d031352..eeb6938 100644
 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
 +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
 @@ -3204,6 +3204,147 @@ fs_visitor::lower_logical_sends()
 return progress;
  }

 +/**
 + * Get the closest native SIMD width supported by the hardware for 
 instruction
 + * \p inst.  The instruction will be left untouched by
 + * fs_visitor::lower_simd_width() if the returned value is equal to the
 + * original execution size.
 + */
 +static unsigned
 +get_lowered_simd_width(const struct brw_device_info *devinfo,
 +   const fs_inst *inst)
 +{
 +   switch (inst-opcode) {
 +   default:
 +  return inst-exec_size;
 +   }
 +}
 +
 +/**
 + * The \p rows array of registers represents a \p num_rows by \p 
 num_columns
 + * matrix in row-major order, write it in column-major order into the 
 register
 + * passed as destination.  \p stride gives the separation between matrix
 + * elements in the input in fs_builder::dispatch_width() units.
 + */
 +static void
 +emit_transpose(const fs_builder bld,
 +   const fs_reg dst, const fs_reg *rows,
 +   unsigned num_rows, unsigned num_columns, unsigned stride)

 I'm not sure what I think about calling this emit_transpose.  I
 guess it is kind of a transpose, but maybe it's more of a gather.
 I'm not going to quibble about it though.

 *Shrug*, it doesn't only gather the vectors of the rows array (that
 would have been emit_collect :P), it copies them out in vertical, just
 like a matrix transpose -- assuming you're not horrified by the idea of
 considering the argument a matrix.

That's fine.  Feel free to ignore that suggestion.


 +{
 +   fs_reg *const components = new fs_reg[num_rows * num_columns];
 +
 +   for (unsigned i = 0; i  num_columns; ++i) {
 +  for (unsigned j = 0; j  num_rows; ++j)
 + components[num_rows * i + j] = offset(rows[j], bld, stride * i);
 +   }
 +
 +   bld.LOAD_PAYLOAD(dst, components, num_rows * num_columns, 0);
 +
 +   delete[] components;
 +}
 +
 +bool
 +fs_visitor::lower_simd_width()
 +{
 +   bool progress = false;
 +
 +   foreach_block_and_inst_safe(block, fs_inst, inst, cfg) {
 +  const unsigned lower_width = get_lowered_simd_width(devinfo, inst);
 +
 +  if (lower_width != inst-exec_size) {
 + /* Builder matching the original instruction. */
 + const fs_builder ibld = bld.at(block, inst)
 +.exec_all(inst-force_writemask_all)
 +.group(inst-exec_size, 
 inst-force_sechalf);
 +
 + /* Split the copies in chunks of the execution width of either the
 +  * original or the lowered instruction, whichever is lower.
 +  */
 + const unsigned copy_width = MIN2(lower_width, inst-exec_size);
 + const unsigned n = inst-exec_size / copy_width;
 + const unsigned dst_size = inst-regs_written * REG_SIZE /
 +inst-dst.component_size(inst-exec_size);
 + fs_reg dsts[4];
 +
 + assert(n  0  n = ARRAY_SIZE(dsts) 
 +!inst-writes_accumulator  !inst-mlen);
 +
 + for (unsigned i = 0; i  n; i++) {
 +/* Emit a copy of the original instruction with the lowered 
 width.
 + * If the 

Re: [Mesa-dev] [PATCH] glapi: fix argument parsing in glX_proto_recv.py

2015-07-22 Thread Emil Velikov
It should be in master now.

-Emil

On 21 July 2015 at 23:34, Dylan Baker baker.dyla...@gmail.com wrote:
 Cool, I was under that impression too, but having it confirmed is great.

 Since these do matter and I don't have push access, could you make sure this
 lands?

 On Jul 21, 2015 15:33, Emil Velikov emil.l.veli...@gmail.com wrote:

 It seems that I was under the wrong impression on the hole thing.

 Some of the files generated from these scripts are used in/by xserver.
 Although they are imported/tracked into version control of the latter in
 order to prevent the xserver build directly diving into mesa source tree.

 The XORG_GLAPI_OUTPUTS files on the other hand, are the ones that were
 removed from xserver with commit be6680967a4(glx: convert to direct GL
 dispatch (v2)). Unfortunatelly they only cover ~20 lines of makefile
 rules, thus none of the python scripts can be removed/simplified.

 Apologies if I mislead anyone.

 -Emil


 On 04/07/15 10:42, Emil Velikov wrote:
  Seems like there is a misunderstanding somewhere - perhaps some on my
  end. Yet I do recall Adam Jackson stating on numerous occasions, on
  IRC, that we don't want/need any sources from mesa (outside
  dri_interface.h and the gl* headers) for xserver.
 
  Adam can we truly nuke (some of) the generated sources in mesa, or
  does X/xserver relies on them ?
 
  Cheers,
  Emil
 
  On 3 July 2015 at 21:13, Dylan Baker baker.dyla...@gmail.com wrote:
  I asked about dumping them and was told that they would leave mesa only
  when
  x itself died. If be more than happy to see them go if they're not
  useful
 
  On Jul 3, 2015 06:28, Emil Velikov emil.l.veli...@gmail.com wrote:
 
  On 02/07/15 18:25, Dylan Baker wrote:
  One of the plugins I use with vim helpfully added an underscore to
  the
  front of mode for kicks.
 
  Obviously this isn't a feature used very often because it's been
  broken
  since d986cb7c70db (since May 20th), and no one has noticed.
 
  Iirc these have a long and boring history with xserver. The short
  version is that they (and many other files reciding in
  XORG_GL{API,X}_DIR) have been dead/unused for a long while.
 
  If you want to avoid the rewriting them a few times, I would suggest
  that we nuke them up first. That's why I've asked about your plans
  earlier :)
 
  -Emil
 
 


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] mesa: Detect and provide macros for function attributes pure and const.

2015-07-22 Thread Jose Fonseca

On 21/07/15 15:57, Emil Velikov wrote:

On 18 July 2015 at 08:13, Jose Fonseca jfons...@vmware.com wrote:

On 18/07/15 01:38, Eric Anholt wrote:


Emil Velikov emil.l.veli...@gmail.com writes:


On 14/07/15 19:45, Eric Anholt wrote:


These are really useful hints to the compiler in the absence of
link-time
optimization, and I'm going to use them in VC4.

I've made the const attribute be ATTRIBUTE_CONST unlike other function
attributes, because we have other things in the tree #defining CONST for
their own unrelated purposes.


Mindly related: how people feel about making these macros less screamy,
by following the approach used in the kernel: PURE - __pure and so on ?



I'd love it.



Less screamy is fine, but beware prefixing double underscore: the C standard
stipulates that its use is reserved for for C/C++ runtime. [1]


I though about it before posting although I've seen others define
those, even do so in their public headers.
Now that I have some examples from my current /usr/include

Searching for __pure
dwarves/dutil.h:#define __pure __attribute__ ((pure))

Searching for __attribute_const__
sys/cdefs.h:# define __attribute_const__ __attribute__ ((__const__))
sys/cdefs.h:# define __attribute_const__ /* Ignore */

Searching for __printf

Searching for __always_unused

Searching for __noreturn

Searching for __packed
libvisual-0.4/libvisual/lv_defines.h:# define __packed __attribute__ ((packed))
libvisual-0.4/libvisual/lv_defines.h:# define __packed /* no packed */
bsd/sys/cdefs.h:#  define __packed __attribute__((__packed__))
bsd/sys/cdefs.h:#  define __packed

Searching for __deprecated
pciaccess.h:#define __deprecated __attribute__((deprecated))
pciaccess.h:#define __deprecated

Searching for __weak

Searching for __alias

With a handful of other headers defining more double underscore prefixed macros.


Look at stdlibc++ implementation: every internal variable has a double
underscore prefix.


Unless we're talking about STL/other template library we don't care
what library foo uses in it's internal implementation do we ? After
all these will be resolved at compile time.


Maybe kernel gets away on GLIBC (and because it doesn't use C++), but
there's no guarantee it will work on other C runtimes, and even if it does,
it could start failing anytime.


True, it's not the best of ideas. Just worth pointing out that the
cat is already out, for other projects.



  There are already more than 12K #define __foo cases on my system.

These defines are reserved for system headers, so it's natural to be 
lots of them in /usr/include.



MacOSX also defines some of these on its sys/cdefs.h:

  http://www.opensource.apple.com/source/xnu/xnu-1456.1.26/bsd/sys/cdefs.h

The question is: can we expect that most systems will define these 
__foo, or at least not use them for other purposes.


I don't know the answer.  At a glance MSVC doesn't seem to rely on them 
for anything.  So it might work.  I don't oppose if you want to give it 
a shot.



Jose
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 04/12] i965/fs: Pass a BAD_FILE header source to LOAD_PAYLOAD in emit_texture_gen7().

2015-07-22 Thread Jason Ekstrand
On Wed, Jul 22, 2015 at 12:43 AM, Francisco Jerez curroje...@riseup.net wrote:
 Jason Ekstrand ja...@jlekstrand.net writes:

 On Sat, Jul 18, 2015 at 7:34 AM, Francisco Jerez curroje...@riseup.net 
 wrote:
 So that it's left uninitialized by LOAD_PAYLOAD, we only need to
 reserve space for it in the message since it will be initialized
 implicitly by the generator.
 ---
  src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 7 ---
  1 file changed, 4 insertions(+), 3 deletions(-)

 diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp 
 b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
 index 522e13e..89fcc49 100644
 --- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
 +++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
 @@ -473,8 +473,9 @@ fs_visitor::emit_texture_gen7(ir_texture_opcode op, 
 fs_reg dst,
 if (op == ir_tg4 || offset_value.file != BAD_FILE ||
 is_high_sampler(devinfo, sampler)) {
/* For general texture offsets (no txf workaround), we need a header 
 to
 -   * put them in.  Note that for SIMD16 we're making space for two 
 actual
 -   * hardware registers here, so the emit will have to fix up for this.
 +   * put them in.  Note that we're only reserving space for it in the
 +   * message payload as it will be initialized implicitly by the
 +   * generator.

 Mind splitting this in two?  Fixing the comment above is really
 unrelated to passing BAD_FILE below.

 It seemed like the sentence I changed was trying to explain what the
 line of code I changed was doing.  It may sound unrelated because the
 comment was completely misleading (or just outdated), but it can only
 refer to the three lines of code below...

It does refer to the three lines of code below, but the fact that it's
bogus has nothing to do with the change below.  It's been wrong for
probably 8 months and I've just never noticed.  Register allocation
used to, in SIMD16 mode, work in terms of pairs of physical registers.
In the texturing code, we needed a 1-register header so in SIMD16, we
got 2 registers and then the generator knew to put the header in the
second half.  About 8 months ago, I changed RA to work in terms of
single hardware registers even on SIMD16 and this comment has been
stale ever since.

 --Jason

 *
 * * ir4_tg4 needs to place its channel select in the header,
 * for interaction with ARB_texture_swizzle
 @@ -483,7 +484,7 @@ fs_visitor::emit_texture_gen7(ir_texture_opcode op, 
 fs_reg dst,
 * need to offset the Sampler State Pointer in the header.
 */
header_size = 1;
 -  sources[0] = fs_reg(GRF, alloc.allocate(1), BRW_REGISTER_TYPE_UD);
 +  sources[0] = fs_reg();
length++;
 }

 --
 2.4.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 12/12] i965/fs: Reimplement emit_single_fb_write() in terms of logical framebuffer writes.

2015-07-22 Thread Jason Ekstrand
On Wed, Jul 22, 2015 at 12:54 AM, Francisco Jerez curroje...@riseup.net wrote:
 Jason Ekstrand ja...@jlekstrand.net writes:

 On Thu, Jul 16, 2015 at 8:41 AM, Francisco Jerez curroje...@riseup.net 
 wrote:
 The only non-trivial thing it still has to do is figure out where to
 take the src/dst depth values from and predicate the instruction if
 discard is in use.  The manual SIMD unrolling logic in the dual-source
 case goes away because this is now handled transparently by the SIMD
 lowering pass.
 ---
  src/mesa/drivers/dri/i965/brw_fs.h   |   5 +-
  src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 203 
 +++
  2 files changed, 20 insertions(+), 188 deletions(-)

 diff --git a/src/mesa/drivers/dri/i965/brw_fs.h 
 b/src/mesa/drivers/dri/i965/brw_fs.h
 index 1ae79a9..64f89d4 100644
 --- a/src/mesa/drivers/dri/i965/brw_fs.h
 +++ b/src/mesa/drivers/dri/i965/brw_fs.h
 @@ -290,13 +290,10 @@ public:
 bool optimize_frontfacing_ternary(nir_alu_instr *instr,
   const fs_reg result);

 -   void setup_color_payload(fs_reg *dst, fs_reg color, unsigned components,
 -unsigned exec_size, bool use_2nd_half);
 void emit_alpha_test();
 fs_inst *emit_single_fb_write(const brw::fs_builder bld,
   fs_reg color1, fs_reg color2,
 - fs_reg src0_alpha, unsigned components,
 - unsigned exec_size, bool use_2nd_half = 
 false);
 + fs_reg src0_alpha, unsigned components);
 void emit_fb_writes();
 void emit_urb_writes();
 void emit_cs_terminate();
 diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp 
 b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
 index ba4b177..bcfeaa0 100644
 --- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
 +++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
 @@ -1409,33 +1409,6 @@ fs_visitor::emit_interpolation_setup_gen6()
 }
  }

 -void
 -fs_visitor::setup_color_payload(fs_reg *dst, fs_reg color, unsigned 
 components,
 -unsigned exec_size, bool use_2nd_half)
 -{
 -   brw_wm_prog_key *key = (brw_wm_prog_key*) this-key;
 -   fs_inst *inst;
 -
 -   if (key-clamp_fragment_color) {
 -  fs_reg tmp = vgrf(glsl_type::vec4_type);
 -  assert(color.type == BRW_REGISTER_TYPE_F);
 -  for (unsigned i = 0; i  components; i++) {
 - inst = bld.MOV(offset(tmp, bld, i), offset(color, bld, i));
 - inst-saturate = true;
 -  }
 -  color = tmp;
 -   }
 -
 -   if (exec_size  dispatch_width) {
 -  unsigned half_idx = use_2nd_half ? 1 : 0;
 -  for (unsigned i = 0; i  components; i++)
 - dst[i] = half(offset(color, bld, i), half_idx);
 -   } else {
 -  for (unsigned i = 0; i  components; i++)
 - dst[i] = offset(color, bld, i);
 -   }
 -}
 -
  static enum brw_conditional_mod
  cond_for_alpha_func(GLenum func)
  {
 @@ -1493,146 +1466,34 @@ fs_visitor::emit_alpha_test()
  fs_inst *
  fs_visitor::emit_single_fb_write(const fs_builder bld,
   fs_reg color0, fs_reg color1,
 - fs_reg src0_alpha, unsigned components,
 - unsigned exec_size, bool use_2nd_half)
 + fs_reg src0_alpha, unsigned components)
  {
 assert(stage == MESA_SHADER_FRAGMENT);
 brw_wm_prog_data *prog_data = (brw_wm_prog_data*) this-prog_data;
 -   brw_wm_prog_key *key = (brw_wm_prog_key*) this-key;
 -   const fs_builder ubld = bld.group(exec_size, use_2nd_half);
 -   int header_size = 2, payload_header_size;
 -
 -   /* We can potentially have a message length of up to 15, so we have to 
 set
 -* base_mrf to either 0 or 1 in order to fit in m0..m15.
 -*/
 -   fs_reg *sources = ralloc_array(mem_ctx, fs_reg, 15);
 -   int length = 0;
 -
 -   /* From the Sandy Bridge PRM, volume 4, page 198:
 -*
 -* Dispatched Pixel Enables. One bit per pixel indicating
 -*  which pixels were originally enabled when the thread was
 -*  dispatched. This field is only required for the end-of-
 -*  thread message and on all dual-source messages.
 -*/
 -   if (devinfo-gen = 6 
 -   (devinfo-is_haswell || devinfo-gen = 8 || !prog_data-uses_kill) 
 
 -   color1.file == BAD_FILE 
 -   key-nr_color_regions == 1) {
 -  header_size = 0;
 -   }
 -
 -   if (header_size != 0) {
 -  assert(header_size == 2);
 -  /* Allocate 2 registers for a header */
 -  length += 2;
 -   }
 -
 -   if (payload.aa_dest_stencil_reg) {
 -  sources[length] = fs_reg(GRF, alloc.allocate(1));
 -  bld.group(8, 0).exec_all().annotate(FB write stencil/AA alpha)
 - .MOV(sources[length],
 -  fs_reg(brw_vec8_grf(payload.aa_dest_stencil_reg, 0)));
 -  length++;
 -   }
 -
 -   if (prog_data-uses_omask) {
 -  sources[length] = fs_reg(GRF, 

Re: [Mesa-dev] [PATCH 09/12] i965/fs: Hook up SIMD lowering to handle texturing opcodes of unsupported width.

2015-07-22 Thread Jason Ekstrand
On Wed, Jul 22, 2015 at 12:55 AM, Francisco Jerez curroje...@riseup.net wrote:
 Jason Ekstrand ja...@jlekstrand.net writes:

 On Sat, Jul 18, 2015 at 7:34 AM, Francisco Jerez curroje...@riseup.net 
 wrote:
 This should match the set of cases in which we currently call fail()
 or no16() from the emit_texture_*() methods and the ones in which
 emit_texture_gen4() enables the SIMD16 workaround.

 Hint for reviewers: It's not a big deal if I happen to have missed
 some case here, it will just lead to an assertion failure down the
 road which is easily fixable, however being stricter than necessary
 won't cause any visible breakage, it would just decrease performance
 silently due to the unnecessary message splitting, so feel free to
 double-check that all cases listed here already cause a SIMD8/16
 fall-back with the current texturing code -- You may want to skip over
 the Gen5-6 cases though if you don't have pencil and paper at hand.

 I'll believe you.  Even if you got it wrong, the worst that happens is
 we hit the fail case when we try and lower.

 ---
  src/mesa/drivers/dri/i965/brw_fs.cpp | 27 +++
  1 file changed, 27 insertions(+)

 diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
 b/src/mesa/drivers/dri/i965/brw_fs.cpp
 index 043d9e9..f291202 100644
 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
 +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
 @@ -3918,6 +3918,33 @@ get_lowered_simd_width(const struct brw_device_info 
 *devinfo,
/* Dual-source FB writes are unsupported in SIMD16 mode. */
return (inst-src[1].file != BAD_FILE ? 8 : inst-exec_size);

 +   case SHADER_OPCODE_TXD_LOGICAL:
 +  /* TXD is unsupported in SIMD16 mode. */
 +  return 8;
 +
 +   case SHADER_OPCODE_TG4_OFFSET_LOGICAL: {
 +  /* gather4_po_c is unsupported in SIMD16 mode. */
 +  const fs_reg shadow_c = inst-src[1];
 +  return (shadow_c.file != BAD_FILE ? 8 : inst-exec_size);
 +   }
 +   case SHADER_OPCODE_TXL_LOGICAL:
 +   case FS_OPCODE_TXB_LOGICAL: {
 +  /* Gen4 doesn't have SIMD8 non-shadow-compare bias/LOD instructions, 
 and
 +   * Gen4-6 don't support TXL and TXB with shadow comparison in SIMD16
 +   * mode.
 +   */
 +  const fs_reg shadow_c = inst-src[1];
 +  return (devinfo-gen == 4  shadow_c.file == BAD_FILE ? 16 :
 +  devinfo-gen  7  shadow_c.file != BAD_FILE ? 8 :
 +  inst-exec_size);

 Could we please use an if-ladder here.  Nested ternaries are hard to
 read and kind of pointless given that we can return multiple places in
 the if.

 Heh, I always have the very opposite style itch -- If you can compute
 something succinctly with a handful of pure expressions why venture
 into the dangerous realm of statements?

Heh... The mathematician in me is tempted to agree with you.  However,
the programmer in me likes his control-flow to have obvious nesting
and indentation. :-)

 I think John Backus' article Can programming be liberated from the von
 Neumann style? although slightly outdated elaborates the point pretty
 well.

 But sure, if you'd like it better with an if-ladder why not.

 +   }
 +   case SHADER_OPCODE_TXF_LOGICAL:
 +   case SHADER_OPCODE_TXS_LOGICAL:
 +  /* Gen4 doesn't have SIMD8 variants for the RESINFO and LD-with-LOD
 +   * messages.  Use SIMD16 instead.
 +   */
 +  return (devinfo-gen == 4 ? 16 : inst-exec_size);

 I'd kind of also like this ternary to go, but I'm ok with it if you
 want to keep it.

 +
 default:
return inst-exec_size;
 }
 --
 2.4.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 91425] [regression, bisected] Piglit spec/ext_packed_float/ getteximage-invalid-format-for-packed-type fails

2015-07-22 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=91425

--- Comment #3 from Ilia Mirkin imir...@alum.mit.edu ---
(In reply to Brian Paul from comment #1)
 Hmm, I can't reproduce that.  The test passes completely for me.
 
 What driver are you using?  I'm testing llvmpipe/softpipe.

Odd. I repro with llvmpipe. I guess the issue is the

  if (depth != 1) {
 _mesa_error(ctx, GL_INVALID_VALUE,
 %s(depth = %d), caller, depth);
 return true;
  }

check for 1d/2d targets. The piglit test does:

glGetTexImage(GL_TEXTURE_2D, 0, format, type, pxBuffer);

Since it's checking for, roughly speaking, invalid things, I assume that the
texImage isn't there, and so get_texture_image_dims will end up setting a depth
= 0. Should that be depth = 1 instead?

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 18/78] i965: Take is_scalar_shader_stage() method out to allow reuse

2015-07-22 Thread Jason Ekstrand
On Wed, Jul 22, 2015 at 4:26 AM, Eduardo Lima Mitev el...@igalia.com wrote:
 On 07/13/2015 01:38 PM, Jason Ekstrand wrote:
 On Fri, Jul 10, 2015 at 8:53 AM, Eduardo Lima Mitev el...@igalia.com wrote:
 On 06/30/2015 06:58 PM, Jason Ekstrand wrote:
 On Fri, Jun 26, 2015 at 1:06 AM, Eduardo Lima Mitev el...@igalia.com 
 wrote:
 This patch makes public the is_scalar_shader_stage() method in 
 brw_shader, and
 renames it to brw_compiler_is_scalar_shader_stage(). The plan is to later 
 reuse it
 in brw_nir, to enable/disable optimization passes depending on the type
 of shader stage.

 I'm not so sure that this is a good plan.  It assumes that whether we
 are doing a scalar or vec4 compile is based entirely on the shader
 stage and some static information (possibly based on environment
 variables).  Ken and I were talking around the office and we may want
 to use both SIMD4x2 and SIMD8 mode for geometry shaders depending on
 the number of inputs, etc.  This won't work in the given paradigm.


 If I understand correctly, what you propose is having a function to
 dynamically choose the type of shader (scalar vs. vec4) when compiling
 the shader, using not only gen and stage, but also actual application
 data. I think this is a good idea and will allow experimenting with
 different combinations of shaders with real input data.

 However, I wonder if we this should be added later after more elaborated
 thoughts on what exactly do we need and where to plug it. I have been
 experimenting with a function following the use case you mentioned,
 choosing shader backend based on inputs to a GS. But honestly it feels
 like a blind guess from my side to what actually you and Ken have in mind.

 Current patch basically reuses the function we already have to select
 the shader, so what I propose is to move forward with it for the moment
 (adding the missing MESA_SHADER_COMPUTE) and discuss how to extend it to
 factor in dynamic data; or perhaps you can explain us your proposal with
 a bit more detail.

 WDYT?

 My primary issue was with having it based on a is_scalar_stage
 function at all.  It seems like a better long-term plan would be to
 simply pass an is_scalar boolean into those lowering functions.  That
 said, I haven't taken a real hard look as to what calls those
 functions and whether or not that's actually what we want either.
 Make sense?  If there's no better place to make that determination
 further up the call chain, then this is fine for now.


 Yes, I like the idea of passing the is_scalar boolean to
 brw_create_nir() directly, so that this decision is moved up to a higher
 context. By doing this we also leave the is_scalar_shader_stage()
 function private to brw_shader.cpp, meaning this patch is not relevant
 anymore and we leave that function untouched.

 A patch illustrating these changes is at:
 https://github.com/Igalia/mesa/commit/9e7bc62d43eaa30de005c40b91551ad113c4065c


 In the future, if we want to select shader type (scalar vs. vector)
 based on dynamic info, the logic that needs changing is contained in
 brw_shader. But this patch doesn't address that yet, lacking a more
 defined use-case. My (wild) idea would be to have a
 preferred_shader_type variable in brw_context for example, that can be
 modifiable dynamically at any point, and will instruct the link phase to
 use scalar or vec4 if possible (depending on gen, stage, etc).

 What do you think?

Looks fine to me.

 The new method accepts a brw_compiler instead of a brw_context. This is 
 done
 for consistency, since the actual info we need (scalar_vs) is in 
 brw_compiler,
 and fetching in through brw_content-intelScreen-compiler seems like too
 much indirection.

 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89580
 ---
  src/mesa/drivers/dri/i965/brw_shader.cpp | 22 ++
  src/mesa/drivers/dri/i965/brw_shader.h   | 13 +
  2 files changed, 19 insertions(+), 16 deletions(-)

 diff --git a/src/mesa/drivers/dri/i965/brw_shader.cpp 
 b/src/mesa/drivers/dri/i965/brw_shader.cpp
 index 0b53647..3b99046 100644
 --- a/src/mesa/drivers/dri/i965/brw_shader.cpp
 +++ b/src/mesa/drivers/dri/i965/brw_shader.cpp
 @@ -182,19 +182,6 @@ brw_shader_precompile(struct gl_context *ctx,
 return true;
  }

 -static inline bool
 -is_scalar_shader_stage(struct brw_context *brw, int stage)
 -{
 -   switch (stage) {
 -   case MESA_SHADER_FRAGMENT:
 -  return true;
 -   case MESA_SHADER_VERTEX:
 -  return brw-intelScreen-compiler-scalar_vs;
 -   default:
 -  return false;
 -   }
 -}
 -
  static void
  brw_lower_packing_builtins(struct brw_context *brw,
 gl_shader_stage shader_type,
 @@ -205,7 +192,8 @@ brw_lower_packing_builtins(struct brw_context *brw,
 | LOWER_PACK_UNORM_2x16
 | LOWER_UNPACK_UNORM_2x16;

 -   if (is_scalar_shader_stage(brw, shader_type)) {
 +   if (brw_compiler_is_scalar_shader_stage(brw-intelScreen-compiler,
 + 

Re: [Mesa-dev] [PATCH 03/20] i965/fs: Implement lowering of logical surface instructions.

2015-07-22 Thread Jason Ekstrand
On Tue, Jul 21, 2015 at 9:38 AM, Francisco Jerez curroje...@riseup.net wrote:
 ---
  src/mesa/drivers/dri/i965/brw_fs.cpp | 63 
 +++-
  1 file changed, 55 insertions(+), 8 deletions(-)

 diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
 b/src/mesa/drivers/dri/i965/brw_fs.cpp
 index a996676..0b0c5e1 100644
 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
 +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
 @@ -3881,6 +3881,20 @@ lower_sampler_logical_send(const fs_builder bld, 
 fs_inst *inst, opcode op)
 }
  }

 +/**
 + * Initialize the header present in some typed and untyped surface
 + * messages.
 + */
 +static fs_reg
 +emit_surface_header(const fs_builder bld, const fs_reg sample_mask)
 +{
 +   fs_builder ubld = bld.exec_all().group(8, 0);
 +   const fs_reg dst = ubld.vgrf(BRW_REGISTER_TYPE_UD);
 +   ubld.MOV(dst, fs_reg(0));
 +   ubld.MOV(component(dst, 7), sample_mask);
 +   return dst;
 +}
 +
  static void
  lower_surface_logical_send(const fs_builder bld, fs_inst *inst, opcode op,
 const fs_reg sample_mask)
 @@ -3889,10 +3903,43 @@ lower_surface_logical_send(const fs_builder bld, 
 fs_inst *inst, opcode op,
 const fs_reg addr = inst-src[0];
 const fs_reg src = inst-src[1];
 const fs_reg surface = inst-src[2];
 -   const fs_reg dims = inst-src[3];
 +   const UNUSED fs_reg dims = inst-src[3];
 const fs_reg arg = inst-src[4];

 -   assert(!Not implemented);
 +   /* Calculate the total number of components of the payload. */
 +   const unsigned addr_sz = inst-components_read(0);
 +   const unsigned src_sz = inst-components_read(1);
 +   const unsigned header_sz = (sample_mask.file == BAD_FILE ? 0 : 1);
 +   const unsigned sz = header_sz + addr_sz + src_sz;
 +
 +   /* Allocate space for the payload. */
 +   fs_reg *const components = new fs_reg[sz];
 +   const fs_reg payload = bld.vgrf(BRW_REGISTER_TYPE_UD, sz);
 +   unsigned n = 0;
 +
 +   /* Construct the payload. */
 +   if (header_sz)
 +  components[n++] = emit_surface_header(bld, sample_mask);
 +
 +   for (unsigned i = 0; i  addr_sz; i++)
 +  components[n++] = offset(addr, bld, i);
 +
 +   for (unsigned i = 0; i  src_sz; i++)
 +  components[n++] = offset(src, bld, i);
 +
 +   bld.LOAD_PAYLOAD(payload, components, sz, header_sz);
 +
 +   /* Update the original instruction. */
 +   inst-opcode = op;
 +   inst-mlen = header_sz + (addr_sz + src_sz) * inst-exec_size / 8;
 +   inst-header_size = header_sz;
 +
 +   inst-src[0] = payload;
 +   inst-src[1] = surface;
 +   inst-src[2] = arg;
 +   inst-resize_sources(3);
 +
 +   delete[] components;

I don't think this is correct.  Unless something has changed, the
fs_inst constructor that takes an array doesn't make a copy so you
have to ralloc the array and *not* delete it.

  }

  bool
 @@ -3965,37 +4012,37 @@ fs_visitor::lower_logical_sends()
case SHADER_OPCODE_UNTYPED_SURFACE_READ_LOGICAL:
   lower_surface_logical_send(ibld, inst,
  SHADER_OPCODE_UNTYPED_SURFACE_READ,
 -fs_reg());
 +fs_reg(0x));
   break;

case SHADER_OPCODE_UNTYPED_SURFACE_WRITE_LOGICAL:
   lower_surface_logical_send(ibld, inst,
  SHADER_OPCODE_UNTYPED_SURFACE_WRITE,
 -fs_reg());
 +ibld.sample_mask_reg());
   break;

case SHADER_OPCODE_UNTYPED_ATOMIC_LOGICAL:
   lower_surface_logical_send(ibld, inst,
  SHADER_OPCODE_UNTYPED_ATOMIC,
 -fs_reg());
 +ibld.sample_mask_reg());
   break;

case SHADER_OPCODE_TYPED_SURFACE_READ_LOGICAL:
   lower_surface_logical_send(ibld, inst,
  SHADER_OPCODE_TYPED_SURFACE_READ,
 -fs_reg());
 +fs_reg(0x));
   break;

case SHADER_OPCODE_TYPED_SURFACE_WRITE_LOGICAL:
   lower_surface_logical_send(ibld, inst,
  SHADER_OPCODE_TYPED_SURFACE_WRITE,
 -fs_reg());
 +ibld.sample_mask_reg());
   break;

case SHADER_OPCODE_TYPED_ATOMIC_LOGICAL:
   lower_surface_logical_send(ibld, inst,
  SHADER_OPCODE_TYPED_ATOMIC,
 -fs_reg());
 +ibld.sample_mask_reg());
   break;

default:
 --
 2.4.3

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org

Re: [Mesa-dev] [PATCH 2/2] mesa: rearrange texture error checking order

2015-07-22 Thread Brian Paul

On 07/22/2015 11:02 AM, Ilia Mirkin wrote:

This moves the width/height/depth == 0 check to the front and avoids
doing any other checking when that is the case.

Also moves the dimensions check after the format/type checks so that we
don't bail out with success on a width/height/depth == 0 request when
the format/type don't match.

Bugzilla: 
https://urldefense.proofpoint.com/v2/url?u=https-3A__bugs.freedesktop.org_show-5Fbug.cgi-3Fid-3D91425d=BQIBAgc=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEsr=T0t4QG7chq2ZwJo6wilkFznRSFy-8uDKartPGbomVj8m=Y1TotaJmX1E7sCopvmMJc8PQKFXlHu6QxfcYOydKuIIs=L0lmmuk31G6M4hZEiwnX-IC4CPhW7rsrSCgEYtDgXjwe=
Signed-off-by: Ilia Mirkin imir...@alum.mit.edu


I suspect this isn't really needed in light of my patch to
getteximage-invalid-format-for-packed-type.c

I believe the test was in error, or at least sloppy, in that it was 
calling glGetTexImage to test format/type validation when there wasn't 
even a texture image to return.  The OpenGL spec doesn't specify an 
order for error checking multiple things, so if there's multiple errors, 
you can't be sure which one will be reported first.  My patch to the 
test removes that ambiguity.


-Brian



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 4/4] i965/fs: Implement pass to lower instructions of unsupported SIMD width.

2015-07-22 Thread Francisco Jerez
Jason Ekstrand ja...@jlekstrand.net writes:

 On Wed, Jul 22, 2015 at 12:31 AM, Francisco Jerez curroje...@riseup.net 
 wrote:
 Jason Ekstrand ja...@jlekstrand.net writes:

 A few comments below.  Mostly just asking for explanation.

 1-3 are

 Reviewed-by: Jason Ekstrand jason.ekstr...@intel.com

 Obviously, don't merge 4/4 until it actually has users.
 --Jason

 Thanks.


 On Thu, Jul 16, 2015 at 8:35 AM, Francisco Jerez curroje...@riseup.net 
 wrote:
 This lowering pass implements an algorithm to expand SIMDN
 instructions into a sequence of SIMDM instructions in cases where the
 hardware doesn't support the original execution size natively for some
 particular instruction.  The most important use-cases are:

  - Lowering send message instructions that don't support SIMD16
natively into SIMD8 (several texturing, framebuffer write and typed
surface operations).

  - Lowering messages that don't support SIMD8 natively into SIMD16
(*cough*gen4*cough*).

  - 64-bit precision operations (e.g. FP64 and 64-bit integer
multiplication).

  - SIMD32.

 The algorithm works by splitting the sources of the original
 instruction into chunks of width appropriate for the lowered
 instructions, and then interleaving the results component-wise into
 the destination of the original instruction.  The pass is controlled
 by the get_lowered_simd_width() function that currently just returns
 the original execution size making the whole pass a no-op for the
 moment until some user is introduced.
 ---
  src/mesa/drivers/dri/i965/brw_fs.cpp | 142 
 +++
  src/mesa/drivers/dri/i965/brw_fs.h   |   1 +
  2 files changed, 143 insertions(+)

 diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
 b/src/mesa/drivers/dri/i965/brw_fs.cpp
 index d031352..eeb6938 100644
 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
 +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
 @@ -3204,6 +3204,147 @@ fs_visitor::lower_logical_sends()
 return progress;
  }

 +/**
 + * Get the closest native SIMD width supported by the hardware for 
 instruction
 + * \p inst.  The instruction will be left untouched by
 + * fs_visitor::lower_simd_width() if the returned value is equal to the
 + * original execution size.
 + */
 +static unsigned
 +get_lowered_simd_width(const struct brw_device_info *devinfo,
 +   const fs_inst *inst)
 +{
 +   switch (inst-opcode) {
 +   default:
 +  return inst-exec_size;
 +   }
 +}
 +
 +/**
 + * The \p rows array of registers represents a \p num_rows by \p 
 num_columns
 + * matrix in row-major order, write it in column-major order into the 
 register
 + * passed as destination.  \p stride gives the separation between matrix
 + * elements in the input in fs_builder::dispatch_width() units.
 + */
 +static void
 +emit_transpose(const fs_builder bld,
 +   const fs_reg dst, const fs_reg *rows,
 +   unsigned num_rows, unsigned num_columns, unsigned stride)

 I'm not sure what I think about calling this emit_transpose.  I
 guess it is kind of a transpose, but maybe it's more of a gather.
 I'm not going to quibble about it though.

 *Shrug*, it doesn't only gather the vectors of the rows array (that
 would have been emit_collect :P), it copies them out in vertical, just
 like a matrix transpose -- assuming you're not horrified by the idea of
 considering the argument a matrix.

 That's fine.  Feel free to ignore that suggestion.


 +{
 +   fs_reg *const components = new fs_reg[num_rows * num_columns];
 +
 +   for (unsigned i = 0; i  num_columns; ++i) {
 +  for (unsigned j = 0; j  num_rows; ++j)
 + components[num_rows * i + j] = offset(rows[j], bld, stride * i);
 +   }
 +
 +   bld.LOAD_PAYLOAD(dst, components, num_rows * num_columns, 0);
 +
 +   delete[] components;
 +}
 +
 +bool
 +fs_visitor::lower_simd_width()
 +{
 +   bool progress = false;
 +
 +   foreach_block_and_inst_safe(block, fs_inst, inst, cfg) {
 +  const unsigned lower_width = get_lowered_simd_width(devinfo, inst);
 +
 +  if (lower_width != inst-exec_size) {
 + /* Builder matching the original instruction. */
 + const fs_builder ibld = bld.at(block, inst)
 +.exec_all(inst-force_writemask_all)
 +.group(inst-exec_size, 
 inst-force_sechalf);
 +
 + /* Split the copies in chunks of the execution width of either 
 the
 +  * original or the lowered instruction, whichever is lower.
 +  */
 + const unsigned copy_width = MIN2(lower_width, inst-exec_size);
 + const unsigned n = inst-exec_size / copy_width;
 + const unsigned dst_size = inst-regs_written * REG_SIZE /
 +inst-dst.component_size(inst-exec_size);
 + fs_reg dsts[4];
 +
 + assert(n  0  n = ARRAY_SIZE(dsts) 
 +!inst-writes_accumulator  !inst-mlen);
 +
 + for (unsigned i = 0; i  n; i++) {
 +/* Emit a copy of the original 

Re: [Mesa-dev] [PATCH 03/20] i965/fs: Implement lowering of logical surface instructions.

2015-07-22 Thread Jason Ekstrand
On Wed, Jul 22, 2015 at 10:02 AM, Francisco Jerez curroje...@riseup.net wrote:
 Jason Ekstrand ja...@jlekstrand.net writes:

 On Tue, Jul 21, 2015 at 9:38 AM, Francisco Jerez curroje...@riseup.net 
 wrote:
 ---
  src/mesa/drivers/dri/i965/brw_fs.cpp | 63 
 +++-
  1 file changed, 55 insertions(+), 8 deletions(-)

 diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
 b/src/mesa/drivers/dri/i965/brw_fs.cpp
 index a996676..0b0c5e1 100644
 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
 +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
 @@ -3881,6 +3881,20 @@ lower_sampler_logical_send(const fs_builder bld, 
 fs_inst *inst, opcode op)
 }
  }

 +/**
 + * Initialize the header present in some typed and untyped surface
 + * messages.
 + */
 +static fs_reg
 +emit_surface_header(const fs_builder bld, const fs_reg sample_mask)
 +{
 +   fs_builder ubld = bld.exec_all().group(8, 0);
 +   const fs_reg dst = ubld.vgrf(BRW_REGISTER_TYPE_UD);
 +   ubld.MOV(dst, fs_reg(0));
 +   ubld.MOV(component(dst, 7), sample_mask);
 +   return dst;
 +}
 +
  static void
  lower_surface_logical_send(const fs_builder bld, fs_inst *inst, opcode op,
 const fs_reg sample_mask)
 @@ -3889,10 +3903,43 @@ lower_surface_logical_send(const fs_builder bld, 
 fs_inst *inst, opcode op,
 const fs_reg addr = inst-src[0];
 const fs_reg src = inst-src[1];
 const fs_reg surface = inst-src[2];
 -   const fs_reg dims = inst-src[3];
 +   const UNUSED fs_reg dims = inst-src[3];
 const fs_reg arg = inst-src[4];

 -   assert(!Not implemented);
 +   /* Calculate the total number of components of the payload. */
 +   const unsigned addr_sz = inst-components_read(0);
 +   const unsigned src_sz = inst-components_read(1);
 +   const unsigned header_sz = (sample_mask.file == BAD_FILE ? 0 : 1);
 +   const unsigned sz = header_sz + addr_sz + src_sz;
 +
 +   /* Allocate space for the payload. */
 +   fs_reg *const components = new fs_reg[sz];
 +   const fs_reg payload = bld.vgrf(BRW_REGISTER_TYPE_UD, sz);
 +   unsigned n = 0;
 +
 +   /* Construct the payload. */
 +   if (header_sz)
 +  components[n++] = emit_surface_header(bld, sample_mask);
 +
 +   for (unsigned i = 0; i  addr_sz; i++)
 +  components[n++] = offset(addr, bld, i);
 +
 +   for (unsigned i = 0; i  src_sz; i++)
 +  components[n++] = offset(src, bld, i);
 +
 +   bld.LOAD_PAYLOAD(payload, components, sz, header_sz);
 +
 +   /* Update the original instruction. */
 +   inst-opcode = op;
 +   inst-mlen = header_sz + (addr_sz + src_sz) * inst-exec_size / 8;
 +   inst-header_size = header_sz;
 +
 +   inst-src[0] = payload;
 +   inst-src[1] = surface;
 +   inst-src[2] = arg;
 +   inst-resize_sources(3);
 +
 +   delete[] components;

 I don't think this is correct.  Unless something has changed, the
 fs_inst constructor that takes an array doesn't make a copy so you
 have to ralloc the array and *not* delete it.

 It is, I fixed the fs_inst constructor some time ago to take the array
 by value (e.g. so allocation on the stack and re-use of the same array
 is possible for the caller).

Probably a good idea.  In that case, would you mind using a vla
instead of new/delete?

  }

  bool
 @@ -3965,37 +4012,37 @@ fs_visitor::lower_logical_sends()
case SHADER_OPCODE_UNTYPED_SURFACE_READ_LOGICAL:
   lower_surface_logical_send(ibld, inst,
  SHADER_OPCODE_UNTYPED_SURFACE_READ,
 -fs_reg());
 +fs_reg(0x));
   break;

case SHADER_OPCODE_UNTYPED_SURFACE_WRITE_LOGICAL:
   lower_surface_logical_send(ibld, inst,
  SHADER_OPCODE_UNTYPED_SURFACE_WRITE,
 -fs_reg());
 +ibld.sample_mask_reg());
   break;

case SHADER_OPCODE_UNTYPED_ATOMIC_LOGICAL:
   lower_surface_logical_send(ibld, inst,
  SHADER_OPCODE_UNTYPED_ATOMIC,
 -fs_reg());
 +ibld.sample_mask_reg());
   break;

case SHADER_OPCODE_TYPED_SURFACE_READ_LOGICAL:
   lower_surface_logical_send(ibld, inst,
  SHADER_OPCODE_TYPED_SURFACE_READ,
 -fs_reg());
 +fs_reg(0x));
   break;

case SHADER_OPCODE_TYPED_SURFACE_WRITE_LOGICAL:
   lower_surface_logical_send(ibld, inst,
  SHADER_OPCODE_TYPED_SURFACE_WRITE,
 -fs_reg());
 +ibld.sample_mask_reg());
   break;

case SHADER_OPCODE_TYPED_ATOMIC_LOGICAL:
   lower_surface_logical_send(ibld, inst,
  SHADER_OPCODE_TYPED_ATOMIC,
 -   

Re: [Mesa-dev] [PATCH 04/12] i965/fs: Pass a BAD_FILE header source to LOAD_PAYLOAD in emit_texture_gen7().

2015-07-22 Thread Francisco Jerez
Jason Ekstrand ja...@jlekstrand.net writes:

 On Wed, Jul 22, 2015 at 12:43 AM, Francisco Jerez curroje...@riseup.net 
 wrote:
 Jason Ekstrand ja...@jlekstrand.net writes:

 On Sat, Jul 18, 2015 at 7:34 AM, Francisco Jerez curroje...@riseup.net 
 wrote:
 So that it's left uninitialized by LOAD_PAYLOAD, we only need to
 reserve space for it in the message since it will be initialized
 implicitly by the generator.
 ---
  src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 7 ---
  1 file changed, 4 insertions(+), 3 deletions(-)

 diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp 
 b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
 index 522e13e..89fcc49 100644
 --- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
 +++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
 @@ -473,8 +473,9 @@ fs_visitor::emit_texture_gen7(ir_texture_opcode op, 
 fs_reg dst,
 if (op == ir_tg4 || offset_value.file != BAD_FILE ||
 is_high_sampler(devinfo, sampler)) {
/* For general texture offsets (no txf workaround), we need a 
 header to
 -   * put them in.  Note that for SIMD16 we're making space for two 
 actual
 -   * hardware registers here, so the emit will have to fix up for 
 this.
 +   * put them in.  Note that we're only reserving space for it in the
 +   * message payload as it will be initialized implicitly by the
 +   * generator.

 Mind splitting this in two?  Fixing the comment above is really
 unrelated to passing BAD_FILE below.

 It seemed like the sentence I changed was trying to explain what the
 line of code I changed was doing.  It may sound unrelated because the
 comment was completely misleading (or just outdated), but it can only
 refer to the three lines of code below...

 It does refer to the three lines of code below, but the fact that it's
 bogus has nothing to do with the change below.  It's been wrong for
 probably 8 months and I've just never noticed.  Register allocation
 used to, in SIMD16 mode, work in terms of pairs of physical registers.
 In the texturing code, we needed a 1-register header so in SIMD16, we
 got 2 registers and then the generator knew to put the header in the
 second half.  About 8 months ago, I changed RA to work in terms of
 single hardware registers even on SIMD16 and this comment has been
 stale ever since.

Alright...  I'll drop the comment from this patch and change it as a
follow-up.


 --Jason

 *
 * * ir4_tg4 needs to place its channel select in the header,
 * for interaction with ARB_texture_swizzle
 @@ -483,7 +484,7 @@ fs_visitor::emit_texture_gen7(ir_texture_opcode op, 
 fs_reg dst,
 * need to offset the Sampler State Pointer in the header.
 */
header_size = 1;
 -  sources[0] = fs_reg(GRF, alloc.allocate(1), BRW_REGISTER_TYPE_UD);
 +  sources[0] = fs_reg();
length++;
 }

 --
 2.4.3



signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] gallium/util: Stop bundling our snprintf implementation.

2015-07-22 Thread Brian Paul

On 07/22/2015 10:15 AM, Jose Fonseca wrote:

Use MSVCRT functions instead.  Their semantics are slightly
different but they can be made to work as expected.

Also, use the same code paths for both MSVCRT and MinGW.

No testing yet.  Just built.

https://bugs.freedesktop.org/show_bug.cgi?id=91418
---
  src/gallium/auxiliary/Makefile.sources  |1 -
  src/gallium/auxiliary/util/u_snprintf.c | 1480 ---
  src/gallium/auxiliary/util/u_string.h   |   33 +-
  3 files changed, 30 insertions(+), 1484 deletions(-)
  delete mode 100644 src/gallium/auxiliary/util/u_snprintf.c



Looks OK to me.

Reviewed-by: Brian Paul bri...@vmware.com

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] mesa: adjust error message when there's a missing teximage

2015-07-22 Thread Brian Paul

On 07/22/2015 11:02 AM, Ilia Mirkin wrote:

The current message makes it seem like the zoffset is invalid.

Signed-off-by: Ilia Mirkin imir...@alum.mit.edu
---
  src/mesa/main/texgetimage.c | 3 +--
  1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/src/mesa/main/texgetimage.c b/src/mesa/main/texgetimage.c
index 59ec091..2f35ac6 100644
--- a/src/mesa/main/texgetimage.c
+++ b/src/mesa/main/texgetimage.c
@@ -1013,8 +1013,7 @@ dimensions_error_check(struct gl_context *ctx,
 texImage = select_tex_image(texObj, target, level, zoffset);
 if (!texImage) {
/* missing texture image */
-  _mesa_error(ctx, GL_INVALID_OPERATION,
-  %s(zoffset = %d), caller, zoffset);
+  _mesa_error(ctx, GL_INVALID_OPERATION, %s(missing image), caller);
return true;
 }




Reviewed-by: Brian Paul bri...@vmware.com

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 91425] [regression, bisected] Piglit spec/ext_packed_float/ getteximage-invalid-format-for-packed-type fails

2015-07-22 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=91425

--- Comment #4 from Brian Paul bri...@vmware.com ---
OK, the problem is my local copy of
getteximage-invalid-format-for-packed-type.c has the patch ext_packed_float:
fix getteximage-invalid-format-for-packed-type test which I posted for review
(and was R-b'd by Jose) but not yet pushed to master.

I believe the piglit test was defective as-is, per my check-in comment:

The GL spec doesn't explicitly say that glGetTexImage should generate
GL_INVALID_OPERATION when attempting to retrieve a non-existant texture
image, but that's what NVIDIA's driver does.

The purpose of this test is to check the format/type parameters, so let's
define a packed float texture to avoid the undefined texture situation.

Test now passes with NVIDIA.

I'll push that patch.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 07/20] i965/fs: Import image memory offset calculation code.

2015-07-22 Thread Jason Ekstrand
This needs a *lot* more commentary.  These calculations are extremely
tricky and there are almost no comments.  For instance, you are
turning a 2D offset on a tiled surface into a new 2D address into the
raw view of the surface.  Nowhere do you explain what the raw
surface looks like and how its width/height map to real tiled
version.  Somewhere, you need to write all this down in detail and
explain why your calculations are correct.  If you don't write it down
here, then write it down somewhere else and put a comment here
explaining where to go look.

I tried to dig through it, but I'm having a lot of trouble getting any
further splitting the x/y into major/minor.

Cc'ing chad.  He knows miptree layouts far better than I do.  Maybe
this is all obvious to someone familiar with the calculations.

--Jason

On Tue, Jul 21, 2015 at 9:38 AM, Francisco Jerez curroje...@riseup.net wrote:
 Define a function to calculate the memory address of the image
 location given by a vector of coordinates.  This is required in cases
 where we need to fall back to untyped surface access, which take a raw
 memory offset and know nothing about surface coordinates, type
 conversion or memory tiling and swizzling.  They are still useful
 because typed surface reads don't support any 64 or 128-bit formats on
 IVB, and they don't support any 128-bit formats on HSW and BDW.

 The tiling algorithm is implemented based on a number of parameters
 which are passed in as uniforms and determine whether the surface
 layout is X-tiled, Y-tiled or untiled.  This allows binding surfaces
 of different tiling layouts to the pipeline without recompiling the
 program.

 v2: Drop VEC4 suport.
 v3: Rebase.
 ---
  .../drivers/dri/i965/brw_fs_surface_builder.cpp| 108 
 +
  1 file changed, 108 insertions(+)

 diff --git a/src/mesa/drivers/dri/i965/brw_fs_surface_builder.cpp 
 b/src/mesa/drivers/dri/i965/brw_fs_surface_builder.cpp
 index 5ee04de..0c879db 100644
 --- a/src/mesa/drivers/dri/i965/brw_fs_surface_builder.cpp
 +++ b/src/mesa/drivers/dri/i965/brw_fs_surface_builder.cpp
 @@ -215,4 +215,112 @@ namespace {
   return BRW_PREDICATE_NORMAL;
}
 }
 +
 +   namespace image_coordinates {
 +  /**
 +   * Calculate the offset in memory of the texel given by \p coord.
 +   *
 +   * This is meant to be used with untyped surface messages to access a
 +   * tiled surface, what involves taking into account the tiling and
 +   * swizzling modes of the surface manually so it will hopefully not
 +   * happen very often.
 +   */
 +  fs_reg
 +  emit_address_calculation(const fs_builder bld, const fs_reg image,
 +   const fs_reg coord, unsigned dims)
 +  {
 + const brw_device_info *devinfo = bld.shader-devinfo;
 + const fs_reg off = offset(image, bld, 
 BRW_IMAGE_PARAM_OFFSET_OFFSET);
 + const fs_reg stride = offset(image, bld, 
 BRW_IMAGE_PARAM_STRIDE_OFFSET);
 + const fs_reg tile = offset(image, bld, 
 BRW_IMAGE_PARAM_TILING_OFFSET);
 + const fs_reg swz = offset(image, bld, 
 BRW_IMAGE_PARAM_SWIZZLING_OFFSET);
 + const fs_reg addr = bld.vgrf(BRW_REGISTER_TYPE_UD, 2);
 + const fs_reg tmp = bld.vgrf(BRW_REGISTER_TYPE_UD, 2);
 + const fs_reg minor = bld.vgrf(BRW_REGISTER_TYPE_UD, 2);
 + const fs_reg major = bld.vgrf(BRW_REGISTER_TYPE_UD, 2);
 + const fs_reg dst = bld.vgrf(BRW_REGISTER_TYPE_UD);
 +
 + /* Shift the coordinates by the fixed surface offset. */
 + for (unsigned c = 0; c  2; ++c)
 +bld.ADD(offset(addr, bld, c), offset(off, bld, c),
 +(c  dims ?
 + offset(retype(coord, BRW_REGISTER_TYPE_UD), bld, c) :
 + fs_reg(0)));
 +
 + if (dims  2) {
 +/* Decompose z into a major (tmp.y) and a minor (tmp.x)
 + * index.
 + */
 +bld.BFE(offset(tmp, bld, 0), offset(tile, bld, 2), fs_reg(0),
 +offset(retype(coord, BRW_REGISTER_TYPE_UD), bld, 2));
 +bld.SHR(offset(tmp, bld, 1),
 +offset(retype(coord, BRW_REGISTER_TYPE_UD), bld, 2),
 +offset(tile, bld, 2));
 +
 +/* Take into account the horizontal (tmp.x) and vertical (tmp.y)
 + * slice offset.
 + */
 +for (unsigned c = 0; c  2; ++c) {
 +   bld.MUL(offset(tmp, bld, c),
 +   offset(stride, bld, 2 + c), offset(tmp, bld, c));
 +   bld.ADD(offset(addr, bld, c),
 +   offset(addr, bld, c), offset(tmp, bld, c));
 +}

Tiled surfaces are inherently 2-D, but it seems like you're treating
it as if it has 3-D tiles.  What's going on here?

 + }
 +
 + if (dims  1) {
 +for (unsigned c = 0; c  2; ++c) {
 +   /* Calculate the minor x and y indices. */
 +   

[Mesa-dev] [PATCH 1/2] mesa: adjust error message when there's a missing teximage

2015-07-22 Thread Ilia Mirkin
The current message makes it seem like the zoffset is invalid.

Signed-off-by: Ilia Mirkin imir...@alum.mit.edu
---
 src/mesa/main/texgetimage.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/src/mesa/main/texgetimage.c b/src/mesa/main/texgetimage.c
index 59ec091..2f35ac6 100644
--- a/src/mesa/main/texgetimage.c
+++ b/src/mesa/main/texgetimage.c
@@ -1013,8 +1013,7 @@ dimensions_error_check(struct gl_context *ctx,
texImage = select_tex_image(texObj, target, level, zoffset);
if (!texImage) {
   /* missing texture image */
-  _mesa_error(ctx, GL_INVALID_OPERATION,
-  %s(zoffset = %d), caller, zoffset);
+  _mesa_error(ctx, GL_INVALID_OPERATION, %s(missing image), caller);
   return true;
}
 
-- 
2.3.6

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/2] mesa: rearrange texture error checking order

2015-07-22 Thread Ilia Mirkin
This moves the width/height/depth == 0 check to the front and avoids
doing any other checking when that is the case.

Also moves the dimensions check after the format/type checks so that we
don't bail out with success on a width/height/depth == 0 request when
the format/type don't match.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91425
Signed-off-by: Ilia Mirkin imir...@alum.mit.edu
---
 src/mesa/main/texgetimage.c | 26 +-
 1 file changed, 13 insertions(+), 13 deletions(-)

diff --git a/src/mesa/main/texgetimage.c b/src/mesa/main/texgetimage.c
index 2f35ac6..cdbd618 100644
--- a/src/mesa/main/texgetimage.c
+++ b/src/mesa/main/texgetimage.c
@@ -928,6 +928,13 @@ dimensions_error_check(struct gl_context *ctx,
const struct gl_texture_image *texImage;
int i;
 
+   if (width == 0 || height == 0 || depth == 0) {
+  /* Not an error, but nothing to do.  Return 'true' so that the
+   * caller simply returns.
+   */
+  return true;
+   }
+
if (xoffset  0) {
   _mesa_error(ctx, GL_INVALID_VALUE, %s(xoffset = %d), caller, xoffset);
   return true;
@@ -1079,13 +1086,6 @@ dimensions_error_check(struct gl_context *ctx,
   }
}
 
-   if (width == 0 || height == 0 || depth == 0) {
-  /* Not an error, but nothing to do.  Return 'true' so that the
-   * caller simply returns.
-   */
-  return true;
-   }
-
return false;
 }
 
@@ -1164,18 +1164,18 @@ getteximage_error_check(struct gl_context *ctx,
   return true;
}
 
-   if (dimensions_error_check(ctx, texObj, target, level,
-  xoffset, yoffset, zoffset,
-  width, height, depth, caller)) {
-  return true;
-   }
-
err = _mesa_error_check_format_and_type(ctx, format, type);
if (err != GL_NO_ERROR) {
   _mesa_error(ctx, err, %s(format/type), caller);
   return true;
}
 
+   if (dimensions_error_check(ctx, texObj, target, level,
+  xoffset, yoffset, zoffset,
+  width, height, depth, caller)) {
+  return true;
+   }
+
if (pbo_error_check(ctx, target, width, height, depth,
format, type, bufSize, pixels, caller)) {
   return true;
-- 
2.3.6

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 03/20] i965/fs: Implement lowering of logical surface instructions.

2015-07-22 Thread Francisco Jerez
Jason Ekstrand ja...@jlekstrand.net writes:

 On Tue, Jul 21, 2015 at 9:38 AM, Francisco Jerez curroje...@riseup.net 
 wrote:
 ---
  src/mesa/drivers/dri/i965/brw_fs.cpp | 63 
 +++-
  1 file changed, 55 insertions(+), 8 deletions(-)

 diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
 b/src/mesa/drivers/dri/i965/brw_fs.cpp
 index a996676..0b0c5e1 100644
 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
 +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
 @@ -3881,6 +3881,20 @@ lower_sampler_logical_send(const fs_builder bld, 
 fs_inst *inst, opcode op)
 }
  }

 +/**
 + * Initialize the header present in some typed and untyped surface
 + * messages.
 + */
 +static fs_reg
 +emit_surface_header(const fs_builder bld, const fs_reg sample_mask)
 +{
 +   fs_builder ubld = bld.exec_all().group(8, 0);
 +   const fs_reg dst = ubld.vgrf(BRW_REGISTER_TYPE_UD);
 +   ubld.MOV(dst, fs_reg(0));
 +   ubld.MOV(component(dst, 7), sample_mask);
 +   return dst;
 +}
 +
  static void
  lower_surface_logical_send(const fs_builder bld, fs_inst *inst, opcode op,
 const fs_reg sample_mask)
 @@ -3889,10 +3903,43 @@ lower_surface_logical_send(const fs_builder bld, 
 fs_inst *inst, opcode op,
 const fs_reg addr = inst-src[0];
 const fs_reg src = inst-src[1];
 const fs_reg surface = inst-src[2];
 -   const fs_reg dims = inst-src[3];
 +   const UNUSED fs_reg dims = inst-src[3];
 const fs_reg arg = inst-src[4];

 -   assert(!Not implemented);
 +   /* Calculate the total number of components of the payload. */
 +   const unsigned addr_sz = inst-components_read(0);
 +   const unsigned src_sz = inst-components_read(1);
 +   const unsigned header_sz = (sample_mask.file == BAD_FILE ? 0 : 1);
 +   const unsigned sz = header_sz + addr_sz + src_sz;
 +
 +   /* Allocate space for the payload. */
 +   fs_reg *const components = new fs_reg[sz];
 +   const fs_reg payload = bld.vgrf(BRW_REGISTER_TYPE_UD, sz);
 +   unsigned n = 0;
 +
 +   /* Construct the payload. */
 +   if (header_sz)
 +  components[n++] = emit_surface_header(bld, sample_mask);
 +
 +   for (unsigned i = 0; i  addr_sz; i++)
 +  components[n++] = offset(addr, bld, i);
 +
 +   for (unsigned i = 0; i  src_sz; i++)
 +  components[n++] = offset(src, bld, i);
 +
 +   bld.LOAD_PAYLOAD(payload, components, sz, header_sz);
 +
 +   /* Update the original instruction. */
 +   inst-opcode = op;
 +   inst-mlen = header_sz + (addr_sz + src_sz) * inst-exec_size / 8;
 +   inst-header_size = header_sz;
 +
 +   inst-src[0] = payload;
 +   inst-src[1] = surface;
 +   inst-src[2] = arg;
 +   inst-resize_sources(3);
 +
 +   delete[] components;

 I don't think this is correct.  Unless something has changed, the
 fs_inst constructor that takes an array doesn't make a copy so you
 have to ralloc the array and *not* delete it.

It is, I fixed the fs_inst constructor some time ago to take the array
by value (e.g. so allocation on the stack and re-use of the same array
is possible for the caller).

  }

  bool
 @@ -3965,37 +4012,37 @@ fs_visitor::lower_logical_sends()
case SHADER_OPCODE_UNTYPED_SURFACE_READ_LOGICAL:
   lower_surface_logical_send(ibld, inst,
  SHADER_OPCODE_UNTYPED_SURFACE_READ,
 -fs_reg());
 +fs_reg(0x));
   break;

case SHADER_OPCODE_UNTYPED_SURFACE_WRITE_LOGICAL:
   lower_surface_logical_send(ibld, inst,
  SHADER_OPCODE_UNTYPED_SURFACE_WRITE,
 -fs_reg());
 +ibld.sample_mask_reg());
   break;

case SHADER_OPCODE_UNTYPED_ATOMIC_LOGICAL:
   lower_surface_logical_send(ibld, inst,
  SHADER_OPCODE_UNTYPED_ATOMIC,
 -fs_reg());
 +ibld.sample_mask_reg());
   break;

case SHADER_OPCODE_TYPED_SURFACE_READ_LOGICAL:
   lower_surface_logical_send(ibld, inst,
  SHADER_OPCODE_TYPED_SURFACE_READ,
 -fs_reg());
 +fs_reg(0x));
   break;

case SHADER_OPCODE_TYPED_SURFACE_WRITE_LOGICAL:
   lower_surface_logical_send(ibld, inst,
  SHADER_OPCODE_TYPED_SURFACE_WRITE,
 -fs_reg());
 +ibld.sample_mask_reg());
   break;

case SHADER_OPCODE_TYPED_ATOMIC_LOGICAL:
   lower_surface_logical_send(ibld, inst,
  SHADER_OPCODE_TYPED_ATOMIC,
 -fs_reg());
 +ibld.sample_mask_reg());
   break;

default:
 --
 2.4.3

 

Re: [Mesa-dev] [PATCH 03/20] i965/fs: Implement lowering of logical surface instructions.

2015-07-22 Thread Francisco Jerez
Jason Ekstrand ja...@jlekstrand.net writes:

 On Wed, Jul 22, 2015 at 10:02 AM, Francisco Jerez curroje...@riseup.net 
 wrote:
 Jason Ekstrand ja...@jlekstrand.net writes:

 On Tue, Jul 21, 2015 at 9:38 AM, Francisco Jerez curroje...@riseup.net 
 wrote:
 ---
  src/mesa/drivers/dri/i965/brw_fs.cpp | 63 
 +++-
  1 file changed, 55 insertions(+), 8 deletions(-)

 diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
 b/src/mesa/drivers/dri/i965/brw_fs.cpp
 index a996676..0b0c5e1 100644
 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
 +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
 @@ -3881,6 +3881,20 @@ lower_sampler_logical_send(const fs_builder bld, 
 fs_inst *inst, opcode op)
 }
  }

 +/**
 + * Initialize the header present in some typed and untyped surface
 + * messages.
 + */
 +static fs_reg
 +emit_surface_header(const fs_builder bld, const fs_reg sample_mask)
 +{
 +   fs_builder ubld = bld.exec_all().group(8, 0);
 +   const fs_reg dst = ubld.vgrf(BRW_REGISTER_TYPE_UD);
 +   ubld.MOV(dst, fs_reg(0));
 +   ubld.MOV(component(dst, 7), sample_mask);
 +   return dst;
 +}
 +
  static void
  lower_surface_logical_send(const fs_builder bld, fs_inst *inst, opcode 
 op,
 const fs_reg sample_mask)
 @@ -3889,10 +3903,43 @@ lower_surface_logical_send(const fs_builder bld, 
 fs_inst *inst, opcode op,
 const fs_reg addr = inst-src[0];
 const fs_reg src = inst-src[1];
 const fs_reg surface = inst-src[2];
 -   const fs_reg dims = inst-src[3];
 +   const UNUSED fs_reg dims = inst-src[3];
 const fs_reg arg = inst-src[4];

 -   assert(!Not implemented);
 +   /* Calculate the total number of components of the payload. */
 +   const unsigned addr_sz = inst-components_read(0);
 +   const unsigned src_sz = inst-components_read(1);
 +   const unsigned header_sz = (sample_mask.file == BAD_FILE ? 0 : 1);
 +   const unsigned sz = header_sz + addr_sz + src_sz;
 +
 +   /* Allocate space for the payload. */
 +   fs_reg *const components = new fs_reg[sz];
 +   const fs_reg payload = bld.vgrf(BRW_REGISTER_TYPE_UD, sz);
 +   unsigned n = 0;
 +
 +   /* Construct the payload. */
 +   if (header_sz)
 +  components[n++] = emit_surface_header(bld, sample_mask);
 +
 +   for (unsigned i = 0; i  addr_sz; i++)
 +  components[n++] = offset(addr, bld, i);
 +
 +   for (unsigned i = 0; i  src_sz; i++)
 +  components[n++] = offset(src, bld, i);
 +
 +   bld.LOAD_PAYLOAD(payload, components, sz, header_sz);
 +
 +   /* Update the original instruction. */
 +   inst-opcode = op;
 +   inst-mlen = header_sz + (addr_sz + src_sz) * inst-exec_size / 8;
 +   inst-header_size = header_sz;
 +
 +   inst-src[0] = payload;
 +   inst-src[1] = surface;
 +   inst-src[2] = arg;
 +   inst-resize_sources(3);
 +
 +   delete[] components;

 I don't think this is correct.  Unless something has changed, the
 fs_inst constructor that takes an array doesn't make a copy so you
 have to ralloc the array and *not* delete it.

 It is, I fixed the fs_inst constructor some time ago to take the array
 by value (e.g. so allocation on the stack and re-use of the same array
 is possible for the caller).

 Probably a good idea.  In that case, would you mind using a vla
 instead of new/delete?

I'd rather use new/delete (or the C++ standard library).  VLAs are a
non-standard feature only guaranteed to be available in C99, elsewhere
they are an extension with far from universal compiler support -- The
requirement was in fact again dropped from the C11 standard and the
functionally-similar C++ feature was also voted out by the committee
during the standardization of C++14.

I'd rather stick to the standard...

  }

  bool
 @@ -3965,37 +4012,37 @@ fs_visitor::lower_logical_sends()
case SHADER_OPCODE_UNTYPED_SURFACE_READ_LOGICAL:
   lower_surface_logical_send(ibld, inst,
  SHADER_OPCODE_UNTYPED_SURFACE_READ,
 -fs_reg());
 +fs_reg(0x));
   break;

case SHADER_OPCODE_UNTYPED_SURFACE_WRITE_LOGICAL:
   lower_surface_logical_send(ibld, inst,
  SHADER_OPCODE_UNTYPED_SURFACE_WRITE,
 -fs_reg());
 +ibld.sample_mask_reg());
   break;

case SHADER_OPCODE_UNTYPED_ATOMIC_LOGICAL:
   lower_surface_logical_send(ibld, inst,
  SHADER_OPCODE_UNTYPED_ATOMIC,
 -fs_reg());
 +ibld.sample_mask_reg());
   break;

case SHADER_OPCODE_TYPED_SURFACE_READ_LOGICAL:
   lower_surface_logical_send(ibld, inst,
  SHADER_OPCODE_TYPED_SURFACE_READ,
 -fs_reg());
 +fs_reg(0x));
   break;

case 

Re: [Mesa-dev] [PATCH 03/20] i965/fs: Implement lowering of logical surface instructions.

2015-07-22 Thread Jason Ekstrand
On Wed, Jul 22, 2015 at 10:16 AM, Francisco Jerez curroje...@riseup.net wrote:
 Jason Ekstrand ja...@jlekstrand.net writes:

 On Wed, Jul 22, 2015 at 10:02 AM, Francisco Jerez curroje...@riseup.net 
 wrote:
 Jason Ekstrand ja...@jlekstrand.net writes:

 On Tue, Jul 21, 2015 at 9:38 AM, Francisco Jerez curroje...@riseup.net 
 wrote:
 ---
  src/mesa/drivers/dri/i965/brw_fs.cpp | 63 
 +++-
  1 file changed, 55 insertions(+), 8 deletions(-)

 diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
 b/src/mesa/drivers/dri/i965/brw_fs.cpp
 index a996676..0b0c5e1 100644
 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
 +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
 @@ -3881,6 +3881,20 @@ lower_sampler_logical_send(const fs_builder bld, 
 fs_inst *inst, opcode op)
 }
  }

 +/**
 + * Initialize the header present in some typed and untyped surface
 + * messages.
 + */
 +static fs_reg
 +emit_surface_header(const fs_builder bld, const fs_reg sample_mask)
 +{
 +   fs_builder ubld = bld.exec_all().group(8, 0);
 +   const fs_reg dst = ubld.vgrf(BRW_REGISTER_TYPE_UD);
 +   ubld.MOV(dst, fs_reg(0));
 +   ubld.MOV(component(dst, 7), sample_mask);
 +   return dst;
 +}
 +
  static void
  lower_surface_logical_send(const fs_builder bld, fs_inst *inst, opcode 
 op,
 const fs_reg sample_mask)
 @@ -3889,10 +3903,43 @@ lower_surface_logical_send(const fs_builder bld, 
 fs_inst *inst, opcode op,
 const fs_reg addr = inst-src[0];
 const fs_reg src = inst-src[1];
 const fs_reg surface = inst-src[2];
 -   const fs_reg dims = inst-src[3];
 +   const UNUSED fs_reg dims = inst-src[3];
 const fs_reg arg = inst-src[4];

 -   assert(!Not implemented);
 +   /* Calculate the total number of components of the payload. */
 +   const unsigned addr_sz = inst-components_read(0);
 +   const unsigned src_sz = inst-components_read(1);
 +   const unsigned header_sz = (sample_mask.file == BAD_FILE ? 0 : 1);
 +   const unsigned sz = header_sz + addr_sz + src_sz;
 +
 +   /* Allocate space for the payload. */
 +   fs_reg *const components = new fs_reg[sz];
 +   const fs_reg payload = bld.vgrf(BRW_REGISTER_TYPE_UD, sz);
 +   unsigned n = 0;
 +
 +   /* Construct the payload. */
 +   if (header_sz)
 +  components[n++] = emit_surface_header(bld, sample_mask);
 +
 +   for (unsigned i = 0; i  addr_sz; i++)
 +  components[n++] = offset(addr, bld, i);
 +
 +   for (unsigned i = 0; i  src_sz; i++)
 +  components[n++] = offset(src, bld, i);
 +
 +   bld.LOAD_PAYLOAD(payload, components, sz, header_sz);
 +
 +   /* Update the original instruction. */
 +   inst-opcode = op;
 +   inst-mlen = header_sz + (addr_sz + src_sz) * inst-exec_size / 8;
 +   inst-header_size = header_sz;
 +
 +   inst-src[0] = payload;
 +   inst-src[1] = surface;
 +   inst-src[2] = arg;
 +   inst-resize_sources(3);
 +
 +   delete[] components;

 I don't think this is correct.  Unless something has changed, the
 fs_inst constructor that takes an array doesn't make a copy so you
 have to ralloc the array and *not* delete it.

 It is, I fixed the fs_inst constructor some time ago to take the array
 by value (e.g. so allocation on the stack and re-use of the same array
 is possible for the caller).

 Probably a good idea.  In that case, would you mind using a vla
 instead of new/delete?

 I'd rather use new/delete (or the C++ standard library).  VLAs are a
 non-standard feature only guaranteed to be available in C99, elsewhere
 they are an extension with far from universal compiler support -- The
 requirement was in fact again dropped from the C11 standard and the
 functionally-similar C++ feature was also voted out by the committee
 during the standardization of C++14.

 I'd rather stick to the standard...

I can live with that.

  }

  bool
 @@ -3965,37 +4012,37 @@ fs_visitor::lower_logical_sends()
case SHADER_OPCODE_UNTYPED_SURFACE_READ_LOGICAL:
   lower_surface_logical_send(ibld, inst,
  SHADER_OPCODE_UNTYPED_SURFACE_READ,
 -fs_reg());
 +fs_reg(0x));
   break;

case SHADER_OPCODE_UNTYPED_SURFACE_WRITE_LOGICAL:
   lower_surface_logical_send(ibld, inst,
  SHADER_OPCODE_UNTYPED_SURFACE_WRITE,
 -fs_reg());
 +ibld.sample_mask_reg());
   break;

case SHADER_OPCODE_UNTYPED_ATOMIC_LOGICAL:
   lower_surface_logical_send(ibld, inst,
  SHADER_OPCODE_UNTYPED_ATOMIC,
 -fs_reg());
 +ibld.sample_mask_reg());
   break;

case SHADER_OPCODE_TYPED_SURFACE_READ_LOGICAL:
   lower_surface_logical_send(ibld, inst,
  SHADER_OPCODE_TYPED_SURFACE_READ,
 -fs_reg());
 

Re: [Mesa-dev] [PATCH 2/2] mesa: rearrange texture error checking order

2015-07-22 Thread Brian Paul

On 07/22/2015 11:29 AM, Ilia Mirkin wrote:

On Wed, Jul 22, 2015 at 1:20 PM, Brian Paul bri...@vmware.com wrote:

On 07/22/2015 11:02 AM, Ilia Mirkin wrote:


This moves the width/height/depth == 0 check to the front and avoids
doing any other checking when that is the case.

Also moves the dimensions check after the format/type checks so that we
don't bail out with success on a width/height/depth == 0 request when
the format/type don't match.

Bugzilla:
https://urldefense.proofpoint.com/v2/url?u=https-3A__bugs.freedesktop.org_show-5Fbug.cgi-3Fid-3D91425d=BQIBAgc=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEsr=T0t4QG7chq2ZwJo6wilkFznRSFy-8uDKartPGbomVj8m=Y1TotaJmX1E7sCopvmMJc8PQKFXlHu6QxfcYOydKuIIs=L0lmmuk31G6M4hZEiwnX-IC4CPhW7rsrSCgEYtDgXjwe=
Signed-off-by: Ilia Mirkin imir...@alum.mit.edu



I suspect this isn't really needed in light of my patch to
getteximage-invalid-format-for-packed-type.c

I believe the test was in error, or at least sloppy, in that it was calling
glGetTexImage to test format/type validation when there wasn't even a
texture image to return.  The OpenGL spec doesn't specify an order for error
checking multiple things, so if there's multiple errors, you can't be sure
which one will be reported first.  My patch to the test removes that
ambiguity.


Well, irrespective of the piglit test, the current code is a little
suspect -- it does the width/height/depth == 0 checks after it ensures
that height == 1 for 1d and depth == 1 for 2d. Any thoughts on that?


Yeah, that makes sense.  I guess we don't have any tests that exercise 
getting a 0 x 0 texture image.


Reviewed-by: Brian Paul bri...@vmware.com


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 06/14] meta: Don't do fragment color clamping in case of ReadPixels

2015-07-22 Thread Anuj Phogat
On Tue, Jul 21, 2015 at 1:18 AM, Iago Toral ito...@igalia.com wrote:
 On Tue, 2015-07-21 at 09:24 +0200, Iago Toral wrote:
 On Tue, 2015-06-16 at 11:15 -0700, Anuj Phogat wrote:
  Without this patch, piglit test arb_color_buffer_float-readpixels
  fails, when forced to use the meta pbo path.
 
  Signed-off-by: Anuj Phogat anuj.pho...@gmail.com
  Cc: mesa-sta...@lists.freedesktop.org
  ---
   src/mesa/drivers/common/meta_tex_subimage.c | 5 +
   1 file changed, 5 insertions(+)
 
  diff --git a/src/mesa/drivers/common/meta_tex_subimage.c 
  b/src/mesa/drivers/common/meta_tex_subimage.c
  index 84cbc50..ccb7dfb 100644
  --- a/src/mesa/drivers/common/meta_tex_subimage.c
  +++ b/src/mesa/drivers/common/meta_tex_subimage.c
  @@ -25,6 +25,7 @@
*Jason Ekstrand jason.ekstr...@intel.com
*/
 
  +#include blend.h
   #include bufferobj.h
   #include buffers.h
   #include fbobject.h
  @@ -305,6 +306,10 @@ _mesa_meta_pbo_GetTexSubImage(struct gl_context *ctx, 
  GLuint dims,
  _mesa_meta_begin(ctx, ~(MESA_META_PIXEL_TRANSFER |
  MESA_META_PIXEL_STORE));
 
  +   /* GL_CLAMP_FRAGMENT_COLOR doesn't affect ReadPixels. */
  +   if (!tex_image  ctx-Extensions.ARB_color_buffer_float)
  +  _mesa_ClampColor(GL_CLAMP_FRAGMENT_COLOR, GL_FALSE);
  +

 I think you need to restore the original value of
 GL_CLAMP_FRAGMENT_COLOR after you are done with ReadPixels,  otherwise
 it looks like you would be changing the setting for future API calls
 too.

 Oh, I just noticed that this setting is already being considered in the
 state flags passed to _mesa_meta_begin, so ignore that comment.

 Also, does this only apply to ReadPixels and not to GetTexImage? That
 looks odd to me.
You're right. This should apply to GetTexImage too. I'll send out v2 and
also add a piglit test for GetTexImage.


 Iago

  _mesa_GenFramebuffers(2, fbos);
 
  if (tex_image  tex_image-TexObject-Target == GL_TEXTURE_1D_ARRAY) {


 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 09/20] i965/fs: Import image format conversion primitives.

2015-07-22 Thread Jason Ekstrand
On Tue, Jul 21, 2015 at 9:38 AM, Francisco Jerez curroje...@riseup.net wrote:
 Define bitfield packing, unpacking and type conversion operations in
 terms of which the image format conversion code will be implemented.
 These don't directly know about image formats: The packing and
 unpacking functions take a 4-tuple of bit shifts and a 4-tuple of bit
 widths as arguments, determining the bitfield position of each
 component.  Most of the remaining functions perform integer, fixed
 point normalized, and floating point type conversions, mapping between
 a target type with per-component bit widths given by a parameter and a
 matching native representation of the same type.

 v2: Drop VEC4 suport.
 v3: Rebase.
 ---
  .../drivers/dri/i965/brw_fs_surface_builder.cpp| 263 
 +
  1 file changed, 263 insertions(+)

 diff --git a/src/mesa/drivers/dri/i965/brw_fs_surface_builder.cpp 
 b/src/mesa/drivers/dri/i965/brw_fs_surface_builder.cpp
 index 0c879db..ea1c4aa 100644
 --- a/src/mesa/drivers/dri/i965/brw_fs_surface_builder.cpp
 +++ b/src/mesa/drivers/dri/i965/brw_fs_surface_builder.cpp
 @@ -323,4 +323,267 @@ namespace {
   return dst;
}
 }
 +
 +   namespace image_format_conversion {
 +  using image_format_info::color_u;
 +
 +  namespace {
 + /**
 +  * Maximum representable value in an unsigned integer with the given
 +  * number of bits.
 +  */
 + inline unsigned
 + scale(unsigned n)
 + {
 +return (1  n) - 1;
 + }
 +  }
 +
 +  /**
 +   * Pack the vector \p src in a bitfield given the per-component bit
 +   * shifts and widths.

You should comment in here that it assumes that either everything fits
in 32 bits or that it is homogeneous with a power-of-two width.  More
specifically, it only works if no component spans a 32-bit boundary.

 +   */
 +  fs_reg
 +  emit_pack(const fs_builder bld, const fs_reg src,
 +const color_u shifts, const color_u widths)
 +  {
 + const fs_reg dst = bld.vgrf(BRW_REGISTER_TYPE_UD, 4);
 + bool seen[4] = {};
 +
 + for (unsigned c = 0; c  4; ++c) {
 +if (widths[c]) {
 +   const fs_reg tmp = bld.vgrf(BRW_REGISTER_TYPE_UD);
 +
 +   /* Shift each component left to the correct bitfield 
 position. */
 +   bld.SHL(tmp, offset(src, bld, c), fs_reg(shifts[c] % 32));
 +
 +   /* Add everything up. */
 +   if (seen[shifts[c] / 32]) {
 +  bld.OR(offset(dst, bld, shifts[c] / 32),
 + offset(dst, bld, shifts[c] / 32), tmp);
 +   } else {
 +  bld.MOV(offset(dst, bld, shifts[c] / 32), tmp);
 +  seen[shifts[c] / 32] = true;
 +   }
 +}
 + }
 +
 + return dst;
 +  }
 +
 +  /**
 +   * Unpack a vector from the bitfield \p src given the per-component bit
 +   * shifts and widths.

Same comment here.

 +   */
 +  fs_reg
 +  emit_unpack(const fs_builder bld, const fs_reg src,
 +  const color_u shifts, const color_u widths)
 +  {
 + const fs_reg dst = bld.vgrf(src.type, 4);
 +
 + for (unsigned c = 0; c  4; ++c) {
 +if (widths[c]) {
 +   /* Shift left to discard the most significant bits. */
 +   bld.SHL(offset(dst, bld, c),
 +   offset(src, bld, shifts[c] / 32),
 +   fs_reg(32 - shifts[c] % 32 - widths[c]));
 +
 +   /* Shift back to the least significant bits using an 
 arithmetic
 +* shift to get sign extension on signed types.
 +*/
 +   bld.ASR(offset(dst, bld, c),
 +   offset(dst, bld, c), fs_reg(32 - widths[c]));
 +}
 + }
 +
 + return dst;
 +  }
 +
 +  /**
 +   * Convert a vector into an integer vector of the specified signedness
 +   * and bit widths, properly handling overflow.
 +   */
 +  fs_reg
 +  emit_convert_to_integer(const fs_builder bld, const fs_reg src,
 +  const color_u widths, bool is_signed)
 +  {
 + const unsigned s = (is_signed ? 1 : 0);
 + const fs_reg dst = bld.vgrf(
 +is_signed ? BRW_REGISTER_TYPE_D : BRW_REGISTER_TYPE_UD, 4);
 +
 + for (unsigned c = 0; c  4; ++c) {
 +if (widths[c]) {
 +   bld.MOV(offset(dst, bld, c), offset(src, bld, c));
 +
 +   /* Clamp to the minimum value. */
 +   if (is_signed)
 +  bld.emit_minmax(offset(dst, bld, c), offset(dst, bld, c),
 +  fs_reg(-(int)scale(widths[c] - s) - 1),
 +  BRW_CONDITIONAL_G);

If it isn't signed, shouldn't you still do a min/max with zero?  Also,
I think you want to do the minmax while its 

Re: [Mesa-dev] [PATCH] mesa: initialize variables to silence compiler warnings

2015-07-22 Thread Anuj Phogat
On Tue, Jul 21, 2015 at 5:42 PM, Brian Paul bri...@vmware.com wrote:
 ---
  src/mesa/main/fbobject.c  | 4 ++--
  src/mesa/main/shaderapi.c | 2 +-
  2 files changed, 3 insertions(+), 3 deletions(-)

 diff --git a/src/mesa/main/fbobject.c b/src/mesa/main/fbobject.c
 index f46554b..cc342c2 100644
 --- a/src/mesa/main/fbobject.c
 +++ b/src/mesa/main/fbobject.c
 @@ -3304,7 +3304,7 @@ _mesa_FramebufferTexture(GLenum target, GLenum 
 attachment,
 GET_CURRENT_CONTEXT(ctx);
 struct gl_framebuffer *fb;
 struct gl_texture_object *texObj;
 -   GLboolean layered;
 +   GLboolean layered = GL_FALSE;

 const char *func = FramebufferTexture;

 @@ -3347,7 +3347,7 @@ _mesa_NamedFramebufferTexture(GLuint framebuffer, 
 GLenum attachment,
 GET_CURRENT_CONTEXT(ctx);
 struct gl_framebuffer *fb;
 struct gl_texture_object *texObj;
 -   GLboolean layered;
 +   GLboolean layered = GL_FALSE;

 const char *func = glNamedFramebufferTexture;

 diff --git a/src/mesa/main/shaderapi.c b/src/mesa/main/shaderapi.c
 index 3365c7a..376874e 100644
 --- a/src/mesa/main/shaderapi.c
 +++ b/src/mesa/main/shaderapi.c
 @@ -1331,7 +1331,7 @@ void GLAPIENTRY
  _mesa_GetObjectParameterfvARB(GLhandleARB object, GLenum pname,
GLfloat *params)
  {
 -   GLint iparams[1];  /* XXX is one element enough? */
 +   GLint iparams[1] = {0};  /* XXX is one element enough? */
 _mesa_GetObjectParameterivARB(object, pname, iparams);
 params[0] = (GLfloat) iparams[0];
  }
 --
 1.9.1

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Reviewed-by: Anuj Phogat anuj.pho...@gmail.com
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] glx: Fix build warnings for PURE being redefined.

2015-07-22 Thread Eric Anholt
We can't completely drop this #define because it's used in the xserver
generated code.  Until someone gets around to putting xml generation
directly in the server, we're stuck #defining it for them.
---
 src/mapi/glapi/gen/gl_XML.py | 10 ++
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/src/mapi/glapi/gen/gl_XML.py b/src/mapi/glapi/gen/gl_XML.py
index 67aba81..f19570b 100644
--- a/src/mapi/glapi/gen/gl_XML.py
+++ b/src/mapi/glapi/gen/gl_XML.py
@@ -183,10 +183,12 @@ class gl_print_base(object):
 The name is also added to the file's undef_list.
 
 self.undef_list.append(PURE)
-print #  if defined(__GNUC__) || (defined(__SUNPRO_C)  
(__SUNPRO_C = 0x590))
-#define PURE __attribute__((pure))
-#  else
-#define PURE
+print #  if !defined(PURE)
+#if defined(__GNUC__) || (defined(__SUNPRO_C)  (__SUNPRO_C = 0x590))
+#  define PURE __attribute__((pure))
+#else
+#  define PURE
+#endif
 #  endif
 return
 
-- 
2.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 1/7] nouveau: implement the nvif hardware performance counters interface

2015-07-22 Thread Martin Peres



On 01/07/15 01:01, Samuel Pitoiset wrote:

This commit implements the base interface for hardware performance
counters that will be shared between nv50 and nvc0 drivers.

TODO: Bump libdrm version of mesa when nvif will be merged.

Changes since v2:
- remove double-query thing for domains, signals and sources

Signed-off-by: Samuel Pitoiset samuel.pitoi...@gmail.com
---
  src/gallium/drivers/nouveau/Makefile.sources  |   2 +
  src/gallium/drivers/nouveau/nouveau_perfmon.c | 290 ++
  src/gallium/drivers/nouveau/nouveau_perfmon.h |  58 ++
  src/gallium/drivers/nouveau/nouveau_screen.c  |   5 +
  src/gallium/drivers/nouveau/nouveau_screen.h  |   1 +
  5 files changed, 356 insertions(+)
  create mode 100644 src/gallium/drivers/nouveau/nouveau_perfmon.c
  create mode 100644 src/gallium/drivers/nouveau/nouveau_perfmon.h

diff --git a/src/gallium/drivers/nouveau/Makefile.sources 
b/src/gallium/drivers/nouveau/Makefile.sources
index 3fae3bc..3da0bdc 100644
--- a/src/gallium/drivers/nouveau/Makefile.sources
+++ b/src/gallium/drivers/nouveau/Makefile.sources
@@ -10,6 +10,8 @@ C_SOURCES := \
nouveau_heap.h \
nouveau_mm.c \
nouveau_mm.h \
+   nouveau_perfmon.c \
+   nouveau_perfmon.h \
nouveau_screen.c \
nouveau_screen.h \
nouveau_statebuf.h \
diff --git a/src/gallium/drivers/nouveau/nouveau_perfmon.c 
b/src/gallium/drivers/nouveau/nouveau_perfmon.c
new file mode 100644
index 000..e1d4546
--- /dev/null
+++ b/src/gallium/drivers/nouveau/nouveau_perfmon.c
@@ -0,0 +1,290 @@
+/*
+ * Copyright 2015 Samuel Pitoiset
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the Software),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ */
+
+#include errno.h
+
+#include util/u_memory.h
+
+#include nouveau_debug.h
+#include nouveau_winsys.h
+#include nouveau_perfmon.h
+
+static int
+nouveau_perfmon_query_sources(struct nouveau_perfmon *pm,
+  struct nouveau_perfmon_dom *dom,
+  struct nouveau_perfmon_sig *sig)
+{
+   struct nvif_perfmon_query_source_v0 args = {};
+
+   args.iter   = 1;


Why start iterating from 1 and not 0?


+   args.domain = dom-id;
+   args.signal = sig-signal;
+   do {
+  struct nouveau_perfmon_src *src;
+  int ret;
+
+  ret = nouveau_object_mthd(pm-object, NVIF_PERFMON_V0_QUERY_SOURCE,
+args, sizeof(args));
+  if (ret)
+ return ret;


 You do not check what happens if you do not expose any source for this 
signal. A test on args.iter != 0x with a return if not the case 
would be nice!



+
+  src = CALLOC_STRUCT(nouveau_perfmon_src);
+  if (!src)
+ return -ENOMEM;
+
+#if 0
+  debug_printf(id   = %d\n, args.source);
+  debug_printf(name = %s\n, args.name);
+  debug_printf(mask = %08x\n, args.mask);
+  debug_printf(\n);
+#endif
+
+  src-id = args.source;
+  strncpy(src-name, args.name, sizeof(src-name));
+  list_addtail(src-head, sig-sources);
+   } while (args.iter != 0xff);
+
+   return 0;
+}
+
+static int
+nouveau_perfmon_query_signals(struct nouveau_perfmon *pm,
+  struct nouveau_perfmon_dom *dom)
+{
+   struct nvif_perfmon_query_signal_v0 args = {};
+
+   args.iter   = 1;
+   args.domain = dom-id;
+   do {
+  struct nouveau_perfmon_sig *sig;
+  int ret;
+
+  ret = nouveau_object_mthd(pm-object, NVIF_PERFMON_V0_QUERY_SIGNAL,
+args, sizeof(args));
+  if (ret)
+ return ret;


You do not check what happens if you do not expose any source for this 
signal. A test on args.iter != 0x with a return if not the case 
would be nice!



+
+  sig = CALLOC_STRUCT(nouveau_perfmon_sig);
+  if (!sig)
+ return -ENOMEM;
+  list_inithead(sig-sources);
+
+#if 0
+  debug_printf(name  = %s\n, args.name);
+  debug_printf(signal= 0x%02x\n, args.signal);
+  debug_printf(source_nr = %d\n, 

Re: [Mesa-dev] [PATCH v4 3/6] mesa/es3.1: enable GL_ARB_texture_multisample for GLES 3.1

2015-07-22 Thread Ian Romanick
On 06/26/2015 07:38 AM, Ilia Mirkin wrote:
 On Fri, Jun 26, 2015 at 4:18 AM, Tapani Pälli tapani.pa...@intel.com wrote:


 On 06/26/2015 01:06 AM, Ilia Mirkin wrote:

 On Thu, Jun 25, 2015 at 4:22 PM, Ilia Mirkin imir...@alum.mit.edu wrote:

 On Thu, Jun 25, 2015 at 5:08 AM, Marta Lofstedt
 marta.lofst...@linux.intel.com wrote:

 From: Marta Lofstedt marta.lofst...@intel.com

 v4 : only expose GL_ARB_texture_multisample enums
 for gles 3.1 and desktop GL.


 I was suspicious of this logic. Based on my reading of the code, what
 your ARB_texture_multisample_es31 thing does is expose those enums
 when *either* the driver enables ARB_texture_multisample *or* the
 current context is ES3.1.

 ARB_draw_indirect_es31 has the same problem, btw.


 I'm not sure I understand the issue. It should work fine with 'OR', if we
 have either of these then we expose the enum?

 I could have misread the get.c extra_ext() logic, but I don't think I
 have. As far as I can tell there's no way to (generically) AND these
 things.


 Why do we need AND? If you have 3.1 context then you *need* to support this
 enum. There is no such condition where you would have gles 3.1 context but
 not the enum, right?


 What you really need to do is create a whole new [GL, GL_CORE, ES31]
 section in get_hash_params and update get_hash_generator.py
 accordingly.


 An alternative, BTW, is to just say screw it and expose the enums in
 GL ES 3.0. In that case, just move the enums into a section that
 includes GLES3. I'm not sure how big of a deal it is. But your current
 patches are definitely wrong.


 We do this, but we have explicit version check against version 3.1.
 
 OK, well, at the end of the day, the things anyone says don't matter,
 it just has to work, right? So here are the scenarios:
 
 Let's take it as a given that the driver sets
 Ext.ARB_texture_multisample to true, we have the following situations:
 
 GL compat glGet(SAMPLE_MASK)
 GL core glGet(SAMPLE_MASK)
 GL ES3.0 glGet(SAMPLE_MASK)
 GL ES3.1 glGet(SAMPLE_MASK)
 
 Where the ES3.0 one should fail, and all the others should succeed.
 OK, so first off, this is in the GL/GL_CORE/GLES3 section, which means
 that the logic will get hit for all 4, right? Then you do a check for

Yes, I think you're right, and I think I led Marta and Tapani astray.  I
think they originally had separate sections, and I told them to change
at least some of them.  The whole separate section mechanism is
confusing and leads to code churn (imagine if an ES3.0 extension comes
along that enables this functionality).

I think a good Python project for someone would be to reorganize that
file so that all the enums are in a single list.  Each enum would have a
list of minterms to describe its availability.  These would include
profile and version.  Instead of having a flat array of integers, we'd
have an array of structures:

struct requirement {
gl_api  API;
unsigned Version;   /** Minimum version required. */
unsigned extension[2];  /** Offsets of required extensions.  Use
 * dummy_true if none.
 */
enum value_extra extra; /** Other requirements. */
};

This is equivalent to the current system (which is also a maximum of
minterms), but I think it's a lot easier to specify things.

Either way, I think that can be follow-up work.

 Ext.ARB_texture_multisample || ES3.1
 
 and if that doesn't get set to true, then you return INVALID_ENUM,
 right? So tell me, given that the driver *always* sets
 Ext.ARB_texture_multisample to true, how can that ever be false? The
 ES3.1 thing could be useful for drivers that don't actually support
 ARB_texture_multisample and you want to fake it, but in practice all
 the relevant drivers support it. But none of that cause a
 GL_INVALID_ENUM to get returned in ES3.0 as it was before the patch.
 
 Hope that clears things up.
 
   -ilia
 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev
 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 06/14] meta: Don't do fragment color clamping in _mesa_meta_pbo_GetTexSubImage

2015-07-22 Thread Anuj Phogat
_mesa_meta_pbo_GetTexSubImage() uses _mesa_meta_BlitFrameBuffer(),
which will do fragment clamping if enabled. But fragment clamping
doesn't affect ReadPixels and GetTexImage.

Without this patch, piglit test arb_color_buffer_float-clear fails,
when forced to use the meta pbo path.

v2: Apply this fix to both glReadPixels and glGetTexImage.

Cc: mesa-sta...@lists.freedesktop.org
Signed-off-by: Anuj Phogat anuj.pho...@gmail.com
---
 src/mesa/drivers/common/meta_tex_subimage.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/src/mesa/drivers/common/meta_tex_subimage.c 
b/src/mesa/drivers/common/meta_tex_subimage.c
index 6d7c1eb..038d525 100644
--- a/src/mesa/drivers/common/meta_tex_subimage.c
+++ b/src/mesa/drivers/common/meta_tex_subimage.c
@@ -25,6 +25,7 @@
  *Jason Ekstrand jason.ekstr...@intel.com
  */
 
+#include blend.h
 #include bufferobj.h
 #include buffers.h
 #include fbobject.h
@@ -331,6 +332,10 @@ _mesa_meta_pbo_GetTexSubImage(struct gl_context *ctx, 
GLuint dims,
_mesa_meta_begin(ctx, ~(MESA_META_PIXEL_TRANSFER |
MESA_META_PIXEL_STORE));
 
+   /* GL_CLAMP_FRAGMENT_COLOR doesn't affect ReadPixels and GettexImage */
+   if (ctx-Extensions.ARB_color_buffer_float)
+  _mesa_ClampColor(GL_CLAMP_FRAGMENT_COLOR, GL_FALSE);
+
_mesa_GenFramebuffers(2, fbos);
 
if (tex_image  tex_image-TexObject-Target == GL_TEXTURE_1D_ARRAY) {
-- 
1.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v4 3/6] mesa/es3.1: enable GL_ARB_texture_multisample for GLES 3.1

2015-07-22 Thread Ilia Mirkin
On Wed, Jul 22, 2015 at 2:55 PM, Ian Romanick i...@freedesktop.org wrote:
 On 06/26/2015 07:38 AM, Ilia Mirkin wrote:
 On Fri, Jun 26, 2015 at 4:18 AM, Tapani Pälli tapani.pa...@intel.com wrote:


 On 06/26/2015 01:06 AM, Ilia Mirkin wrote:

 On Thu, Jun 25, 2015 at 4:22 PM, Ilia Mirkin imir...@alum.mit.edu wrote:

 On Thu, Jun 25, 2015 at 5:08 AM, Marta Lofstedt
 marta.lofst...@linux.intel.com wrote:

 From: Marta Lofstedt marta.lofst...@intel.com

 v4 : only expose GL_ARB_texture_multisample enums
 for gles 3.1 and desktop GL.


 I was suspicious of this logic. Based on my reading of the code, what
 your ARB_texture_multisample_es31 thing does is expose those enums
 when *either* the driver enables ARB_texture_multisample *or* the
 current context is ES3.1.

 ARB_draw_indirect_es31 has the same problem, btw.


 I'm not sure I understand the issue. It should work fine with 'OR', if we
 have either of these then we expose the enum?

 I could have misread the get.c extra_ext() logic, but I don't think I
 have. As far as I can tell there's no way to (generically) AND these
 things.


 Why do we need AND? If you have 3.1 context then you *need* to support this
 enum. There is no such condition where you would have gles 3.1 context but
 not the enum, right?


 What you really need to do is create a whole new [GL, GL_CORE, ES31]
 section in get_hash_params and update get_hash_generator.py
 accordingly.


 An alternative, BTW, is to just say screw it and expose the enums in
 GL ES 3.0. In that case, just move the enums into a section that
 includes GLES3. I'm not sure how big of a deal it is. But your current
 patches are definitely wrong.


 We do this, but we have explicit version check against version 3.1.

 OK, well, at the end of the day, the things anyone says don't matter,
 it just has to work, right? So here are the scenarios:

 Let's take it as a given that the driver sets
 Ext.ARB_texture_multisample to true, we have the following situations:

 GL compat glGet(SAMPLE_MASK)
 GL core glGet(SAMPLE_MASK)
 GL ES3.0 glGet(SAMPLE_MASK)
 GL ES3.1 glGet(SAMPLE_MASK)

 Where the ES3.0 one should fail, and all the others should succeed.
 OK, so first off, this is in the GL/GL_CORE/GLES3 section, which means
 that the logic will get hit for all 4, right? Then you do a check for

 Yes, I think you're right, and I think I led Marta and Tapani astray.  I
 think they originally had separate sections, and I told them to change
 at least some of them.  The whole separate section mechanism is
 confusing and leads to code churn (imagine if an ES3.0 extension comes
 along that enables this functionality).

 I think a good Python project for someone would be to reorganize that
 file so that all the enums are in a single list.  Each enum would have a
 list of minterms to describe its availability.  These would include
 profile and version.  Instead of having a flat array of integers, we'd
 have an array of structures:

 struct requirement {
 gl_api  API;
 unsigned Version;   /** Minimum version required. */
 unsigned extension[2];  /** Offsets of required extensions.  Use
  * dummy_true if none.
  */
 enum value_extra extra; /** Other requirements. */
 };

 This is equivalent to the current system (which is also a maximum of
 minterms), but I think it's a lot easier to specify things.

 Either way, I think that can be follow-up work.

We currently have varying levels of correctness wrt exposing features
we're not supposed to expose. I haven't thought about your new
dispatch stuff, but would, e.g., glPolygonOffsetClampEXT be
retrievable in an ES context right now? If it would be, do we care?

I feel like a bit of a troll when I'm pointing out these
inconsistencies since they're fairly trivial, but I also see that some
thought and effort has gone into hiding various functionality where
it's not supposed to exist.

  -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] mesa: Detect and provide macros for function attributes pure and const.

2015-07-22 Thread Jose Fonseca

On 22/07/15 17:13, Jose Fonseca wrote:

On 21/07/15 15:57, Emil Velikov wrote:

On 18 July 2015 at 08:13, Jose Fonseca jfons...@vmware.com wrote:

On 18/07/15 01:38, Eric Anholt wrote:


Emil Velikov emil.l.veli...@gmail.com writes:


On 14/07/15 19:45, Eric Anholt wrote:


These are really useful hints to the compiler in the absence of
link-time
optimization, and I'm going to use them in VC4.

I've made the const attribute be ATTRIBUTE_CONST unlike other
function
attributes, because we have other things in the tree #defining
CONST for
their own unrelated purposes.


Mindly related: how people feel about making these macros less
screamy,
by following the approach used in the kernel: PURE - __pure and so
on ?



I'd love it.



Less screamy is fine, but beware prefixing double underscore: the C
standard
stipulates that its use is reserved for for C/C++ runtime. [1]


I though about it before posting although I've seen others define
those, even do so in their public headers.
Now that I have some examples from my current /usr/include

Searching for __pure
dwarves/dutil.h:#define __pure __attribute__ ((pure))

Searching for __attribute_const__
sys/cdefs.h:# define __attribute_const__ __attribute__ ((__const__))
sys/cdefs.h:# define __attribute_const__ /* Ignore */

Searching for __printf

Searching for __always_unused

Searching for __noreturn

Searching for __packed
libvisual-0.4/libvisual/lv_defines.h:# define __packed __attribute__
((packed))
libvisual-0.4/libvisual/lv_defines.h:# define __packed /* no packed */
bsd/sys/cdefs.h:#  define __packed __attribute__((__packed__))
bsd/sys/cdefs.h:#  define __packed

Searching for __deprecated
pciaccess.h:#define __deprecated __attribute__((deprecated))
pciaccess.h:#define __deprecated

Searching for __weak

Searching for __alias

With a handful of other headers defining more double underscore
prefixed macros.


Look at stdlibc++ implementation: every internal variable has a double
underscore prefix.


Unless we're talking about STL/other template library we don't care
what library foo uses in it's internal implementation do we ? After
all these will be resolved at compile time.


Maybe kernel gets away on GLIBC (and because it doesn't use C++), but
there's no guarantee it will work on other C runtimes, and even if it
does,
it could start failing anytime.


True, it's not the best of ideas. Just worth pointing out that the
cat is already out, for other projects.



   There are already more than 12K #define __foo cases on my system.

These defines are reserved for system headers, so it's natural to be
lots of them in /usr/include.


MacOSX also defines some of these on its sys/cdefs.h:

   http://www.opensource.apple.com/source/xnu/xnu-1456.1.26/bsd/sys/cdefs.h

The question is: can we expect that most systems will define these
__foo, or at least not use them for other purposes.

I don't know the answer.  At a glance MSVC doesn't seem to rely on them
for anything.  So it might work.  I don't oppose if you want to give it
a shot.


Jose


Ironically, Windows headers already define PURE (its used in COM 
interfaces).  Just realized this now after this was merged.


So we really need a different name...

Jose
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] mesa: rearrange texture error checking order

2015-07-22 Thread Ilia Mirkin
On Wed, Jul 22, 2015 at 1:20 PM, Brian Paul bri...@vmware.com wrote:
 On 07/22/2015 11:02 AM, Ilia Mirkin wrote:

 This moves the width/height/depth == 0 check to the front and avoids
 doing any other checking when that is the case.

 Also moves the dimensions check after the format/type checks so that we
 don't bail out with success on a width/height/depth == 0 request when
 the format/type don't match.

 Bugzilla:
 https://urldefense.proofpoint.com/v2/url?u=https-3A__bugs.freedesktop.org_show-5Fbug.cgi-3Fid-3D91425d=BQIBAgc=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEsr=T0t4QG7chq2ZwJo6wilkFznRSFy-8uDKartPGbomVj8m=Y1TotaJmX1E7sCopvmMJc8PQKFXlHu6QxfcYOydKuIIs=L0lmmuk31G6M4hZEiwnX-IC4CPhW7rsrSCgEYtDgXjwe=
 Signed-off-by: Ilia Mirkin imir...@alum.mit.edu


 I suspect this isn't really needed in light of my patch to
 getteximage-invalid-format-for-packed-type.c

 I believe the test was in error, or at least sloppy, in that it was calling
 glGetTexImage to test format/type validation when there wasn't even a
 texture image to return.  The OpenGL spec doesn't specify an order for error
 checking multiple things, so if there's multiple errors, you can't be sure
 which one will be reported first.  My patch to the test removes that
 ambiguity.

Well, irrespective of the piglit test, the current code is a little
suspect -- it does the width/height/depth == 0 checks after it ensures
that height == 1 for 1d and depth == 1 for 2d. Any thoughts on that?

  -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] radeon: Silence GCC unused-but-set-variable warnings.

2015-07-22 Thread Marek Olšák
Reviewed-by: Marek Olšák marek.ol...@amd.com

Marek

On Wed, Jul 22, 2015 at 7:04 AM, Vinson Lee v...@freedesktop.org wrote:
 radeon_fbo.c: In function 'radeon_map_renderbuffer_s8z24':
 radeon_fbo.c:162:9: warning: variable 'ret' set but not used 
 [-Wunused-but-set-variable]
  int ret;
  ^
 radeon_fbo.c: In function 'radeon_map_renderbuffer_z16':
 radeon_fbo.c:200:9: warning: variable 'ret' set but not used 
 [-Wunused-but-set-variable]
  int ret;
  ^
 radeon_fbo.c: In function 'radeon_map_renderbuffer':
 radeon_fbo.c:242:8: warning: variable 'ret' set but not used 
 [-Wunused-but-set-variable]
 int ret;
 ^
 radeon_fbo.c: In function 'radeon_unmap_renderbuffer':
 radeon_fbo.c:419:14: warning: variable 'ok' set but not used 
 [-Wunused-but-set-variable]
 GLboolean ok;
   ^

 Signed-off-by: Vinson Lee v...@freedesktop.org
 ---
  src/mesa/drivers/dri/radeon/radeon_fbo.c | 6 +-
  1 file changed, 5 insertions(+), 1 deletion(-)

 diff --git a/src/mesa/drivers/dri/radeon/radeon_fbo.c 
 b/src/mesa/drivers/dri/radeon/radeon_fbo.c
 index 5e4aaca..5eece51 100644
 --- a/src/mesa/drivers/dri/radeon/radeon_fbo.c
 +++ b/src/mesa/drivers/dri/radeon/radeon_fbo.c
 @@ -169,6 +169,7 @@ radeon_map_renderbuffer_s8z24(struct gl_context *ctx,
  rrb-map_buffer = malloc(w * h * 4);
  ret = radeon_bo_map(rrb-bo, !!(mode  GL_MAP_WRITE_BIT));
  assert(!ret);
 +(void) ret;
  untiled_s8z24_map = rrb-map_buffer;
  tiled_s8z24_map = rrb-bo-ptr;

 @@ -207,6 +208,7 @@ radeon_map_renderbuffer_z16(struct gl_context *ctx,
  rrb-map_buffer = malloc(w * h * 2);
  ret = radeon_bo_map(rrb-bo, !!(mode  GL_MAP_WRITE_BIT));
  assert(!ret);
 +(void) ret;

  untiled_z16_map = rrb-map_buffer;
  tiled_z16_map = rrb-bo-ptr;
 @@ -324,6 +326,7 @@ radeon_map_renderbuffer(struct gl_context *ctx,

 ret = radeon_bo_map(rrb-bo, !!(mode  GL_MAP_WRITE_BIT));
 assert(!ret);
 +   (void) ret;

 map = rrb-bo-ptr;
 stride = rrb-map_pitch;
 @@ -416,7 +419,6 @@ radeon_unmap_renderbuffer(struct gl_context *ctx,
  {
 struct radeon_context *const rmesa = RADEON_CONTEXT(ctx);
 struct radeon_renderbuffer *rrb = radeon_renderbuffer(rb);
 -   GLboolean ok;

 if ((rmesa-radeonScreen-chip_flags  RADEON_CHIPSET_DEPTH_ALWAYS_TILED) 
  !rrb-has_surface) {
 if (rb-Format == MESA_FORMAT_Z24_UNORM_S8_UINT || rb-Format == 
 MESA_FORMAT_Z24_UNORM_X8_UINT) {
 @@ -438,6 +440,7 @@ radeon_unmap_renderbuffer(struct gl_context *ctx,
 radeon_bo_unmap(rrb-map_bo);

 if (rrb-map_mode  GL_MAP_WRITE_BIT) {
 +  GLboolean ok;
ok = rmesa-vtbl.blit(ctx, rrb-map_bo, 0,
 rb-Format, rrb-map_pitch / rrb-cpp,
 rrb-map_w, rrb-map_h,
 @@ -449,6 +452,7 @@ radeon_unmap_renderbuffer(struct gl_context *ctx,
 rrb-map_w, rrb-map_h,
 GL_FALSE);
assert(ok);
 +  (void) ok;
 }

 radeon_bo_unref(rrb-map_bo);
 --
 2.1.4

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 2/7] nv50: allocate a software object class

2015-07-22 Thread Martin Peres

On 01/07/15 01:01, Samuel Pitoiset wrote:

This will allow to monitor global performance counters through the
command stream of the GPU instead of using ioctls.

Signed-off-by: Samuel Pitoiset samuel.pitoi...@gmail.com
---
  src/gallium/drivers/nouveau/nv50/nv50_screen.c | 11 +++
  src/gallium/drivers/nouveau/nv50/nv50_screen.h |  1 +
  src/gallium/drivers/nouveau/nv50/nv50_winsys.h |  1 +
  3 files changed, 13 insertions(+)

diff --git a/src/gallium/drivers/nouveau/nv50/nv50_screen.c 
b/src/gallium/drivers/nouveau/nv50/nv50_screen.c
index 6583a35..c985344 100644
--- a/src/gallium/drivers/nouveau/nv50/nv50_screen.c
+++ b/src/gallium/drivers/nouveau/nv50/nv50_screen.c
@@ -367,6 +367,7 @@ nv50_screen_destroy(struct pipe_screen *pscreen)
 nouveau_object_del(screen-eng2d);
 nouveau_object_del(screen-m2mf);
 nouveau_object_del(screen-sync);
+   nouveau_object_del(screen-sw);
  
 nouveau_screen_fini(screen-base);
  
@@ -437,6 +438,9 @@ nv50_screen_init_hwctx(struct nv50_screen *screen)

 BEGIN_NV04(push, SUBC_3D(NV01_SUBCHAN_OBJECT), 1);
 PUSH_DATA (push, screen-tesla-handle);
  
+   BEGIN_NV04(push, SUBC_SW(NV01_SUBCHAN_OBJECT), 1);

+   PUSH_DATA (push, screen-sw-handle);
+
 BEGIN_NV04(push, NV50_3D(COND_MODE), 1);
 PUSH_DATA (push, NV50_3D_COND_MODE_ALWAYS);
  
@@ -768,6 +772,13 @@ nv50_screen_create(struct nouveau_device *dev)

goto fail;
 }
  
+   ret = nouveau_object_new(chan, 0xbeef506e, 0x506e,

I guess the 0x506e needs to be defined in libdrm, right?

Other than that, it is Reviewed-by: Martin Peres martin.pe...@free.fr

+NULL, 0, screen-sw);
+   if (ret) {
+  NOUVEAU_ERR(Failed to allocate SW object: %d\n, ret);
+  goto fail;
+   }
+
 ret = nouveau_object_new(chan, 0xbeef5039, NV50_M2MF_CLASS,
  NULL, 0, screen-m2mf);
 if (ret) {
diff --git a/src/gallium/drivers/nouveau/nv50/nv50_screen.h 
b/src/gallium/drivers/nouveau/nv50/nv50_screen.h
index 881051b..69fdfdb 100644
--- a/src/gallium/drivers/nouveau/nv50/nv50_screen.h
+++ b/src/gallium/drivers/nouveau/nv50/nv50_screen.h
@@ -93,6 +93,7 @@ struct nv50_screen {
 struct nouveau_object *tesla;
 struct nouveau_object *eng2d;
 struct nouveau_object *m2mf;
+   struct nouveau_object *sw;
  };
  
  static INLINE struct nv50_screen *

diff --git a/src/gallium/drivers/nouveau/nv50/nv50_winsys.h 
b/src/gallium/drivers/nouveau/nv50/nv50_winsys.h
index e8578c8..5cb33ef 100644
--- a/src/gallium/drivers/nouveau/nv50/nv50_winsys.h
+++ b/src/gallium/drivers/nouveau/nv50/nv50_winsys.h
@@ -60,6 +60,7 @@ PUSH_REFN(struct nouveau_pushbuf *push, struct nouveau_bo 
*bo, uint32_t flags)
  #define SUBC_COMPUTE(m) 6, (m)
  #define NV50_COMPUTE(n) SUBC_COMPUTE(NV50_COMPUTE_##n)
  
+#define SUBC_SW(m) 7, (m)
  
  static INLINE uint32_t

  NV50_FIFO_PKHDR(int subc, int mthd, unsigned size)


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 07/14] mesa: Set green, blue channels to zero only for formats with these components

2015-07-22 Thread Iago Toral
On Tue, 2015-07-21 at 11:13 -0700, Anuj Phogat wrote:
 On Tue, Jul 21, 2015 at 12:50 AM, Iago Toral ito...@igalia.com wrote:
  On Tue, 2015-06-16 at 11:15 -0700, Anuj Phogat wrote:
  Signed-off-by: Anuj Phogat anuj.pho...@gmail.com
  ---
   src/mesa/drivers/common/meta.c | 13 ++---
   1 file changed, 10 insertions(+), 3 deletions(-)
 
  diff --git a/src/mesa/drivers/common/meta.c 
  b/src/mesa/drivers/common/meta.c
  index 214a68a..fceb25d 100644
  --- a/src/mesa/drivers/common/meta.c
  +++ b/src/mesa/drivers/common/meta.c
  @@ -3132,9 +3132,16 @@ decompress_texture_image(struct gl_context *ctx,
  * returned as red and two-channel texture values are returned as
  * red/alpha.
  */
  -  if ((baseTexFormat == GL_LUMINANCE ||
  -   baseTexFormat == GL_LUMINANCE_ALPHA ||
  -   baseTexFormat == GL_INTENSITY) ||
  +  if (((baseTexFormat == GL_LUMINANCE ||
  +baseTexFormat == GL_LUMINANCE_ALPHA ||
  +baseTexFormat == GL_INTENSITY) 
  +   (destBaseFormat == GL_RGBA ||
  +destBaseFormat == GL_RGB ||
  +destBaseFormat == GL_RG ||
  +destBaseFormat == GL_GREEN ||
  +destBaseFormat == GL_BLUE ||
  +destBaseFormat == GL_BGRA ||
  +destBaseFormat == GL_BGR)) ||
 
  Is this needed to achieve correct behavior or just an optimization? I
  would expect that if the dest format does not have G/B channels, setting
  pixel transfer options for these channels would not have any functional
  effect anyway.
 
 This is just an optimization. We set pixel transfer operations based on these
 conditions and then call _mesa_ReadPixels, which falls back to slower path
 if transfer operations are set. I'll bump up the commit message of this patch.

Yes, a note in the commit log is worth it, thanks,

Reviewed-by: Iago Toral Quiroga ito...@igalia.com

 /* If we're reading back an RGB(A) texture (using 
  glGetTexImage) as
   * luminance then we need to return L=tex(R).
   */
 
 
 


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 4/4] i965/fs: Implement pass to lower instructions of unsupported SIMD width.

2015-07-22 Thread Francisco Jerez
Jason Ekstrand ja...@jlekstrand.net writes:

 A few comments below.  Mostly just asking for explanation.

 1-3 are

 Reviewed-by: Jason Ekstrand jason.ekstr...@intel.com

 Obviously, don't merge 4/4 until it actually has users.
 --Jason

Thanks.


 On Thu, Jul 16, 2015 at 8:35 AM, Francisco Jerez curroje...@riseup.net 
 wrote:
 This lowering pass implements an algorithm to expand SIMDN
 instructions into a sequence of SIMDM instructions in cases where the
 hardware doesn't support the original execution size natively for some
 particular instruction.  The most important use-cases are:

  - Lowering send message instructions that don't support SIMD16
natively into SIMD8 (several texturing, framebuffer write and typed
surface operations).

  - Lowering messages that don't support SIMD8 natively into SIMD16
(*cough*gen4*cough*).

  - 64-bit precision operations (e.g. FP64 and 64-bit integer
multiplication).

  - SIMD32.

 The algorithm works by splitting the sources of the original
 instruction into chunks of width appropriate for the lowered
 instructions, and then interleaving the results component-wise into
 the destination of the original instruction.  The pass is controlled
 by the get_lowered_simd_width() function that currently just returns
 the original execution size making the whole pass a no-op for the
 moment until some user is introduced.
 ---
  src/mesa/drivers/dri/i965/brw_fs.cpp | 142 
 +++
  src/mesa/drivers/dri/i965/brw_fs.h   |   1 +
  2 files changed, 143 insertions(+)

 diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
 b/src/mesa/drivers/dri/i965/brw_fs.cpp
 index d031352..eeb6938 100644
 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
 +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
 @@ -3204,6 +3204,147 @@ fs_visitor::lower_logical_sends()
 return progress;
  }

 +/**
 + * Get the closest native SIMD width supported by the hardware for 
 instruction
 + * \p inst.  The instruction will be left untouched by
 + * fs_visitor::lower_simd_width() if the returned value is equal to the
 + * original execution size.
 + */
 +static unsigned
 +get_lowered_simd_width(const struct brw_device_info *devinfo,
 +   const fs_inst *inst)
 +{
 +   switch (inst-opcode) {
 +   default:
 +  return inst-exec_size;
 +   }
 +}
 +
 +/**
 + * The \p rows array of registers represents a \p num_rows by \p num_columns
 + * matrix in row-major order, write it in column-major order into the 
 register
 + * passed as destination.  \p stride gives the separation between matrix
 + * elements in the input in fs_builder::dispatch_width() units.
 + */
 +static void
 +emit_transpose(const fs_builder bld,
 +   const fs_reg dst, const fs_reg *rows,
 +   unsigned num_rows, unsigned num_columns, unsigned stride)

 I'm not sure what I think about calling this emit_transpose.  I
 guess it is kind of a transpose, but maybe it's more of a gather.
 I'm not going to quibble about it though.

*Shrug*, it doesn't only gather the vectors of the rows array (that
would have been emit_collect :P), it copies them out in vertical, just
like a matrix transpose -- assuming you're not horrified by the idea of
considering the argument a matrix.


 +{
 +   fs_reg *const components = new fs_reg[num_rows * num_columns];
 +
 +   for (unsigned i = 0; i  num_columns; ++i) {
 +  for (unsigned j = 0; j  num_rows; ++j)
 + components[num_rows * i + j] = offset(rows[j], bld, stride * i);
 +   }
 +
 +   bld.LOAD_PAYLOAD(dst, components, num_rows * num_columns, 0);
 +
 +   delete[] components;
 +}
 +
 +bool
 +fs_visitor::lower_simd_width()
 +{
 +   bool progress = false;
 +
 +   foreach_block_and_inst_safe(block, fs_inst, inst, cfg) {
 +  const unsigned lower_width = get_lowered_simd_width(devinfo, inst);
 +
 +  if (lower_width != inst-exec_size) {
 + /* Builder matching the original instruction. */
 + const fs_builder ibld = bld.at(block, inst)
 +.exec_all(inst-force_writemask_all)
 +.group(inst-exec_size, 
 inst-force_sechalf);
 +
 + /* Split the copies in chunks of the execution width of either the
 +  * original or the lowered instruction, whichever is lower.
 +  */
 + const unsigned copy_width = MIN2(lower_width, inst-exec_size);
 + const unsigned n = inst-exec_size / copy_width;
 + const unsigned dst_size = inst-regs_written * REG_SIZE /
 +inst-dst.component_size(inst-exec_size);
 + fs_reg dsts[4];
 +
 + assert(n  0  n = ARRAY_SIZE(dsts) 
 +!inst-writes_accumulator  !inst-mlen);
 +
 + for (unsigned i = 0; i  n; i++) {
 +/* Emit a copy of the original instruction with the lowered 
 width.
 + * If the EOT flag was set throw it away except for the last
 + * instruction to avoid killing the thread prematurely.
 + */

Re: [Mesa-dev] [PATCH 09/12] i965/fs: Hook up SIMD lowering to handle texturing opcodes of unsupported width.

2015-07-22 Thread Francisco Jerez
Jason Ekstrand ja...@jlekstrand.net writes:

 On Sat, Jul 18, 2015 at 7:34 AM, Francisco Jerez curroje...@riseup.net 
 wrote:
 This should match the set of cases in which we currently call fail()
 or no16() from the emit_texture_*() methods and the ones in which
 emit_texture_gen4() enables the SIMD16 workaround.

 Hint for reviewers: It's not a big deal if I happen to have missed
 some case here, it will just lead to an assertion failure down the
 road which is easily fixable, however being stricter than necessary
 won't cause any visible breakage, it would just decrease performance
 silently due to the unnecessary message splitting, so feel free to
 double-check that all cases listed here already cause a SIMD8/16
 fall-back with the current texturing code -- You may want to skip over
 the Gen5-6 cases though if you don't have pencil and paper at hand.

 I'll believe you.  Even if you got it wrong, the worst that happens is
 we hit the fail case when we try and lower.

 ---
  src/mesa/drivers/dri/i965/brw_fs.cpp | 27 +++
  1 file changed, 27 insertions(+)

 diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
 b/src/mesa/drivers/dri/i965/brw_fs.cpp
 index 043d9e9..f291202 100644
 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
 +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
 @@ -3918,6 +3918,33 @@ get_lowered_simd_width(const struct brw_device_info 
 *devinfo,
/* Dual-source FB writes are unsupported in SIMD16 mode. */
return (inst-src[1].file != BAD_FILE ? 8 : inst-exec_size);

 +   case SHADER_OPCODE_TXD_LOGICAL:
 +  /* TXD is unsupported in SIMD16 mode. */
 +  return 8;
 +
 +   case SHADER_OPCODE_TG4_OFFSET_LOGICAL: {
 +  /* gather4_po_c is unsupported in SIMD16 mode. */
 +  const fs_reg shadow_c = inst-src[1];
 +  return (shadow_c.file != BAD_FILE ? 8 : inst-exec_size);
 +   }
 +   case SHADER_OPCODE_TXL_LOGICAL:
 +   case FS_OPCODE_TXB_LOGICAL: {
 +  /* Gen4 doesn't have SIMD8 non-shadow-compare bias/LOD instructions, 
 and
 +   * Gen4-6 don't support TXL and TXB with shadow comparison in SIMD16
 +   * mode.
 +   */
 +  const fs_reg shadow_c = inst-src[1];
 +  return (devinfo-gen == 4  shadow_c.file == BAD_FILE ? 16 :
 +  devinfo-gen  7  shadow_c.file != BAD_FILE ? 8 :
 +  inst-exec_size);

 Could we please use an if-ladder here.  Nested ternaries are hard to
 read and kind of pointless given that we can return multiple places in
 the if.

Heh, I always have the very opposite style itch -- If you can compute
something succinctly with a handful of pure expressions why venture
into the dangerous realm of statements?

I think John Backus' article Can programming be liberated from the von
Neumann style? although slightly outdated elaborates the point pretty
well.

But sure, if you'd like it better with an if-ladder why not.

 +   }
 +   case SHADER_OPCODE_TXF_LOGICAL:
 +   case SHADER_OPCODE_TXS_LOGICAL:
 +  /* Gen4 doesn't have SIMD8 variants for the RESINFO and LD-with-LOD
 +   * messages.  Use SIMD16 instead.
 +   */
 +  return (devinfo-gen == 4 ? 16 : inst-exec_size);

 I'd kind of also like this ternary to go, but I'm ok with it if you
 want to keep it.

 +
 default:
return inst-exec_size;
 }
 --
 2.4.3



signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 90264] [Regression, bisected] Tooltip corruption in Chrome

2015-07-22 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=90264

--- Comment #38 from Heiko lil_...@web.de ---
(In reply to Antoine Labour from comment #37)
 Can you make glXWaitX re-evaluate window sizes? That's the reason we call
 it - we just resized the window, we want to make sure we're drawing to the
 right back buffer. This is a UI workload, we may draw only once and we don't
 want flashing (what a glXSwapBuffers would do), so we want to make sure
 we're drawing to a buffer of the right size - glXWaitX is meant to
 synchronize GL with changes in X.
 
 In this trace, you're only seeing parts of the calls - in particular there
 should be a glViewport and draws and a glXSwapBuffers after the resize.
 
 I don't think there's confusion from the app side. Parts of what you're
 seeing is chromium restoring state between its virtual contexts (we have
 different subsystems that all need to access GL, and are multiplexed,
 possibly on a single GL context, depending on the version), so it's not
 necessarily surprising to see scissor/viewport that doesn't match the window
 size - until we get to actually draw the picture on screen. In particular,
 the texture is correctly loaded with 370x72, so no confusion there.

Well, most of my findings were pretty vague, because I'm neither a GL expert
nor a mesa expert.

With your hint, I just noticed that dri2_wait_x()/dri2_wait_gl() don't do
anything, because of priv-have_fake_front being zero all the time. It would be
set to 1 for buffer attachments of __DRI_BUFFER_FAKE_FRONT_LEFT, but that's
nowhere set in mesa. Thus gcc just optimized it away completely along with
dri2_copy_drawable().

Seems to be related to [1]?


[1] http://lists.freedesktop.org/archives/mesa-dev/2011-June/008241.html

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH mesa 2/2] egl: android: remove DRM_GRALLOC_TOP hack

2015-07-22 Thread Varad Gautam
On Tuesday, July 21, 2015 03:36:32 PM Emil Velikov wrote:
 Now that the drm_gralloc module exports the correct includes we can get
 rid of this hack.
 
 Cc: Varad Gautam varadgau...@gmail.com
 Cc: Chih-Wei Huang cwhu...@android-x86.org
 Cc: Eric Anholt e...@anholt.net
 Signed-off-by: Emil Velikov emil.l.veli...@gmail.com
 ---

these two Tested-by: Varad Gautam varadgau...@gmail.com

  Android.mk | 2 --
  src/egl/Android.mk | 1 -
  2 files changed, 3 deletions(-)
 
 diff --git a/Android.mk b/Android.mk
 index 17fd5f5..ed160fb 100644
 --- a/Android.mk
 +++ b/Android.mk
 @@ -45,8 +45,6 @@ endif
  MESA_COMMON_MK := $(MESA_TOP)/Android.common.mk
  MESA_PYTHON2 := python
 
 -DRM_GRALLOC_TOP := hardware/drm_gralloc
 -
  classic_drivers := i915 i965
  gallium_drivers := swrast freedreno i915g ilo nouveau r300g r600g radeonsi
 vmwgfx vc4
 
 diff --git a/src/egl/Android.mk b/src/egl/Android.mk
 index 516b53c..ebd67af 100644
 --- a/src/egl/Android.mk
 +++ b/src/egl/Android.mk
 @@ -52,7 +52,6 @@ LOCAL_CFLAGS += -DDEFAULT_DRIVER_DIR=\/system/lib/dri\
  endif
 
  LOCAL_C_INCLUDES := \
 - $(DRM_GRALLOC_TOP) \
   $(MESA_TOP)/src/egl/main \
   $(MESA_TOP)/src/egl/drivers/dri2 \

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2] android: rework the EGL build

2015-07-22 Thread Varad Gautam
On Tuesday, July 21, 2015 03:27:47 PM Emil Velikov wrote:
 See previous two commits for details.
 
 v2: Don't forget git mv, bring back DRM_GRALLOC_TOP. Spotted by Varad.
 
 Cc: Varad Gautam varadgau...@gmail.com
 Cc: Chih-Wei Huang cwhu...@android-x86.org
 Cc: Eric Anholt e...@anholt.net
 Signed-off-by: Emil Velikov emil.l.veli...@gmail.com

Tested-by: Varad Gautam varadgau...@gmail.com

 ---
 
 Patch generated with -M to ease diff/review process.
 
 -Emil
 
  Android.mk  |  3 +-
  src/egl/{main = }/Android.mk   | 36 ---
  src/egl/drivers/dri2/Android.mk | 63
 - 3 files changed, 26
 insertions(+), 76 deletions(-)
  rename src/egl/{main = }/Android.mk (76%)
  delete mode 100644 src/egl/drivers/dri2/Android.mk
 
 diff --git a/Android.mk b/Android.mk
 index 69e0d33..17fd5f5 100644
 --- a/Android.mk
 +++ b/Android.mk
 @@ -91,8 +91,7 @@ SUBDIRS := \
   src/glsl \
   src/mesa \
   src/util \
 - src/egl/main \
 - src/egl/drivers/dri2 \
 + src/egl \
   src/mesa/drivers/dri
 
  ifeq ($(strip $(MESA_BUILD_GALLIUM)),true)
 diff --git a/src/egl/main/Android.mk b/src/egl/Android.mk
 similarity index 76%
 rename from src/egl/main/Android.mk
 rename to src/egl/Android.mk
 index 270c165..516b53c 100644
 --- a/src/egl/main/Android.mk
 +++ b/src/egl/Android.mk
 @@ -27,19 +27,37 @@ LOCAL_PATH := $(call my-dir)
 
  include $(LOCAL_PATH)/Makefile.sources
 
 -SOURCES := \
 - ${LIBEGL_C_FILES}
 -
  # ---
  # Build libGLES_mesa
  # ---
 
  include $(CLEAR_VARS)
 
 -LOCAL_SRC_FILES := $(SOURCES)
 +LOCAL_SRC_FILES := \
 + $(LIBEGL_C_FILES) \
 + $(dri2_backend_core_FILES) \
 + drivers/dri2/platform_android.c
 
  LOCAL_CFLAGS := \
 - -D_EGL_NATIVE_PLATFORM=_EGL_PLATFORM_ANDROID
 + -D_EGL_NATIVE_PLATFORM=_EGL_PLATFORM_ANDROID \
 + -D_EGL_BUILT_IN_DRIVER_DRI2 \
 + -DHAVE_ANDROID_PLATFORM
 +
 +ifeq ($(MESA_LOLLIPOP_BUILD),true)
 +LOCAL_CFLAGS_arm := -DDEFAULT_DRIVER_DIR=\/system/lib/dri\
 +LOCAL_CFLAGS_x86 := -DDEFAULT_DRIVER_DIR=\/system/lib/dri\
 +LOCAL_CFLAGS_x86_64 := -DDEFAULT_DRIVER_DIR=\/system/lib64/dri\
 +else
 +LOCAL_CFLAGS += -DDEFAULT_DRIVER_DIR=\/system/lib/dri\
 +endif
 +
 +LOCAL_C_INCLUDES := \
 + $(DRM_GRALLOC_TOP) \
 + $(MESA_TOP)/src/egl/main \
 + $(MESA_TOP)/src/egl/drivers/dri2 \
 +
 +LOCAL_STATIC_LIBRARIES := \
 + libmesa_loader
 
  LOCAL_SHARED_LIBRARIES := \
   libdl \
 @@ -53,12 +71,11 @@ LOCAL_SHARED_LIBRARIES += libsync
  endif
 
  # add libdrm if there are hardware drivers
 -ifneq ($(MESA_GPU_DRIVERS),swrast)
 +ifneq ($(filter-out swrast,$(MESA_GPU_DRIVERS)),)
 +LOCAL_CFLAGS += -DHAVE_LIBDRM
  LOCAL_SHARED_LIBRARIES += libdrm
  endif
 
 -LOCAL_CFLAGS += -D_EGL_BUILT_IN_DRIVER_DRI2
 -
  ifeq ($(strip $(MESA_BUILD_CLASSIC)),true)
  # require i915_dri and/or i965_dri
  LOCAL_REQUIRED_MODULES += \
 @@ -69,9 +86,6 @@ ifeq ($(strip $(MESA_BUILD_GALLIUM)),true)
  LOCAL_REQUIRED_MODULES += gallium_dri
  endif # MESA_BUILD_GALLIUM
 
 -LOCAL_STATIC_LIBRARIES := \
 - libmesa_egl_dri2 \
 - libmesa_loader
 
  LOCAL_MODULE := libGLES_mesa
  ifeq ($(MESA_LOLLIPOP_BUILD),true)
 diff --git a/src/egl/drivers/dri2/Android.mk
 b/src/egl/drivers/dri2/Android.mk deleted file mode 100644
 index 76be3b2..000
 --- a/src/egl/drivers/dri2/Android.mk
 +++ /dev/null
 @@ -1,63 +0,0 @@
 -# Mesa 3-D graphics library
 -#
 -# Copyright (C) 2010-2011 Chia-I Wu olva...@gmail.com
 -# Copyright (C) 2010-2011 LunarG Inc.
 -#
 -# Permission is hereby granted, free of charge, to any person obtaining a
 -# copy of this software and associated documentation files (the
 Software), -# to deal in the Software without restriction, including
 without limitation -# the rights to use, copy, modify, merge, publish,
 distribute, sublicense, -# and/or sell copies of the Software, and to
 permit persons to whom the -# Software is furnished to do so, subject to
 the following conditions: -#
 -# The above copyright notice and this permission notice shall be included
 -# in all copies or substantial portions of the Software.
 -#
 -# THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS
 OR -# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
 MERCHANTABILITY, -# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. 
 IN NO EVENT SHALL -# THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY
 CLAIM, DAMAGES OR OTHER -# LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
 TORT OR OTHERWISE, ARISING -# FROM, OUT OF OR IN CONNECTION WITH THE
 SOFTWARE OR THE USE OR OTHER -# DEALINGS IN THE SOFTWARE.
 -
 -# Android.mk for egl_dri2
 -
 -LOCAL_PATH := $(call my-dir)
 -
 -include $(CLEAR_VARS)
 -
 -LOCAL_SRC_FILES := \
 - egl_dri2.c \
 - platform_android.c
 -
 -LOCAL_CFLAGS := \
 - -DHAVE_ANDROID_PLATFORM
 -
 -ifeq ($(MESA_LOLLIPOP_BUILD),true)
 -LOCAL_CFLAGS_arm := -DDEFAULT_DRIVER_DIR=\/system/lib/dri\
 

Re: [Mesa-dev] [PATCH 05/14] meta: Abort meta pbo path if readpixels need signed-unsigned conversion

2015-07-22 Thread Iago Toral
On Tue, 2015-07-21 at 17:05 -0700, Anuj Phogat wrote:
 On Tue, Jul 21, 2015 at 1:36 AM, Iago Toral ito...@igalia.com wrote:
  On Tue, 2015-07-21 at 08:13 +0200, Iago Toral wrote:
  On Mon, 2015-07-20 at 10:56 -0700, Anuj Phogat wrote:
   On Mon, Jul 20, 2015 at 5:10 AM, Iago Toral ito...@igalia.com wrote:
On Fri, 2015-06-19 at 13:40 -0700, Anuj Phogat wrote:
On Tue, Jun 16, 2015 at 9:21 PM, Jason Ekstrand 
ja...@jlekstrand.net wrote:

 On Jun 16, 2015 11:15, Anuj Phogat anuj.pho...@gmail.com wrote:

 Without this patch, piglit test fbo_integer_readpixels_sint_uint 
 fails,
 when
 forced to use the meta pbo path.

 Signed-off-by: Anuj Phogat anuj.pho...@gmail.com
 Cc: mesa-sta...@lists.freedesktop.org
 ---
  src/mesa/drivers/common/meta_tex_subimage.c | 3 +++
  1 file changed, 3 insertions(+)

 diff --git a/src/mesa/drivers/common/meta_tex_subimage.c
 b/src/mesa/drivers/common/meta_tex_subimage.c
 index 00364f8..84cbc50 100644
 --- a/src/mesa/drivers/common/meta_tex_subimage.c
 +++ b/src/mesa/drivers/common/meta_tex_subimage.c
 @@ -283,6 +283,9 @@ _mesa_meta_pbo_GetTexSubImage(struct 
 gl_context *ctx,
 GLuint dims,

if (_mesa_need_rgb_to_luminance_conversion(rb-Format, 
 format))
   return false;
 +
 +  if (_mesa_need_signed_unsigned_int_conversion(rb-Format, 
 format,
 type))
 + return false;

 Hrm... This seems fishy.  Isn't glBlitFramebuffers supposed to 
 handle format
 conversion with integers?  If so we should probably fix it rather 
 than just
 skip it for the meta pbo path.

As discussed offline, here is relevant text for glBlitFrameBuffer() 
from
OpenGL 4.5 spec, section 18.3.1:
An INVALID_OPERATION error is generated if format conversions are not
supported, which occurs under any of the following conditions:
-The read buffer contains fixed-point or floating-point values and 
any draw
  buffer contains neither fixed-point nor floating-point values.
-The read buffer contains unsigned integer values and any draw buffer 
does
  not contain unsigned integer values.
- The read buffer contains signed integer values and any draw buffer 
does
  not contain signed integer values.
   
I'll add a comment here explaining the reason to avoid meta path.
   
Is this code going to run only for glBlitFramebuffer? I see this
function being called from code paths that implement glReadPixels and
glGetTexImage too.
   
   _mesa_meta_pbo_GetTexSubImage() is used only for glReadPixels and
   glGetTexImage. I quoted the glBliFrameBuffer restriction above because
   the function is later using _mesa_meta_BlitFramebuffer(), which doesn't
   support some format conversions.
 
  If this function can be used to resolve ReadPixels and GetTexImage but
  the checks you add are *specific* to BlitFramebuffer, it does not look
  like this is the right place for them. Shouldn't you put them inside
  _mesa_meta_BlitFramebuffer instead? Otherwise they would affect to
  ReadPixels and GetTexImage too and I don't see the same restrictions
  applying to ReadPixels for example.
 We already have error checks in place for glBlitFrameBuffer(). Take a
 look at compatible_color_datatypes() in _mesa_blit_framebuffer().
 
  Specifically for ReadPixels I only see this in the spec:
 
  An INVALID_OPERATION error is generated if format is an integer format
  and the color buffer is not an integer format, or if the color buffer is
  an integer format and format is not an integer format.
 
  So, unlike BlitFramebuffer, it seems that ReadPixels is fine as long as
  both formats are integer, no matter if the types have the same sign or
  not.
 Right. That's the reason this patch doesn't generate any GL error for
 signed-unsigned int mismatch. It just decides not to use meta pbo path
 because of  unsupported format conversions in _mesa_meta_BlitFrameBuffer(),
 and fallback to using other paths.

Ah, I see now, thanks for clarifying this.

Reviewed-by: Iago Toral Quiroga ito...@igalia.com

 
  Iago
 
 
 }

 /* For arrays, use a tall (height * depth) 2D texture but 
 taking into
 --
 1.9.3

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev
   
   
  
 
 
  ___
  mesa-dev mailing list
  mesa-dev@lists.freedesktop.org
  http://lists.freedesktop.org/mailman/listinfo/mesa-dev
 
 
 


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org

Re: [Mesa-dev] [PATCH 18/78] i965: Take is_scalar_shader_stage() method out to allow reuse

2015-07-22 Thread Eduardo Lima Mitev
On 07/13/2015 01:38 PM, Jason Ekstrand wrote:
 On Fri, Jul 10, 2015 at 8:53 AM, Eduardo Lima Mitev el...@igalia.com wrote:
 On 06/30/2015 06:58 PM, Jason Ekstrand wrote:
 On Fri, Jun 26, 2015 at 1:06 AM, Eduardo Lima Mitev el...@igalia.com 
 wrote:
 This patch makes public the is_scalar_shader_stage() method in brw_shader, 
 and
 renames it to brw_compiler_is_scalar_shader_stage(). The plan is to later 
 reuse it
 in brw_nir, to enable/disable optimization passes depending on the type
 of shader stage.

 I'm not so sure that this is a good plan.  It assumes that whether we
 are doing a scalar or vec4 compile is based entirely on the shader
 stage and some static information (possibly based on environment
 variables).  Ken and I were talking around the office and we may want
 to use both SIMD4x2 and SIMD8 mode for geometry shaders depending on
 the number of inputs, etc.  This won't work in the given paradigm.


 If I understand correctly, what you propose is having a function to
 dynamically choose the type of shader (scalar vs. vec4) when compiling
 the shader, using not only gen and stage, but also actual application
 data. I think this is a good idea and will allow experimenting with
 different combinations of shaders with real input data.

 However, I wonder if we this should be added later after more elaborated
 thoughts on what exactly do we need and where to plug it. I have been
 experimenting with a function following the use case you mentioned,
 choosing shader backend based on inputs to a GS. But honestly it feels
 like a blind guess from my side to what actually you and Ken have in mind.

 Current patch basically reuses the function we already have to select
 the shader, so what I propose is to move forward with it for the moment
 (adding the missing MESA_SHADER_COMPUTE) and discuss how to extend it to
 factor in dynamic data; or perhaps you can explain us your proposal with
 a bit more detail.

 WDYT?
 
 My primary issue was with having it based on a is_scalar_stage
 function at all.  It seems like a better long-term plan would be to
 simply pass an is_scalar boolean into those lowering functions.  That
 said, I haven't taken a real hard look as to what calls those
 functions and whether or not that's actually what we want either.
 Make sense?  If there's no better place to make that determination
 further up the call chain, then this is fine for now.
 

Yes, I like the idea of passing the is_scalar boolean to
brw_create_nir() directly, so that this decision is moved up to a higher
context. By doing this we also leave the is_scalar_shader_stage()
function private to brw_shader.cpp, meaning this patch is not relevant
anymore and we leave that function untouched.

A patch illustrating these changes is at:
https://github.com/Igalia/mesa/commit/9e7bc62d43eaa30de005c40b91551ad113c4065c


In the future, if we want to select shader type (scalar vs. vector)
based on dynamic info, the logic that needs changing is contained in
brw_shader. But this patch doesn't address that yet, lacking a more
defined use-case. My (wild) idea would be to have a
preferred_shader_type variable in brw_context for example, that can be
modifiable dynamically at any point, and will instruct the link phase to
use scalar or vec4 if possible (depending on gen, stage, etc).

What do you think?

 The new method accepts a brw_compiler instead of a brw_context. This is 
 done
 for consistency, since the actual info we need (scalar_vs) is in 
 brw_compiler,
 and fetching in through brw_content-intelScreen-compiler seems like too
 much indirection.

 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89580
 ---
  src/mesa/drivers/dri/i965/brw_shader.cpp | 22 ++
  src/mesa/drivers/dri/i965/brw_shader.h   | 13 +
  2 files changed, 19 insertions(+), 16 deletions(-)

 diff --git a/src/mesa/drivers/dri/i965/brw_shader.cpp 
 b/src/mesa/drivers/dri/i965/brw_shader.cpp
 index 0b53647..3b99046 100644
 --- a/src/mesa/drivers/dri/i965/brw_shader.cpp
 +++ b/src/mesa/drivers/dri/i965/brw_shader.cpp
 @@ -182,19 +182,6 @@ brw_shader_precompile(struct gl_context *ctx,
 return true;
  }

 -static inline bool
 -is_scalar_shader_stage(struct brw_context *brw, int stage)
 -{
 -   switch (stage) {
 -   case MESA_SHADER_FRAGMENT:
 -  return true;
 -   case MESA_SHADER_VERTEX:
 -  return brw-intelScreen-compiler-scalar_vs;
 -   default:
 -  return false;
 -   }
 -}
 -
  static void
  brw_lower_packing_builtins(struct brw_context *brw,
 gl_shader_stage shader_type,
 @@ -205,7 +192,8 @@ brw_lower_packing_builtins(struct brw_context *brw,
 | LOWER_PACK_UNORM_2x16
 | LOWER_UNPACK_UNORM_2x16;

 -   if (is_scalar_shader_stage(brw, shader_type)) {
 +   if (brw_compiler_is_scalar_shader_stage(brw-intelScreen-compiler,
 +   shader_type)) {
ops |= LOWER_UNPACK_UNORM_4x8
 | LOWER_UNPACK_SNORM_4x8
   

[Mesa-dev] [PATCH] i965/fs: Fix calculation of the number of registers read in opt_copy_propagate.

2015-07-22 Thread Francisco Jerez
This seems to have been multiplying by stride twice since
fs_inst::regs_read/regs_written were changed to return the value in
register units rather than in dispatch_width-wide components.  The
value returned by fs_inst::regs_read() already takes into account the
stride so it's wrong to do it again here.
---
 src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp
index 269bdb5..3940158 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp
@@ -309,7 +309,7 @@ fs_visitor::try_copy_propagate(fs_inst *inst, int arg, 
acp_entry *entry)
 */
if (inst-src[arg].reg_offset  entry-dst.reg_offset ||
(inst-src[arg].reg_offset * 32 + inst-src[arg].subreg_offset +
-inst-regs_read(arg) * inst-src[arg].stride * 32) 
+inst-regs_read(arg) * 32) 
(entry-dst.reg_offset + entry-regs_written) * 32)
   return false;
 
@@ -477,7 +477,7 @@ fs_visitor::try_constant_propagate(fs_inst *inst, acp_entry 
*entry)
*/
   if (inst-src[i].reg_offset  entry-dst.reg_offset ||
   (inst-src[i].reg_offset * 32 + inst-src[i].subreg_offset +
-   inst-regs_read(i) * inst-src[i].stride * 32) 
+   inst-regs_read(i) * 32) 
   (entry-dst.reg_offset + entry-regs_written) * 32)
  continue;
 
-- 
2.4.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 02/12] st/dri: fix EGL_KHR_fence_sync since the last radeonsi change broke it

2015-07-22 Thread Michel Dänzer
On 17.07.2015 01:54, Marek Olšák wrote:
 From: Marek Olšák marek.ol...@amd.com
 
 Broken by f1be3d8cdde17a9b9ae283e1bab2f46b992d3bf3, which returns NULL
 if no commands have been submitted.

I've fixed that in radeonsi, so you can drop this patch.


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 16/78] i965/nir/vec4: Implement store_output intrinsic

2015-07-22 Thread Eduardo Lima Mitev
On 07/13/2015 01:57 PM, Jason Ekstrand wrote:
 On Wed, Jul 8, 2015 at 11:54 PM, Eduardo Lima Mitev el...@igalia.com wrote:
 On 06/30/2015 06:51 PM, Jason Ekstrand wrote:
 On Fri, Jun 26, 2015 at 1:06 AM, Eduardo Lima Mitev el...@igalia.com 
 wrote:
 The index into the output_reg array where to store the destination 
 register is
 fetched from the nir_outputs map built during nir_setup_outputs stage.

 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89580
 ---
  src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 17 +++--
  1 file changed, 15 insertions(+), 2 deletions(-)

 diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp 
 b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
 index 8a2d335..55d4490 100644
 --- a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
 +++ b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
 @@ -520,10 +520,23 @@ vec4_visitor::nir_emit_intrinsic(nir_intrinsic_instr 
 *instr)
 }

 case nir_intrinsic_store_output_indirect:
 +  has_indirect = true;
/* fallthrough */
 -   case nir_intrinsic_store_output:
 -  /* @TODO: Not yet implemented */
 +   case nir_intrinsic_store_output: {
 +  int offset = instr-const_index[0];
 +  int output = nir_outputs[offset];
 +
 +  src = get_nir_src(instr-src[0], nir_output_types[offset]);
 +  dest = dst_reg(src);
 +
 +  dest.writemask = brw_writemask_for_size(instr-num_components);
 +
 +  if (has_indirect)
 + dest.reladdr = new(mem_ctx) src_reg(get_nir_src(instr-src[1]));
 +
 +  output_reg[output] = dest;

 I'm very confused about the amount of indirection going on here.  It
 seems to me that we should be setting these outputs up in
 setup_outputs() rather than storring off a map from ints to other ints
 and setting it up here.  I didn't make this comment on the patch for
 setup_outputs() because I wanted to wait to see it used before I
 commented on it.

 I'm guessing you did it this way because the nir_assign_var_locations
 is giving you bogus values.  If so, then it might be better to just
 assign variable locations in setup_outputs() rather than having a
 remap table.  The whole point of nir_lower_io is to make IO easy for
 the back-end.  If you need a re-map table, then it's no longer making
 it easy and we need to think more about what's going on.
 --Jason


 That double indirection felt bad since the beginning, but it was needed
 to store the original variable's location (var-data.location). Let me
 explain:

 We are (re)using the plumbering in vec4_visitor to setup URB, so the
 only thing we need to do is to store the out register in output_reg
 map at the correct location. And that location is given by the original
 location in the shader (var-data.location).

 So, in this case, nir_assign_var_locations pass, which constructs
 var-data.driver_location, is not useful to us, except to give us
 consecutive indexes to construct the other map we have, the type map,
 which is needed to carry the correct type from the original variable to
 the output register.
 
 If nir_assign_var_locations isn't doing anything for you, don't call
 it.  You'll need to do something with var-data.driver_location.  If
 what you really want is var-data.location, then just copy that to
 var-data.driver_location when you do nir_setup_outputs.  Or
 (depending on how the URB setup works, I don't actually know), put the
 actual URB location in var-data.driver_location when you walk the
 outputs.
 
 From there, you have two options.  One would be to setup output_reg at
 the same time with the correct types right away and emit a MOV when
 you get a store_output.  (Copy propagation should clean up the MOV.)
 For what it's worth, I don't think the type matters; a URB write just
 writes data to something so as long as you don't have a type mismatch
 in a MOV, the hardware won't care.
 
 The other option, would be to directly emit the URB write in
 store_output.  At the moment, it may be better to take the first
 option since that better matches what the FS does right now.  But both
 should work fine.
 

Thanks for these hints, they were very useful.

I rewrote the implementation of store_output intrinsic to avoid the
setup phase completely. The type, as you suggested, was not important as
long as they match while MOVing the contents of output_reg. To guarantee
that, I had to patch the emit_urb_slot() to guarantee the types always
match. This code is shared with vec4_visitor, so it makes sense to move
the safeguards there instead of having both backends provide the correct
register type in output_reg entries.
For reference, this is the patch that implements it:
https://github.com/Igalia/mesa/commit/8c703937f285c0b3a1e7bf6681c7ed7fe09815aa
.

I also put var-data.location in const_index[1] of the intrinsic op, and
disabled nir_assign_var_locations() for output variables, since I don't
need var-data.driver_location. I could have used const_index[0], but I
prefer to leave driver_location there, and use const_index[1], to avoid

Re: [Mesa-dev] [PATCH 10/12] i965/fs: Reimplement emit_texture() in terms of logical send messages.

2015-07-22 Thread Francisco Jerez
Jason Ekstrand ja...@jlekstrand.net writes:

 On Sat, Jul 18, 2015 at 7:34 AM, Francisco Jerez curroje...@riseup.net 
 wrote:
 ---
  src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 66 
 +---
  1 file changed, 49 insertions(+), 17 deletions(-)

 diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp 
 b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
 index 89fcc49..4011639 100644
 --- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
 +++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
 @@ -861,6 +861,14 @@ fs_visitor::emit_texture(ir_texture_opcode op,
}
 }

 +   if (op == ir_query_levels) {
 +  /* textureQueryLevels() is implemented in terms of TXS so we need to
 +   * pass a valid LOD argument.
 +   */
 +  assert(lod.file == BAD_FILE);
 +  lod = fs_reg(0u);

 Why can't this go as part of the switch below?  It seems kind of odd
 to split the logic up like this.

What I had in mind originally was to factor out the switch statement
below into a separate function -- Later on I noticed that you wanted to
rework emit_texture() and friends to stop using ir_texture_opcode as
representation for the texturing operation, so you would likely want to
get rid of the whole switch-case statement below or replace it with
something else, so I didn't bother.  What are your current plans for
emit_texture()?

 +   }
 +
 if (coordinate.file != BAD_FILE) {
/* FINISHME: Texture coordinate rescaling doesn't work with 
 non-constant
 * samplers.  This should only be a problem with GL_CLAMP on Gen7.
 @@ -873,26 +881,50 @@ fs_visitor::emit_texture(ir_texture_opcode op,
  * samples, so don't worry about them.
  */
 fs_reg dst = vgrf(glsl_type::get_instance(dest_type-base_type, 4, 1));
 +   const fs_reg srcs[] = {
 +  coordinate, shadow_c, lod, lod2,
 +  sample_index, mcs, sampler_reg, offset_value,
 +  fs_reg(coord_components), fs_reg(grad_components)
 +   };
 +   enum opcode opcode;

 -   if (devinfo-gen = 7) {
 -  inst = emit_texture_gen7(op, dst, coordinate, coord_components,
 -   shadow_c, lod, lod2, grad_components,
 -   sample_index, mcs, sampler_reg,
 -   offset_value);
 -   } else if (devinfo-gen = 5) {
 -  inst = emit_texture_gen5(op, dst, coordinate, coord_components,
 -   shadow_c, lod, lod2, grad_components,
 -   sample_index, sampler,
 -   offset_value.file != BAD_FILE);
 -   } else if (dispatch_width == 16) {
 -  inst = emit_texture_gen4_simd16(op, dst, coordinate, coord_components,
 -  shadow_c, lod, sampler);
 -   } else {
 -  inst = emit_texture_gen4(op, dst, coordinate, coord_components,
 -   shadow_c, lod, lod2, grad_components,
 -   sampler);
 +   switch (op) {
 +   case ir_tex:
 +  opcode = SHADER_OPCODE_TEX_LOGICAL;
 +  break;
 +   case ir_txb:
 +  opcode = FS_OPCODE_TXB_LOGICAL;
 +  break;
 +   case ir_txl:
 +  opcode = SHADER_OPCODE_TXL_LOGICAL;
 +  break;
 +   case ir_txd:
 +  opcode = SHADER_OPCODE_TXD_LOGICAL;
 +  break;
 +   case ir_txf:
 +  opcode = SHADER_OPCODE_TXF_LOGICAL;
 +  break;
 +   case ir_txf_ms:
 +  opcode = SHADER_OPCODE_TXF_CMS_LOGICAL;
 +  break;
 +   case ir_txs:
 +   case ir_query_levels:
 +  opcode = SHADER_OPCODE_TXS_LOGICAL;
 +  break;
 +   case ir_lod:
 +  opcode = SHADER_OPCODE_LOD_LOGICAL;
 +  break;
 +   case ir_tg4:
 +  opcode = (offset_value.file != BAD_FILE  offset_value.file != IMM ?
 +SHADER_OPCODE_TG4_OFFSET_LOGICAL : 
 SHADER_OPCODE_TG4_LOGICAL);
 +  break;
 +   default:
 +  unreachable(not reached);

 Please choose something more descriptive than not reached.  How
 about invalid texture opcode

Sure, I'll fix that, thanks.


 }

 +   inst = bld.emit(opcode, dst, srcs, ARRAY_SIZE(srcs));
 +   inst-regs_written = 4 * dispatch_width / 8;
 +
 if (shadow_c.file != BAD_FILE)
inst-shadow_compare = true;

 --
 2.4.3



signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 04/12] i965/fs: Pass a BAD_FILE header source to LOAD_PAYLOAD in emit_texture_gen7().

2015-07-22 Thread Francisco Jerez
Jason Ekstrand ja...@jlekstrand.net writes:

 On Sat, Jul 18, 2015 at 7:34 AM, Francisco Jerez curroje...@riseup.net 
 wrote:
 So that it's left uninitialized by LOAD_PAYLOAD, we only need to
 reserve space for it in the message since it will be initialized
 implicitly by the generator.
 ---
  src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 7 ---
  1 file changed, 4 insertions(+), 3 deletions(-)

 diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp 
 b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
 index 522e13e..89fcc49 100644
 --- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
 +++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
 @@ -473,8 +473,9 @@ fs_visitor::emit_texture_gen7(ir_texture_opcode op, 
 fs_reg dst,
 if (op == ir_tg4 || offset_value.file != BAD_FILE ||
 is_high_sampler(devinfo, sampler)) {
/* For general texture offsets (no txf workaround), we need a header 
 to
 -   * put them in.  Note that for SIMD16 we're making space for two 
 actual
 -   * hardware registers here, so the emit will have to fix up for this.
 +   * put them in.  Note that we're only reserving space for it in the
 +   * message payload as it will be initialized implicitly by the
 +   * generator.

 Mind splitting this in two?  Fixing the comment above is really
 unrelated to passing BAD_FILE below.

It seemed like the sentence I changed was trying to explain what the
line of code I changed was doing.  It may sound unrelated because the
comment was completely misleading (or just outdated), but it can only
refer to the three lines of code below...

 --Jason

 *
 * * ir4_tg4 needs to place its channel select in the header,
 * for interaction with ARB_texture_swizzle
 @@ -483,7 +484,7 @@ fs_visitor::emit_texture_gen7(ir_texture_opcode op, 
 fs_reg dst,
 * need to offset the Sampler State Pointer in the header.
 */
header_size = 1;
 -  sources[0] = fs_reg(GRF, alloc.allocate(1), BRW_REGISTER_TYPE_UD);
 +  sources[0] = fs_reg();
length++;
 }

 --
 2.4.3



signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 12/12] i965/fs: Reimplement emit_single_fb_write() in terms of logical framebuffer writes.

2015-07-22 Thread Francisco Jerez
Jason Ekstrand ja...@jlekstrand.net writes:

 On Thu, Jul 16, 2015 at 8:41 AM, Francisco Jerez curroje...@riseup.net 
 wrote:
 The only non-trivial thing it still has to do is figure out where to
 take the src/dst depth values from and predicate the instruction if
 discard is in use.  The manual SIMD unrolling logic in the dual-source
 case goes away because this is now handled transparently by the SIMD
 lowering pass.
 ---
  src/mesa/drivers/dri/i965/brw_fs.h   |   5 +-
  src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 203 
 +++
  2 files changed, 20 insertions(+), 188 deletions(-)

 diff --git a/src/mesa/drivers/dri/i965/brw_fs.h 
 b/src/mesa/drivers/dri/i965/brw_fs.h
 index 1ae79a9..64f89d4 100644
 --- a/src/mesa/drivers/dri/i965/brw_fs.h
 +++ b/src/mesa/drivers/dri/i965/brw_fs.h
 @@ -290,13 +290,10 @@ public:
 bool optimize_frontfacing_ternary(nir_alu_instr *instr,
   const fs_reg result);

 -   void setup_color_payload(fs_reg *dst, fs_reg color, unsigned components,
 -unsigned exec_size, bool use_2nd_half);
 void emit_alpha_test();
 fs_inst *emit_single_fb_write(const brw::fs_builder bld,
   fs_reg color1, fs_reg color2,
 - fs_reg src0_alpha, unsigned components,
 - unsigned exec_size, bool use_2nd_half = 
 false);
 + fs_reg src0_alpha, unsigned components);
 void emit_fb_writes();
 void emit_urb_writes();
 void emit_cs_terminate();
 diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp 
 b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
 index ba4b177..bcfeaa0 100644
 --- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
 +++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
 @@ -1409,33 +1409,6 @@ fs_visitor::emit_interpolation_setup_gen6()
 }
  }

 -void
 -fs_visitor::setup_color_payload(fs_reg *dst, fs_reg color, unsigned 
 components,
 -unsigned exec_size, bool use_2nd_half)
 -{
 -   brw_wm_prog_key *key = (brw_wm_prog_key*) this-key;
 -   fs_inst *inst;
 -
 -   if (key-clamp_fragment_color) {
 -  fs_reg tmp = vgrf(glsl_type::vec4_type);
 -  assert(color.type == BRW_REGISTER_TYPE_F);
 -  for (unsigned i = 0; i  components; i++) {
 - inst = bld.MOV(offset(tmp, bld, i), offset(color, bld, i));
 - inst-saturate = true;
 -  }
 -  color = tmp;
 -   }
 -
 -   if (exec_size  dispatch_width) {
 -  unsigned half_idx = use_2nd_half ? 1 : 0;
 -  for (unsigned i = 0; i  components; i++)
 - dst[i] = half(offset(color, bld, i), half_idx);
 -   } else {
 -  for (unsigned i = 0; i  components; i++)
 - dst[i] = offset(color, bld, i);
 -   }
 -}
 -
  static enum brw_conditional_mod
  cond_for_alpha_func(GLenum func)
  {
 @@ -1493,146 +1466,34 @@ fs_visitor::emit_alpha_test()
  fs_inst *
  fs_visitor::emit_single_fb_write(const fs_builder bld,
   fs_reg color0, fs_reg color1,
 - fs_reg src0_alpha, unsigned components,
 - unsigned exec_size, bool use_2nd_half)
 + fs_reg src0_alpha, unsigned components)
  {
 assert(stage == MESA_SHADER_FRAGMENT);
 brw_wm_prog_data *prog_data = (brw_wm_prog_data*) this-prog_data;
 -   brw_wm_prog_key *key = (brw_wm_prog_key*) this-key;
 -   const fs_builder ubld = bld.group(exec_size, use_2nd_half);
 -   int header_size = 2, payload_header_size;
 -
 -   /* We can potentially have a message length of up to 15, so we have to 
 set
 -* base_mrf to either 0 or 1 in order to fit in m0..m15.
 -*/
 -   fs_reg *sources = ralloc_array(mem_ctx, fs_reg, 15);
 -   int length = 0;
 -
 -   /* From the Sandy Bridge PRM, volume 4, page 198:
 -*
 -* Dispatched Pixel Enables. One bit per pixel indicating
 -*  which pixels were originally enabled when the thread was
 -*  dispatched. This field is only required for the end-of-
 -*  thread message and on all dual-source messages.
 -*/
 -   if (devinfo-gen = 6 
 -   (devinfo-is_haswell || devinfo-gen = 8 || !prog_data-uses_kill) 
 
 -   color1.file == BAD_FILE 
 -   key-nr_color_regions == 1) {
 -  header_size = 0;
 -   }
 -
 -   if (header_size != 0) {
 -  assert(header_size == 2);
 -  /* Allocate 2 registers for a header */
 -  length += 2;
 -   }
 -
 -   if (payload.aa_dest_stencil_reg) {
 -  sources[length] = fs_reg(GRF, alloc.allocate(1));
 -  bld.group(8, 0).exec_all().annotate(FB write stencil/AA alpha)
 - .MOV(sources[length],
 -  fs_reg(brw_vec8_grf(payload.aa_dest_stencil_reg, 0)));
 -  length++;
 -   }
 -
 -   if (prog_data-uses_omask) {
 -  sources[length] = fs_reg(GRF, alloc.allocate(1),
 -   BRW_REGISTER_TYPE_UD);
 -
 -  /* Hand 

Re: [Mesa-dev] [PATCH 01/12] i965/fs: Define logical framebuffer write opcode.

2015-07-22 Thread Francisco Jerez
Kenneth Graunke kenn...@whitecape.org writes:

 On Thursday, July 16, 2015 06:41:16 PM Francisco Jerez wrote:
 The logical variant is largely equivalent to the original opcode but
 instead of taking a single payload source it expects the arguments
 that make up the payload separately as individual sources, like:
 
  fb_write_logical null, color0, color1, src0_alpha,
 src_depth, dst_depth, sample_mask, num_components
 
 This patch defines the opcode and usual instruction boilerplate,
 including a placeholder lowering function provided mostly as
 self-documentation of its source registers.
 ---
  src/mesa/drivers/dri/i965/brw_defines.h  |  1 +
  src/mesa/drivers/dri/i965/brw_fs.cpp | 34 
 
  src/mesa/drivers/dri/i965/brw_fs.h   |  2 +-
  src/mesa/drivers/dri/i965/brw_shader.cpp |  2 ++
  4 files changed, 38 insertions(+), 1 deletion(-)
 
 diff --git a/src/mesa/drivers/dri/i965/brw_defines.h 
 b/src/mesa/drivers/dri/i965/brw_defines.h
 index 5bf53e3..65685a9 100644
 --- a/src/mesa/drivers/dri/i965/brw_defines.h
 +++ b/src/mesa/drivers/dri/i965/brw_defines.h
 @@ -875,6 +875,7 @@ enum opcode {
  * instructions.
  */
 FS_OPCODE_FB_WRITE = 128,

 I'd like to start including block comments above our virtual opcodes
 describing their sources and semantics - it's a natural place to look
 for that information.

 For this, perhaps something like:

   /**
* Framebuffer write.
*
* Source 0: [required] Color 0
* Source 1: [optional] Color 1 (for dual source blend messages)
* Source 2: [optional] Src0 Alpha
* Source 3: [optional] Source Depth (passthrough from the thread payload)
* Source 4: [optional] Destination Depth (gl_FragDepth)
* Source 5: [optional] Sample Mask (gl_SampleMask)
* Source 6: [required] Number of color components (as a UD immediate).
*/


Sounds good to me.  I'll steal your comment for this and write similar
ones for the other logical-send virtual opcodes I've introduced lately.

 Patches 1-10 are:
 Reviewed-by: Kenneth Graunke kenn...@whitecape.org

 Patches 11-12 are:
 Acked-by: Kenneth Graunke kenn...@whitecape.org

Thanks.

 +   FS_OPCODE_FB_WRITE_LOGICAL,
 FS_OPCODE_BLORP_FB_WRITE,
 FS_OPCODE_REP_FB_WRITE,
 SHADER_OPCODE_RCP,


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 91425] [regression, bisected] Piglit spec/ext_packed_float/ getteximage-invalid-format-for-packed-type fails

2015-07-22 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=91425

Bug ID: 91425
   Summary: [regression, bisected] Piglit spec/ext_packed_float/
getteximage-invalid-format-for-packed-type fails
   Product: Mesa
   Version: git
  Hardware: Other
OS: All
Status: NEW
  Severity: normal
  Priority: medium
 Component: Mesa core
  Assignee: mesa-dev@lists.freedesktop.org
  Reporter: sigles...@igalia.com
QA Contact: mesa-dev@lists.freedesktop.org
CC: bri...@vmware.com, imir...@alum.mit.edu

I found a piglit regression when testing today's master:

$ bin/getteximage-invalid-format-for-packed-type -auto -fbo
Mesa: User error: GL_INVALID_VALUE in glGetTextImage(depth = 0)
Unexpected GL error: GL_INVALID_VALUE 0x501
(Error at
/home/siglesias/devel/piglit/tests/spec/ext_packed_float/getteximage-invalid-format-for-packed-type.c:124)
Expected GL error: GL_INVALID_OPERATION 0x502
PIGLIT: {subtest: {GL_UNSIGNED_BYTE_3_3_2, GL_RGBA : fail}}
Mesa: User error: GL_INVALID_VALUE in glGetTextImage(depth = 0)
Unexpected GL error: GL_INVALID_VALUE 0x501
(Error at
/home/siglesias/devel/piglit/tests/spec/ext_packed_float/getteximage-invalid-format-for-packed-type.c:118)
PIGLIT: {subtest: {GL_UNSIGNED_BYTE_3_3_2, GL_RGB : fail}}
Mesa: User error: GL_INVALID_VALUE in glGetTextImage(depth = 0)
Unexpected GL error: GL_INVALID_VALUE 0x501
(Error at
/home/siglesias/devel/piglit/tests/spec/ext_packed_float/getteximage-invalid-format-for-packed-type.c:124)
Expected GL error: GL_INVALID_OPERATION 0x502
PIGLIT: {subtest: {GL_UNSIGNED_BYTE_3_3_2, GL_RED : fail}}
Mesa: User error: GL_INVALID_VALUE in glGetTextImage(depth = 0)
Unexpected GL error: GL_INVALID_VALUE 0x501
(Error at
/home/siglesias/devel/piglit/tests/spec/ext_packed_float/getteximage-invalid-format-for-packed-type.c:124)
[...]

I bisected it and found the first bad commit:

---

f20cfc5a409b69d5ae3d10a870d90e0b4e493ddf is the first bad commit
commit f20cfc5a409b69d5ae3d10a870d90e0b4e493ddf
Author: Brian Paul bri...@vmware.com
Date:   Tue Jul 21 18:35:38 2015 -0600

mesa: overhaul the glGetTexImage code

1. Reorganize the error checking code.
2. Lay groundwork for getting sub images by passing image offset and
   dimensions to the error checking code.
3. Implement _mesa_GetnTexImageARB(), _mesa_GetTexImage() and
   _mesa_GetTextureImage() all in terms of get_texture_image().

v2: pass offset/width/height/depth arguments to the error checking
function, avoid using magic width/height/depth values.
v3: remove unused bufSize param to get_texture_image()

Reviewed-by: Ilia Mirkin imir...@alum.mit.edu

---

Mesa master HEAD: 800efb0690e962750b9a072bcbab279fdaae24a1

$ glxinfo | grep git
OpenGL core profile version string: 3.3 (Core Profile) Mesa 10.7.0-devel
(git-800efb0)
OpenGL version string: 3.0 Mesa 10.7.0-devel (git-800efb0)
OpenGL ES profile version string: OpenGL ES 3.0 Mesa 10.7.0-devel (git-800efb0)

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 4/7] nv50: configure the ring buffer for reading back PM counters

2015-07-22 Thread Martin Peres



On 01/07/15 01:01, Samuel Pitoiset wrote:

To write data at the right offset, the kernel has to know some
parameters of this ring buffer, like the number of domains and the
maximum number of queries.

Changes since v2:
- only configure the ring buffer if the notifier BO is allocated
- only use one BEGIN_NV04()

Signed-off-by: Samuel Pitoiset samuel.pitoi...@gmail.com
---
  src/gallium/drivers/nouveau/nv50/nv50_screen.c | 10 ++
  1 file changed, 10 insertions(+)

diff --git a/src/gallium/drivers/nouveau/nv50/nv50_screen.c 
b/src/gallium/drivers/nouveau/nv50/nv50_screen.c
index ab95d65..335bff1 100644
--- a/src/gallium/drivers/nouveau/nv50/nv50_screen.c
+++ b/src/gallium/drivers/nouveau/nv50/nv50_screen.c
@@ -442,6 +442,16 @@ nv50_screen_init_hwctx(struct nv50_screen *screen)
 BEGIN_NV04(push, SUBC_SW(NV01_SUBCHAN_OBJECT), 1);
 PUSH_DATA (push, screen-sw-handle);
  
+   if (screen-query) {

+  /* Do not need to configure the ring buffer used to read back
+   * global performance counters when it is not allocated. */
+  BEGIN_NV04(push, SUBC_SW(0x0190), 1);
+  PUSH_DATA (push, screen-query-handle);
+  BEGIN_NV04(push, SUBC_SW(0x0600), 2);


Shouldn't we have the sw method's handle defined in libdrm?

Anyway, patches 3 and 4 are

Reviewed-by: Martin Peres martin.pe...@free.fr

+  PUSH_DATA (push, NV50_HW_PM_RING_BUFFER_MAX_QUERIES);
+  PUSH_DATA (push, NV50_HW_PM_RING_BUFFER_NUM_DOMAINS);
+   }
+
 BEGIN_NV04(push, NV50_3D(COND_MODE), 1);
 PUSH_DATA (push, NV50_3D_COND_MODE_ALWAYS);
  


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 17/20] i965/fs: Handle image uniforms in NIR programs.

2015-07-22 Thread Jason Ekstrand
On Tue, Jul 21, 2015 at 9:38 AM, Francisco Jerez curroje...@riseup.net wrote:
 v2: Move the image_params array back to brw_stage_prog_data.
 ---
  src/mesa/drivers/dri/i965/brw_fs.h   |  1 +
  src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 50 
 +++-
  2 files changed, 43 insertions(+), 8 deletions(-)

 diff --git a/src/mesa/drivers/dri/i965/brw_fs.h 
 b/src/mesa/drivers/dri/i965/brw_fs.h
 index 4749c47..97df784 100644
 --- a/src/mesa/drivers/dri/i965/brw_fs.h
 +++ b/src/mesa/drivers/dri/i965/brw_fs.h
 @@ -264,6 +264,7 @@ public:
nir_jump_instr *instr);
 fs_reg get_nir_src(nir_src src);
 fs_reg get_nir_dest(nir_dest dest);
 +   fs_reg get_nir_image_deref(const nir_deref_var *deref);
 void emit_percomp(const brw::fs_builder bld, const fs_inst inst,
   unsigned wr_mask);

 diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp 
 b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
 index 6ec96c4..31024b7 100644
 --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
 +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
 @@ -234,17 +234,26 @@ fs_visitor::nir_setup_uniform(nir_variable *var)
   continue;
}

 -  unsigned slots = storage-type-component_slots();
 -  if (storage-array_elements)
 - slots *= storage-array_elements;
 +  if (storage-type-is_image()) {
 + /* Images don't get a valid location assigned by nir_lower_io()
 +  * because their size is driver-specific, so we need to allocate
 +  * space for them here at the end of the parameter array.
 +  */
 + var-data.driver_location = uniforms;
 + param_size[uniforms] =
 +BRW_IMAGE_PARAM_SIZE * MAX2(storage-array_elements, 1);
 +
 + setup_image_uniform_values(storage);
 +  } else {
 + unsigned slots = storage-type-component_slots();
 + if (storage-array_elements)
 +slots *= storage-array_elements;

 -  for (unsigned i = 0; i  slots; i++) {
 - stage_prog_data-param[index++] = storage-storage[i];
 + for (unsigned i = 0; i  slots; i++) {
 +stage_prog_data-param[index++] = storage-storage[i];
 + }
}
 }
 -
 -   /* Make sure we actually initialized the right amount of stuff here. */
 -   assert(var-data.driver_location + var-type-component_slots() == index);
  }

  void
 @@ -1193,6 +1202,31 @@ fs_visitor::get_nir_dest(nir_dest dest)
   dest.reg.indirect);
  }

 +fs_reg
 +fs_visitor::get_nir_image_deref(const nir_deref_var *deref)
 +{
 +   fs_reg image(UNIFORM, deref-var-data.driver_location,
 +BRW_REGISTER_TYPE_UD);
 +
 +   if (deref-deref.child) {
 +  const nir_deref_array *deref_array =
 + nir_deref_as_array(deref-deref.child);
 +  assert(deref_array-deref.child == NULL);

You should probably assert that the deref type of the child is an
array before casting.

 +
 +  image = offset(image, bld,
 + deref_array-base_offset * BRW_IMAGE_PARAM_SIZE);
 +
 +  if (deref_array-deref_array_type == nir_deref_array_type_indirect) {
 + fs_reg *tmp = new(mem_ctx) fs_reg(vgrf(glsl_type::int_type));
 + bld.MUL(*tmp, get_nir_src(deref_array-indirect),
 + fs_reg(BRW_IMAGE_PARAM_SIZE));
 + image.reladdr = tmp;
 +  }
 +   }
 +
 +   return image;
 +}
 +
  void
  fs_visitor::emit_percomp(const fs_builder bld, const fs_inst inst,
   unsigned wr_mask)
 --
 2.4.3

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 18/20] i965/fs: Translate image load, store and atomic NIR intrinsics.

2015-07-22 Thread Jason Ekstrand
On Tue, Jul 21, 2015 at 9:38 AM, Francisco Jerez curroje...@riseup.net wrote:
 ---
  src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 149 
 +++
  1 file changed, 149 insertions(+)

 diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp 
 b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
 index 31024b7..76297b7 100644
 --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
 +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
 @@ -24,6 +24,7 @@
  #include glsl/ir.h
  #include glsl/ir_optimization.h
  #include glsl/nir/glsl_to_nir.h
 +#include main/shaderimage.h
  #include program/prog_to_nir.h
  #include brw_fs.h
  #include brw_fs_surface_builder.h
 @@ -1245,6 +1246,97 @@ fs_visitor::emit_percomp(const fs_builder bld, const 
 fs_inst inst,
 }
  }

 +/**
 + * Get the matching channel register datatype for an image intrinsic of the
 + * specified GLSL image type.
 + */
 +static brw_reg_type
 +get_image_base_type(const glsl_type *type)
 +{
 +   switch ((glsl_base_type)type-sampler_type) {
 +   case GLSL_TYPE_UINT:
 +  return BRW_REGISTER_TYPE_UD;
 +   case GLSL_TYPE_INT:
 +  return BRW_REGISTER_TYPE_D;
 +   case GLSL_TYPE_FLOAT:
 +  return BRW_REGISTER_TYPE_F;
 +   default:
 +  unreachable(Not reached.);
 +   }
 +}
 +
 +/**
 + * Get the appropriate atomic op for an image atomic intrinsic.
 + */
 +static unsigned
 +get_image_atomic_op(nir_intrinsic_op op, const glsl_type *type)
 +{
 +   switch (op) {
 +   case nir_intrinsic_image_atomic_add:
 +  return BRW_AOP_ADD;
 +   case nir_intrinsic_image_atomic_min:
 +  return (get_image_base_type(type) == BRW_REGISTER_TYPE_D ?
 +  BRW_AOP_IMIN : BRW_AOP_UMIN);
 +   case nir_intrinsic_image_atomic_max:
 +  return (get_image_base_type(type) == BRW_REGISTER_TYPE_D ?
 +  BRW_AOP_IMAX : BRW_AOP_UMAX);
 +   case nir_intrinsic_image_atomic_and:
 +  return BRW_AOP_AND;
 +   case nir_intrinsic_image_atomic_or:
 +  return BRW_AOP_OR;
 +   case nir_intrinsic_image_atomic_xor:
 +  return BRW_AOP_XOR;
 +   case nir_intrinsic_image_atomic_exchange:
 +  return BRW_AOP_MOV;
 +   case nir_intrinsic_image_atomic_comp_swap:
 +  return BRW_AOP_CMPWR;
 +   default:
 +  unreachable(Not reachable.);
 +   }
 +}
 +
 +/**
 + * Return true if the image is a 1D array and the implementation
 + * requires the array index to be passed in as the Z component of the
 + * coordinate vector.
 + */
 +static bool
 +needs_zero_y_image_coordinate(const fs_builder bld,
 +  const nir_variable *image)
 +{
 +   const glsl_type *type = image-type-without_array();
 +   const mesa_format format =
 +  _mesa_get_shader_image_format(image-data.image.format);
 +   /* HSW in vec4 mode and our software coordinate handling for untyped
 +* reads want the array index to be at the Z component.
 +*/
 +   const bool array_index_at_z = (!image-data.image.write_only 
 +  
 !image_format_info::has_matching_typed_format(
 + bld.shader-devinfo, format));
 +
 +   return (type-sampler_dimensionality == GLSL_SAMPLER_DIM_1D 
 +   type-sampler_array  array_index_at_z);
 +}
 +
 +/**
 + * Transform image coordinates into the form expected by the
 + * implementation.
 + */
 +static fs_reg
 +fix_image_address(const fs_builder bld,
 +  const nir_variable *image, const fs_reg addr)

Is there a good reason why this is happening here and not inside of
emit_image_load/store?  It seems to break encaptulation substantially
if the code that calls emit_image_load/store has to think about this
stuff and adjust the address type based on whether a typed or untyped
message will eventually be used.

 +{
 +   if (needs_zero_y_image_coordinate(bld, image)) {
 +  assert(image-type-without_array()-coordinate_components() == 2);
 +  const fs_reg srcs[] = { addr, fs_reg(0), offset(addr, bld, 1) };
 +  const fs_reg tmp = bld.vgrf(addr.type, ARRAY_SIZE(srcs));
 +  bld.LOAD_PAYLOAD(tmp, srcs, ARRAY_SIZE(srcs), 0);
 +  return tmp;
 +   } else {
 +  return addr;
 +   }
 +}
 +
  void
  fs_visitor::nir_emit_intrinsic(const fs_builder bld, nir_intrinsic_instr 
 *instr)
  {
 @@ -1319,6 +1411,63 @@ fs_visitor::nir_emit_intrinsic(const fs_builder bld, 
 nir_intrinsic_instr *instr
break;
 }

 +   case nir_intrinsic_image_load:
 +   case nir_intrinsic_image_store:
 +   case nir_intrinsic_image_atomic_add:
 +   case nir_intrinsic_image_atomic_min:
 +   case nir_intrinsic_image_atomic_max:
 +   case nir_intrinsic_image_atomic_and:
 +   case nir_intrinsic_image_atomic_or:
 +   case nir_intrinsic_image_atomic_xor:
 +   case nir_intrinsic_image_atomic_exchange:
 +   case nir_intrinsic_image_atomic_comp_swap: {
 +  using namespace image_access;
 +
 +  /* Get the referenced image variable and type. */
 +  const nir_variable *var = instr-variables[0]-var;
 +  const glsl_type *type = var-type-without_array();
 +  

[Mesa-dev] [Bug 90264] [Regression, bisected] Tooltip corruption in Chrome

2015-07-22 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=90264

Antoine Labour pi...@chromium.org changed:

   What|Removed |Added

 CC||pi...@chromium.org

--- Comment #39 from Antoine Labour pi...@chromium.org ---
(In reply to Heiko from comment #38)
 (In reply to Antoine Labour from comment #37)
  Can you make glXWaitX re-evaluate window sizes? That's the reason we call
  it - we just resized the window, we want to make sure we're drawing to the
  right back buffer. This is a UI workload, we may draw only once and we don't
  want flashing (what a glXSwapBuffers would do), so we want to make sure
  we're drawing to a buffer of the right size - glXWaitX is meant to
  synchronize GL with changes in X.
  
  In this trace, you're only seeing parts of the calls - in particular there
  should be a glViewport and draws and a glXSwapBuffers after the resize.
  
  I don't think there's confusion from the app side. Parts of what you're
  seeing is chromium restoring state between its virtual contexts (we have
  different subsystems that all need to access GL, and are multiplexed,
  possibly on a single GL context, depending on the version), so it's not
  necessarily surprising to see scissor/viewport that doesn't match the window
  size - until we get to actually draw the picture on screen. In particular,
  the texture is correctly loaded with 370x72, so no confusion there.
 
 Well, most of my findings were pretty vague, because I'm neither a GL expert
 nor a mesa expert.
 
 With your hint, I just noticed that dri2_wait_x()/dri2_wait_gl() don't do
 anything, because of priv-have_fake_front being zero all the time. It would
 be set to 1 for buffer attachments of __DRI_BUFFER_FAKE_FRONT_LEFT, but
 that's nowhere set in mesa. Thus gcc just optimized it away completely along
 with dri2_copy_drawable().
 
 Seems to be related to [1]?
 
 
 [1] http://lists.freedesktop.org/archives/mesa-dev/2011-June/008241.html

Maybe, I'm not a DRI2 (or 3 or n) expert.

There is fundamentally a race between X and GL during resize, because the back
buffer is supposed to be the same size as the window, but the window gets
resized on the X server, whereas the back buffer is a property of the client
and/or the GL stream. So, at some point, the client needs to take a snapshot
of the window dimensions to operate, and problems come up if the window gets
resized between the time the snapshot is taken and the time glXSwapBuffers
occur. I think that's what's happening in this bug. It sounds like the
referenced commit changes when the snapshot is taken.

From my quick reading of the radeon driver (sorry, I'm absolutely not familiar
with the codebase), if I understand correctly, it looks like that snapshot is
taken with the DRI2GetBuffer{,WithFormat}, via radeon_update_renderbuffers,
which is called in a few places, notably in radeon_viewport, which was always
called on every glViewport before
http://cgit.freedesktop.org/mesa/mesa/commit/?id=95073a2dca03a48f4c77bc846a4a6d1f0eb81ae6
, but after which is only called on non-redundant glViewport.
radeon_update_renderbuffers is also called on radeonMakeCurrent, which is
called on non-redundant glXMakeCurrent, and on radeon_prepare_render when dri2
stamps don't match which I think is meant to capture querying new buffers after
glXSwapBuffers. There's a couple of other places too, but they may not matter
for this.


So, I guess there could be a couple of workarounds in chrome to force
radeon_update_renderbuffers (e.g. non-idempotent glViewport or glXMakeCurrent),
but they're hacks. Or the driver could make sure it invalidates whatever is
needed in glXWaitX, because that's what the user uses to synchronize between X
and GL (i.e. you can assume that something in X has changed).

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] glsl/tests: fix varying_test since tess changes.

2015-07-22 Thread Dave Airlie
From: Dave Airlie airl...@redhat.com

This fixes make check since the tess changes.

Signed-off-by: Dave Airlie airl...@redhat.com
---
 src/glsl/tests/varyings_test.cpp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/glsl/tests/varyings_test.cpp b/src/glsl/tests/varyings_test.cpp
index 4573529..62f8c6b 100644
--- a/src/glsl/tests/varyings_test.cpp
+++ b/src/glsl/tests/varyings_test.cpp
@@ -70,7 +70,7 @@ public:
hash_table *consumer_interface_inputs;
 
const glsl_type *simple_interface;
-   ir_variable *junk[VARYING_SLOT_MAX];
+   ir_variable *junk[VARYING_SLOT_TESS_MAX];
 };
 
 link_varyings::link_varyings()
-- 
2.4.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] i965: fix warning since tess merge.

2015-07-22 Thread Dave Airlie
From: Dave Airlie airl...@redhat.com

Signed-off-by: Dave Airlie airl...@redhat.com
---
 src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
index 6c97933..33469d4 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
@@ -102,6 +102,8 @@ fs_visitor::nir_setup_inputs(nir_shader *shader)
   }
   case MESA_SHADER_GEOMETRY:
   case MESA_SHADER_COMPUTE:
+  case MESA_SHADER_TESS_CTRL:
+  case MESA_SHADER_TESS_EVAL:
  unreachable(fs_visitor not used for these stages yet.);
  break;
   case MESA_SHADER_FRAGMENT:
-- 
2.4.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glsl/tests: fix varying_test since tess changes.

2015-07-22 Thread Michel Dänzer
On 23.07.2015 10:14, Dave Airlie wrote:
 From: Dave Airlie airl...@redhat.com
 
 This fixes make check since the tess changes.
 
 Signed-off-by: Dave Airlie airl...@redhat.com

Tested-by: Michel Dänzer michel.daen...@amd.com


Please push this to fix make check.


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/2] i965: fix warning since tess merge.

2015-07-22 Thread Dave Airlie
From: Dave Airlie airl...@redhat.com

Signed-off-by: Dave Airlie airl...@redhat.com
---
 src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
index 6c97933..33469d4 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
@@ -102,6 +102,8 @@ fs_visitor::nir_setup_inputs(nir_shader *shader)
   }
   case MESA_SHADER_GEOMETRY:
   case MESA_SHADER_COMPUTE:
+  case MESA_SHADER_TESS_CTRL:
+  case MESA_SHADER_TESS_EVAL:
  unreachable(fs_visitor not used for these stages yet.);
  break;
   case MESA_SHADER_FRAGMENT:
-- 
2.4.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] docs/GL3.txt: ARB_shader_precision

2015-07-22 Thread Dave Airlie
From: Dave Airlie airl...@redhat.com

This extension is about setting expectation on GL4.1 implementations
rather than actually enforcing things. So once you support GLSL 410
then you support this in theory.

Signed-off-by: Dave Airlie airl...@redhat.com
---
 docs/GL3.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/docs/GL3.txt b/docs/GL3.txt
index 197b676..e299cb9 100644
--- a/docs/GL3.txt
+++ b/docs/GL3.txt
@@ -126,7 +126,7 @@ GL 4.1, GLSL 4.10:
   GL_ARB_ES2_compatibility DONE (i965, nv50, nvc0, 
r600, radeonsi, llvmpipe, softpipe)
   GL_ARB_get_program_binaryDONE (0 binary formats)
   GL_ARB_separate_shader_objects   DONE (all drivers)
-  GL_ARB_shader_precision  started (Micah)
+  GL_ARB_shader_precision  DONE (all drivers)
   GL_ARB_vertex_attrib_64bit   DONE (nvc0, radeonsi, 
llvmpipe, softpipe)
   GL_ARB_viewport_arrayDONE (i965, nv50, nvc0, 
r600, radeonsi, llvmpipe)
 
-- 
2.4.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] docs/GL3.txt: ARB_shader_precision

2015-07-22 Thread Ilia Mirkin
In actuality it'll only be exposed on nvc0 atm...

On Wed, Jul 22, 2015 at 9:20 PM, Dave Airlie airl...@gmail.com wrote:
 From: Dave Airlie airl...@redhat.com

 This extension is about setting expectation on GL4.1 implementations
 rather than actually enforcing things. So once you support GLSL 410
 then you support this in theory.

 Signed-off-by: Dave Airlie airl...@redhat.com
 ---
  docs/GL3.txt | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

 diff --git a/docs/GL3.txt b/docs/GL3.txt
 index 197b676..e299cb9 100644
 --- a/docs/GL3.txt
 +++ b/docs/GL3.txt
 @@ -126,7 +126,7 @@ GL 4.1, GLSL 4.10:
GL_ARB_ES2_compatibility DONE (i965, nv50, 
 nvc0, r600, radeonsi, llvmpipe, softpipe)
GL_ARB_get_program_binaryDONE (0 binary 
 formats)
GL_ARB_separate_shader_objects   DONE (all drivers)
 -  GL_ARB_shader_precision  started (Micah)
 +  GL_ARB_shader_precision  DONE (all drivers)
GL_ARB_vertex_attrib_64bit   DONE (nvc0, radeonsi, 
 llvmpipe, softpipe)
GL_ARB_viewport_arrayDONE (i965, nv50, 
 nvc0, r600, radeonsi, llvmpipe)

 --
 2.4.3

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] glsl: Allow token pasting in ESSL

2015-07-22 Thread Ryan Houdek
All vendors I've tested allow token pasting in their ESSL shaders.
At least one application expects this to be supported
even though it is against spec.
This would be the Dolphin GameCube/Wii emulator that requires this.
The vendors I've tested that is known to support this is:
ARM Mali
Qualcomm Adreno
IMGTec PowerVR
Nvidia Tegra K1/X1

The vendors that I don't have access to testing:
Intel Android
Vivante
---
 src/glsl/glcpp/glcpp-lex.l | 2 --
 1 file changed, 2 deletions(-)

diff --git a/src/glsl/glcpp/glcpp-lex.l b/src/glsl/glcpp/glcpp-lex.l
index fa9aa50..3e2ad23 100644
--- a/src/glsl/glcpp/glcpp-lex.l
+++ b/src/glsl/glcpp/glcpp-lex.l
@@ -494,8 +494,6 @@ HEXADECIMAL_INTEGER 0[xX][0-9a-fA-F]+[uU]?
 
 ## {
if (! parser-skipping) {
-   if (parser-is_gles)
-   glcpp_error(yylloc, yyextra, Token pasting (##) is 
illegal in GLES);
RETURN_TOKEN (PASTE);
}
 }
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 06/14] meta: Don't do fragment color clamping in _mesa_meta_pbo_GetTexSubImage

2015-07-22 Thread Iago Toral
Reviewed-by: Iago Toral Quiroga ito...@igalia.com

On Wed, 2015-07-22 at 11:54 -0700, Anuj Phogat wrote:
 _mesa_meta_pbo_GetTexSubImage() uses _mesa_meta_BlitFrameBuffer(),
 which will do fragment clamping if enabled. But fragment clamping
 doesn't affect ReadPixels and GetTexImage.
 
 Without this patch, piglit test arb_color_buffer_float-clear fails,
 when forced to use the meta pbo path.
 
 v2: Apply this fix to both glReadPixels and glGetTexImage.
 
 Cc: mesa-sta...@lists.freedesktop.org
 Signed-off-by: Anuj Phogat anuj.pho...@gmail.com
 ---
  src/mesa/drivers/common/meta_tex_subimage.c | 5 +
  1 file changed, 5 insertions(+)
 
 diff --git a/src/mesa/drivers/common/meta_tex_subimage.c 
 b/src/mesa/drivers/common/meta_tex_subimage.c
 index 6d7c1eb..038d525 100644
 --- a/src/mesa/drivers/common/meta_tex_subimage.c
 +++ b/src/mesa/drivers/common/meta_tex_subimage.c
 @@ -25,6 +25,7 @@
   *Jason Ekstrand jason.ekstr...@intel.com
   */
  
 +#include blend.h
  #include bufferobj.h
  #include buffers.h
  #include fbobject.h
 @@ -331,6 +332,10 @@ _mesa_meta_pbo_GetTexSubImage(struct gl_context *ctx, 
 GLuint dims,
 _mesa_meta_begin(ctx, ~(MESA_META_PIXEL_TRANSFER |
 MESA_META_PIXEL_STORE));
  
 +   /* GL_CLAMP_FRAGMENT_COLOR doesn't affect ReadPixels and GettexImage */
 +   if (ctx-Extensions.ARB_color_buffer_float)
 +  _mesa_ClampColor(GL_CLAMP_FRAGMENT_COLOR, GL_FALSE);
 +
 _mesa_GenFramebuffers(2, fbos);
  
 if (tex_image  tex_image-TexObject-Target == GL_TEXTURE_1D_ARRAY) {


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 18/20] mesa: fill out the ARB_shader_subroutine APIs

2015-07-22 Thread Kenneth Graunke
On Tuesday, July 21, 2015 03:19:25 PM Dave Airlie wrote:
 From: Dave Airlie airl...@redhat.com
 
 This fleshes out the APIs, using the program resource
 APIs where they should match.
 
 It also sets the default values to valid subroutines.
 
 Signed-off-by: Dave Airlie airl...@redhat.com
 ---
  src/mesa/main/shaderapi.c | 450 
 +-
  src/mesa/main/shaderapi.h |   3 +
  2 files changed, 450 insertions(+), 3 deletions(-)

So, one question I have...according to the spec...

   Subroutine uniform variables are similar to uniform variables, except they
are context state rather than program state. Having subroutine uniforms be
context state allows them to have different values if the program is used
in multiple contexts simultaneously.  There is a set of subroutine
uniforms for each shader stage.

but it looks like we're storing them in the shader, rather than
per-context.  Bug?  Or am I missing something?

At this point I'm just inclined to give everything I haven't reviewed an
Acked-by: Kenneth Graunke kenn...@whitecape.org

I don't think anybody wants to spend much more time on this, and it'd be
nice to check off the box and move on.


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 16/78] i965/nir/vec4: Implement store_output intrinsic

2015-07-22 Thread Jason Ekstrand
On Wed, Jul 22, 2015 at 4:37 AM, Eduardo Lima Mitev el...@igalia.com wrote:
 On 07/13/2015 01:57 PM, Jason Ekstrand wrote:
 On Wed, Jul 8, 2015 at 11:54 PM, Eduardo Lima Mitev el...@igalia.com wrote:
 On 06/30/2015 06:51 PM, Jason Ekstrand wrote:
 On Fri, Jun 26, 2015 at 1:06 AM, Eduardo Lima Mitev el...@igalia.com 
 wrote:
 The index into the output_reg array where to store the destination 
 register is
 fetched from the nir_outputs map built during nir_setup_outputs stage.

 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89580
 ---
  src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 17 +++--
  1 file changed, 15 insertions(+), 2 deletions(-)

 diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp 
 b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
 index 8a2d335..55d4490 100644
 --- a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
 +++ b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
 @@ -520,10 +520,23 @@ 
 vec4_visitor::nir_emit_intrinsic(nir_intrinsic_instr *instr)
 }

 case nir_intrinsic_store_output_indirect:
 +  has_indirect = true;
/* fallthrough */
 -   case nir_intrinsic_store_output:
 -  /* @TODO: Not yet implemented */
 +   case nir_intrinsic_store_output: {
 +  int offset = instr-const_index[0];
 +  int output = nir_outputs[offset];
 +
 +  src = get_nir_src(instr-src[0], nir_output_types[offset]);
 +  dest = dst_reg(src);
 +
 +  dest.writemask = brw_writemask_for_size(instr-num_components);
 +
 +  if (has_indirect)
 + dest.reladdr = new(mem_ctx) src_reg(get_nir_src(instr-src[1]));
 +
 +  output_reg[output] = dest;

 I'm very confused about the amount of indirection going on here.  It
 seems to me that we should be setting these outputs up in
 setup_outputs() rather than storring off a map from ints to other ints
 and setting it up here.  I didn't make this comment on the patch for
 setup_outputs() because I wanted to wait to see it used before I
 commented on it.

 I'm guessing you did it this way because the nir_assign_var_locations
 is giving you bogus values.  If so, then it might be better to just
 assign variable locations in setup_outputs() rather than having a
 remap table.  The whole point of nir_lower_io is to make IO easy for
 the back-end.  If you need a re-map table, then it's no longer making
 it easy and we need to think more about what's going on.
 --Jason


 That double indirection felt bad since the beginning, but it was needed
 to store the original variable's location (var-data.location). Let me
 explain:

 We are (re)using the plumbering in vec4_visitor to setup URB, so the
 only thing we need to do is to store the out register in output_reg
 map at the correct location. And that location is given by the original
 location in the shader (var-data.location).

 So, in this case, nir_assign_var_locations pass, which constructs
 var-data.driver_location, is not useful to us, except to give us
 consecutive indexes to construct the other map we have, the type map,
 which is needed to carry the correct type from the original variable to
 the output register.

 If nir_assign_var_locations isn't doing anything for you, don't call
 it.  You'll need to do something with var-data.driver_location.  If
 what you really want is var-data.location, then just copy that to
 var-data.driver_location when you do nir_setup_outputs.  Or
 (depending on how the URB setup works, I don't actually know), put the
 actual URB location in var-data.driver_location when you walk the
 outputs.

 From there, you have two options.  One would be to setup output_reg at
 the same time with the correct types right away and emit a MOV when
 you get a store_output.  (Copy propagation should clean up the MOV.)
 For what it's worth, I don't think the type matters; a URB write just
 writes data to something so as long as you don't have a type mismatch
 in a MOV, the hardware won't care.

 The other option, would be to directly emit the URB write in
 store_output.  At the moment, it may be better to take the first
 option since that better matches what the FS does right now.  But both
 should work fine.


 Thanks for these hints, they were very useful.

 I rewrote the implementation of store_output intrinsic to avoid the
 setup phase completely. The type, as you suggested, was not important as
 long as they match while MOVing the contents of output_reg. To guarantee
 that, I had to patch the emit_urb_slot() to guarantee the types always
 match. This code is shared with vec4_visitor, so it makes sense to move
 the safeguards there instead of having both backends provide the correct
 register type in output_reg entries.
 For reference, this is the patch that implements it:
 https://github.com/Igalia/mesa/commit/8c703937f285c0b3a1e7bf6681c7ed7fe09815aa

Seems reasonable.

 I also put var-data.location in const_index[1] of the intrinsic op, and
 disabled nir_assign_var_locations() for output variables, since I don't
 need var-data.driver_location. I could 

[Mesa-dev] [Bug 91425] [regression, bisected] Piglit spec/ext_packed_float/ getteximage-invalid-format-for-packed-type fails

2015-07-22 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=91425

Samuel Iglesias sigles...@igalia.com changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #5 from Samuel Iglesias sigles...@igalia.com ---
(In reply to Brian Paul from comment #4)
 OK, the problem is my local copy of
 getteximage-invalid-format-for-packed-type.c has the patch
 ext_packed_float: fix getteximage-invalid-format-for-packed-type test
 which I posted for review (and was R-b'd by Jose) but not yet pushed to
 master.
 
 I believe the piglit test was defective as-is, per my check-in comment:
 
 The GL spec doesn't explicitly say that glGetTexImage should generate
 GL_INVALID_OPERATION when attempting to retrieve a non-existant texture
 image, but that's what NVIDIA's driver does.
 
 The purpose of this test is to check the format/type parameters, so let's
 define a packed float texture to avoid the undefined texture situation.
 
 Test now passes with NVIDIA.
 
 I'll push that patch.

It passes now. Thanks Brian!

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/2] doxygen: Link GLvector4f struct members properly, avoiding invalid XML/HTML warning

2015-07-22 Thread rhyskidd
From: Rhys Kidd rhysk...@gmail.com

Signed-off-by: Rhys Kidd rhysk...@gmail.com
---
 src/mesa/math/m_vector.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/mesa/math/m_vector.h b/src/mesa/math/m_vector.h
index 3b7f583..5bd76b8 100644
--- a/src/mesa/math/m_vector.h
+++ b/src/mesa/math/m_vector.h
@@ -61,7 +61,7 @@
  */
 typedef struct {
GLfloat (*data)[4]; /** may be malloc'd or point to client data */
-   GLfloat *start; /** points somewhere inside of data */
+   GLfloat *start; /** points somewhere inside of GLvector4f::data */
GLuint count;   /** size of the vector (in elements) */
GLuint stride;  /** stride from one element to the next (in bytes) */
GLuint size;/** 2-4 for vertices and 1-4 for texcoords */
-- 
2.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/2] doxygen: Correct grammatical typo in math/m_vector.h

2015-07-22 Thread rhyskidd
From: Rhys Kidd rhysk...@gmail.com

Signed-off-by: Rhys Kidd rhysk...@gmail.com
---
 src/mesa/math/m_vector.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/mesa/math/m_vector.h b/src/mesa/math/m_vector.h
index 8551ee7..3b7f583 100644
--- a/src/mesa/math/m_vector.h
+++ b/src/mesa/math/m_vector.h
@@ -51,7 +51,7 @@
 
 /**
  * Wrap all the information about vectors up in a struct.  Has
- * additional fields compared to the other vectors to help us track of
+ * additional fields compared to the other vectors to help us track
  * different vertex sizes, and whether we need to clean columns out
  * because they contain non-(0,0,0,1) values.
  *
-- 
2.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] glsl: Fix a bug where LHS swizzles of swizzles were too small.

2015-07-22 Thread Kenneth Graunke
A simple shader such as

   vec4 color;
   color.xy.x = 1.0;

would cause ir_assignment::set_lhs() to generate bogus IR:

   (swiz xy (swiz x (constant float (1.0

We were setting the number of components of each new RHS swizzle based
on the highest channel used in the LHS swizzle.  So, .xy.y would
generate (swiz xy (swiz xx ...)), while .xy.x would break.

Our existing Piglit test happened to use .xzy.z, which worked, since
'z' is the third component, resulting in an xxx swizzle.

This patch sets the number of swizzle components based on the size of
the LHS swizzle's inner value, so we always have the correct number
at each step.

Fixes new Piglit tests glsl-vs-swizzle-swizzle-lhs-[23].
Fixes ir_validate assertions in in Metro 2033 Redux.

Cc: i...@freedesktop.org
Cc: mesa-sta...@lists.freedesktop.org
Signed-off-by: Kenneth Graunke kenn...@whitecape.org
---
 src/glsl/ir.cpp | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/src/glsl/ir.cpp b/src/glsl/ir.cpp
index 8fb7ca4..7350dfa 100644
--- a/src/glsl/ir.cpp
+++ b/src/glsl/ir.cpp
@@ -95,6 +95,8 @@ ir_assignment::set_lhs(ir_rvalue *lhs)
 
 write_mask |= (((this-write_mask  i)  1)  c);
 update_rhs_swizzle(rhs_swiz, i, c);
+ assert(rhs_swiz.num_components = swiz-val-type-vector_elements);
+ rhs_swiz.num_components = swiz-val-type-vector_elements;
   }
 
   this-write_mask = write_mask;
-- 
2.4.6

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 18/20] mesa: fill out the ARB_shader_subroutine APIs

2015-07-22 Thread Dave Airlie
On 23 July 2015 at 15:01, Kenneth Graunke kenn...@whitecape.org wrote:
 On Tuesday, July 21, 2015 03:19:25 PM Dave Airlie wrote:
 From: Dave Airlie airl...@redhat.com

 This fleshes out the APIs, using the program resource
 APIs where they should match.

 It also sets the default values to valid subroutines.

 Signed-off-by: Dave Airlie airl...@redhat.com
 ---
  src/mesa/main/shaderapi.c | 450 
 +-
  src/mesa/main/shaderapi.h |   3 +
  2 files changed, 450 insertions(+), 3 deletions(-)

 So, one question I have...according to the spec...

Subroutine uniform variables are similar to uniform variables, except they
 are context state rather than program state. Having subroutine uniforms be
 context state allows them to have different values if the program is used
 in multiple contexts simultaneously.  There is a set of subroutine
 uniforms for each shader stage.

 but it looks like we're storing them in the shader, rather than
 per-context.  Bug?  Or am I missing something?

No I think it's a bug in that I may have convinced myself the constant
buffer we give to drivers is per context, when it really isn't.

The question is whether this is some feature anyone is ever going to
care about at this point, I'd still like to land this, and I try and
go write a piglit test that isn't crap to test for that case.


 At this point I'm just inclined to give everything I haven't reviewed an
 Acked-by: Kenneth Graunke kenn...@whitecape.org

 I don't think anybody wants to spend much more time on this, and it'd be
 nice to check off the box and move on.

Yup me to :-0

Dave.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 5/7] nv50: add support for compute/graphics global performance counters

2015-07-22 Thread Martin Peres

On 01/07/15 01:01, Samuel Pitoiset wrote:

This commit adds support for both compute and graphics global
performance counters which have been reverse engineered with
CUPTI (Linux) and PerfKit (Windows).

Currently, only one query type can be monitored at the same time because
the Gallium's HUD doesn't fit pretty well. This will be improved later.

Changes since v2:
- replace \% by percentage
- remove one extra call to PUSH_SPACE
- use nouveau_fence instead of my hand-made fence mechanism

Signed-off-by: Samuel Pitoiset samuel.pitoi...@gmail.com
---
  src/gallium/drivers/nouveau/nv50/nv50_query.c  | 1066 +++-
  src/gallium/drivers/nouveau/nv50/nv50_screen.h |   35 +
  2 files changed, 1096 insertions(+), 5 deletions(-)

diff --git a/src/gallium/drivers/nouveau/nv50/nv50_query.c 
b/src/gallium/drivers/nouveau/nv50/nv50_query.c
index 81f7474..7fb6f3a 100644
--- a/src/gallium/drivers/nouveau/nv50/nv50_query.c
+++ b/src/gallium/drivers/nouveau/nv50/nv50_query.c
@@ -27,6 +27,8 @@
  #include nv50/nv50_context.h
  #include nv_object.xml.h
  
+#include nouveau_perfmon.h

+
  #define NV50_QUERY_STATE_READY   0
  #define NV50_QUERY_STATE_ACTIVE  1
  #define NV50_QUERY_STATE_ENDED   2
@@ -51,10 +53,25 @@ struct nv50_query {
 boolean is64bit;
 struct nouveau_mm_allocation *mm;
 struct nouveau_fence *fence;
+   struct nouveau_object *perfdom;
  };
  
  #define NV50_QUERY_ALLOC_SPACE 256
  
+#ifdef DEBUG
No need to guard the definition of this function. The compiler will get 
rid of it if it has no users.

+static void nv50_hw_pm_dump_perfdom(struct nvif_perfdom_v0 *args);
+#endif
+
+static boolean
+nv50_hw_pm_query_create(struct nv50_context *, struct nv50_query *);
+static void
+nv50_hw_pm_query_destroy(struct nv50_context *, struct nv50_query *);
+static boolean
+nv50_hw_pm_query_begin(struct nv50_context *, struct nv50_query *);
+static void nv50_hw_pm_query_end(struct nv50_context *, struct nv50_query *);
+static boolean nv50_hw_pm_query_result(struct nv50_context *,
+struct nv50_query *, boolean, void *);
+
  static INLINE struct nv50_query *
  nv50_query(struct pipe_query *pipe)
  {
@@ -96,9 +113,15 @@ nv50_query_allocate(struct nv50_context *nv50, struct 
nv50_query *q, int size)
  static void
  nv50_query_destroy(struct pipe_context *pipe, struct pipe_query *pq)
  {
-   nv50_query_allocate(nv50_context(pipe), nv50_query(pq), 0);
-   nouveau_fence_ref(NULL, nv50_query(pq)-fence);
-   FREE(nv50_query(pq));
+   struct nv50_context *nv50 = nv50_context(pipe);
+   struct nv50_query *q = nv50_query(pq);
+
+   if ((q-type = NV50_HW_PM_QUERY(0)  q-type = NV50_HW_PM_QUERY_LAST))
+  nv50_hw_pm_query_destroy(nv50, q);
+
+   nv50_query_allocate(nv50, q, 0);
+   nouveau_fence_ref(NULL, q-fence);
+   FREE(q);
  }
  
  static struct pipe_query *

@@ -120,6 +143,12 @@ nv50_query_create(struct pipe_context *pipe, unsigned 
type, unsigned index)
   type == PIPE_QUERY_PRIMITIVES_EMITTED ||
   type == PIPE_QUERY_SO_STATISTICS ||
   type == PIPE_QUERY_PIPELINE_STATISTICS);
+   if (type = NV50_HW_PM_QUERY(0)  q-type = NV50_HW_PM_QUERY_LAST) {
+  /* Hardware global performance counters are not 64 bits, but we also use
+   * a fence to make sure the query is ready. */


I do not understand the logic of this comment.

+  q-is64bit = TRUE;
+   }
+
 q-type = type;
  
 if (q-type == PIPE_QUERY_OCCLUSION_COUNTER) {

@@ -127,6 +156,11 @@ nv50_query_create(struct pipe_context *pipe, unsigned 
type, unsigned index)
q-data -= 32 / sizeof(*q-data); /* we advance before query_begin ! */
 }
  
+   if ((q-type = NV50_HW_PM_QUERY(0)  q-type = NV50_HW_PM_QUERY_LAST)) {

+  if (!nv50_hw_pm_query_create(nv50, q))
+ return NULL;
+   }
+
 return (struct pipe_query *)q;
  }
  
@@ -151,6 +185,7 @@ nv50_query_begin(struct pipe_context *pipe, struct pipe_query *pq)

 struct nv50_context *nv50 = nv50_context(pipe);
 struct nouveau_pushbuf *push = nv50-base.pushbuf;
 struct nv50_query *q = nv50_query(pq);
+   boolean ret = TRUE;
  
 /* For occlusion queries we have to change the storage, because a previous

  * query might set the initial render conition to FALSE even *after* we re-
@@ -205,10 +240,13 @@ nv50_query_begin(struct pipe_context *pipe, struct 
pipe_query *pq)
nv50_query_get(push, q, 0x10, 0x5002);
break;
 default:
+  if ((q-type = NV50_HW_PM_QUERY(0)  q-type = 
NV50_HW_PM_QUERY_LAST)) {
+ ret = nv50_hw_pm_query_begin(nv50, q);
+  }
break;
 }
 q-state = NV50_QUERY_STATE_ACTIVE;
-   return true;
+   return ret;
  }
  
  static void

@@ -265,7 +303,9 @@ nv50_query_end(struct pipe_context *pipe, struct pipe_query 
*pq)
q-state = NV50_QUERY_STATE_READY;
break;
 default:
-  assert(0);
+  if ((q-type = NV50_HW_PM_QUERY(0)  q-type = 
NV50_HW_PM_QUERY_LAST)) {
+ 

Re: [Mesa-dev] [PATCH 1/2] mesa: Detect and provide macros for function attributes pure and const.

2015-07-22 Thread Jose Fonseca

On 22/07/15 21:01, Jose Fonseca wrote:

On 22/07/15 17:13, Jose Fonseca wrote:

On 21/07/15 15:57, Emil Velikov wrote:

On 18 July 2015 at 08:13, Jose Fonseca jfons...@vmware.com wrote:

On 18/07/15 01:38, Eric Anholt wrote:


Emil Velikov emil.l.veli...@gmail.com writes:


On 14/07/15 19:45, Eric Anholt wrote:


These are really useful hints to the compiler in the absence of
link-time
optimization, and I'm going to use them in VC4.

I've made the const attribute be ATTRIBUTE_CONST unlike other
function
attributes, because we have other things in the tree #defining
CONST for
their own unrelated purposes.


Mindly related: how people feel about making these macros less
screamy,
by following the approach used in the kernel: PURE - __pure and so
on ?



I'd love it.



Less screamy is fine, but beware prefixing double underscore: the C
standard
stipulates that its use is reserved for for C/C++ runtime. [1]


I though about it before posting although I've seen others define
those, even do so in their public headers.
Now that I have some examples from my current /usr/include

Searching for __pure
dwarves/dutil.h:#define __pure __attribute__ ((pure))

Searching for __attribute_const__
sys/cdefs.h:# define __attribute_const__ __attribute__ ((__const__))
sys/cdefs.h:# define __attribute_const__ /* Ignore */

Searching for __printf

Searching for __always_unused

Searching for __noreturn

Searching for __packed
libvisual-0.4/libvisual/lv_defines.h:# define __packed __attribute__
((packed))
libvisual-0.4/libvisual/lv_defines.h:# define __packed /* no packed */
bsd/sys/cdefs.h:#  define __packed __attribute__((__packed__))
bsd/sys/cdefs.h:#  define __packed

Searching for __deprecated
pciaccess.h:#define __deprecated __attribute__((deprecated))
pciaccess.h:#define __deprecated

Searching for __weak

Searching for __alias

With a handful of other headers defining more double underscore
prefixed macros.


Look at stdlibc++ implementation: every internal variable has a double
underscore prefix.


Unless we're talking about STL/other template library we don't care
what library foo uses in it's internal implementation do we ? After
all these will be resolved at compile time.


Maybe kernel gets away on GLIBC (and because it doesn't use C++), but
there's no guarantee it will work on other C runtimes, and even if it
does,
it could start failing anytime.


True, it's not the best of ideas. Just worth pointing out that the
cat is already out, for other projects.



   There are already more than 12K #define __foo cases on my system.

These defines are reserved for system headers, so it's natural to be
lots of them in /usr/include.


MacOSX also defines some of these on its sys/cdefs.h:


http://www.opensource.apple.com/source/xnu/xnu-1456.1.26/bsd/sys/cdefs.h

The question is: can we expect that most systems will define these
__foo, or at least not use them for other purposes.

I don't know the answer.  At a glance MSVC doesn't seem to rely on them
for anything.  So it might work.  I don't oppose if you want to give it
a shot.


Jose


Ironically, Windows headers already define PURE (its used in COM
interfaces).  Just realized this now after this was merged.

So we really need a different name...


Question is which name: ATTRIBUTE_PURE, __pure, something else?

As I said, I don't oppose __pure, but some logic will be necessary to 
avoid redefition when system headers (like sys/cdefs.h) already define 
it, which is not trivial.  It will need something like



   #if HAVE_SYS_CDEFS_H
   #include sys/cdefs.h  // for __pure
   #endif
   #ifndef __pure
   
   #endif


Jose




___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 7/7] nv50: enable GL_AMD_performance_monitor

2015-07-22 Thread Martin Peres

On 01/07/15 01:01, Samuel Pitoiset wrote:

This exposes a group of global performance counters that enables
GL_AMD_performance_monitor. All piglit tests are okay.

Signed-off-by: Samuel Pitoiset samuel.pitoi...@gmail.com
---
  src/gallium/drivers/nouveau/nv50/nv50_query.c  | 35 ++
  src/gallium/drivers/nouveau/nv50/nv50_screen.c |  1 +
  src/gallium/drivers/nouveau/nv50/nv50_screen.h |  6 +
  3 files changed, 42 insertions(+)

diff --git a/src/gallium/drivers/nouveau/nv50/nv50_query.c 
b/src/gallium/drivers/nouveau/nv50/nv50_query.c
index 7dadb77..6d57305 100644
--- a/src/gallium/drivers/nouveau/nv50/nv50_query.c
+++ b/src/gallium/drivers/nouveau/nv50/nv50_query.c
@@ -1563,6 +1563,7 @@ nv50_screen_get_driver_query_info(struct pipe_screen 
*pscreen,
  
   info-name = cfg-event-name;

   info-query_type = NV50_HW_PM_QUERY(id);
+ info-group_id = NV50_HW_PM_QUERY_GROUP;
   info-max_value.u64 =
  (cfg-event-display == NV50_HW_PM_EVENT_DISPLAY_RATIO) ? 100 : 0;
   return 1;
@@ -1573,6 +1574,40 @@ nv50_screen_get_driver_query_info(struct pipe_screen 
*pscreen,
 return 0;
  }
  
+int

+nv50_screen_get_driver_query_group_info(struct pipe_screen *pscreen,
+unsigned id,
+struct pipe_driver_query_group_info 
*info)
+{
+   struct nv50_screen *screen = nv50_screen(pscreen);
+   int count = 0;
+
+   // TODO: Check DRM version when nvif will be merged in libdrm!
+   if (screen-base.perfmon) {
+  count++; /* NV50_HW_PM_QUERY_GROUP */
+   }
+
+   if (!info)
+  return count;
+
+   if (id == NV50_HW_PM_QUERY_GROUP) {
+  if (screen-base.perfmon) {
+ info-name = Global performance counters;
+ info-type = PIPE_DRIVER_QUERY_GROUP_TYPE_GPU;
+ info-num_queries = NV50_HW_PM_QUERY_COUNT;
+ info-max_active_queries = 1; /* TODO: get rid of this limitation! */
+ return 1;
+  }
+   }
+
+   /* user asked for info about non-existing query group */
+   info-name = this_is_not_the_query_group_you_are_looking_for;
+   info-max_active_queries = 0;
+   info-num_queries = 0;
+   info-type = 0;
+   return 0;
+}
+
  void
  nv50_init_query_functions(struct nv50_context *nv50)
  {
diff --git a/src/gallium/drivers/nouveau/nv50/nv50_screen.c 
b/src/gallium/drivers/nouveau/nv50/nv50_screen.c
index ac1acd1..05f921d 100644
--- a/src/gallium/drivers/nouveau/nv50/nv50_screen.c
+++ b/src/gallium/drivers/nouveau/nv50/nv50_screen.c
@@ -749,6 +749,7 @@ nv50_screen_create(struct nouveau_device *dev)
 pscreen-get_shader_param = nv50_screen_get_shader_param;
 pscreen-get_paramf = nv50_screen_get_paramf;
 pscreen-get_driver_query_info = nv50_screen_get_driver_query_info;
+   pscreen-get_driver_query_group_info = 
nv50_screen_get_driver_query_group_info;
  
 nv50_screen_init_resource_functions(pscreen);
  
diff --git a/src/gallium/drivers/nouveau/nv50/nv50_screen.h b/src/gallium/drivers/nouveau/nv50/nv50_screen.h

index 69127c0..807ae0e 100644
--- a/src/gallium/drivers/nouveau/nv50/nv50_screen.h
+++ b/src/gallium/drivers/nouveau/nv50/nv50_screen.h
@@ -114,6 +114,9 @@ nv50_screen(struct pipe_screen *screen)
 return (struct nv50_screen *)screen;
  }
  
+/* Hardware global performance counters groups. */

+#define NV50_HW_PM_QUERY_GROUP 0
+
  /* Hardware global performance counters. */
  #define NV50_HW_PM_QUERY_COUNT  24
  #define NV50_HW_PM_QUERY(i)(PIPE_QUERY_DRIVER_SPECIFIC + (i))
@@ -146,6 +149,9 @@ nv50_screen(struct pipe_screen *screen)
  int nv50_screen_get_driver_query_info(struct pipe_screen *, unsigned,
struct pipe_driver_query_info *);
  
+int nv50_screen_get_driver_query_group_info(struct pipe_screen *, unsigned,

+struct 
pipe_driver_query_group_info *);
+
  boolean nv50_blitter_create(struct nv50_screen *);
  void nv50_blitter_destroy(struct nv50_screen *);
  

Everything looks good to me! Excellent work Samuel!

Reviewed-by: Martin Peres martin.pe...@free.fr

IIRC, the kernel patches were supposed to land in 4.2 as there was no 
pull request from Ben, it will likely end up in 4.3. As for the libdrm 
patches, did you review them? Looking forward to seeing this series merged!


It would also be nice to start a discussion to rework the Gallium HUD to 
avoid the stupid problem that we have where we can only monitor one 
signal at a time!


Congrats again Samuel!
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 6/7] nv50: expose global performance counters to the HUD

2015-07-22 Thread Martin Peres

On 01/07/15 01:01, Samuel Pitoiset wrote:

Signed-off-by: Samuel Pitoiset samuel.pitoi...@gmail.com
---
  src/gallium/drivers/nouveau/nv50/nv50_query.c  | 41 ++
  src/gallium/drivers/nouveau/nv50/nv50_screen.c |  1 +
  src/gallium/drivers/nouveau/nv50/nv50_screen.h |  3 ++
  3 files changed, 45 insertions(+)

diff --git a/src/gallium/drivers/nouveau/nv50/nv50_query.c 
b/src/gallium/drivers/nouveau/nv50/nv50_query.c
index 7fb6f3a..7dadb77 100644
--- a/src/gallium/drivers/nouveau/nv50/nv50_query.c
+++ b/src/gallium/drivers/nouveau/nv50/nv50_query.c
@@ -1532,6 +1532,47 @@ nv50_hw_pm_query_result(struct nv50_context *nv50, 
struct nv50_query *q,
 return TRUE;
  }
  
+int

+nv50_screen_get_driver_query_info(struct pipe_screen *pscreen,
+  unsigned id,
+  struct pipe_driver_query_info *info)
+{
+   struct nv50_screen *screen = nv50_screen(pscreen);
+   int count = 0;
+
+   // TODO: Check DRM version when nvif will be merged in libdrm!
+   if (screen-base.perfmon) {
+  nv50_identify_events(screen);
+  count += NV50_HW_PM_QUERY_COUNT;
+   }
+
+   if (!info)
+  return count;
+
+   /* Init default values. */
+   info-name = this_is_not_the_query_you_are_looking_for;


The spirit of calim is still in the code. Must be the ghost in the shell!

The patch is

Reviewed-by: Martin Peres martin.pe...@free.fr

+   info-query_type = 0xdeadd01d;
+   info-type = PIPE_DRIVER_QUERY_TYPE_UINT64;
+   info-max_value.u64 = 0;
+   info-group_id = -1;
+
+   if (id  count) {
+  if (screen-base.perfmon) {
+ const struct nv50_hw_pm_query_cfg *cfg =
+nv50_hw_pm_query_get_cfg(screen, NV50_HW_PM_QUERY(id));
+
+ info-name = cfg-event-name;
+ info-query_type = NV50_HW_PM_QUERY(id);
+ info-max_value.u64 =
+(cfg-event-display == NV50_HW_PM_EVENT_DISPLAY_RATIO) ? 100 : 0;
+ return 1;
+  }
+   }
+
+   /* User asked for info about non-existing query. */
+   return 0;
+}
+
  void
  nv50_init_query_functions(struct nv50_context *nv50)
  {
diff --git a/src/gallium/drivers/nouveau/nv50/nv50_screen.c 
b/src/gallium/drivers/nouveau/nv50/nv50_screen.c
index 335bff1..ac1acd1 100644
--- a/src/gallium/drivers/nouveau/nv50/nv50_screen.c
+++ b/src/gallium/drivers/nouveau/nv50/nv50_screen.c
@@ -748,6 +748,7 @@ nv50_screen_create(struct nouveau_device *dev)
 pscreen-get_param = nv50_screen_get_param;
 pscreen-get_shader_param = nv50_screen_get_shader_param;
 pscreen-get_paramf = nv50_screen_get_paramf;
+   pscreen-get_driver_query_info = nv50_screen_get_driver_query_info;
  
 nv50_screen_init_resource_functions(pscreen);
  
diff --git a/src/gallium/drivers/nouveau/nv50/nv50_screen.h b/src/gallium/drivers/nouveau/nv50/nv50_screen.h

index 0449659..69127c0 100644
--- a/src/gallium/drivers/nouveau/nv50/nv50_screen.h
+++ b/src/gallium/drivers/nouveau/nv50/nv50_screen.h
@@ -143,6 +143,9 @@ nv50_screen(struct pipe_screen *screen)
  #define NV50_HW_PM_QUERY_TEX_CACHE_HIT  22
  #define NV50_HW_PM_QUERY_TEX_WAITS_FOR_FB   23
  
+int nv50_screen_get_driver_query_info(struct pipe_screen *, unsigned,

+  struct pipe_driver_query_info *);
+
  boolean nv50_blitter_create(struct nv50_screen *);
  void nv50_blitter_destroy(struct nv50_screen *);
  


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [RFC 17/16] glsl: add AoA support for atmoic counters

2015-07-22 Thread Timothy Arceri
---
 src/glsl/link_atomics.cpp | 73 +++
 1 file changed, 55 insertions(+), 18 deletions(-)

diff --git a/src/glsl/link_atomics.cpp b/src/glsl/link_atomics.cpp
index 100d03c..21f9af7 100644
--- a/src/glsl/link_atomics.cpp
+++ b/src/glsl/link_atomics.cpp
@@ -95,6 +95,56 @@ namespace {
y-data.atomic.offset  x-data.atomic.offset + 
x-type-atomic_size()));
}
 
+   void
+   process_atomic_variable(const glsl_type *t, struct gl_shader_program *prog,
+   char **name, size_t name_length, ir_variable *var,
+   active_atomic_buffer *const buffers,
+   unsigned *num_buffers, int *offset,
+   const unsigned shader_stage)
+   {
+  /* FIXME: Arrays of arrays get counted separately. For example:
+   * x1[3][3][2] = 9 counters
+   * x2[3][2]= 3 counters
+   * x3[2]   = 1 counter
+   *
+   * However this code marks all the counters as active even when they
+   * might not be used.
+   */
+  if (t-is_array()  t-fields.array-is_array()) {
+ for (unsigned i = 0; i  t-length; i++) {
+   size_t new_length = name_length;
+
+   /* Append the subscript to the current variable name */
+   ralloc_asprintf_rewrite_tail(name, new_length, [%u], i);
+
+process_atomic_variable(t-fields.array, prog, name, new_length,
+var, buffers, num_buffers, offset,
+shader_stage);
+ }
+  } else {
+ unsigned id = 0;
+ bool found = prog-UniformHash-get(id, *name);
+ assert(found);
+ (void) found;
+ active_atomic_buffer *buf = buffers[var-data.binding];
+ gl_uniform_storage *const storage = prog-UniformStorage[id];
+
+ /* If this is the first time the buffer is used, increment
+  * the counter of buffers used.
+  */
+ if (buf-size == 0)
+(*num_buffers)++;
+
+ buf-push_back(id, var);
+
+ buf-stage_references[shader_stage]++;
+ buf-size = MAX2(buf-size, *offset + t-atomic_size());
+
+ storage-offset = *offset;
+ *offset += t-atomic_size();
+  }
+   }
+
active_atomic_buffer *
find_active_atomic_counters(struct gl_context *ctx,
struct gl_shader_program *prog,
@@ -114,23 +164,11 @@ namespace {
 ir_variable *var = node-as_variable();
 
 if (var  var-type-contains_atomic()) {
-   unsigned id = 0;
-   bool found = prog-UniformHash-get(id, var-name);
-   assert(found);
-   (void) found;
-   active_atomic_buffer *buf = buffers[var-data.binding];
-
-   /* If this is the first time the buffer is used, increment
-* the counter of buffers used.
-*/
-   if (buf-size == 0)
-  (*num_buffers)++;
-
-   buf-push_back(id, var);
-
-   buf-stage_references[i]++;
-   buf-size = MAX2(buf-size, var-data.atomic.offset +
-var-type-atomic_size());
+   char *name = ralloc_strdup(NULL, var-name);
+   int offset = var-data.atomic.offset;
+   process_atomic_variable(var-type, prog, name, strlen(name),
+   var, buffers, num_buffers, offset, i);
+   ralloc_free(name);
 }
  }
   }
@@ -205,7 +243,6 @@ link_assign_atomic_counter_resources(struct gl_context *ctx,
 var-data.binding = i;
 
  storage-atomic_buffer_index = i;
- storage-offset = var-data.atomic.offset;
  storage-array_stride = (var-type-is_array() ?
   var-type-without_array()-atomic_size() : 
0);
   }
-- 
2.4.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] Using the right context in st_texture_release_all_sampler_views()

2015-07-22 Thread Brian Paul

Hi Marek,

This is regarding your commit st/mesa: use pipe_sampler_view_release 
for releasing sampler views from last October.


Basically, we have:

void
st_texture_release_all_sampler_views(struct st_context *st,
 struct st_texture_object *stObj)
{
   GLuint i;

   /* XXX This should use sampler_views[i]-pipe, not st-pipe */
   for (i = 0; i  stObj-num_sampler_views; ++i)
  pipe_sampler_view_release(st-pipe, stObj-sampler_views[i]);
}

Our VMware/svga driver has an issue when 
pipe_context::sampler_view_destroy() is called with one context and a 
sampler view which was created with another context.  I can hack around 
it in our driver code, but it would be nice to fix this in the state 
tracker.


Ideally, the above code should be something like:

   for (i = 0; i  stObj-num_sampler_views; ++i)
  pipe_sampler_view_reference(stObj-sampler_views[i], NULL);

The current code which uses the st-pipe context came from the bug
https://bugs.freedesktop.org/show_bug.cgi?id=81680

AFAICT, you were just working around an R600 driver issue.  Any chance 
we could fix the state tracker and re-test Firefox on R600?


-Brian
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [RFC 18/16] nir: wrapper for glsl_type arrays_of_arrays_size()

2015-07-22 Thread Timothy Arceri
---
 src/glsl/nir/nir_types.cpp | 6 ++
 src/glsl/nir/nir_types.h   | 2 ++
 2 files changed, 8 insertions(+)

diff --git a/src/glsl/nir/nir_types.cpp b/src/glsl/nir/nir_types.cpp
index 62176f5..06f5b0a 100644
--- a/src/glsl/nir/nir_types.cpp
+++ b/src/glsl/nir/nir_types.cpp
@@ -106,6 +106,12 @@ glsl_get_length(const struct glsl_type *type)
return type-is_matrix() ? type-matrix_columns : type-length;
 }
 
+unsigned
+glsl_get_aoa_size(const struct glsl_type *type)
+{
+   return type-arrays_of_arrays_size();
+}
+
 const char *
 glsl_get_struct_elem_name(const struct glsl_type *type, unsigned index)
 {
diff --git a/src/glsl/nir/nir_types.h b/src/glsl/nir/nir_types.h
index 276d4ad..3f75e46 100644
--- a/src/glsl/nir/nir_types.h
+++ b/src/glsl/nir/nir_types.h
@@ -59,6 +59,8 @@ unsigned glsl_get_matrix_columns(const struct glsl_type 
*type);
 
 unsigned glsl_get_length(const struct glsl_type *type);
 
+unsigned glsl_get_aoa_size(const struct glsl_type *type);
+
 const char *glsl_get_struct_elem_name(const struct glsl_type *type,
   unsigned index);
 
-- 
2.4.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [RFC 19/16] nir: add atomic lowering support for AoA

2015-07-22 Thread Timothy Arceri
---
 src/glsl/nir/nir_lower_atomics.c | 20 +++-
 1 file changed, 11 insertions(+), 9 deletions(-)

diff --git a/src/glsl/nir/nir_lower_atomics.c b/src/glsl/nir/nir_lower_atomics.c
index ce3615a..e5ec008 100644
--- a/src/glsl/nir/nir_lower_atomics.c
+++ b/src/glsl/nir/nir_lower_atomics.c
@@ -72,15 +72,17 @@ lower_instr(nir_intrinsic_instr *instr, nir_function_impl 
*impl)
 
nir_ssa_def *offset_def = offset_const-def;
 
-   if (instr-variables[0]-deref.child != NULL) {
-  assert(instr-variables[0]-deref.child-deref_type ==
- nir_deref_type_array);
-  nir_deref_array *deref_array =
- nir_deref_as_array(instr-variables[0]-deref.child);
-  assert(deref_array-deref.child == NULL);
-
-  offset_const-value.u[0] +=
- deref_array-base_offset * ATOMIC_COUNTER_SIZE;
+   nir_deref *tail = instr-variables[0]-deref;
+   while (tail-child != NULL) {
+  assert(tail-child-deref_type == nir_deref_type_array);
+  nir_deref_array *deref_array = nir_deref_as_array(tail-child);
+  tail = tail-child;
+
+  unsigned child_array_elements = tail != NULL ?
+ glsl_get_aoa_size(tail-type) : 1;
+
+  offset_const-value.u[0] += deref_array-base_offset *
+ child_array_elements * ATOMIC_COUNTER_SIZE;
 
   if (deref_array-deref_array_type == nir_deref_array_type_indirect) {
  nir_load_const_instr *atomic_counter_size =
-- 
2.4.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 90311] Fail to build libglx with clang at linking stage

2015-07-22 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=90311

Julien Isorce julien.iso...@gmail.com changed:

   What|Removed |Added

 CC||emil.l.veli...@gmail.com

--- Comment #3 from Julien Isorce julien.iso...@gmail.com ---
Emil I am not gonna try your solution soon so I copy/past your comment from
mailing list here in case someone else finds this bug and wants to go ahead:


So as mentioned previously glx (libGL) should interact with the
dispatch directly, rather than have any knowledge in the
implementation detail (which lies in mesa).

src/glx/indirect_glx.c is a nice example, where __glXNewIndirectAPI
(autogenerated from
mapi/glapi/gen/glX_proto_send.py) is used to create the dispatch table
and then set it with _glapi_set_dispatch().

AppleDRI uses similar approach
_glapi_create_table_from_handle(generated from
mapi/glapi/gen/gl_gentable.py), but there are things that could be
improved/fixed.

 - __ogl_framework_api does not need to be over a thousand entries
long. There are six overrides + __ogl_framework_api-Scissor used.
So tweaking the generator to directly produce __applegl_api and(?) a
slimmed down __ogl_framework_api sounds like a good idea to me.

Further on one could also:
 - Drop the multiple forward declarations of __ogl_framework_api.
 - Wrap/implement appledri around __GLXDRI(display|screen) shoving
storing things like dl_handle, thus allowing things to be torn down on
exit. Namely I'm suggesting to cleanup and test Jon's earlier work.
 - apple_glapi_set_dispatch is called too late - at
applegl_bind_context. According to the docs/spec one should be able to
fetch the function pointers without any context let alone a currently
bound one.
 - As a follow up from the above struct
glx_context_vtable::get_proc_address could be nuked, making the struct
fit nicely into 32/64 byte cache :-)

Obviously if you can find an alternative way (but not as hacky as this
one) that'll be great.

Cheers,
Emil

P.S. Why is there no appledriproto package, similar to xf86driproto
and friends ? Having appledri{,str}.h duplicated seems fragile and
ill-advised.



-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 10/20] i965/fs: Implement image load, store and atomic.

2015-07-22 Thread Jason Ekstrand
This all looks correct as far as I can tell.  However, I'm very
concerned about the number of checks such as
has_matching_typed_format() that are built-in to the compiler (via
surface_builder) where we then go on to do something that is highly
dependant on state setup doing the exact same check (not via
surface_builder) and doing the right thing.  Another example would be
the interplay with has_split_bit_layout.  How do we know these won't
get out of sync?

On Tue, Jul 21, 2015 at 9:38 AM, Francisco Jerez curroje...@riseup.net wrote:
 v2: Drop VEC4 suport.
 v3: Rebase.
 ---
  .../drivers/dri/i965/brw_fs_surface_builder.cpp| 216 
 +
  src/mesa/drivers/dri/i965/brw_fs_surface_builder.h |  17 ++
  2 files changed, 233 insertions(+)

 diff --git a/src/mesa/drivers/dri/i965/brw_fs_surface_builder.cpp 
 b/src/mesa/drivers/dri/i965/brw_fs_surface_builder.cpp
 index ea1c4aa..46b449f 100644
 --- a/src/mesa/drivers/dri/i965/brw_fs_surface_builder.cpp
 +++ b/src/mesa/drivers/dri/i965/brw_fs_surface_builder.cpp
 @@ -587,3 +587,219 @@ namespace {
}
 }
  }
 +
 +namespace brw {
 +   namespace image_access {
 +  /**
 +   * Load a vector from a surface of the given format and dimensionality
 +   * at the given coordinates.
 +   */
 +  fs_reg
 +  emit_image_load(const fs_builder bld,
 +  const fs_reg image, const fs_reg addr,
 +  mesa_format format, unsigned dims)
 +  {
 + using namespace image_format_info;
 + using namespace image_format_conversion;
 + using namespace image_validity;
 + using namespace image_coordinates;
 + using namespace surface_access;
 + const brw_device_info *devinfo = bld.shader-devinfo;
 + const mesa_format lower_format =
 +brw_lower_mesa_image_format(devinfo, format);
 + fs_reg tmp;
 +
 + if (has_matching_typed_format(devinfo, format)) {
 +/* Hopefully we get here most of the time... */
 +tmp = emit_typed_read(bld, image, addr, dims,
 +  _mesa_format_num_components(lower_format));
 + } else {
 +/* Untyped surface reads return 32 bits of the surface per
 + * component, without any sort of unpacking or type conversion,
 + */
 +const unsigned size = _mesa_get_format_bytes(format) / 4;
 +
 +/* they don't properly handle out of bounds access, so we have to
 + * check manually if the coordinates are valid and predicate the
 + * surface read on the result,
 + */
 +const brw_predicate pred =
 +   emit_bounds_check(bld, image, addr, dims);
 +
 +/* and they don't know about surface coordinates, we need to
 + * convert them to a raw memory offset.
 + */
 +const fs_reg laddr = emit_address_calculation(bld, image, addr, 
 dims);
 +
 +tmp = emit_untyped_read(bld, image, laddr, 1, size, pred);
 +
 +/* An out of bounds surface access should give zero as result. */
 +for (unsigned c = 0; c  4; ++c)
 +   set_predicate(pred, bld.SEL(offset(tmp, bld, c),
 +   offset(tmp, bld, c), fs_reg(0)));
 + }
 +
 + /* Set the register type to D instead of UD if the data type is
 +  * represented as a signed integer in memory so that sign extension
 +  * is handled correctly by unpack.
 +  */
 + if (needs_sign_extension(format))
 +tmp = retype(tmp, BRW_REGISTER_TYPE_D);
 +
 + if (!has_supported_bit_layout(devinfo, format)) {
 +/* Unpack individual vector components from the bitfield if the
 + * hardware is unable to do it for us.
 + */
 +if (has_split_bit_layout(devinfo, format))
 +   tmp = emit_pack(bld, tmp, get_bit_shifts(lower_format),
 +   get_bit_widths(lower_format));
 +else
 +   tmp = emit_unpack(bld, tmp, get_bit_shifts(format),
 + get_bit_widths(format));
 +
 + } else if ((needs_sign_extension(format) 
 + !is_conversion_trivial(devinfo, format)) ||
 +has_undefined_high_bits(devinfo, format)) {
 +/* Perform a trivial unpack even though the bit layout matches in
 + * order to get the most significant bits of each component
 + * initialized properly.
 + */
 +tmp = emit_unpack(bld, tmp, color_u(0, 32, 64, 96),
 +  get_bit_widths(format));
 + }
 +
 + if (!_mesa_is_format_integer(format)) {
 +if (is_conversion_trivial(devinfo, format)) {
 +   /* Just need to cast the vector to the target type. */
 +   tmp = retype(tmp, 

Re: [Mesa-dev] [PATCH 20/20] i965: Expose ARB_shader_image_load_store.

2015-07-22 Thread Jason Ekstrand
On Tue, Jul 21, 2015 at 9:38 AM, Francisco Jerez curroje...@riseup.net wrote:
 Reviewed-by: Paul Berry stereotype...@gmail.com

I'm sure that Paul still thinks this patch does what the commit
message says.  However, does the r-b really still apply to the rest of
it?

 ---
  src/mesa/drivers/dri/i965/intel_extensions.c | 1 +
  1 file changed, 1 insertion(+)

 diff --git a/src/mesa/drivers/dri/i965/intel_extensions.c 
 b/src/mesa/drivers/dri/i965/intel_extensions.c
 index 6b3bd12..5a5e308 100644
 --- a/src/mesa/drivers/dri/i965/intel_extensions.c
 +++ b/src/mesa/drivers/dri/i965/intel_extensions.c
 @@ -324,6 +324,7 @@ intelInitExtensions(struct gl_context *ctx)
ctx-Extensions.ARB_framebuffer_no_attachments = true;
ctx-Extensions.ARB_gpu_shader5 = true;
ctx-Extensions.ARB_shader_atomic_counters = true;
 +  ctx-Extensions.ARB_shader_image_load_store = true;
ctx-Extensions.ARB_texture_compression_bptc = true;
ctx-Extensions.ARB_texture_view = true;

 --
 2.4.3

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 07/20] glsl/types: add new subroutine type (v3.1)

2015-07-22 Thread Kenneth Graunke
On Tuesday, July 21, 2015 03:19:14 PM Dave Airlie wrote:
 From: Dave Airlie airl...@redhat.com
 
 This type will be used to store the name of subroutine types
 
 as in subroutine void myfunc(void);
 will store myfunc into a subroutine type.
 
 This is required to the parser can identify a subroutine
 type in a uniform decleration as a valid type, and also for
 looking up the type later.
 
 Also add contains_subroutine method.
 
 v2: handle subroutine to int comparisons, needed
 for lowering pass.
 v3: do subroutine to int with it's own IR
 operation to avoid hacking on asserts (Kayden)
 v3.1: fix warnings in this patch, fix nir,
 fix tgsi
 
 Reviewed-by: Chris Forbes chr...@ijw.co.nz
 Signed-off-by: Dave Airlie airl...@redhat.com

Patches 7-9 are:
Reviewed-by: Kenneth Graunke kenn...@whitecape.org


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 4/4] i965/fs: Implement pass to lower instructions of unsupported SIMD width.

2015-07-22 Thread Jason Ekstrand
On Wed, Jul 22, 2015 at 10:05 AM, Francisco Jerez curroje...@riseup.net wrote:
 Jason Ekstrand ja...@jlekstrand.net writes:

 On Wed, Jul 22, 2015 at 12:31 AM, Francisco Jerez curroje...@riseup.net 
 wrote:
 Jason Ekstrand ja...@jlekstrand.net writes:

 A few comments below.  Mostly just asking for explanation.

 1-3 are

 Reviewed-by: Jason Ekstrand jason.ekstr...@intel.com

 Obviously, don't merge 4/4 until it actually has users.
 --Jason

 Thanks.


 On Thu, Jul 16, 2015 at 8:35 AM, Francisco Jerez curroje...@riseup.net 
 wrote:
 This lowering pass implements an algorithm to expand SIMDN
 instructions into a sequence of SIMDM instructions in cases where the
 hardware doesn't support the original execution size natively for some
 particular instruction.  The most important use-cases are:

  - Lowering send message instructions that don't support SIMD16
natively into SIMD8 (several texturing, framebuffer write and typed
surface operations).

  - Lowering messages that don't support SIMD8 natively into SIMD16
(*cough*gen4*cough*).

  - 64-bit precision operations (e.g. FP64 and 64-bit integer
multiplication).

  - SIMD32.

 The algorithm works by splitting the sources of the original
 instruction into chunks of width appropriate for the lowered
 instructions, and then interleaving the results component-wise into
 the destination of the original instruction.  The pass is controlled
 by the get_lowered_simd_width() function that currently just returns
 the original execution size making the whole pass a no-op for the
 moment until some user is introduced.
 ---
  src/mesa/drivers/dri/i965/brw_fs.cpp | 142 
 +++
  src/mesa/drivers/dri/i965/brw_fs.h   |   1 +
  2 files changed, 143 insertions(+)

 diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
 b/src/mesa/drivers/dri/i965/brw_fs.cpp
 index d031352..eeb6938 100644
 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
 +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
 @@ -3204,6 +3204,147 @@ fs_visitor::lower_logical_sends()
 return progress;
  }

 +/**
 + * Get the closest native SIMD width supported by the hardware for 
 instruction
 + * \p inst.  The instruction will be left untouched by
 + * fs_visitor::lower_simd_width() if the returned value is equal to the
 + * original execution size.
 + */
 +static unsigned
 +get_lowered_simd_width(const struct brw_device_info *devinfo,
 +   const fs_inst *inst)
 +{
 +   switch (inst-opcode) {
 +   default:
 +  return inst-exec_size;
 +   }
 +}
 +
 +/**
 + * The \p rows array of registers represents a \p num_rows by \p 
 num_columns
 + * matrix in row-major order, write it in column-major order into the 
 register
 + * passed as destination.  \p stride gives the separation between matrix
 + * elements in the input in fs_builder::dispatch_width() units.
 + */
 +static void
 +emit_transpose(const fs_builder bld,
 +   const fs_reg dst, const fs_reg *rows,
 +   unsigned num_rows, unsigned num_columns, unsigned stride)

 I'm not sure what I think about calling this emit_transpose.  I
 guess it is kind of a transpose, but maybe it's more of a gather.
 I'm not going to quibble about it though.

 *Shrug*, it doesn't only gather the vectors of the rows array (that
 would have been emit_collect :P), it copies them out in vertical, just
 like a matrix transpose -- assuming you're not horrified by the idea of
 considering the argument a matrix.

 That's fine.  Feel free to ignore that suggestion.


 +{
 +   fs_reg *const components = new fs_reg[num_rows * num_columns];
 +
 +   for (unsigned i = 0; i  num_columns; ++i) {
 +  for (unsigned j = 0; j  num_rows; ++j)
 + components[num_rows * i + j] = offset(rows[j], bld, stride * i);
 +   }
 +
 +   bld.LOAD_PAYLOAD(dst, components, num_rows * num_columns, 0);
 +
 +   delete[] components;
 +}
 +
 +bool
 +fs_visitor::lower_simd_width()
 +{
 +   bool progress = false;
 +
 +   foreach_block_and_inst_safe(block, fs_inst, inst, cfg) {
 +  const unsigned lower_width = get_lowered_simd_width(devinfo, inst);
 +
 +  if (lower_width != inst-exec_size) {
 + /* Builder matching the original instruction. */
 + const fs_builder ibld = bld.at(block, inst)
 +.exec_all(inst-force_writemask_all)
 +.group(inst-exec_size, 
 inst-force_sechalf);
 +
 + /* Split the copies in chunks of the execution width of either 
 the
 +  * original or the lowered instruction, whichever is lower.
 +  */
 + const unsigned copy_width = MIN2(lower_width, inst-exec_size);
 + const unsigned n = inst-exec_size / copy_width;
 + const unsigned dst_size = inst-regs_written * REG_SIZE /
 +inst-dst.component_size(inst-exec_size);
 + fs_reg dsts[4];
 +
 + assert(n  0  n = ARRAY_SIZE(dsts) 
 +!inst-writes_accumulator  !inst-mlen);
 +
 + for 

[Mesa-dev] [RFC] AoA atomic counter follow-up fixes

2015-07-22 Thread Timothy Arceri
Hi Guys,

These patches fix up AoA support for atomic counters following
my change in approach to uniforms in V2 of my AoA GLES series.

However I wanted to get some feedback mainly on the first patch
as the end result is the whole AoA is marked as active which
doesnt happen on Nvidia drivers.

I guess my question is do you think this patch is ok as a first
pass at just getting this stuff up and running or should I spend
a bunch of time writting a pass that splits the outer arrays into
separate variables to make them  easier to handle separatly.
At this stage in the glsl ir life cycle I don't know if its worth
the extra effort.

Thanks,
Tim

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Using the right context in st_texture_release_all_sampler_views()

2015-07-22 Thread Brian Paul

On 07/22/2015 05:31 PM, Jose Fonseca wrote:

On 22/07/15 23:32, Brian Paul wrote:

Hi Marek,

This is regarding your commit st/mesa: use pipe_sampler_view_release
for releasing sampler views from last October.

Basically, we have:

void
st_texture_release_all_sampler_views(struct st_context *st,
  struct st_texture_object *stObj)
{
GLuint i;

/* XXX This should use sampler_views[i]-pipe, not st-pipe */
for (i = 0; i  stObj-num_sampler_views; ++i)
   pipe_sampler_view_release(st-pipe, stObj-sampler_views[i]);
}

Our VMware/svga driver has an issue when
pipe_context::sampler_view_destroy() is called with one context and a
sampler view which was created with another context.  I can hack around
it in our driver code, but it would be nice to fix this in the state
tracker.

Ideally, the above code should be something like:

for (i = 0; i  stObj-num_sampler_views; ++i)
   pipe_sampler_view_reference(stObj-sampler_views[i], NULL);

The current code which uses the st-pipe context came from the bug
https://bugs.freedesktop.org/show_bug.cgi?id=81680

AFAICT, you were just working around an R600 driver issue.  Any chance
we could fix the state tracker and re-test Firefox on R600?


Freeing a view from a different context with another context is wrong.

But freeing a view with a context that might be current on another
thread is also wrong, as pipe_context are not thread safe.


If that was the previous behavior, then maybe Firefox crashed due to the
race conditions.  Looking at the stack trace, I suspect that what
happened was that one context was freeing the st_texture, and all its
views, while the other context was destroying itself, and all its own
views.  In short, st_texture_release_sampler_view was being called
simultanouesly from two different threads.


The proper fix IMO is to let the pipe context that owns the view to
destroyed from whatever thread it is current (or whenever it is made
current.)


OK, in our off-list discussion it wasn't clear to me that 
multi-threading was your main concern until your last message.  I see 
what you're saying now.


So, when we destroy a texture object which may be shared by multiple 
contexts, we theoretically need to move the texture's sampler views to 
new, per-context lists.  At some later time, when the context is used, 
we'd check if the list of sampler views to be destroyed was non-empty 
and free them.


Unfortunately, that involves one context/thread reaching into another 
context.  And that sounds messy.


The ultimate solution may be to get rid of the per-texture list of 
sampler views and instead store sampler views in a context-private data 
structure.  I'll have to think that through.


-Brian

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Using the right context in st_texture_release_all_sampler_views()

2015-07-22 Thread Jose Fonseca

On 22/07/15 23:32, Brian Paul wrote:

Hi Marek,

This is regarding your commit st/mesa: use pipe_sampler_view_release
for releasing sampler views from last October.

Basically, we have:

void
st_texture_release_all_sampler_views(struct st_context *st,
  struct st_texture_object *stObj)
{
GLuint i;

/* XXX This should use sampler_views[i]-pipe, not st-pipe */
for (i = 0; i  stObj-num_sampler_views; ++i)
   pipe_sampler_view_release(st-pipe, stObj-sampler_views[i]);
}

Our VMware/svga driver has an issue when
pipe_context::sampler_view_destroy() is called with one context and a
sampler view which was created with another context.  I can hack around
it in our driver code, but it would be nice to fix this in the state
tracker.

Ideally, the above code should be something like:

for (i = 0; i  stObj-num_sampler_views; ++i)
   pipe_sampler_view_reference(stObj-sampler_views[i], NULL);

The current code which uses the st-pipe context came from the bug
https://bugs.freedesktop.org/show_bug.cgi?id=81680

AFAICT, you were just working around an R600 driver issue.  Any chance
we could fix the state tracker and re-test Firefox on R600?


Freeing a view from a different context with another context is wrong.

But freeing a view with a context that might be current on another 
thread is also wrong, as pipe_context are not thread safe.



If that was the previous behavior, then maybe Firefox crashed due to the 
race conditions.  Looking at the stack trace, I suspect that what 
happened was that one context was freeing the st_texture, and all its 
views, while the other context was destroying itself, and all its own 
views.  In short, st_texture_release_sampler_view was being called 
simultanouesly from two different threads.



The proper fix IMO is to let the pipe context that owns the view to 
destroyed from whatever thread it is current (or whenever it is made 
current.)



Jose
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Using the right context in st_texture_release_all_sampler_views()

2015-07-22 Thread Jose Fonseca

On 23/07/15 01:00, Brian Paul wrote:

On 07/22/2015 05:31 PM, Jose Fonseca wrote:

On 22/07/15 23:32, Brian Paul wrote:

Hi Marek,

This is regarding your commit st/mesa: use pipe_sampler_view_release
for releasing sampler views from last October.

Basically, we have:

void
st_texture_release_all_sampler_views(struct st_context *st,
  struct st_texture_object *stObj)
{
GLuint i;

/* XXX This should use sampler_views[i]-pipe, not st-pipe */
for (i = 0; i  stObj-num_sampler_views; ++i)
   pipe_sampler_view_release(st-pipe, stObj-sampler_views[i]);
}

Our VMware/svga driver has an issue when
pipe_context::sampler_view_destroy() is called with one context and a
sampler view which was created with another context.  I can hack around
it in our driver code, but it would be nice to fix this in the state
tracker.

Ideally, the above code should be something like:

for (i = 0; i  stObj-num_sampler_views; ++i)
   pipe_sampler_view_reference(stObj-sampler_views[i], NULL);

The current code which uses the st-pipe context came from the bug
https://bugs.freedesktop.org/show_bug.cgi?id=81680

AFAICT, you were just working around an R600 driver issue.  Any chance
we could fix the state tracker and re-test Firefox on R600?


Freeing a view from a different context with another context is wrong.

But freeing a view with a context that might be current on another
thread is also wrong, as pipe_context are not thread safe.


If that was the previous behavior, then maybe Firefox crashed due to the
race conditions.  Looking at the stack trace, I suspect that what
happened was that one context was freeing the st_texture, and all its
views, while the other context was destroying itself, and all its own
views.  In short, st_texture_release_sampler_view was being called
simultanouesly from two different threads.


The proper fix IMO is to let the pipe context that owns the view to
destroyed from whatever thread it is current (or whenever it is made
current.)


OK, in our off-list discussion it wasn't clear to me that
multi-threading was your main concern until your last message.  I see
what you're saying now.

So, when we destroy a texture object which may be shared by multiple
contexts, we theoretically need to move the texture's sampler views to
new, per-context lists.  At some later time, when the context is used,
we'd check if the list of sampler views to be destroyed was non-empty
and free them.

Unfortunately, that involves one context/thread reaching into another
context.  And that sounds messy.


Yes.  We could have a sepeare mutex for this list.

Still, it's hard to prevent dead-locks (e.g, two contexts trying to add 
a deferred view to each-others defferred list.)



The ultimate solution may be to get rid of the per-texture list of
sampler views and instead store sampler views in a context-private data
structure.  I'll have to think that through.


Yes, that's another possibility. I haven't thought it through either.

One way or another, you'll still want to notify a context to garbagge 
collect its views, even if it's just an atomic flag on each context, 
otherwise a texture that was shared by two contexts might end up 
lingering around indefinitely because contexts are holding to its views.



Jose

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/2] dispatch_sanity.cpp: remove commented out tess entries

2015-07-22 Thread Dave Airlie
From: Dave Airlie airl...@redhat.com

These entries were put in the GL4.0 section, so removed the commented
out ones.

Signed-off-by: Dave Airlie airl...@redhat.com
---
 src/mesa/main/tests/dispatch_sanity.cpp | 2 --
 1 file changed, 2 deletions(-)

diff --git a/src/mesa/main/tests/dispatch_sanity.cpp 
b/src/mesa/main/tests/dispatch_sanity.cpp
index 7e3a97b..d1bd0ce 100644
--- a/src/mesa/main/tests/dispatch_sanity.cpp
+++ b/src/mesa/main/tests/dispatch_sanity.cpp
@@ -1577,8 +1577,6 @@ const struct function gl_core_functions_possible[] = {
 // { glUniformSubroutinesuiv, 43, -1 },   // XXX: Add to xml
 // { glGetUniformSubroutineuiv, 43, -1 }, // XXX: Add to xml
 // { glGetProgramStageiv, 43, -1 },   // XXX: Add to xml
-// { glPatchParameteri, 43, -1 }, // XXX: Add to xml
-// { glPatchParameterfv, 43, -1 },// XXX: Add to xml
 
{ glBindTransformFeedback, 43, -1 },
{ glDeleteTransformFeedbacks, 43, -1 },
-- 
2.4.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 13/14] mesa: Change the signature of _mesa_need_rgb_to_luminance_conversion()

2015-07-22 Thread Iago Toral
Looks good to me, if this did not introduce any regressions:

Reviewed-by: Iago Toral Quiroga ito...@igalia.com

On Tue, 2015-06-16 at 11:15 -0700, Anuj Phogat wrote:
 This allows us to handle cases when texImage-_BaseFormat doesn't match
 _mesa_format_get_base_format(texImage-Format). _BaseFormat is what we
 care about in this function.
 
 Signed-off-by: Anuj Phogat anuj.pho...@gmail.com
 ---
  src/mesa/drivers/common/meta_tex_subimage.c |  4 +++-
  src/mesa/main/readpix.c | 28 +++-
  src/mesa/main/readpix.h |  3 ++-
  3 files changed, 20 insertions(+), 15 deletions(-)
 
 diff --git a/src/mesa/drivers/common/meta_tex_subimage.c 
 b/src/mesa/drivers/common/meta_tex_subimage.c
 index 6d52014..43e1210 100644
 --- a/src/mesa/drivers/common/meta_tex_subimage.c
 +++ b/src/mesa/drivers/common/meta_tex_subimage.c
 @@ -262,6 +262,7 @@ _mesa_meta_pbo_GetTexSubImage(struct gl_context *ctx, 
 GLuint dims,
 int full_height, image_height;
 struct gl_texture_image *pbo_tex_image;
 struct gl_renderbuffer *rb = NULL;
 +   GLenum dstBaseFormat = _mesa_unpack_format_to_base_format(format);
 GLenum status, base_format;
 bool success = false, clear_channels_to_zero = false;
 float save_clear_color[4];
 @@ -284,7 +285,8 @@ _mesa_meta_pbo_GetTexSubImage(struct gl_context *ctx, 
 GLuint dims,
  type, GL_FALSE))
   return false;
  
 -  if (_mesa_need_rgb_to_luminance_conversion(rb-Format, format))
 +  if (_mesa_need_rgb_to_luminance_conversion(rb-_BaseFormat,
 + dstBaseFormat))
   return false;
  
if (_mesa_need_signed_unsigned_int_conversion(rb-Format, format, 
 type))
 diff --git a/src/mesa/main/readpix.c b/src/mesa/main/readpix.c
 index c98975f..3a9b766 100644
 --- a/src/mesa/main/readpix.c
 +++ b/src/mesa/main/readpix.c
 @@ -47,17 +47,14 @@
   * Return true if the conversion L=R+G+B is needed.
   */
  GLboolean
 -_mesa_need_rgb_to_luminance_conversion(mesa_format texFormat, GLenum format)
 +_mesa_need_rgb_to_luminance_conversion(GLenum srcBaseFormat,
 +   GLenum dstBaseFormat)
  {
 -   GLenum baseTexFormat = _mesa_get_format_base_format(texFormat);
 -
 -   return (baseTexFormat == GL_RG ||
 -   baseTexFormat == GL_RGB ||
 -   baseTexFormat == GL_RGBA) 
 -  (format == GL_LUMINANCE ||
 -   format == GL_LUMINANCE_ALPHA ||
 -   format == GL_LUMINANCE_INTEGER_EXT ||
 -   format == GL_LUMINANCE_ALPHA_INTEGER_EXT);
 +   return (srcBaseFormat == GL_RG ||
 +   srcBaseFormat == GL_RGB ||
 +   srcBaseFormat == GL_RGBA) 
 +  (dstBaseFormat == GL_LUMINANCE ||
 +   dstBaseFormat == GL_LUMINANCE_ALPHA);
  }
  
  /**
 @@ -89,6 +86,8 @@ _mesa_get_readpixels_transfer_ops(const struct gl_context 
 *ctx,
GLboolean uses_blit)
  {
 GLbitfield transferOps = ctx-_ImageTransferState;
 +   GLenum srcBaseFormat = _mesa_get_format_base_format(texFormat);
 +   GLenum dstBaseFormat = _mesa_unpack_format_to_base_format(format);
  
 if (format == GL_DEPTH_COMPONENT ||
 format == GL_DEPTH_STENCIL ||
 @@ -125,7 +124,7 @@ _mesa_get_readpixels_transfer_ops(const struct gl_context 
 *ctx,
  * have any effect anyway.
  */
 if (_mesa_get_format_datatype(texFormat) == GL_UNSIGNED_NORMALIZED 
 -   !_mesa_need_rgb_to_luminance_conversion(texFormat, format)) {
 +   !_mesa_need_rgb_to_luminance_conversion(srcBaseFormat, 
 dstBaseFormat)) {
transferOps = ~IMAGE_CLAMP_BIT;
 }
  
 @@ -164,6 +163,7 @@ _mesa_readpixels_needs_slow_path(const struct gl_context 
 *ctx, GLenum format,
  {
 struct gl_renderbuffer *rb =
   _mesa_get_read_renderbuffer_for_format(ctx, format);
 +   GLenum dstBaseFormat = _mesa_unpack_format_to_base_format(format);
  
 assert(rb);
  
 @@ -184,7 +184,8 @@ _mesa_readpixels_needs_slow_path(const struct gl_context 
 *ctx, GLenum format,
  
 default:
/* Color formats. */
 -  if (_mesa_need_rgb_to_luminance_conversion(rb-Format, format)) {
 +  if (_mesa_need_rgb_to_luminance_conversion(rb-_BaseFormat,
 + dstBaseFormat)) {
   return GL_TRUE;
}
  
 @@ -458,6 +459,7 @@ read_rgba_pixels( struct gl_context *ctx,
 uint8_t rebase_swizzle[4];
 struct gl_framebuffer *fb = ctx-ReadBuffer;
 struct gl_renderbuffer *rb = fb-_ColorReadBuffer;
 +   GLenum dstBaseFormat = _mesa_unpack_format_to_base_format(format);
  
 if (!rb)
return;
 @@ -469,7 +471,7 @@ read_rgba_pixels( struct gl_context *ctx,
 dst_stride = _mesa_image_row_stride(packing, width, format, type);
 dst_format = _mesa_format_from_format_and_type(format, type);
 convert_rgb_to_lum =
 -  _mesa_need_rgb_to_luminance_conversion(rb-Format, format);
 +  

Re: [Mesa-dev] [PATCH 14/14] meta: Use _mesa_need_rgb_to_luminance_conversion() in decompress_texture_image()

2015-07-22 Thread Iago Toral
Reviewed-by: Iago Toral Quiroga ito...@igalia.com

On Tue, 2015-06-16 at 11:15 -0700, Anuj Phogat wrote:
 Signed-off-by: Anuj Phogat anuj.pho...@gmail.com
 ---
  src/mesa/drivers/common/meta.c | 7 ++-
  1 file changed, 2 insertions(+), 5 deletions(-)
 
 diff --git a/src/mesa/drivers/common/meta.c b/src/mesa/drivers/common/meta.c
 index 6108d98..e123500 100644
 --- a/src/mesa/drivers/common/meta.c
 +++ b/src/mesa/drivers/common/meta.c
 @@ -3137,11 +3137,8 @@ decompress_texture_image(struct gl_context *ctx,
/* If we're reading back an RGB(A) texture (using glGetTexImage) as
  * luminance then we need to return L=tex(R).
  */
 -  ((baseTexFormat == GL_RGBA ||
 -baseTexFormat == GL_RGB  ||
 -baseTexFormat == GL_RG) 
 -  (destBaseFormat == GL_LUMINANCE ||
 -   destBaseFormat == GL_LUMINANCE_ALPHA))) {
 +  _mesa_need_rgb_to_luminance_conversion(baseTexFormat,
 + destBaseFormat)) {
   /* Green and blue must be zero */
   _mesa_PixelTransferf(GL_GREEN_SCALE, 0.0f);
   _mesa_PixelTransferf(GL_BLUE_SCALE, 0.0f);


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 91425] [regression, bisected] Piglit spec/ext_packed_float/ getteximage-invalid-format-for-packed-type fails

2015-07-22 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=91425

--- Comment #2 from Michel Dänzer mic...@daenzer.net ---
FWIW, I got the same regression with radeonsi.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] targets/dri: scons: add missing link against libdrm

2015-07-22 Thread Emil Velikov
Otherwise the final dri module will have (additional) unresolved
symbols.

Cc: Brian Paul bri...@vmware.com
Cc: Jose Fonseca jfons...@vmware.com
Signed-off-by: Emil Velikov emil.l.veli...@gmail.com
---

We can only fix the remaining unresolved symbols (_glapi_foo), as we 
remove the non-shared glapi when building with DRI.

With this we at least match the autotools build.

-Emil

 src/gallium/targets/dri/SConscript | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/src/gallium/targets/dri/SConscript 
b/src/gallium/targets/dri/SConscript
index 8d29f3b..2fb0da0 100644
--- a/src/gallium/targets/dri/SConscript
+++ b/src/gallium/targets/dri/SConscript
@@ -25,6 +25,8 @@ if env['llvm']:
 env.Append(CPPDEFINES = 'GALLIUM_LLVMPIPE')
 env.Prepend(LIBS = [llvmpipe])
 
+env.PkgUseModules('DRM')
+
 env.Append(CPPDEFINES = [
 'GALLIUM_VMWGFX',
 'GALLIUM_SOFTPIPE',
-- 
2.4.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 91425] [regression, bisected] Piglit spec/ext_packed_float/ getteximage-invalid-format-for-packed-type fails

2015-07-22 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=91425

--- Comment #1 from Brian Paul bri...@vmware.com ---
Hmm, I can't reproduce that.  The test passes completely for me.

What driver are you using?  I'm testing llvmpipe/softpipe.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 91387] Mesa 10.6.1 implementation error: invalid target in _swrast_choose_texture_sample_func

2015-07-22 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=91387

--- Comment #3 from Michael Godfrey michaeldgodf...@gmail.com ---
Created attachment 117297
  -- https://bugs.freedesktop.org/attachment.cgi?id=117297action=edit
backtrace

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH V2 12/14] meta: Fix reading luminance texture as rgba in _mesa_meta_pbo_GetTexSubImage()

2015-07-22 Thread Iago Toral
The problem here is that the _mesa_meta_BlitFramebuffer is not setting
G/B channels to 0.0 when doing Luminance/Intensity to RGBA conversions,
so why not implement the fix in _mesa_meta_BlitFramebuffer directly? The
GL spec expects frambuffer blits to handle these conversions properly,
so it looks like a win for all uses of that function.

Iago

On Fri, 2015-07-17 at 10:28 -0700, Anuj Phogat wrote:
 After recent addition of pbo testing in piglit test getteximage-luminance,
 it fails on i965. This patch makes a sub test pass.
 
 This patch adds a clear color operation to meta pbo path, which I think is
 better than falling back to software path.
 
 V2: Fix color mask for GL_LUMINANCE_ALPHA
 
 Signed-off-by: Anuj Phogat anuj.pho...@gmail.com
 Cc: mesa-sta...@lists.freedesktop.org
 ---
  src/mesa/drivers/common/meta_tex_subimage.c | 36 
 +++--
  1 file changed, 34 insertions(+), 2 deletions(-)
 
 diff --git a/src/mesa/drivers/common/meta_tex_subimage.c 
 b/src/mesa/drivers/common/meta_tex_subimage.c
 index 13f8292..f4d5ac3 100644
 --- a/src/mesa/drivers/common/meta_tex_subimage.c
 +++ b/src/mesa/drivers/common/meta_tex_subimage.c
 @@ -28,6 +28,7 @@
  #include blend.h
  #include bufferobj.h
  #include buffers.h
 +#include clear.h
  #include fbobject.h
  #include glformats.h
  #include glheader.h
 @@ -278,8 +279,9 @@ _mesa_meta_pbo_GetTexSubImage(struct gl_context *ctx, 
 GLuint dims,
 int full_height, image_height;
 struct gl_texture_image *pbo_tex_image;
 struct gl_renderbuffer *rb = NULL;
 -   GLenum status;
 -   bool success = false;
 +   GLenum status, src_base_format;
 +   bool success = false, clear_channels_to_zero = false;
 +   float save_clear_color[4];
 int z;
  
 if (!_mesa_is_bufferobj(packing-BufferObj))
 @@ -380,6 +382,27 @@ _mesa_meta_pbo_GetTexSubImage(struct gl_context *ctx, 
 GLuint dims,
GL_COLOR_BUFFER_BIT, GL_NEAREST))
goto fail;
  
 +   src_base_format = tex_image ?
 + tex_image-_BaseFormat :
 + ctx-ReadBuffer-_ColorReadBuffer-_BaseFormat;
 +
 +   /* Depending on the base formats involved we might need to rebase some
 +* values. For example if we download from a Luminance format to RGBA
 +* format, we want G=0 and B=0.
 +*/
 +   clear_channels_to_zero =
 +  _mesa_need_luminance_to_rgb_conversion(src_base_format,
 + pbo_tex_image-_BaseFormat);
 +
 +   if (clear_channels_to_zero) {
 +  memcpy(save_clear_color, ctx-Color.ClearColor.f, 4 * sizeof(float));
 +  /* Clear the Green, Blue channels. */
 +  _mesa_ColorMask(GL_FALSE, GL_TRUE, GL_TRUE,
 +  src_base_format != GL_LUMINANCE_ALPHA);
 +  _mesa_ClearColor(0.0, 0.0, 0.0, 1.0);
 +  _mesa_Clear(GL_COLOR_BUFFER_BIT);
 +   }
 +
 for (z = 1; z  depth; z++) {
_mesa_meta_bind_fbo_image(GL_READ_FRAMEBUFFER, GL_COLOR_ATTACHMENT0,
  tex_image, zoffset + z);
 @@ -392,6 +415,15 @@ _mesa_meta_pbo_GetTexSubImage(struct gl_context *ctx, 
 GLuint dims,
   0, z * image_height,
   width, z * image_height + height,
   GL_COLOR_BUFFER_BIT, GL_NEAREST);
 +  if (clear_channels_to_zero)
 + _mesa_Clear(GL_COLOR_BUFFER_BIT);
 +   }
 +
 +   /* Unmask the color channels and restore the saved clear color values. */
 +   if (clear_channels_to_zero) {
 +  _mesa_ColorMask(GL_TRUE, GL_TRUE, GL_TRUE, GL_TRUE);
 +  _mesa_ClearColor(save_clear_color[0], save_clear_color[1],
 +   save_clear_color[2], save_clear_color[3]);
 }
  
 success = true;


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 11/14] meta: Use _mesa_need_luminance_to_rgb_conversion() in decompress_texture_image()

2015-07-22 Thread Iago Toral
Patches 10-11 are

Reviewed-by: Iago Toral Quiroga ito...@igalia.com

On Tue, 2015-06-16 at 11:15 -0700, Anuj Phogat wrote:
 Signed-off-by: Anuj Phogat anuj.pho...@gmail.com
 ---
  src/mesa/drivers/common/meta.c | 12 ++--
  1 file changed, 2 insertions(+), 10 deletions(-)
 
 diff --git a/src/mesa/drivers/common/meta.c b/src/mesa/drivers/common/meta.c
 index c9e58d8..6108d98 100644
 --- a/src/mesa/drivers/common/meta.c
 +++ b/src/mesa/drivers/common/meta.c
 @@ -3132,16 +3132,8 @@ decompress_texture_image(struct gl_context *ctx,
 * returned as red and two-channel texture values are returned as
 * red/alpha.
 */
 -  if (((baseTexFormat == GL_LUMINANCE ||
 -baseTexFormat == GL_LUMINANCE_ALPHA ||
 -baseTexFormat == GL_INTENSITY) 
 -   (destBaseFormat == GL_RGBA ||
 -destBaseFormat == GL_RGB ||
 -destBaseFormat == GL_RG ||
 -destBaseFormat == GL_GREEN ||
 -destBaseFormat == GL_BLUE ||
 -destBaseFormat == GL_BGRA ||
 -destBaseFormat == GL_BGR)) ||
 +  if (_mesa_need_luminance_to_rgb_conversion(baseTexFormat,
 + destBaseFormat) ||
/* If we're reading back an RGB(A) texture (using glGetTexImage) as
  * luminance then we need to return L=tex(R).
  */


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 91387] Mesa 10.6.1 implementation error: invalid target in _swrast_choose_texture_sample_func

2015-07-22 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=91387

Brian Paul bri...@vmware.com changed:

   What|Removed |Added

 CC||anuj.pho...@gmail.com

--- Comment #4 from Brian Paul bri...@vmware.com ---
Ugh, looks like there's some sort of unintended recursion going on here.
I think Anuj might want to take a look at this.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


  1   2   >