Re: [Mesa-dev] radv submission 2

2016-10-04 Thread Edward O'Callaghan


On 10/05/2016 01:05 PM, Timothy Arceri wrote:
> On Wed, 2016-10-05 at 10:48 +1000, Dave Airlie wrote:
>> Again I'm sure this will hit limits, and I've asked Michel to
>> drop the big patch before it gets here.
>>
>> All of these are in a new branch radv-submit2.
>> https://github.com/airlied/mesa/commits/radv-submit2
>>
>> I've incorporated all the feedback except two things:
>>
>> a) #pragma once - this is used in lots of places already, if
>> someone enforces a paint color on the shed, then do so, but I don't
>> think radv should be the rallying cry for it.
> 
> I think it was decided (at least for i965) to get rid of them back in
> March [1].
> 
> https://lists.freedesktop.org/archives/mesa-dev/2016-March/109847.html

I will handle it.

Kind Regards,
Edward.

> 
> 
>>
>> b) HAVE_LLVM defines, in some places I've left these in, esp the
>> "common" code that we could possibly reuse in radeonsi in the future,
>> as we might need to use the common stuff with older LLVMs.
>>
>> I think I've caught every other suggestion/comments on the way past.
>>
>> Thanks for all the interest,
>> Dave.
>>
>>
>> ___
>> mesa-dev mailing list
>> mesa-dev@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> 



signature.asc
Description: OpenPGP digital signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] winsys/radeon: (trivial) rename variable for consistency

2016-10-04 Thread Alexandre Demers
Signed-off-by: Alexandre Demers 
---
 src/gallium/drivers/radeon/radeon_uvd.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/src/gallium/drivers/radeon/radeon_uvd.c 
b/src/gallium/drivers/radeon/radeon_uvd.c
index fb1491a..81fba95 100644
--- a/src/gallium/drivers/radeon/radeon_uvd.c
+++ b/src/gallium/drivers/radeon/radeon_uvd.c
@@ -108,7 +108,7 @@ static void set_reg(struct ruvd_decoder *dec, unsigned reg, 
uint32_t val)
 
 /* send a command to the VCPU through the GPCOM registers */
 static void send_cmd(struct ruvd_decoder *dec, unsigned cmd,
-struct pb_buffer* buf, uint32_t off,
+struct pb_buffer* buf, uint32_t offset,
 enum radeon_bo_usage usage, enum radeon_bo_domain domain)
 {
int reloc_idx;
@@ -119,12 +119,12 @@ static void send_cmd(struct ruvd_decoder *dec, unsigned 
cmd,
if (!dec->use_legacy) {
uint64_t addr;
addr = dec->ws->buffer_get_virtual_address(buf);
-   addr = addr + off;
+   addr = addr + offset;
set_reg(dec, RUVD_GPCOM_VCPU_DATA0, addr);
set_reg(dec, RUVD_GPCOM_VCPU_DATA1, addr >> 32);
} else {
-   off += dec->ws->buffer_get_reloc_offset(buf);
-   set_reg(dec, RUVD_GPCOM_VCPU_DATA0, off);
+   offset += dec->ws->buffer_get_reloc_offset(buf);
+   set_reg(dec, RUVD_GPCOM_VCPU_DATA0, offset);
set_reg(dec, RUVD_GPCOM_VCPU_DATA1, reloc_idx * 4);
}
set_reg(dec, RUVD_GPCOM_VCPU_CMD, cmd << 1);
-- 
2.10.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] i965/l3: Add explicit way size calculation for bxt

2016-10-04 Thread Ben Widawsky
There should be no functional change here because Broxton and CHV are
both gt1. Without this code however, it might seem like broxton support
is missing.

While here, put the gt1 check in front to hopefully short-circuit the
condition for the mobile cases.

Cc: Francisco Jerez 
Signed-off-by: Ben Widawsky 
Reviewed-by: Anuj Phogat 
---
 src/intel/common/gen_l3_config.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/src/intel/common/gen_l3_config.c b/src/intel/common/gen_l3_config.c
index 0d99f12..0783217 100644
--- a/src/intel/common/gen_l3_config.c
+++ b/src/intel/common/gen_l3_config.c
@@ -257,7 +257,9 @@ get_l3_way_size(const struct gen_device_info *devinfo)
if (devinfo->is_baytrail)
   return 2;
 
-   else if (devinfo->is_cherryview || devinfo->gt == 1)
+   else if (devinfo->gt == 1 ||
+devinfo->is_cherryview ||
+devinfo->is_broxton)
   return 4;
 
else
-- 
2.10.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 97542] mesa-12.0.1 with llvm-3.9.0_rc3 - src/gallium/state_trackers/clover/llvm/invocation.cpp:212:75: error: no matching function for call to clang::CompilerInvocation::setLangDefault

2016-10-04 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=97542

--- Comment #12 from Alexander Tsoy  ---
(In reply to Clément Guérin from comment #11)
> Hello, I built mesa 12.0.3 against llvm 3.9.0 on arch linux. Rocket League
> and Portal were working properly, however Tomb Raider was crashing right
> before the Feral logo.

Tomb Raider's crash is caused by the following error:
"LLVM ERROR: branch size exceeds simm16"

Reverting relevant LLVM commit [1] workarounds this issue, however running Tomb
Raider at Ultra settings completely hangs the GPU. Other settings works fine.

Another issue is the following warning which appear when running any GL app:
"Warning: LLVM emitted unknown config register: 0x4"

So yes, as Michel said, mesa-12.0 doesn't fully support llvm-3.9.

[1]
https://github.com/llvm-mirror/llvm/commit/76e32dfbc0acecb33e2141a0c2faf5b23e1342fc

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Mesa include guard style. (Was: [PATCH] i965/cfg: Remove redundant #pragma once.)

2016-10-04 Thread Dave Airlie
On 13 March 2016 at 11:29, Ian Romanick  wrote:
> On 03/11/2016 03:46 PM, Eric Anholt wrote:
>> Ian Romanick  writes:
>>
>>> On 03/10/2016 05:53 PM, Francisco Jerez wrote:
 Iago Toral  writes:

> On Wed, 2016-03-09 at 19:04 -0800, Francisco Jerez wrote:
>> Matt Turner  writes:
>>
>>> On Wed, Mar 9, 2016 at 1:37 PM, Francisco Jerez  
>>> wrote:
 Iago Toral  writes:

> On Tue, 2016-03-08 at 17:42 -0800, Francisco Jerez wrote:
>> brw_cfg.h already has include guards, remove the "#pragma once" which
>> is redundant and non-standard.
>
> FWIW, I think using both #pragma once and include guards is a way to
> keep portability while still getting the performance advantage of
> #pragma once where it is supported.
>
 It's highly unlikely to make any significant difference on any
 reasonably modern compiler.  I cannot measure any change in compilation
 time locally from my cleanup.

> Also it seems that we do the same thing in many other files...
>
 Really?  I'm not aware of any other file where we use both.
>>>
>>> There are quite a few in glsl/
>>
>> Heh, apparently you're right.  Anyway it seems rather pointless to use
>> '#pragma once' in a bunch of scattered header files with the expectation
>> to gain some speed, the improvement from a single header file is so
>> minuscule (if it will make any difference at all on a modern compiler
>> and compilation workload, which I doubt) that we would have to use it
>> universally in order to have the chance to measure any improvement.
>>
>> Can we please just decide for one of the include guard styles and use it
>> consistently?  Given that the majority of header files in the Mesa
>> codebase use old-school define guards, that it's the only standard
>> option, that it has well-defined semantics in presence of file copies
>> and hardlinks, and that the performance argument against it is rather
>> dubious (although I definitely find '#pragma once' prettier and more
>> concise), I'd vote for using preprocessor define guards universally.
>>
>> What do other people think?
>
> I think we have to use define guards necessarily since #pragma once is
> not standard even it it has wide support. So the question is whether we
> want to use only define guards or define guards plus #pragma once. I am
> fine with doing only define guards as you propose.

 *Shrug* I have the impression that the only real advantage of '#pragma
 once' is that you no longer need to do the ifndef/define dance, so I
 don't think I can see much benefit in doing both.
>>>
>>> Several compilers will cache the file name where '#pragma once' occurs
>>> and never read that file again.  A #include of a file previously seen
>>> with '#pragma once' becomes a no-op.  Since the file is never read, the
>>> compiler avoids all the I/O and the parsing.  That is true of MSVC and,
>>> I thought, some versions of GCC.  As Iago points out, some compilers
>>> ignore the #pragma altogether.  Since Mesa supports (or does it?) some
>>> of these compilers, we have to have the ifdef/define/endif guards.
>>
>> Compilers have noticed that ifdef/define/endif is a thing and optimized
>> it, anyway.
>>
>> https://gcc.gnu.org/onlinedocs/cppinternals/Guard-Macros.html
>
> That's cool!  I don't think GCC did that when I looked into this in
> 2010.  It sounds like the #pragma actually breaks the GCC optimization,
> so let's get rid of them all.

Just to reignite this, I don't this statement is any way true. using #pragma
once doesn't break GCC optimisation, the optimisation isn't useful in the
presence of #pragma once, as gcc will never ever read those files again,
so there is no need to do it.

Dave.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] radv submission 2

2016-10-04 Thread Timothy Arceri
On Wed, 2016-10-05 at 10:48 +1000, Dave Airlie wrote:
> Again I'm sure this will hit limits, and I've asked Michel to
> drop the big patch before it gets here.
> 
> All of these are in a new branch radv-submit2.
> https://github.com/airlied/mesa/commits/radv-submit2
> 
> I've incorporated all the feedback except two things:
> 
> a) #pragma once - this is used in lots of places already, if
> someone enforces a paint color on the shed, then do so, but I don't
> think radv should be the rallying cry for it.

I think it was decided (at least for i965) to get rid of them back in
March [1].

https://lists.freedesktop.org/archives/mesa-dev/2016-March/109847.html


> 
> b) HAVE_LLVM defines, in some places I've left these in, esp the
> "common" code that we could possibly reuse in radeonsi in the future,
> as we might need to use the common stuff with older LLVMs.
> 
> I think I've caught every other suggestion/comments on the way past.
> 
> Thanks for all the interest,
> Dave.
> 
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 05/11] nir: Add a LCSAA-pass

2016-10-04 Thread Timothy Arceri
On Tue, 2016-10-04 at 16:47 -0700, Jason Ekstrand wrote:
> On Fri, Sep 16, 2016 at 6:24 AM, Timothy Arceri  bora.com> wrote:
> > From: Thomas Helland 
> > 
> > V2: Do a "depth first search" to convert to LCSSA
> > 
> > V3: Small comment fixup
> > 
> > V4: Rebase, adapt to removal of function overloads
> > 
> > V5: Rebase, adapt to relocation of nir to compiler/nir
> >     Still need to adapt to potential if-uses
> >     Work around nir_validate issue
> > 
> > V6 (Timothy):
> >  - tidy lcssa and stop leaking memory
> >  - dont rewrite the src for the lcssa phi node
> >  - validate lcssa phi srcs to avoid postvalidate assert
> >  - don't add new phi if one already exists
> >  - more lcssa phi validation fixes
> >  - Rather than marking ssa defs inside a loop just mark blocks
> > inside
> >    a loop. This is simpler and fixes lcssa for intrinsics which do
> >    not have a destination.
> >  - don't create LCSSA phis for loops we won't unroll
> >  - require loop metadata for lcssa pass
> >  - handle case were the ssa defs use outside the loop is already a
> > phi
> > 
> > V7: (Timothy)
> > - pass indirect mask to metadata call
> > ---
> >  src/compiler/Makefile.sources   |   1 +
> >  src/compiler/nir/nir.h          |   6 ++
> >  src/compiler/nir/nir_to_lcssa.c | 227
> > 
> >  src/compiler/nir/nir_validate.c |  11 +-
> >  4 files changed, 242 insertions(+), 3 deletions(-)
> >  create mode 100644 src/compiler/nir/nir_to_lcssa.c
> > 
> > diff --git a/src/compiler/Makefile.sources
> > b/src/compiler/Makefile.sources
> > index 7ed26a9..8ef6080 100644
> > --- a/src/compiler/Makefile.sources
> > +++ b/src/compiler/Makefile.sources
> > @@ -247,6 +247,7 @@ NIR_FILES = \
> >         nir/nir_search_helpers.h \
> >         nir/nir_split_var_copies.c \
> >         nir/nir_sweep.c \
> > +       nir/nir_to_lcssa.c \
> >         nir/nir_to_ssa.c \
> >         nir/nir_validate.c \
> >         nir/nir_vla.h \
> > diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h
> > index cc8f4b6..29a6f45 100644
> > --- a/src/compiler/nir/nir.h
> > +++ b/src/compiler/nir/nir.h
> > @@ -1387,6 +1387,8 @@ typedef struct {
> >     struct exec_list srcs; /** < list of nir_phi_src */
> > 
> >     nir_dest dest;
> > +
> > +   bool is_lcssa_phi;
> >  } nir_phi_instr;
> > 
> >  typedef struct {
> > @@ -2643,6 +2645,10 @@ void nir_convert_to_ssa(nir_shader *shader);
> >  bool nir_repair_ssa_impl(nir_function_impl *impl);
> >  bool nir_repair_ssa(nir_shader *shader);
> > 
> > +void nir_to_lcssa_impl(nir_function_impl *impl,
> > +                       nir_variable_mode indirect_mask);
> > +void nir_to_lcssa(nir_shader *shader, nir_variable_mode
> > indirect_mask);
> > +
> >  /* If phi_webs_only is true, only convert SSA values involved in
> > phi nodes to
> >   * registers.  If false, convert all values (even those not
> > involved in a phi
> >   * node) to registers.
> > diff --git a/src/compiler/nir/nir_to_lcssa.c
> > b/src/compiler/nir/nir_to_lcssa.c
> > new file mode 100644
> > index 000..25d0bdb
> > --- /dev/null
> > +++ b/src/compiler/nir/nir_to_lcssa.c
> > @@ -0,0 +1,227 @@
> > +/*
> > + * Copyright © 2015 Thomas Helland
> > + *
> > + * Permission is hereby granted, free of charge, to any person
> > obtaining a
> > + * copy of this software and associated documentation files (the
> > "Software"),
> > + * to deal in the Software without restriction, including without
> > limitation
> > + * the rights to use, copy, modify, merge, publish, distribute,
> > sublicense,
> > + * and/or sell copies of the Software, and to permit persons to
> > whom the
> > + * Software is furnished to do so, subject to the following
> > conditions:
> > + *
> > + * The above copyright notice and this permission notice
> > (including the next
> > + * paragraph) shall be included in all copies or substantial
> > portions of the
> > + * Software.
> > + *
> > + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
> > EXPRESS OR
> > + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
> > MERCHANTABILITY,
> > + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO
> > EVENT SHALL
> > + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM,
> > DAMAGES OR OTHER
> > + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
> > ARISING
> > + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
> > OTHER DEALINGS
> > + * IN THE SOFTWARE.
> > + */
> > +
> > +/*
> > + * This pass converts the ssa-graph into "Loop Closed SSA form".
> > This is
> > + * done by placing phi nodes at the exits of the loop for all
> > values
> > + * that are used outside the loop. The result is it transforms:
> > + *
> > + * loop {                    ->      loop {
> > + *    ssa2 =             ->          ssa2 = ...
> > + *    if (cond)              ->          if (cond) {
> > + *       break;              ->             break;
> > + *    ssa3 

Re: [Mesa-dev] [PATCH] i965/l3: Remove redundant is_cherryview check

2016-10-04 Thread Francisco Jerez
Ben Widawsky  writes:

> All mobile parts (so far) are GT1. The check added extra confusion
> because it appeared Broxton was missing when it wasn't. Replace it with
> a comment.
>
> Alternatively, I'd be willing to add an is_broxton check.
>
> Cc: Francisco Jerez 
> Signed-off-by: Ben Widawsky 
> ---
>  src/intel/common/gen_l3_config.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/src/intel/common/gen_l3_config.c 
> b/src/intel/common/gen_l3_config.c
> index b172ef6..eb4e8ae 100644
> --- a/src/intel/common/gen_l3_config.c
> +++ b/src/intel/common/gen_l3_config.c
> @@ -258,7 +258,8 @@ get_l3_way_size(const struct gen_device_info *devinfo)
> if (devinfo->is_baytrail)
>return 2;
>  
> -   else if (devinfo->is_cherryview || devinfo->gt == 1)
> +   /* XXX: Cherryview and Broxton are always gt1 */
> +   else if (devinfo->gt == 1)

An explicit devinfo->is_broxton check would be as informative as the
comment (XXX?) and more obviously correct, because the GTn naming
doesn't officially apply to mobile parts as far as I'm aware, so I'd
definitely prefer the additional check.

>return 4;
>  
> else
> -- 
> 2.10.0


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] i965: solve cubemap negative x/y/z faces buffer offset issue in dEQP.

2016-10-04 Thread Xu, Randy
Hi, Jason & Tapani

Thanks for your review, let me introduce the dEQP failure first.

In dEQP-EGL.functional.image.create.gles2_cubemap_negative_*_texture, 2D 
textures are generated from all 6 faces of a Cubemap texture (64x64), and then 
rendered through glDrawXXX.
In brw_miptree_get_vertical_slice_pitch, the mt->qpitch is counted as 144.
  return h0 + h1 + (brw->gen >= 7 ? 12 : 11) * mt->valign;  
   // 64+32+12*4 = 144

Take the face negative_x for example, the total offset in bo is 
144(y)*64(x)*4(bpp) = 36864.
It’s TILING_Y buffer, as the y (144) is not 32 aligned (mask_y = 31 from 
intel_region_get_tile_masks), the total bo offset is divided into two parts: 
36864 =  32768 (offset 128*64*4) + 16(tile_y)*64*4
   case I915_TILING_Y:
  *mask_x = 128 / cpp - 1;
  *mask_y = 31;


Both the tile_y and offset are passed to texture2D in create_mt_for_dri_image, 
while the tile_y is not used to count the total offset in rendering path, 
that’s why I add this patch.
Please check and comment more.

Thanks,
Randy




From: Jason Ekstrand [mailto:ja...@jlekstrand.net]
Sent: Tuesday, October 4, 2016 11:59 PM
To: Palli, Tapani 
Cc: Xu, Randy ; mesa-dev@lists.freedesktop.org; 
x...@freedesktop.org
Subject: Re: [Mesa-dev] [PATCH 1/3] i965: solve cubemap negative x/y/z faces 
buffer offset issue in dEQP.

On Tue, Oct 4, 2016 at 8:55 AM, Tapani Pälli 
> wrote:
On 10/04/2016 06:09 PM, Jason Ekstrand wrote:
On Thu, Sep 29, 2016 at 11:27 PM, Xu,Randy 
> wrote:
Add the miptree level/slice x/y_offset when count the surface offset
in brw_emit_surface_state. The surface offset has two parts, one is
from mt->offset, which should be 32 aligned in width/height for tiled
buffer; another is from mt->level[current_level].slice[current_slice].
x/y_offset.

This fix will solve 12 deqp failure
dEQP-EGL.functional.image.create.gles2_cubemap_negative_*_texture

Signed-off-by: Xu,Randy >
---
 src/mesa/drivers/dri/i965/brw_wm_surface_state.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c 
b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
index 61a4b94..3a5c573 100644
--- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
+++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
@@ -85,7 +85,8 @@ brw_emit_surface_state(struct brw_context *brw,
unsigned read_domains, unsigned write_domains)
 {
const struct surface_state_info ss_info = surface_state_infos[brw->gen];
-   uint32_t tile_x = 0, tile_y = 0;
+   uint32_t tile_x = mt->level[0].slice[0].x_offset;
+   uint32_t tile_y = mt->level[0].slice[0].y_offset;

This isn't correct.  First off, there are some fairly strict restrictions on 
what we can do with tile_x and tile_y and we can't just shove x/y_offset in 
there.  We need to use intel_miptree_get_tile_offsets to get both a byte offset 
and an intratile offset.  Second, we should already be taking slices into 
account for cube maps via base_array_layer where needed.
Unfortunately, I'm not 100% sure what the correct patch is without a bit more 
information about what the test is doing that causes a problem.

I did take a brief look and when running the set mentioned above (for example 
with ./deqp-egl 
--deqp-case=*EGL.functional.image.create.gles2_cubemap_negative_*_texture) what 
happens is that we never end up to the part of code calling 
intel_miptree_get_tile_offsets in that function (because surf.dim_layout != 
dim_layout condition does not trigger). This is just what I observed, should we 
just call intel_miptree_get_tile_offsets() unconditionally then?

No.  Very much no.  The intel_miptree_get_tile_offsets() stuff is a hack that 
lets us convert non-2D things to 2D things and it comes with piles of 
restrictions.



--Jason

uint32_t offset = mt->offset;

struct isl_surf surf;
--
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev



___

mesa-dev mailing list

mesa-dev@lists.freedesktop.org

https://lists.freedesktop.org/mailman/listinfo/mesa-dev



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/4] radv/winsys: import the amdgpu winsys for the radv vulkan driver. (v1.1)

2016-10-04 Thread Dave Airlie
From: Dave Airlie 

This just brings these files into the tree, it doesn't integrate
them with the build system.

The radv winsys is based on the gallium one with some changes,
due to how command buffers are built and lack of flushing behaviour.

v1.1: cleanup whitespace issues, move Makefiles to other patch.
add missing copyright headers

Authors: Bas Nieuwenhuizen and Dave Airlie
Signed-off-by: Dave Airlie 
---
 src/amd/vulkan/radv_radeon_winsys.h| 336 +
 src/amd/vulkan/winsys/amdgpu/radv_amdgpu_bo.c  | 297 
 src/amd/vulkan/winsys/amdgpu/radv_amdgpu_bo.h  |  50 ++
 src/amd/vulkan/winsys/amdgpu/radv_amdgpu_cs.c  | 778 +
 src/amd/vulkan/winsys/amdgpu/radv_amdgpu_cs.h  |  51 ++
 src/amd/vulkan/winsys/amdgpu/radv_amdgpu_surface.c | 523 ++
 src/amd/vulkan/winsys/amdgpu/radv_amdgpu_surface.h |  29 +
 src/amd/vulkan/winsys/amdgpu/radv_amdgpu_winsys.c  | 359 ++
 src/amd/vulkan/winsys/amdgpu/radv_amdgpu_winsys.h  |  57 ++
 .../winsys/amdgpu/radv_amdgpu_winsys_public.h  |  30 +
 10 files changed, 2510 insertions(+)
 create mode 100644 src/amd/vulkan/radv_radeon_winsys.h
 create mode 100644 src/amd/vulkan/winsys/amdgpu/radv_amdgpu_bo.c
 create mode 100644 src/amd/vulkan/winsys/amdgpu/radv_amdgpu_bo.h
 create mode 100644 src/amd/vulkan/winsys/amdgpu/radv_amdgpu_cs.c
 create mode 100644 src/amd/vulkan/winsys/amdgpu/radv_amdgpu_cs.h
 create mode 100644 src/amd/vulkan/winsys/amdgpu/radv_amdgpu_surface.c
 create mode 100644 src/amd/vulkan/winsys/amdgpu/radv_amdgpu_surface.h
 create mode 100644 src/amd/vulkan/winsys/amdgpu/radv_amdgpu_winsys.c
 create mode 100644 src/amd/vulkan/winsys/amdgpu/radv_amdgpu_winsys.h
 create mode 100644 src/amd/vulkan/winsys/amdgpu/radv_amdgpu_winsys_public.h

diff --git a/src/amd/vulkan/radv_radeon_winsys.h 
b/src/amd/vulkan/radv_radeon_winsys.h
new file mode 100644
index 000..29a4ee3
--- /dev/null
+++ b/src/amd/vulkan/radv_radeon_winsys.h
@@ -0,0 +1,336 @@
+/*
+ * Copyright © 2016 Red Hat.
+ * Copyright © 2016 Bas Nieuwenhuizen
+ *
+ * Based on radeon_winsys.h which is:
+ * Copyright 2008 Corbin Simpson 
+ * Copyright 2010 Marek Olšák 
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ */
+#pragma once
+
+#include 
+#include 
+#include 
+#include "main/macros.h"
+#include "amd_family.h"
+
+#define FREE(x) free(x)
+
+enum radeon_bo_domain { /* bitfield */
+   RADEON_DOMAIN_GTT  = 2,
+   RADEON_DOMAIN_VRAM = 4,
+   RADEON_DOMAIN_VRAM_GTT = RADEON_DOMAIN_VRAM | RADEON_DOMAIN_GTT
+};
+
+enum radeon_bo_flag { /* bitfield */
+   RADEON_FLAG_GTT_WC =(1 << 0),
+   RADEON_FLAG_CPU_ACCESS =(1 << 1),
+   RADEON_FLAG_NO_CPU_ACCESS = (1 << 2),
+};
+
+enum radeon_bo_usage { /* bitfield */
+   RADEON_USAGE_READ = 2,
+   RADEON_USAGE_WRITE = 4,
+   RADEON_USAGE_READWRITE = RADEON_USAGE_READ | RADEON_USAGE_WRITE
+};
+
+enum ring_type {
+   RING_GFX = 0,
+   RING_COMPUTE,
+   RING_DMA,
+   RING_UVD,
+   RING_VCE,
+   RING_LAST,
+};
+
+struct radeon_winsys_cs {
+   unsigned cdw;  /* Number of used dwords. */
+   unsigned max_dw; /* Maximum number of dwords. */
+   uint32_t *buf; /* The base pointer of the chunk. */
+};
+
+struct radeon_info {
+   /* PCI info: domain:bus:dev:func */
+   uint32_tpci_domain;
+   uint32_tpci_bus;
+   uint32_tpci_dev;
+   uint32_tpci_func;
+
+   /* Device info. */
+   uint32_tpci_id;
+   enum radeon_family  family;
+   const char  *name;
+   enum chip_class chip_class;
+   uint32_tgart_page_size;
+   uint64_tgart_size;
+   uint64_t   

[Mesa-dev] [PATCH 4/4] radv: toplevel configure/make changes required to build (v1.1)

2016-10-04 Thread Dave Airlie
From: Dave Airlie 

This moves some of the llvm checks around to allow them
to be used for non-gallium drivers as well.

radv requires llvm 3.9.0 as vulkan requires compute shaders.

v1.1: add all make infrastructure to this patch for easier
review.

Authors: Bas Nieuwenhuizen and Dave Airlie
Signed-off-by: Dave Airlie 
---
 configure.ac|  33 ++--
 src/Makefile.am |   8 +-
 src/amd/common/Makefile.am  |  51 +
 src/amd/common/Makefile.sources |  29 +++
 src/amd/vulkan/Makefile.am  | 165 
 src/amd/vulkan/Makefile.sources |  67 
 6 files changed, 345 insertions(+), 8 deletions(-)
 create mode 100644 src/amd/common/Makefile.am
 create mode 100644 src/amd/common/Makefile.sources
 create mode 100644 src/amd/vulkan/Makefile.am
 create mode 100644 src/amd/vulkan/Makefile.sources

diff --git a/configure.ac b/configure.ac
index 1bfac3b..634f3c3 100644
--- a/configure.ac
+++ b/configure.ac
@@ -1704,6 +1704,10 @@ if test -n "$with_vulkan_drivers"; then
 HAVE_INTEL_VULKAN=yes;
 
 ;;
+xradeon)
+PKG_CHECK_MODULES([AMDGPU], [libdrm_amdgpu >= 
$LIBDRM_AMDGPU_REQUIRED])
+HAVE_RADEON_VULKAN=yes;
+   ;;
 *)
 AC_MSG_ERROR([Vulkan driver '$driver' does not exist])
 ;;
@@ -2187,7 +2191,7 @@ if test "x$enable_gallium_llvm" = xauto; then
 i*86|x86_64|amd64) enable_gallium_llvm=yes;;
 esac
 fi
-if test "x$enable_gallium_llvm" = xyes; then
+if test "x$enable_gallium_llvm" = xyes || test "x$HAVE_RADEON_VULKAN" = xyes; 
then
 if test -n "$llvm_prefix"; then
 AC_PATH_TOOL([LLVM_CONFIG], [llvm-config], [no], ["$llvm_prefix/bin"])
 else
@@ -2357,10 +2361,7 @@ radeon_llvm_check() {
 else
 amdgpu_llvm_target_name='amdgpu'
 fi
-if test "x$enable_gallium_llvm" != "xyes"; then
-AC_MSG_ERROR([--enable-gallium-llvm is required when building $1])
-fi
-llvm_check_version_for "3" "6" "0" $1
+llvm_check_version_for $2 $3 $4 $1
 if test true && $LLVM_CONFIG --targets-built | grep -iqvw 
$amdgpu_llvm_target_name ; then
 AC_MSG_ERROR([LLVM $amdgpu_llvm_target_name not enabled in your LLVM 
build.])
 fi
@@ -2371,6 +2372,13 @@ radeon_llvm_check() {
 fi
 }
 
+radeon_gallium_llvm_check() {
+if test "x$enable_gallium_llvm" != "xyes"; then
+AC_MSG_ERROR([--enable-gallium-llvm is required when building $1])
+fi
+radeon_llvm_check $*
+}
+
 swr_llvm_check() {
 gallium_require_llvm $1
 if test ${LLVM_VERSION_INT} -lt 306; then
@@ -2455,7 +2463,7 @@ if test -n "$with_gallium_drivers"; then
 gallium_require_drm "Gallium R600"
 gallium_require_drm_loader
 if test "x$enable_opencl" = xyes; then
-radeon_llvm_check "r600g"
+radeon_gallium_llvm_check "r600g" "3" "6" "0"
 LLVM_COMPONENTS="${LLVM_COMPONENTS} bitreader asmparser"
 fi
 ;;
@@ -2465,7 +2473,7 @@ if test -n "$with_gallium_drivers"; then
 PKG_CHECK_MODULES([AMDGPU], [libdrm_amdgpu >= 
$LIBDRM_AMDGPU_REQUIRED])
 gallium_require_drm "radeonsi"
 gallium_require_drm_loader
-radeon_llvm_check "radeonsi"
+radeon_gallium_llvm_check "radeonsi" "3" "6" "0"
 require_egl_drm "radeonsi"
 ;;
 xnouveau)
@@ -2584,6 +2592,10 @@ if test "x$MESA_LLVM" != x0; then
 fi
 fi
 
+if test "x$HAVE_RADEON_VULKAN" != "x0"; then
+radeon_llvm_check "radv" "3" "9" "0"
+fi
+
 AM_CONDITIONAL(HAVE_GALLIUM_SVGA, test "x$HAVE_GALLIUM_SVGA" = xyes)
 AM_CONDITIONAL(HAVE_GALLIUM_I915, test "x$HAVE_GALLIUM_I915" = xyes)
 AM_CONDITIONAL(HAVE_GALLIUM_ILO, test "x$HAVE_GALLIUM_ILO" = xyes)
@@ -2621,8 +2633,13 @@ AM_CONDITIONAL(HAVE_R200_DRI, test x$HAVE_R200_DRI = 
xyes)
 AM_CONDITIONAL(HAVE_RADEON_DRI, test x$HAVE_RADEON_DRI = xyes)
 AM_CONDITIONAL(HAVE_SWRAST_DRI, test x$HAVE_SWRAST_DRI = xyes)
 
+AM_CONDITIONAL(HAVE_RADEON_VULKAN, test "x$HAVE_RADEON_VULKAN" = xyes)
 AM_CONDITIONAL(HAVE_INTEL_VULKAN, test "x$HAVE_INTEL_VULKAN" = xyes)
 
+AM_CONDITIONAL(HAVE_AMD_DRIVERS, test "x$HAVE_GALLIUM_R600" = xyes -o \
+  "x$HAVE_GALLIUM_RADEONSI" = xyes -o \
+  "x$HAVE_RADEON_VULKAN" = xyes)
+
 AM_CONDITIONAL(HAVE_INTEL_DRIVERS, test "x$HAVE_INTEL_VULKAN" = xyes -o \
 "x$HAVE_I965_DRI" = xyes)
 
@@ -2713,6 +2730,8 @@ dnl Substitute the config
 AC_CONFIG_FILES([Makefile
src/Makefile
src/amd/Makefile
+   src/amd/common/Makefile
+   src/amd/vulkan/Makefile
src/compiler/Makefile
src/egl/Makefile
src/egl/main/egl.pc
diff --git a/src/Makefile.am b/src/Makefile.am
index 551f431..1cb02c6 

[Mesa-dev] radv submission 2

2016-10-04 Thread Dave Airlie
Again I'm sure this will hit limits, and I've asked Michel to
drop the big patch before it gets here.

All of these are in a new branch radv-submit2.
https://github.com/airlied/mesa/commits/radv-submit2

I've incorporated all the feedback except two things:

a) #pragma once - this is used in lots of places already, if
someone enforces a paint color on the shed, then do so, but I don't
think radv should be the rallying cry for it.

b) HAVE_LLVM defines, in some places I've left these in, esp the
"common" code that we could possibly reuse in radeonsi in the future,
as we might need to use the common stuff with older LLVMs.

I think I've caught every other suggestion/comments on the way past.

Thanks for all the interest,
Dave.


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] radeonsi: fix texture border colors for compute shaders

2016-10-04 Thread Edward O'Callaghan
Acked-by: Edward O'Callaghan 

On 10/05/2016 10:51 AM, Marek Olšák wrote:
> From: Marek Olšák 
> 
> There are VM faults without this.
> 
> Cc: 12.0 
> ---
>  src/gallium/drivers/radeonsi/si_compute.c | 12 
>  1 file changed, 12 insertions(+)
> 
> diff --git a/src/gallium/drivers/radeonsi/si_compute.c 
> b/src/gallium/drivers/radeonsi/si_compute.c
> index 9a5a4a9..1d1df2f 100644
> --- a/src/gallium/drivers/radeonsi/si_compute.c
> +++ b/src/gallium/drivers/radeonsi/si_compute.c
> @@ -201,20 +201,21 @@ static void si_set_global_binding(
>   offset = util_le32_to_cpu(*handles[i]);
>   va += offset;
>   va = util_cpu_to_le64(va);
>   memcpy(handles[i], , sizeof(va));
>   }
>  }
>  
>  static void si_initialize_compute(struct si_context *sctx)
>  {
>   struct radeon_winsys_cs *cs = sctx->b.gfx.cs;
> + uint64_t bc_va;
>  
>   radeon_set_sh_reg_seq(cs, R_00B810_COMPUTE_START_X, 3);
>   radeon_emit(cs, 0);
>   radeon_emit(cs, 0);
>   radeon_emit(cs, 0);
>  
>   radeon_set_sh_reg_seq(cs, R_00B858_COMPUTE_STATIC_THREAD_MGMT_SE0, 2);
>   /* R_00B858_COMPUTE_STATIC_THREAD_MGMT_SE0 / SE1 */
>   radeon_emit(cs, S_00B858_SH0_CU_EN(0x) | 
> S_00B858_SH1_CU_EN(0x));
>   radeon_emit(cs, S_00B85C_SH0_CU_EN(0x) | 
> S_00B85C_SH1_CU_EN(0x));
> @@ -235,20 +236,31 @@ static void si_initialize_compute(struct si_context 
> *sctx)
>* which is now 0x22f.
>*/
>   if (sctx->b.chip_class <= SI) {
>   /* XXX: This should be:
>* (number of compute units) * 4 * (waves per simd) - 1 */
>  
>   radeon_set_sh_reg(cs, R_00B82C_COMPUTE_MAX_WAVE_ID,
> 0x190 /* Default value */);
>   }
>  
> + /* Set the pointer to border colors. */
> + bc_va = sctx->border_color_buffer->gpu_address;
> +
> + if (sctx->b.chip_class >= CIK) {
> + radeon_set_uconfig_reg_seq(cs, R_030E00_TA_CS_BC_BASE_ADDR, 2);
> + radeon_emit(cs, bc_va >> 8);  /* R_030E00_TA_CS_BC_BASE_ADDR */
> + radeon_emit(cs, bc_va >> 40); /* R_030E04_TA_CS_BC_BASE_ADDR_HI 
> */
> + } else {
> + radeon_set_config_reg(cs, R_00950C_TA_CS_BC_BASE_ADDR, bc_va >> 
> 8);
> + }
> +
>   sctx->cs_shader_state.emitted_program = NULL;
>   sctx->cs_shader_state.initialized = true;
>  }
>  
>  static bool si_setup_compute_scratch_buffer(struct si_context *sctx,
>  struct si_shader *shader,
>  struct si_shader_config *config)
>  {
>   uint64_t scratch_bo_size, scratch_needed;
>   scratch_bo_size = 0;
> 



signature.asc
Description: OpenPGP digital signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH V2 10/11] genX/cmd_buffer: Enable fast depth clears

2016-10-04 Thread Nanley Chery
On Tue, Oct 04, 2016 at 03:55:13PM -0700, Chad Versace wrote:
> On Tue 04 Oct 2016, Nanley Chery wrote:
> > On Mon, Oct 03, 2016 at 06:21:30PM -0700, Jason Ekstrand wrote:
> > > On Mon, Oct 3, 2016 at 6:11 PM, Jason Ekstrand  
> > > wrote:
> > > 
> > > > On Tue, Sep 27, 2016 at 3:23 PM, Nanley Chery 
> > > > wrote:
> > > >
> > > >> On Tue, Sep 27, 2016 at 03:12:17PM -0700, Chad Versace wrote:
> > > >> > On Tue 27 Sep 2016, Nanley Chery wrote:
> > > >> > > On Tue, Sep 27, 2016 at 11:00:21AM -0700, Chad Versace wrote:
> > > >> >
> > > >> > > > As a consequence of that reasoning, we should set
> > > >> 3DSTATE_CLEAR_PARAMS.DepthClearValueValid = 1
> > > >> > > > whenever hiz is enabled, even if we don't care about the actual
> > > >> clear value.
> > > >>
> > > >
> > > > The logic seems to imply that we can't trust the context to save/restore
> > > > our depth clear value so we have to set it every time.  At the very 
> > > > least,
> > > > once per batch?  In any case, I doubt there's all that much cost 
> > > > involved
> > > > in emitting 3DSTATE_CLEAR_PARAMS so I don't think re-emitting it is that
> > > > big of a deal.
> > > >
> > > 
> > > Thinking about it a bit more...
> > > 
> > > We only set up dept/stencil packets once per subpass and we only do clears
> > > once per subpass so... I don't think we're actually saving anything by
> > > emitting it at clear time rather than at depth/stencil setup time.  It is 
> > > a
> > > bit more convenient because the clear values may be more accessible at
> > > clear time.
> > > 
> > > As far as "should we emit 3DSTATE_CLEAR_PARAMS all the time?"  Let's not 
> > > go
> > > to any heroics to try and avoid re-emitting it.  Once per subpass is not a
> > > big deal at all.
> > > 
> > 
> > I wouldn't say the code to implement it was complex, but I'm fine with
> > trading off efficiency for simplicity here. I'll add a comment describing 
> > the
> > situation.
> 
> I'm happy with this conclusion.

After testing the V3, I found an interesting result. Emitting two
3DSTATE_CLEAR_PARAMS that both have DepthClearValueValid set to 1 is
causing Vulkan CTS failures on BDW+. The GPU doesn't seem to be picking
up on the latter packet's clear color. I'll look into this more
tomorrow.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/4] amd/common: add nir->llvm translation.

2016-10-04 Thread Dave Airlie
On 4 October 2016 at 21:05, Emil Velikov  wrote:
> Hi Dave,
>
> On 4 October 2016 at 02:48, Dave Airlie  wrote:
>> From: Bas Nieuwenhuizen 
>>
>> This adds the basic files for the NIR->LLVM translation layer,
>> along with some hopefully generic code to load the binary
>> result, and other helpers required.
>>
>> The hope is in the future we could share this with an
>> GL_ARB_spirv implementation or a push to replace TGSI
>> with NIR in radeonsi.
>>
> Some of this code is copied from/based on existing one in mesa. Please
> mention so in the commit summary and the respective files ?

Okay I've add that in the tops of the files, for some of them the code has
changed quite a bit, but at least two are definitely based on other files.

>> +AM_CPPFLAGS = \
>> +   $(VALGRIND_CFLAGS) \
>> +   $(DEFINES) \
>> +   -I$(top_srcdir)/include \
>> +   -I$(top_builddir)/src \
>> +   -I$(top_srcdir)/src \
>> +   -I$(top_builddir)/src/compiler \
>> +   -I$(top_builddir)/src/compiler/nir \
>> +   -I$(top_srcdir)/src/compiler \
>> +   -I$(top_srcdir)/src/mapi \
>> +   -I$(top_srcdir)/src/mesa \
>> +   -I$(top_srcdir)/src/mesa/drivers/dri/common \
>> +   -I$(top_srcdir)/src/gallium/auxiliary \
>> +   -I$(top_srcdir)/src/gallium/include
>> +
> I'm leaning that at least some of the above can be nuked, but if it's
> too much of a hassle just add a comment on top - XXX/TODO or other.

I've just added a TODO, it's all fun and games until you hit main/macros.h.

I'm envisaging a future where I'm not including this, anv uses macros from it,
but only includes it via the brw_compiler.h file. I think migrating more stuff
to util/macros.h might make life better.

>> +AM_CFLAGS = -Wno-override-init -msse2 \
> From a curtecy skim though, neither of these two are required. Please drop 
> them.

Done.

>
>> +   $(VISIBILITY_CFLAGS) \
>> +   $(PTHREAD_CFLAGS) \
>> +   $(LLVM_CFLAGS) \
>> +   $(LIBELF_CFLAGS)
>> +
>> +AM_CXXFLAGS = \
>> +   $(VISIBILITY_CXXFLAGS) \
>> +   $(MSVC2013_COMPAT_CXXFLAGS) \
> I don't think we're about to build things with MSVC anytime soon, so
> we can drop this line.

Done.

>> +AMD_COMPILER_SOURCES := \
>> +   ac_binary.c \
>> +   ac_llvm_util.c \
>> +   ac_nir_to_llvm.c \
>> +   ac_llvm_helper.cpp
> Please list all the files.
>
> ac_binary.c
> ac_binary.h
> ac_llvm_helper.cpp
> ac_llvm_util.c
> ac_llvm_util.h
> ac_nir_to_llvm.c
> ac_nir_to_llvm.h
> ac_radeon_winsys.h
>

I've realised I can kill ac_radeon_winsys.h so I've done this
and done that as well.

>
>
>> +#pragma once
>> +
> Please don't use pragma once, but ifdef FOO/define FOO guards.

We have a lot of these in the tree already, I don't envisage building
this with any sort of old compilers since it depends on llvm 3.9,
which requires a modern compiler to build, but pragma once
goes back a long way, and unless someone wants to purge Mesa
of all of them I'd prefer to use them.

Dave.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/4] amd/common: add nir->llvm translation.

2016-10-04 Thread Dave Airlie
On 4 October 2016 at 20:09, Nicolai Hähnle  wrote:
> On 04.10.2016 03:48, Dave Airlie wrote:
>>
>> From: Bas Nieuwenhuizen 
>>
>> This adds the basic files for the NIR->LLVM translation layer,
>> along with some hopefully generic code to load the binary
>> result, and other helpers required.
>>
>> The hope is in the future we could share this with an
>> GL_ARB_spirv implementation or a push to replace TGSI
>> with NIR in radeonsi.
>
>
> Indeed. :)
>
> Skimming over this, two remarks:
>
> 1. In ac_binary.c, just include sid.h instead of duplicating #defines.
> 2. There are several files with missing copyright headers -- that's an
> absolute no-go.

Okay done.

>> +AM_CFLAGS = -Wno-override-init -msse2 \
>
>
> Why the -Wno-override-init? The -msse2 is probably fine, but let's not leak
> it to someone who uses r300 on an ancient system...

possibly copied from anv, I've dropped them for now.
>>
>> diff --git a/src/amd/common/ac_llvm_helper.cpp
>> b/src/amd/common/ac_llvm_helper.cpp
>> new file mode 100644
>> index 000..feafdaf
>> --- /dev/null
>> +++ b/src/amd/common/ac_llvm_helper.cpp
>> @@ -0,0 +1,22 @@
>> +>> +// Workaround http://llvm.org/PR23628
>> +#if HAVE_LLVM >= 0x0307
>> +#  pragma push_macro("DEBUG")
>> +#  undef DEBUG
>> +#endif
>> +
>> +#include "ac_nir_to_llvm.h"
>> +#include 
>> +#include 
>> +#include 
>> +
>> +extern "C" void
>> +ac_add_attr_dereferenceable(LLVMValueRef val, uint64_t bytes)
>> +{
>> +#if HAVE_LLVM >= 0x0306
>
>
> We only build with LLVM >= 3.6, so drop this check.

Well in the common code it would be nice in the future to share it with radeonsi
if we can, and that probably means having it work on earlier llvm, so where I've
copied code from other places that does LLVM version checks I've kept them
for now. so in the vulkan driver itself I think dropping them is fine,
in the common
code I'd like to keep them for now.

Dave.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] radeonsi: fix texture border colors for compute shaders

2016-10-04 Thread Marek Olšák
From: Marek Olšák 

There are VM faults without this.

Cc: 12.0 
---
 src/gallium/drivers/radeonsi/si_compute.c | 12 
 1 file changed, 12 insertions(+)

diff --git a/src/gallium/drivers/radeonsi/si_compute.c 
b/src/gallium/drivers/radeonsi/si_compute.c
index 9a5a4a9..1d1df2f 100644
--- a/src/gallium/drivers/radeonsi/si_compute.c
+++ b/src/gallium/drivers/radeonsi/si_compute.c
@@ -201,20 +201,21 @@ static void si_set_global_binding(
offset = util_le32_to_cpu(*handles[i]);
va += offset;
va = util_cpu_to_le64(va);
memcpy(handles[i], , sizeof(va));
}
 }
 
 static void si_initialize_compute(struct si_context *sctx)
 {
struct radeon_winsys_cs *cs = sctx->b.gfx.cs;
+   uint64_t bc_va;
 
radeon_set_sh_reg_seq(cs, R_00B810_COMPUTE_START_X, 3);
radeon_emit(cs, 0);
radeon_emit(cs, 0);
radeon_emit(cs, 0);
 
radeon_set_sh_reg_seq(cs, R_00B858_COMPUTE_STATIC_THREAD_MGMT_SE0, 2);
/* R_00B858_COMPUTE_STATIC_THREAD_MGMT_SE0 / SE1 */
radeon_emit(cs, S_00B858_SH0_CU_EN(0x) | 
S_00B858_SH1_CU_EN(0x));
radeon_emit(cs, S_00B85C_SH0_CU_EN(0x) | 
S_00B85C_SH1_CU_EN(0x));
@@ -235,20 +236,31 @@ static void si_initialize_compute(struct si_context *sctx)
 * which is now 0x22f.
 */
if (sctx->b.chip_class <= SI) {
/* XXX: This should be:
 * (number of compute units) * 4 * (waves per simd) - 1 */
 
radeon_set_sh_reg(cs, R_00B82C_COMPUTE_MAX_WAVE_ID,
  0x190 /* Default value */);
}
 
+   /* Set the pointer to border colors. */
+   bc_va = sctx->border_color_buffer->gpu_address;
+
+   if (sctx->b.chip_class >= CIK) {
+   radeon_set_uconfig_reg_seq(cs, R_030E00_TA_CS_BC_BASE_ADDR, 2);
+   radeon_emit(cs, bc_va >> 8);  /* R_030E00_TA_CS_BC_BASE_ADDR */
+   radeon_emit(cs, bc_va >> 40); /* R_030E04_TA_CS_BC_BASE_ADDR_HI 
*/
+   } else {
+   radeon_set_config_reg(cs, R_00950C_TA_CS_BC_BASE_ADDR, bc_va >> 
8);
+   }
+
sctx->cs_shader_state.emitted_program = NULL;
sctx->cs_shader_state.initialized = true;
 }
 
 static bool si_setup_compute_scratch_buffer(struct si_context *sctx,
 struct si_shader *shader,
 struct si_shader_config *config)
 {
uint64_t scratch_bo_size, scratch_needed;
scratch_bo_size = 0;
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 05/11] nir: Add a LCSAA-pass

2016-10-04 Thread Jason Ekstrand
On Fri, Sep 16, 2016 at 6:24 AM, Timothy Arceri <
timothy.arc...@collabora.com> wrote:

> From: Thomas Helland 
>
> V2: Do a "depth first search" to convert to LCSSA
>
> V3: Small comment fixup
>
> V4: Rebase, adapt to removal of function overloads
>
> V5: Rebase, adapt to relocation of nir to compiler/nir
> Still need to adapt to potential if-uses
> Work around nir_validate issue
>
> V6 (Timothy):
>  - tidy lcssa and stop leaking memory
>  - dont rewrite the src for the lcssa phi node
>  - validate lcssa phi srcs to avoid postvalidate assert
>  - don't add new phi if one already exists
>  - more lcssa phi validation fixes
>  - Rather than marking ssa defs inside a loop just mark blocks inside
>a loop. This is simpler and fixes lcssa for intrinsics which do
>not have a destination.
>  - don't create LCSSA phis for loops we won't unroll
>  - require loop metadata for lcssa pass
>  - handle case were the ssa defs use outside the loop is already a phi
>
> V7: (Timothy)
> - pass indirect mask to metadata call
> ---
>  src/compiler/Makefile.sources   |   1 +
>  src/compiler/nir/nir.h  |   6 ++
>  src/compiler/nir/nir_to_lcssa.c | 227 ++
> ++
>  src/compiler/nir/nir_validate.c |  11 +-
>  4 files changed, 242 insertions(+), 3 deletions(-)
>  create mode 100644 src/compiler/nir/nir_to_lcssa.c
>
> diff --git a/src/compiler/Makefile.sources b/src/compiler/Makefile.sources
> index 7ed26a9..8ef6080 100644
> --- a/src/compiler/Makefile.sources
> +++ b/src/compiler/Makefile.sources
> @@ -247,6 +247,7 @@ NIR_FILES = \
> nir/nir_search_helpers.h \
> nir/nir_split_var_copies.c \
> nir/nir_sweep.c \
> +   nir/nir_to_lcssa.c \
> nir/nir_to_ssa.c \
> nir/nir_validate.c \
> nir/nir_vla.h \
> diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h
> index cc8f4b6..29a6f45 100644
> --- a/src/compiler/nir/nir.h
> +++ b/src/compiler/nir/nir.h
> @@ -1387,6 +1387,8 @@ typedef struct {
> struct exec_list srcs; /** < list of nir_phi_src */
>
> nir_dest dest;
> +
> +   bool is_lcssa_phi;
>  } nir_phi_instr;
>
>  typedef struct {
> @@ -2643,6 +2645,10 @@ void nir_convert_to_ssa(nir_shader *shader);
>  bool nir_repair_ssa_impl(nir_function_impl *impl);
>  bool nir_repair_ssa(nir_shader *shader);
>
> +void nir_to_lcssa_impl(nir_function_impl *impl,
> +   nir_variable_mode indirect_mask);
> +void nir_to_lcssa(nir_shader *shader, nir_variable_mode indirect_mask);
> +
>  /* If phi_webs_only is true, only convert SSA values involved in phi
> nodes to
>   * registers.  If false, convert all values (even those not involved in a
> phi
>   * node) to registers.
> diff --git a/src/compiler/nir/nir_to_lcssa.c b/src/compiler/nir/nir_to_
> lcssa.c
> new file mode 100644
> index 000..25d0bdb
> --- /dev/null
> +++ b/src/compiler/nir/nir_to_lcssa.c
> @@ -0,0 +1,227 @@
> +/*
> + * Copyright © 2015 Thomas Helland
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a
> + * copy of this software and associated documentation files (the
> "Software"),
> + * to deal in the Software without restriction, including without
> limitation
> + * the rights to use, copy, modify, merge, publish, distribute,
> sublicense,
> + * and/or sell copies of the Software, and to permit persons to whom the
> + * Software is furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice (including the
> next
> + * paragraph) shall be included in all copies or substantial portions of
> the
> + * Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
> EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
> MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT
> SHALL
> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR
> OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
> DEALINGS
> + * IN THE SOFTWARE.
> + */
> +
> +/*
> + * This pass converts the ssa-graph into "Loop Closed SSA form". This is
> + * done by placing phi nodes at the exits of the loop for all values
> + * that are used outside the loop. The result is it transforms:
> + *
> + * loop {->  loop {
> + *ssa2 = ->  ssa2 = ...
> + *if (cond)  ->  if (cond) {
> + *   break;  -> break;
> + *ssa3 = ssa2 * ssa4 ->  }
> + * } ->  ssa3 = ssa2 * ssa4
> + * ssa6 = ssa2 + 4   ->   }
> + *ssa5 = lcssa_phi(ssa2)
> + *ssa6 = ssa5 + 4
> + */
>

Let me make sure I understand this correctly.  The point here seems to be
to ensure 

Re: [Mesa-dev] [PATCH v2 01/13] anv: Use blorp for VkCmdFillBuffer

2016-10-04 Thread Nanley Chery
On Mon, Oct 03, 2016 at 02:22:15PM -0700, Jason Ekstrand wrote:
> Signed-off-by: Jason Ekstrand 
> ---
>  src/intel/vulkan/anv_blorp.c  | 106 +
>  src/intel/vulkan/anv_meta_clear.c | 120 
> --
>  2 files changed, 96 insertions(+), 130 deletions(-)
> 

This patch is
Reviewed-by: Nanley Chery 

> diff --git a/src/intel/vulkan/anv_blorp.c b/src/intel/vulkan/anv_blorp.c
> index cb61070..f149f84 100644
> --- a/src/intel/vulkan/anv_blorp.c
> +++ b/src/intel/vulkan/anv_blorp.c
> @@ -480,6 +480,20 @@ void anv_CmdBlitImage(
> blorp_batch_finish();
>  }
>  
> +static enum isl_format
> +isl_format_for_size(unsigned size_B)
> +{
> +   switch (size_B) {
> +   case 1:  return ISL_FORMAT_R8_UINT;
> +   case 2:  return ISL_FORMAT_R8G8_UINT;
> +   case 4:  return ISL_FORMAT_R8G8B8A8_UINT;
> +   case 8:  return ISL_FORMAT_R16G16B16A16_UINT;
> +   case 16: return ISL_FORMAT_R32G32B32A32_UINT;
> +   default:
> +  unreachable("Not a power-of-two format size");
> +   }
> +}
> +
>  static void
>  do_buffer_copy(struct blorp_batch *batch,
> struct anv_bo *src, uint64_t src_offset,
> @@ -491,16 +505,7 @@ do_buffer_copy(struct blorp_batch *batch,
> /* The actual format we pick doesn't matter as blorp will throw it away.
>  * The only thing that actually matters is the size.
>  */
> -   enum isl_format format;
> -   switch (block_size) {
> -   case 1:  format = ISL_FORMAT_R8_UINT;  break;
> -   case 2:  format = ISL_FORMAT_R8G8_UINT;break;
> -   case 4:  format = ISL_FORMAT_R8G8B8A8_UNORM;   break;
> -   case 8:  format = ISL_FORMAT_R16G16B16A16_UNORM;   break;
> -   case 16: format = ISL_FORMAT_R32G32B32A32_UINT;break;
> -   default:
> -  unreachable("Not a power-of-two format size");
> -   }
> +   enum isl_format format = isl_format_for_size(block_size);
>  
> struct isl_surf surf;
> isl_surf_init(>isl_dev, ,
> @@ -667,6 +672,87 @@ void anv_CmdUpdateBuffer(
> blorp_batch_finish();
>  }
>  
> +void anv_CmdFillBuffer(
> +VkCommandBuffer commandBuffer,
> +VkBufferdstBuffer,
> +VkDeviceSizedstOffset,
> +VkDeviceSizefillSize,
> +uint32_tdata)
> +{
> +   ANV_FROM_HANDLE(anv_cmd_buffer, cmd_buffer, commandBuffer);
> +   ANV_FROM_HANDLE(anv_buffer, dst_buffer, dstBuffer);
> +   struct blorp_surf surf;
> +   struct isl_surf isl_surf;
> +
> +   struct blorp_batch batch;
> +   blorp_batch_init(_buffer->device->blorp, , cmd_buffer);
> +
> +   if (fillSize == VK_WHOLE_SIZE) {
> +  fillSize = dst_buffer->size - dstOffset;
> +  /* Make sure fillSize is a multiple of 4 */
> +  fillSize &= ~3ull;
> +   }
> +
> +   /* First, we compute the biggest format that can be used with the
> +* given offsets and size.
> +*/
> +   int bs = 16;
> +   bs = gcd_pow2_u64(bs, dstOffset);
> +   bs = gcd_pow2_u64(bs, fillSize);
> +   enum isl_format isl_format = isl_format_for_size(bs);
> +
> +   union isl_color_value color = {
> +  .u32 = { data, data, data, data },
> +   };
> +
> +   const uint64_t max_fill_size = MAX_SURFACE_DIM * MAX_SURFACE_DIM * bs;
> +   while (fillSize >= max_fill_size) {
> +  get_blorp_surf_for_anv_buffer(cmd_buffer->device,
> +dst_buffer, dstOffset,
> +MAX_SURFACE_DIM, MAX_SURFACE_DIM,
> +MAX_SURFACE_DIM * bs, isl_format,
> +, _surf);
> +
> +  blorp_clear(, , isl_format, ISL_SWIZZLE_IDENTITY,
> +  0, 0, 1, 0, 0, MAX_SURFACE_DIM, MAX_SURFACE_DIM,
> +  color, NULL);
> +  fillSize -= max_fill_size;
> +  dstOffset += max_fill_size;
> +   }
> +
> +   uint64_t height = fillSize / (MAX_SURFACE_DIM * bs);
> +   assert(height < MAX_SURFACE_DIM);
> +   if (height != 0) {
> +  const uint64_t rect_fill_size = height * MAX_SURFACE_DIM * bs;
> +  get_blorp_surf_for_anv_buffer(cmd_buffer->device,
> +dst_buffer, dstOffset,
> +MAX_SURFACE_DIM, height,
> +MAX_SURFACE_DIM * bs, isl_format,
> +, _surf);
> +
> +  blorp_clear(, , isl_format, ISL_SWIZZLE_IDENTITY,
> +  0, 0, 1, 0, 0, MAX_SURFACE_DIM, height,
> +  color, NULL);
> +  fillSize -= rect_fill_size;
> +  dstOffset += rect_fill_size;
> +   }
> +
> +   if (fillSize != 0) {
> +  const uint32_t width = fillSize / bs;
> +  get_blorp_surf_for_anv_buffer(cmd_buffer->device,
> +dst_buffer, dstOffset,
> +width, 1,
> +   

[Mesa-dev] [Bug 98048] Mesa CANNOT use libpthread-stubs

2016-10-04 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=98048

Ian Romanick  changed:

   What|Removed |Added

 CC||jfons...@vmware.com,
   ||jon.tur...@dronecode.org.uk
   ||, matts...@gmail.com

-- 
You are receiving this mail because:
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 98048] Mesa CANNOT use libpthread-stubs

2016-10-04 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=98048

Bug ID: 98048
   Summary: Mesa CANNOT use libpthread-stubs
   Product: Mesa
   Version: git
  Hardware: Other
OS: All
Status: NEW
  Severity: normal
  Priority: medium
 Component: Mesa core
  Assignee: mesa-dev@lists.freedesktop.org
  Reporter: i...@freedesktop.org
QA Contact: mesa-dev@lists.freedesktop.org

At least as far back as ecaa81b, Mesa has used pthread_mutexattr_init and
pthread_mutexattr_settype.  Neither of these functions is available in libc6,
so both are supplied by libpthread-stubs.  However, the implementations in
libpthread-stubs don't do anything.  That's a huge problem because
pthread_mutex_init expects that pthread_mutexattr_t to contain valid data.

I have observed two adverse affects of this in piglit runs on multiple systems.

1. On softpipe, there are numerous sporadic crashes inside malloc or free (see
file:///home/idr/devel/graphics/piglit-results/results/problems.html).  There
are also numerous valgrind warnings like:

==27195== Conditional jump or move depends on uninitialised value(s)
==27195==at 0x99E82A2: pthread_mutex_init (in
/usr/lib64/libpthread-2.22.so)
==27195==by 0xBA9EA1D: mtx_init (threads_posix.h:217)
==27195==by 0xBA9EA1D: _mesa_alloc_shared_state (shared.c:122)
==27195==by 0xB9DA749: _mesa_initialize_context (context.c:1188)
==27195==by 0xBB4478F: st_create_context (st_context.c:544)
==27195==by 0xBB6C64D: st_api_create_context (st_manager.c:669)
==27195==by 0xBCA6DD6: dri_create_context (dri_context.c:123)
==27195==by 0xBCA63EE: driCreateContextAttribs (dri_util.c:448)
==27195==by 0x52515E8: drisw_create_context_attribs (drisw_glx.c:476)
==27195==by 0x522B5A2: glXCreateContextAttribsARB (create_context.c:78)
==27195==by 0x6413353: ??? (in /usr/lib64/libwaffle-1.so.0.5.0)
==27195==by 0x640F394: waffle_context_create (in
/usr/lib64/libwaffle-1.so.0.5.0)
==27195==by 0x4F72238: make_context_current_singlepass
(piglit_wfl_framework.c:476)


2. On NV20 I have observed semi-random deadlocks from within meta.  I have a
patch to work around these, but I now believe the real problem is the context
shared texture mutex is not being created PTHREAD_MUTEX_RECURSIVE because
pthread_mutexattr_settype does nothing.

piglit tests link with libpthread.  I have confirmed that the versions from
libpthread-stubs are still called from within Mesa:

[idr@dynamic104 piglit]$ LIBGL_ALWAYS_SOFTWARE=y gdb --args bin/asmparsertest
ARBvp1.0 tests/asmparsertest/shaders/ARBvp1.0/sne-02.txt -auto -fbo
GNU gdb (GDB) Fedora 7.10.1-31.fc23
Copyright (C) 2015 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later 
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
.
Find the GDB manual and other documentation resources online at:
.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from bin/asmparsertest...done.
(gdb) b pthread_mutexattr_init
Function "pthread_mutexattr_init" not defined.
Make breakpoint pending on future shared library load? (y or [n]) y
Breakpoint 1 (pthread_mutexattr_init) pending.
(gdb) r
Starting program: /home/idr/devel/graphics/piglit/bin/asmparsertest ARBvp1.0
/home/idr/devel/graphics/piglit/tests/asmparsertest/shaders/ARBvp1.0/sne-02.txt
-auto -fbo
Missing separate debuginfos, use: dnf debuginfo-install
glibc-2.22-18.fc23.x86_64
warning: Unable to find libthread_db matching inferior's thread library, thread
debugging will not be available.
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
couldn't open libtxc_dxtn.so, software DXTn compression/decompression
unavailable

Breakpoint 1, __pthread_zero_stub () at stubs.c:203
203 }
(gdb) bt
#0  __pthread_zero_stub () at stubs.c:203
#1  0x70fb6a06 in mtx_init (type=4, mtx=0x6fb9d8) at
../../include/c11/threads_posix.h:215
#2  _mesa_alloc_shared_state (ctx=ctx@entry=0x77fd0010) at
main/shared.c:122
#3  0x70ef274a in _mesa_initialize_context
(ctx=ctx@entry=0x77fd0010, api=api@entry=API_OPENGL_COMPAT,
visual=visual@entry=0x7fffd140, share_list=share_list@entry=0x0,
driverFunctions=driverFunctions@entry=0x7fffcbe0) at main/context.c:1188
#4  0x7105c790 in st_create_context (api=api@entry=API_OPENGL_COMPAT,
pipe=pipe@entry=0x63a2e0, visual=visual@entry=0x7fffd140,
share=share@entry=0x0, options=options@entry=0x7fffd278) at

Re: [Mesa-dev] [PATCH V2 08/11] anv/cmd_buffer: Add code for performing HZ operations

2016-10-04 Thread Chad Versace
On Tue 27 Sep 2016, Nanley Chery wrote:
> On Tue, Sep 27, 2016 at 11:00:14AM -0700, Chad Versace wrote:
> > On Mon 26 Sep 2016, Nanley Chery wrote:
> > > Create a function that performs one of three HiZ operations -
> > > depth/stencil clears, HiZ resolve, and depth resolves.
> > > 
> > > Signed-off-by: Nanley Chery 
> > > 
> > > ---
> > > 
> > > v2. Add documentation
> > > Fix the alignment check
> > > Don't minify clear rectangle (Jason)
> > > Use blorp enums (Jason)
> > > Enable depth stalls and flushes
> > > Use full RT rectangle for resolve ops
> > > Add stencil clear todo
> > > 
> > >  src/intel/vulkan/anv_genX.h|   3 +
> > >  src/intel/vulkan/gen7_cmd_buffer.c |   6 ++
> > >  src/intel/vulkan/gen8_cmd_buffer.c | 167 
> > > +
> > >  3 files changed, 176 insertions(+)
> > 
> > 
> > 
> > > +/**
> > > + * Emit the HZ_OP packet in the sequence specified by the BDW PRM section
> > > + * entitled: "Optimized Depth Buffer Clear and/or Stencil Buffer Clear."
> > > + *
> > > + * \todo Enable Stencil Buffer-only clears
> > > + */
> > > +void
> > > +genX(cmd_buffer_do_hz_op)(struct anv_cmd_buffer *cmd_buffer,
> > > +  enum blorp_hiz_op op)
> > > +{
> > 
> > All other "emission" functions in gen8_cmd_buffer.c are named
> > gen8_cmd_buffer_emit_foo(). I think this funtion should be named
> > gen8_cmd_buffer_emit_hz_op for consistency.
> > 
> 
> Sounds good. I'll fix that in the v3.

Ok.
> 
> > > +   struct anv_cmd_state *cmd_state = _buffer->state;
> > > +   const struct anv_image_view *iview =
> > > +  anv_cmd_buffer_get_depth_stencil_view(cmd_buffer);
> > > +
> > > +   if (iview == NULL || !anv_image_has_hiz(iview->image))
> > > +  return;
> > 
> > Shouldn't this check for subpass_count > 1, like the previous patches
> > do?
> > 
> 
> The following patch in the series adds this check. [...]

Ok.

> > > +  if (op != BLORP_HIZ_OP_DEPTH_CLEAR) {
> > > + /* The Optimized HiZ resolve rectangle must be the size of the 
> > > full RT
> > > +  * and aligned to 8x4. The non-optimized Depth resolve 
> > > rectangle must
> > > +  * be the size of the full RT. The same alignment is assumed to 
> > > be
> > > +  * required.
> > > +  *
> > > +  * TODO:
> > > +  * Consider changing halign of non-D16 depth formats to 8 as 
> > > mip 2 may
> > > +  * get clobbered.
> > 
> > Jason and I did some experiments on BDW and SKL. The SKL hardware aligns
> > the hiz surface correctly for all miplevels, so the clobbered-miplevel-2
> > issue is a non-issue. If I recall correctly, BDW hardware also
> > eliminates the clobbered-miplevel-2 issue; but I'm not 100% sure, so ask
> > Jason. Pre-gen8 definitely suffers from the clobbered-miplevel-2 issue.
> > It would be very very good to list in the comment which hardware does
> > and does not suffer from the issue, as that's not documented anywhere.
> > 
> 
> I'd also like to have these findings documented as well. However, collecting
> that data would require creating new tests and testing on platforms that are
> outside the scope of this series (single-miplevel, gen8+ HiZ). I wouldn't
> mind doing this in a series where multiple mip-level support is added though.
> Would it be better to keep the TODO task locally and omit the comment?

I see no harm in keeping the TODO in the code. I also see no harm in
removing. It's up to you.

> 
> > > +  */
> > 
> > For readability, please explicity do
> > 
> > hzp.ClearRectangleXMin = 0;
> > hzp.ClearRectangleYMin = 0;
> > 
> 
> This will be present in the v3.

Ok.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] aubinator: use the correct format specifier for printing ptrdiff_t.

2016-10-04 Thread Timothy Arceri
Reviewed-by: Timothy Arceri 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH V2 05/11] anv: Allocate hiz surface

2016-10-04 Thread Jason Ekstrand
On Tue, Oct 4, 2016 at 3:48 PM, Nanley Chery  wrote:

> On Mon, Oct 03, 2016 at 04:27:53PM -0700, Jason Ekstrand wrote:
> > On Mon, Sep 26, 2016 at 5:10 PM, Nanley Chery 
> wrote:
> >
> > > From: Chad Versace 
> > >
> > > Nanley Chery:
> > > (rebase)
> > >  - Use isl_surf_get_hiz_surf()
> > > (amend)
> > >  - Only add a HiZ surface onto a depth/stencil attachment
> > >  - Add comment above HiZ surface addition
> > >  - Hide HiZ behind INTEL_VK_HIZ prior to BDW
> > >  - Disable HiZ for untested cases
> > >  - Remove DISABLE_AUX_BIT instead of preventing it from being added
> > >
> > > Signed-off-by: Nanley Chery 
> > > Reviewed-by: Jason Ekstrand 
> > > Reviewed-by: Chad Versace  (v1)
> > >
> > > ---
> > >
> > > v2: Disable certain HiZ cases here (Jason)
> > >
> > >  src/intel/vulkan/anv_image.c | 39 ++
> ++---
> > >  1 file changed, 36 insertions(+), 3 deletions(-)
> > >
> > > diff --git a/src/intel/vulkan/anv_image.c
> b/src/intel/vulkan/anv_image.c
> > > index f6e8672..d408819 100644
> > > --- a/src/intel/vulkan/anv_image.c
> > > +++ b/src/intel/vulkan/anv_image.c
> > > @@ -28,6 +28,7 @@
> > >  #include 
> > >
> > >  #include "anv_private.h"
> > > +#include "util/debug.h"
> > >
> > >  #include "vk_format_info.h"
> > >
> > > @@ -60,6 +61,7 @@ choose_isl_surf_usage(VkImageUsageFlags vk_usage,
> > >default:
> > >   unreachable("bad VkImageAspect");
> > >case VK_IMAGE_ASPECT_DEPTH_BIT:
> > > + isl_usage &= ~ISL_SURF_USAGE_DISABLE_AUX_BIT;
> > >   isl_usage |= ISL_SURF_USAGE_DEPTH_BIT;
> > >   break;
> > >case VK_IMAGE_ASPECT_STENCIL_BIT:
> > > @@ -99,6 +101,16 @@ get_surface(struct anv_image *image,
> > > VkImageAspectFlags aspect)
> > > }
> > >  }
> > >
> > > +static void
> > > +add_surface(struct anv_image *image, struct anv_surface *surf)
> > > +{
> > > +   assert(surf->isl.size > 0); /* isl surface must be initialized */
> > > +
> > > +   surf->offset = align_u32(image->size, surf->isl.alignment);
> > > +   image->size = surf->offset + surf->isl.size;
> > > +   image->alignment = MAX(image->alignment, surf->isl.alignment);
> > > +}
> > > +
> > >  /**
> > >   * Initialize the anv_image::*_surface selected by \a aspect. Then
> update
> > > the
> > >   * image's memory requirements (that is, the image's size and
> alignment).
> > > @@ -160,9 +172,30 @@ make_surface(const struct anv_device *dev,
> > >  */
> > > assert(ok);
> > >
> > > -   anv_surf->offset = align_u32(image->size, anv_surf->isl.alignment);
> > > -   image->size = anv_surf->offset + anv_surf->isl.size;
> > > -   image->alignment = MAX(image->alignment, anv_surf->isl.alignment);
> > > +   add_surface(image, anv_surf);
> > >
> >
> > In my CCS series, I split this in two and had a precursor patch that
> added
> > the add_surface helper.  Do with that information what you will.  I'm
> fine
> > with having it all in one patch.
> >
> >
> > > +
> > > +   /* Allow the user to control HiZ enabling. Disable by default on
> gen7
> > > +* because resolves are not currently implemented pre-BDW.
> > > +*/
> > > +   if (!env_var_as_boolean("INTEL_VK_HIZ", dev->info.gen >= 8)) {
> > > +  anv_finishme("Implement gen7 HiZ");
> > > +  return VK_SUCCESS;
> > > +   } else if (vk_info->mipLevels > 1) {
> > > +  anv_finishme("Test multi-LOD HiZ");
> > > +  return VK_SUCCESS;
> > > +   } else if (dev->info.gen == 8 && vk_info->samples > 1) {
> > > +  anv_finishme("Test gen8 multisampled HiZ");
> > > +  return VK_SUCCESS;
> > > +   }
> > >
> >
> > It may be better to pull this (and the usage check below) into an
> > image_supports_hiz helper.  The early returns work for now but as soon as
> > we have multiple kinds of aux surfaces, we'll have to do that refactor
> > anyway.
> >
> >
>
> I'll nest the hunk above into the following if statement. That should
> prevent conflicts with other aux surfaces.
>

I suppose that works.  If it's a problem, we can always do that refactor
later.


> > > +
> > > +   /* Add a HiZ surface to a depth buffer that will be used for
> rendering.
> > > +*/
> > > +   if (aspect == VK_IMAGE_ASPECT_DEPTH_BIT &&
> > > +   (image->usage & VK_IMAGE_USAGE_DEPTH_STENCIL_ATTACHMENT_BIT))
> {
>
> -Nanley
>
> > > +  isl_surf_get_hiz_surf(>isl_dev, >depth_surface.isl,
> > > +>hiz_surface.isl);
> > > +  add_surface(image, >hiz_surface);
> > > +   }
> > >
> > > return VK_SUCCESS;
> > >  }
> > > --
> > > 2.10.0
> > >
> > > ___
> > > mesa-dev mailing list
> > > mesa-dev@lists.freedesktop.org
> > > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> > >
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org

Re: [Mesa-dev] [PATCH V2 10/11] genX/cmd_buffer: Enable fast depth clears

2016-10-04 Thread Chad Versace
On Tue 04 Oct 2016, Nanley Chery wrote:
> On Mon, Oct 03, 2016 at 06:21:30PM -0700, Jason Ekstrand wrote:
> > On Mon, Oct 3, 2016 at 6:11 PM, Jason Ekstrand  wrote:
> > 
> > > On Tue, Sep 27, 2016 at 3:23 PM, Nanley Chery 
> > > wrote:
> > >
> > >> On Tue, Sep 27, 2016 at 03:12:17PM -0700, Chad Versace wrote:
> > >> > On Tue 27 Sep 2016, Nanley Chery wrote:
> > >> > > On Tue, Sep 27, 2016 at 11:00:21AM -0700, Chad Versace wrote:
> > >> >
> > >> > > > As a consequence of that reasoning, we should set
> > >> 3DSTATE_CLEAR_PARAMS.DepthClearValueValid = 1
> > >> > > > whenever hiz is enabled, even if we don't care about the actual
> > >> clear value.
> > >>
> > >
> > > The logic seems to imply that we can't trust the context to save/restore
> > > our depth clear value so we have to set it every time.  At the very least,
> > > once per batch?  In any case, I doubt there's all that much cost involved
> > > in emitting 3DSTATE_CLEAR_PARAMS so I don't think re-emitting it is that
> > > big of a deal.
> > >
> > 
> > Thinking about it a bit more...
> > 
> > We only set up dept/stencil packets once per subpass and we only do clears
> > once per subpass so... I don't think we're actually saving anything by
> > emitting it at clear time rather than at depth/stencil setup time.  It is a
> > bit more convenient because the clear values may be more accessible at
> > clear time.
> > 
> > As far as "should we emit 3DSTATE_CLEAR_PARAMS all the time?"  Let's not go
> > to any heroics to try and avoid re-emitting it.  Once per subpass is not a
> > big deal at all.
> > 
> 
> I wouldn't say the code to implement it was complex, but I'm fine with
> trading off efficiency for simplicity here. I'll add a comment describing the
> situation.

I'm happy with this conclusion.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 97952] /usr/include/string.h:518:12: error: exception specification in declaration does not match previous declaration

2016-10-04 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=97952

--- Comment #3 from Timothy Arceri  ---
I believe its because the ffs param in string.h is an int while the one in
bitscan.h is unsigned.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH V2 05/11] anv: Allocate hiz surface

2016-10-04 Thread Nanley Chery
On Mon, Oct 03, 2016 at 04:27:53PM -0700, Jason Ekstrand wrote:
> On Mon, Sep 26, 2016 at 5:10 PM, Nanley Chery  wrote:
> 
> > From: Chad Versace 
> >
> > Nanley Chery:
> > (rebase)
> >  - Use isl_surf_get_hiz_surf()
> > (amend)
> >  - Only add a HiZ surface onto a depth/stencil attachment
> >  - Add comment above HiZ surface addition
> >  - Hide HiZ behind INTEL_VK_HIZ prior to BDW
> >  - Disable HiZ for untested cases
> >  - Remove DISABLE_AUX_BIT instead of preventing it from being added
> >
> > Signed-off-by: Nanley Chery 
> > Reviewed-by: Jason Ekstrand 
> > Reviewed-by: Chad Versace  (v1)
> >
> > ---
> >
> > v2: Disable certain HiZ cases here (Jason)
> >
> >  src/intel/vulkan/anv_image.c | 39 ---
> >  1 file changed, 36 insertions(+), 3 deletions(-)
> >
> > diff --git a/src/intel/vulkan/anv_image.c b/src/intel/vulkan/anv_image.c
> > index f6e8672..d408819 100644
> > --- a/src/intel/vulkan/anv_image.c
> > +++ b/src/intel/vulkan/anv_image.c
> > @@ -28,6 +28,7 @@
> >  #include 
> >
> >  #include "anv_private.h"
> > +#include "util/debug.h"
> >
> >  #include "vk_format_info.h"
> >
> > @@ -60,6 +61,7 @@ choose_isl_surf_usage(VkImageUsageFlags vk_usage,
> >default:
> >   unreachable("bad VkImageAspect");
> >case VK_IMAGE_ASPECT_DEPTH_BIT:
> > + isl_usage &= ~ISL_SURF_USAGE_DISABLE_AUX_BIT;
> >   isl_usage |= ISL_SURF_USAGE_DEPTH_BIT;
> >   break;
> >case VK_IMAGE_ASPECT_STENCIL_BIT:
> > @@ -99,6 +101,16 @@ get_surface(struct anv_image *image,
> > VkImageAspectFlags aspect)
> > }
> >  }
> >
> > +static void
> > +add_surface(struct anv_image *image, struct anv_surface *surf)
> > +{
> > +   assert(surf->isl.size > 0); /* isl surface must be initialized */
> > +
> > +   surf->offset = align_u32(image->size, surf->isl.alignment);
> > +   image->size = surf->offset + surf->isl.size;
> > +   image->alignment = MAX(image->alignment, surf->isl.alignment);
> > +}
> > +
> >  /**
> >   * Initialize the anv_image::*_surface selected by \a aspect. Then update
> > the
> >   * image's memory requirements (that is, the image's size and alignment).
> > @@ -160,9 +172,30 @@ make_surface(const struct anv_device *dev,
> >  */
> > assert(ok);
> >
> > -   anv_surf->offset = align_u32(image->size, anv_surf->isl.alignment);
> > -   image->size = anv_surf->offset + anv_surf->isl.size;
> > -   image->alignment = MAX(image->alignment, anv_surf->isl.alignment);
> > +   add_surface(image, anv_surf);
> >
> 
> In my CCS series, I split this in two and had a precursor patch that added
> the add_surface helper.  Do with that information what you will.  I'm fine
> with having it all in one patch.
> 
> 
> > +
> > +   /* Allow the user to control HiZ enabling. Disable by default on gen7
> > +* because resolves are not currently implemented pre-BDW.
> > +*/
> > +   if (!env_var_as_boolean("INTEL_VK_HIZ", dev->info.gen >= 8)) {
> > +  anv_finishme("Implement gen7 HiZ");
> > +  return VK_SUCCESS;
> > +   } else if (vk_info->mipLevels > 1) {
> > +  anv_finishme("Test multi-LOD HiZ");
> > +  return VK_SUCCESS;
> > +   } else if (dev->info.gen == 8 && vk_info->samples > 1) {
> > +  anv_finishme("Test gen8 multisampled HiZ");
> > +  return VK_SUCCESS;
> > +   }
> >
> 
> It may be better to pull this (and the usage check below) into an
> image_supports_hiz helper.  The early returns work for now but as soon as
> we have multiple kinds of aux surfaces, we'll have to do that refactor
> anyway.
> 
> 

I'll nest the hunk above into the following if statement. That should
prevent conflicts with other aux surfaces.

> > +
> > +   /* Add a HiZ surface to a depth buffer that will be used for rendering.
> > +*/
> > +   if (aspect == VK_IMAGE_ASPECT_DEPTH_BIT &&
> > +   (image->usage & VK_IMAGE_USAGE_DEPTH_STENCIL_ATTACHMENT_BIT)) {

-Nanley

> > +  isl_surf_get_hiz_surf(>isl_dev, >depth_surface.isl,
> > +>hiz_surface.isl);
> > +  add_surface(image, >hiz_surface);
> > +   }
> >
> > return VK_SUCCESS;
> >  }
> > --
> > 2.10.0
> >
> > ___
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> >
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] radv initial submission

2016-10-04 Thread Dave Airlie
On 4 October 2016 at 20:47, Albert Freeman  wrote:
> It might be a good idea to consider moving the nir -> llvm code out of
> the amd folder for reuse by a Vulkan software rasterizer or something.
> I have heard another developers recent interest in developing a Vulkan
> software rasterizer. I at one stage was interested in developing a
> Vulkan software rasterizer but ended up just buying a computer with
> hardware Vulkan support. These days I am focused more towards
> compositors and other just above gpu driver stuff (esp. since
> everything in mesa looks to be going smoothly with the current
> developers except for OpenCL and not as many driver tweaking options
> as there could be and lack of visually attractive guis for that and
> Nvidia and other drivers etc). Of course some of the nir -> llvm code
> is for radeons (meaning it takes a bit of effort to move out) and it
> could be a while before a Vulkan software rasterizer is developed and
> it might not even use llvm (but use of llvm does seem likely).

I don't think the nir->llvm code it that splittable, it uses a lot of
radeon specific
intrinsics, until a second user shows up who isn't me I doubt it's worth it
just yet.

>
> The Vulkan drivers are missing out on the gallium HUD and postprocess
> from a users perspective. But gallium-less does help with stepping
> through Vulkan drivers in a debugger. And perhaps makes it easier to
> understand the driver.

These things would be implemented as Vulkan layers outside the scope
of the drivers. For the HUD we'd have to add some query APIs to Vulkan
for the HUD rendering layer to use but that wouldn't be too crazy a job.

The main problem I forsee with vulkan layers is that, it's someone else's
job to do that so nobody will ever do it.

Dave.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/4] i965/sync: Replace 'intel' prefix with 'brw'

2016-10-04 Thread Chad Versace
This is yet another patch for the great renaming begun long ago.
---
 src/mesa/drivers/dri/i965/brw_context.c   |  2 +-
 src/mesa/drivers/dri/i965/brw_context.h   |  2 +-
 src/mesa/drivers/dri/i965/intel_syncobj.c | 70 +++
 3 files changed, 37 insertions(+), 37 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_context.c 
b/src/mesa/drivers/dri/i965/brw_context.c
index db63d92..7a39fe2 100644
--- a/src/mesa/drivers/dri/i965/brw_context.c
+++ b/src/mesa/drivers/dri/i965/brw_context.c
@@ -444,7 +444,7 @@ brw_init_driver_functions(struct brw_context *brw,
intelInitBufferFuncs(functions);
intelInitPixelFuncs(functions);
intelInitBufferObjectFuncs(functions);
-   intel_init_syncobj_functions(functions);
+   brw_init_syncobj_functions(functions);
brw_init_object_purgeable_functions(functions);
 
brwInitFragProgFuncs( functions );
diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
b/src/mesa/drivers/dri/i965/brw_context.h
index b27fe51..a737c2d 100644
--- a/src/mesa/drivers/dri/i965/brw_context.h
+++ b/src/mesa/drivers/dri/i965/brw_context.h
@@ -1604,7 +1604,7 @@ extern int intel_translate_stencil_op(GLenum op);
 extern int intel_translate_logic_op(GLenum opcode);
 
 /* intel_syncobj.c */
-void intel_init_syncobj_functions(struct dd_function_table *functions);
+void brw_init_syncobj_functions(struct dd_function_table *functions);
 
 /* gen6_sol.c */
 struct gl_transform_feedback_object *
diff --git a/src/mesa/drivers/dri/i965/intel_syncobj.c 
b/src/mesa/drivers/dri/i965/intel_syncobj.c
index 4276f3f..cecf3c3 100644
--- a/src/mesa/drivers/dri/i965/intel_syncobj.c
+++ b/src/mesa/drivers/dri/i965/intel_syncobj.c
@@ -52,8 +52,8 @@ struct brw_fence {
bool signalled;
 };
 
-struct intel_gl_sync_object {
-   struct gl_sync_object Base;
+struct brw_gl_sync {
+   struct gl_sync_object gl;
struct brw_fence fence;
 };
 
@@ -169,80 +169,80 @@ brw_fence_server_wait(struct brw_context *brw, struct 
brw_fence *fence)
 }
 
 static struct gl_sync_object *
-intel_gl_new_sync_object(struct gl_context *ctx, GLuint id)
+brw_gl_new_sync(struct gl_context *ctx, GLuint id)
 {
-   struct intel_gl_sync_object *sync;
+   struct brw_gl_sync *sync;
 
sync = calloc(1, sizeof(*sync));
if (!sync)
   return NULL;
 
-   return >Base;
+   return >gl;
 }
 
 static void
-intel_gl_delete_sync_object(struct gl_context *ctx, struct gl_sync_object *s)
+brw_gl_delete_sync(struct gl_context *ctx, struct gl_sync_object *_sync)
 {
-   struct intel_gl_sync_object *sync = (struct intel_gl_sync_object *)s;
+   struct brw_gl_sync *sync = (struct brw_gl_sync *) _sync;
 
brw_fence_finish(>fence);
free(sync);
 }
 
 static void
-intel_gl_fence_sync(struct gl_context *ctx, struct gl_sync_object *s,
-GLenum condition, GLbitfield flags)
+brw_gl_fence_sync(struct gl_context *ctx, struct gl_sync_object *_sync,
+  GLenum condition, GLbitfield flags)
 {
struct brw_context *brw = brw_context(ctx);
-   struct intel_gl_sync_object *sync = (struct intel_gl_sync_object *)s;
+   struct brw_gl_sync *sync = (struct brw_gl_sync *) _sync;
 
brw_fence_init(brw, >fence);
brw_fence_insert(brw, >fence);
 }
 
 static void
-intel_gl_client_wait_sync(struct gl_context *ctx, struct gl_sync_object *s,
-  GLbitfield flags, GLuint64 timeout)
+brw_gl_client_wait_sync(struct gl_context *ctx, struct gl_sync_object *_sync,
+GLbitfield flags, GLuint64 timeout)
 {
struct brw_context *brw = brw_context(ctx);
-   struct intel_gl_sync_object *sync = (struct intel_gl_sync_object *)s;
+   struct brw_gl_sync *sync = (struct brw_gl_sync *) _sync;
 
if (brw_fence_client_wait(brw, >fence, timeout))
-  s->StatusFlag = 1;
+  sync->gl.StatusFlag = 1;
 }
 
 static void
-intel_gl_server_wait_sync(struct gl_context *ctx, struct gl_sync_object *s,
+brw_gl_server_wait_sync(struct gl_context *ctx, struct gl_sync_object *_sync,
   GLbitfield flags, GLuint64 timeout)
 {
struct brw_context *brw = brw_context(ctx);
-   struct intel_gl_sync_object *sync = (struct intel_gl_sync_object *)s;
+   struct brw_gl_sync *sync = (struct brw_gl_sync *) _sync;
 
brw_fence_server_wait(brw, >fence);
 }
 
 static void
-intel_gl_check_sync(struct gl_context *ctx, struct gl_sync_object *s)
+brw_gl_check_sync(struct gl_context *ctx, struct gl_sync_object *_sync)
 {
-   struct intel_gl_sync_object *sync = (struct intel_gl_sync_object *)s;
+   struct brw_gl_sync *sync = (struct brw_gl_sync *) _sync;
 
if (brw_fence_has_completed(>fence))
-  s->StatusFlag = 1;
+  sync->gl.StatusFlag = 1;
 }
 
 void
-intel_init_syncobj_functions(struct dd_function_table *functions)
+brw_init_syncobj_functions(struct dd_function_table *functions)
 {
-   functions->NewSyncObject = intel_gl_new_sync_object;
-   functions->DeleteSyncObject = intel_gl_delete_sync_object;
-   functions->FenceSync = intel_gl_fence_sync;
-   

[Mesa-dev] [PATCH 3/4] i965/sync: Rename intel_syncobj.c -> brw_sync.c

2016-10-04 Thread Chad Versace
---
 src/mesa/drivers/dri/i965/Makefile.sources| 2 +-
 src/mesa/drivers/dri/i965/brw_context.h   | 2 +-
 src/mesa/drivers/dri/i965/{intel_syncobj.c => brw_sync.c} | 0
 3 files changed, 2 insertions(+), 2 deletions(-)
 rename src/mesa/drivers/dri/i965/{intel_syncobj.c => brw_sync.c} (100%)

diff --git a/src/mesa/drivers/dri/i965/Makefile.sources 
b/src/mesa/drivers/dri/i965/Makefile.sources
index df90cb4..4917358 100644
--- a/src/mesa/drivers/dri/i965/Makefile.sources
+++ b/src/mesa/drivers/dri/i965/Makefile.sources
@@ -154,6 +154,7 @@ i965_FILES = \
brw_state_upload.c \
brw_structs.h \
brw_surface_formats.c \
+   brw_sync.c \
brw_tcs.c \
brw_tcs_surface_state.c \
brw_tes.c \
@@ -239,7 +240,6 @@ i965_FILES = \
intel_screen.c \
intel_screen.h \
intel_state.c \
-   intel_syncobj.c \
intel_tex.c \
intel_tex_copy.c \
intel_tex.h \
diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
b/src/mesa/drivers/dri/i965/brw_context.h
index a737c2d..dcda574 100644
--- a/src/mesa/drivers/dri/i965/brw_context.h
+++ b/src/mesa/drivers/dri/i965/brw_context.h
@@ -1603,7 +1603,7 @@ extern int intel_translate_compare_func(GLenum func);
 extern int intel_translate_stencil_op(GLenum op);
 extern int intel_translate_logic_op(GLenum opcode);
 
-/* intel_syncobj.c */
+/* brw_sync.c */
 void brw_init_syncobj_functions(struct dd_function_table *functions);
 
 /* gen6_sol.c */
diff --git a/src/mesa/drivers/dri/i965/intel_syncobj.c 
b/src/mesa/drivers/dri/i965/brw_sync.c
similarity index 100%
rename from src/mesa/drivers/dri/i965/intel_syncobj.c
rename to src/mesa/drivers/dri/i965/brw_sync.c
-- 
2.10.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 0/4] i965: Fixes and cleanups for intel_syncobj.c

2016-10-04 Thread Chad Versace
I'm preparing to implement EGL_ANDROID_native_fence_sync, and I wanted
to land these fixes and cleanups before doing the real work.

Patch 1 is a bugfix. The other patches are cleanups.

This series lives at 
http://git.kiwitree.net/cgit/~chadv/mesa/log/?h=review/brw-sync-v02

Chad Versace (4):
  i965/sync: Fix uninitalized usage and leak of mutex
  i965/sync: Replace 'intel' prefix with 'brw'
  i965/sync: Rename intel_syncobj.c -> brw_sync.c
  i965/sync: Rename awkward variable

 src/mesa/drivers/dri/i965/Makefile.sources |  2 +-
 src/mesa/drivers/dri/i965/brw_context.c|  2 +-
 src/mesa/drivers/dri/i965/brw_context.h|  4 +-
 .../dri/i965/{intel_syncobj.c => brw_sync.c}   | 90 --
 4 files changed, 54 insertions(+), 44 deletions(-)
 rename src/mesa/drivers/dri/i965/{intel_syncobj.c => brw_sync.c} (73%)

-- 
2.10.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 4/4] i965/sync: Rename awkward variable

2016-10-04 Thread Chad Versace
What is the difference between a 'driver_fence' and a 'fence'? Do the
characters 'driver_' add anything helpful? Nope. They do, though, add an
extra 7 chars and pull your eyeballs away to ask "huh? what's that?" one
microsecond too many.
---
 src/mesa/drivers/dri/i965/brw_sync.c | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_sync.c 
b/src/mesa/drivers/dri/i965/brw_sync.c
index cecf3c3..1df5610 100644
--- a/src/mesa/drivers/dri/i965/brw_sync.c
+++ b/src/mesa/drivers/dri/i965/brw_sync.c
@@ -258,27 +258,27 @@ brw_dri_create_fence(__DRIcontext *ctx)
 }
 
 static void
-brw_dri_destroy_fence(__DRIscreen *dri_screen, void *driver_fence)
+brw_dri_destroy_fence(__DRIscreen *dri_screen, void *_fence)
 {
-   struct brw_fence *fence = driver_fence;
+   struct brw_fence *fence = _fence;
 
brw_fence_finish(fence);
free(fence);
 }
 
 static GLboolean
-brw_dri_client_wait_sync(__DRIcontext *ctx, void *driver_fence, unsigned flags,
+brw_dri_client_wait_sync(__DRIcontext *ctx, void *_fence, unsigned flags,
  uint64_t timeout)
 {
-   struct brw_fence *fence = driver_fence;
+   struct brw_fence *fence = _fence;
 
return brw_fence_client_wait(fence->brw, fence, timeout);
 }
 
 static void
-brw_dri_server_wait_sync(__DRIcontext *ctx, void *driver_fence, unsigned flags)
+brw_dri_server_wait_sync(__DRIcontext *ctx, void *_fence, unsigned flags)
 {
-   struct brw_fence *fence = driver_fence;
+   struct brw_fence *fence = _fence;
 
/* We might be called here with a NULL fence as a result of WaitSyncKHR
 * on a EGL_KHR_reusable_sync fence. Nothing to do here in such case.
-- 
2.10.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/4] i965/sync: Fix uninitalized usage and leak of mutex

2016-10-04 Thread Chad Versace
We locked an unitialized mutex in the callstack
glClientWaitSync
intel_gl_client_wait_sync
brw_fence_client_wait_sync
because we forgot to initialize it in intel_gl_fence_sync.
(The EGLSync codepath didn't have this bug. It initialized the mutex in
intel_dri_create_sync).

We also forgot to tear down (mtx_destroy) the mutex when destroying
the sync object.

Cc: mesa-sta...@lists.freedesktop.org
---
 src/mesa/drivers/dri/i965/intel_syncobj.c | 14 --
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/intel_syncobj.c 
b/src/mesa/drivers/dri/i965/intel_syncobj.c
index dfda448..4276f3f 100644
--- a/src/mesa/drivers/dri/i965/intel_syncobj.c
+++ b/src/mesa/drivers/dri/i965/intel_syncobj.c
@@ -58,10 +58,20 @@ struct intel_gl_sync_object {
 };
 
 static void
+brw_fence_init(struct brw_context *brw, struct brw_fence *fence)
+{
+   fence->brw = brw;
+   fence->batch_bo = NULL;
+   mtx_init(>mutex, mtx_plain);
+}
+
+static void
 brw_fence_finish(struct brw_fence *fence)
 {
if (fence->batch_bo)
   drm_intel_bo_unreference(fence->batch_bo);
+
+   mtx_destroy(>mutex);
 }
 
 static void
@@ -186,6 +196,7 @@ intel_gl_fence_sync(struct gl_context *ctx, struct 
gl_sync_object *s,
struct brw_context *brw = brw_context(ctx);
struct intel_gl_sync_object *sync = (struct intel_gl_sync_object *)s;
 
+   brw_fence_init(brw, >fence);
brw_fence_insert(brw, >fence);
 }
 
@@ -240,8 +251,7 @@ intel_dri_create_fence(__DRIcontext *ctx)
if (!fence)
   return NULL;
 
-   mtx_init(>mutex, mtx_plain);
-   fence->brw = brw;
+   brw_fence_init(brw, fence);
brw_fence_insert(brw, fence);
 
return fence;
-- 
2.10.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] gallium/radeon/winsyses: set reasonable max_alloc_size

2016-10-04 Thread Marek Olšák
From: Marek Olšák 

which is returned for GL_MAX_TEXTURE_BUFFER_SIZE.
It doesn't have any other use at the moment.
Bigger allocations are not rejected.

This fixes GL45-CTS.texture_buffer.texture_buffer_max_size on Bonaire.
---
 src/gallium/winsys/amdgpu/drm/amdgpu_winsys.c | 2 +-
 src/gallium/winsys/radeon/drm/radeon_drm_winsys.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/gallium/winsys/amdgpu/drm/amdgpu_winsys.c 
b/src/gallium/winsys/amdgpu/drm/amdgpu_winsys.c
index c28e1ca..1deb8bd 100644
--- a/src/gallium/winsys/amdgpu/drm/amdgpu_winsys.c
+++ b/src/gallium/winsys/amdgpu/drm/amdgpu_winsys.c
@@ -312,21 +312,21 @@ static bool do_winsys_init(struct amdgpu_winsys *ws, int 
fd)
}
 
/* Set which chips have dedicated VRAM. */
ws->info.has_dedicated_vram =
   !(ws->amdinfo.ids_flags & AMDGPU_IDS_FLAGS_FUSION);
 
/* Set hardware information. */
ws->info.gart_size = gtt.heap_size;
ws->info.vram_size = vram.heap_size;
/* TODO: the kernel reports vram/gart.max_allocation == 251 MB (bug?) */
-   ws->info.max_alloc_size = MAX2(ws->info.vram_size, ws->info.gart_size);
+   ws->info.max_alloc_size = MAX2(ws->info.vram_size, ws->info.gart_size) * 
0.9;
/* convert the shader clock from KHz to MHz */
ws->info.max_shader_clock = ws->amdinfo.max_engine_clk / 1000;
ws->info.max_se = ws->amdinfo.num_shader_engines;
ws->info.max_sh_per_se = ws->amdinfo.num_shader_arrays_per_engine;
ws->info.has_uvd = uvd.available_rings != 0;
ws->info.uvd_fw_version =
  uvd.available_rings ? uvd_version : 0;
ws->info.vce_fw_version =
  vce.available_rings ? vce_version : 0;
ws->info.has_userptr = true;
diff --git a/src/gallium/winsys/radeon/drm/radeon_drm_winsys.c 
b/src/gallium/winsys/radeon/drm/radeon_drm_winsys.c
index c7ceee2..515f5cc 100644
--- a/src/gallium/winsys/radeon/drm/radeon_drm_winsys.c
+++ b/src/gallium/winsys/radeon/drm/radeon_drm_winsys.c
@@ -366,21 +366,21 @@ static bool do_winsys_init(struct radeon_drm_winsys *ws)
 retval = drmCommandWriteRead(ws->fd, DRM_RADEON_GEM_INFO,
 _info, sizeof(gem_info));
 if (retval) {
 fprintf(stderr, "radeon: Failed to get MM info, error number %d\n",
 retval);
 return false;
 }
 ws->info.gart_size = gem_info.gart_size;
 ws->info.vram_size = gem_info.vram_size;
 
-ws->info.max_alloc_size = MAX2(ws->info.vram_size, ws->info.gart_size);
+ws->info.max_alloc_size = MAX2(ws->info.vram_size, ws->info.gart_size) * 
0.7;
 if (ws->info.drm_minor < 40)
 ws->info.max_alloc_size = MIN2(ws->info.max_alloc_size, 256*1024*1024);
 
 /* Get max clock frequency info and convert it to MHz */
 radeon_get_drm_value(ws->fd, RADEON_INFO_MAX_SCLK, NULL,
  >info.max_shader_clock);
 ws->info.max_shader_clock /= 1000;
 
 radeon_get_drm_value(ws->fd, RADEON_INFO_SI_BACKEND_ENABLED_MASK, NULL,
  >info.enabled_rb_mask);
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] ddebug: dump most driver information with GALLIUM_DDEBUG=always

2016-10-04 Thread Marek Olšák
From: Marek Olšák 

---
 src/gallium/drivers/ddebug/dd_draw.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/src/gallium/drivers/ddebug/dd_draw.c 
b/src/gallium/drivers/ddebug/dd_draw.c
index 511daf4..970712c 100644
--- a/src/gallium/drivers/ddebug/dd_draw.c
+++ b/src/gallium/drivers/ddebug/dd_draw.c
@@ -1103,21 +1103,25 @@ dd_after_draw(struct dd_context *dctx, struct dd_call 
*call)
 /* Terminate the process to prevent future hangs. */
 dd_kill_process();
  }
  break;
   case DD_DETECT_HANGS_PIPELINED:
  dd_pipelined_process_draw(dctx, call);
  break;
   case DD_DUMP_ALL_CALLS:
  if (!dscreen->no_flush)
 pipe->flush(pipe, NULL, 0);
- dd_write_report(dctx, call, 0, false);
+ dd_write_report(dctx, call,
+ PIPE_DUMP_CURRENT_STATES |
+ PIPE_DUMP_CURRENT_SHADERS |
+ PIPE_DUMP_LAST_COMMAND_BUFFER,
+ false);
  break;
   case DD_DUMP_APITRACE_CALL:
  if (dscreen->apitrace_dump_call ==
  dctx->draw_state.apitrace_call_number) {
 dd_write_report(dctx, call,
 PIPE_DUMP_CURRENT_STATES |
 PIPE_DUMP_CURRENT_SHADERS,
 false);
 /* No need to continue. */
 exit(0);
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/4] radeonsi: add assertions to validate interpolation flags

2016-10-04 Thread Marek Olšák
From: Marek Olšák 

---
 src/gallium/drivers/radeonsi/si_state_shaders.c | 34 +
 1 file changed, 34 insertions(+)

diff --git a/src/gallium/drivers/radeonsi/si_state_shaders.c 
b/src/gallium/drivers/radeonsi/si_state_shaders.c
index 9662625..f6bd129 100644
--- a/src/gallium/drivers/radeonsi/si_state_shaders.c
+++ b/src/gallium/drivers/radeonsi/si_state_shaders.c
@@ -686,20 +686,54 @@ static void si_shader_ps(struct si_shader *shader)
 
/* we need to enable at least one of them, otherwise we hang the GPU */
assert(G_0286CC_PERSP_SAMPLE_ENA(input_ena) ||
   G_0286CC_PERSP_CENTER_ENA(input_ena) ||
   G_0286CC_PERSP_CENTROID_ENA(input_ena) ||
   G_0286CC_PERSP_PULL_MODEL_ENA(input_ena) ||
   G_0286CC_LINEAR_SAMPLE_ENA(input_ena) ||
   G_0286CC_LINEAR_CENTER_ENA(input_ena) ||
   G_0286CC_LINEAR_CENTROID_ENA(input_ena) ||
   G_0286CC_LINE_STIPPLE_TEX_ENA(input_ena));
+   /* POS_W_FLOAT_ENA requires one of the perspective weights. */
+   assert(!G_0286CC_POS_W_FLOAT_ENA(input_ena) ||
+  G_0286CC_PERSP_SAMPLE_ENA(input_ena) ||
+  G_0286CC_PERSP_CENTER_ENA(input_ena) ||
+  G_0286CC_PERSP_CENTROID_ENA(input_ena) ||
+  G_0286CC_PERSP_PULL_MODEL_ENA(input_ena));
+
+   /* Validate interpolation optimization flags (read as implications). */
+   assert(!shader->key.ps.prolog.bc_optimize_for_persp ||
+  (G_0286CC_PERSP_CENTER_ENA(input_ena) &&
+   G_0286CC_PERSP_CENTROID_ENA(input_ena)));
+   assert(!shader->key.ps.prolog.bc_optimize_for_linear ||
+  (G_0286CC_LINEAR_CENTER_ENA(input_ena) &&
+   G_0286CC_LINEAR_CENTROID_ENA(input_ena)));
+   assert(!shader->key.ps.prolog.force_persp_center_interp ||
+  (!G_0286CC_PERSP_SAMPLE_ENA(input_ena) &&
+   !G_0286CC_PERSP_CENTROID_ENA(input_ena)));
+   assert(!shader->key.ps.prolog.force_linear_center_interp ||
+  (!G_0286CC_LINEAR_SAMPLE_ENA(input_ena) &&
+   !G_0286CC_LINEAR_CENTROID_ENA(input_ena)));
+   assert(!shader->key.ps.prolog.force_persp_sample_interp ||
+  (!G_0286CC_PERSP_CENTER_ENA(input_ena) &&
+   !G_0286CC_PERSP_CENTROID_ENA(input_ena)));
+   assert(!shader->key.ps.prolog.force_linear_sample_interp ||
+  (!G_0286CC_LINEAR_CENTER_ENA(input_ena) &&
+   !G_0286CC_LINEAR_CENTROID_ENA(input_ena)));
+
+   /* Validate cases when the optimizations are off (read as 
implications). */
+   assert(shader->key.ps.prolog.bc_optimize_for_persp ||
+  !G_0286CC_PERSP_CENTER_ENA(input_ena) ||
+  !G_0286CC_PERSP_CENTROID_ENA(input_ena));
+   assert(shader->key.ps.prolog.bc_optimize_for_linear ||
+  !G_0286CC_LINEAR_CENTER_ENA(input_ena) ||
+  !G_0286CC_LINEAR_CENTROID_ENA(input_ena));
 
pm4 = si_get_shader_pm4_state(shader);
if (!pm4)
return;
 
/* SPI_BARYC_CNTL.POS_FLOAT_LOCATION
 * Possible vaules:
 * 0 -> Position = pixel center
 * 1 -> Position = pixel centroid
 * 2 -> Position = at sample position
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 4/4] radeonsi: fix interpolateAt opcodes for .zw components

2016-10-04 Thread Marek Olšák
From: Marek Olšák 

Not returning garbage in .zw seems pretty important.

This fixes:
GL45-CTS.shader_multisample_interpolation.render.interpolate_at_*_check.*

Cc: 11.2 12.0 
---
 src/gallium/drivers/radeonsi/si_shader.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
b/src/gallium/drivers/radeonsi/si_shader.c
index 590ae64..f856064 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -5201,21 +5201,21 @@ static void build_interp_intrinsic(const struct 
lp_build_tgsi_action *action,
 
temp2 = LLVMBuildFAdd(gallivm->builder, temp2, temp1, 
"");
 
ij_out[i] = LLVMBuildBitCast(gallivm->builder,
 temp2, ctx->i32, "");
}
interp_param = lp_build_gather_values(bld_base->base.gallivm, 
ij_out, 2);
}
 
intr_name = interp_param ? "llvm.SI.fs.interp" : "llvm.SI.fs.constant";
-   for (chan = 0; chan < 2; chan++) {
+   for (chan = 0; chan < 4; chan++) {
LLVMValueRef args[4];
LLVMValueRef llvm_chan;
unsigned schan;
 
schan = tgsi_util_get_full_src_register_swizzle(>Src[0], 
chan);
llvm_chan = lp_build_const_int32(gallivm, schan);
 
args[0] = llvm_chan;
args[1] = attr_number;
args[2] = params;
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/4] radeonsi: interpolate colors after interpolation weight shuffling

2016-10-04 Thread Marek Olšák
From: Marek Olšák 

---
 src/gallium/drivers/radeonsi/si_shader.c | 96 
 1 file changed, 48 insertions(+), 48 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
b/src/gallium/drivers/radeonsi/si_shader.c
index 30bf093..590ae64 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -7268,68 +7268,20 @@ static bool si_compile_ps_prolog(struct si_screen 
*sscreen,
/* Select LINEAR_CENTROID. */
for (i = 0; i < 2; i++) {
tmp = LLVMBuildSelect(gallivm->builder, 
bc_optimize,
  center[i], centroid[i], 
"");
ret = LLVMBuildInsertValue(gallivm->builder, 
ret,
   tmp, base + 10 + i, 
"");
}
}
}
 
-   /* Interpolate colors. */
-   for (i = 0; i < 2; i++) {
-   unsigned writemask = (key->ps_prolog.colors_read >> (i * 4)) & 
0xf;
-   unsigned face_vgpr = key->ps_prolog.num_input_sgprs +
-key->ps_prolog.face_vgpr_index;
-   LLVMValueRef interp[2], color[4];
-   LLVMValueRef interp_ij = NULL, prim_mask = NULL, face = NULL;
-
-   if (!writemask)
-   continue;
-
-   /* If the interpolation qualifier is not CONSTANT (-1). */
-   if (key->ps_prolog.color_interp_vgpr_index[i] != -1) {
-   unsigned interp_vgpr = key->ps_prolog.num_input_sgprs +
-  
key->ps_prolog.color_interp_vgpr_index[i];
-
-   /* Get the (i,j) updated by bc_optimize handling. */
-   interp[0] = LLVMBuildExtractValue(gallivm->builder, ret,
- interp_vgpr, "");
-   interp[1] = LLVMBuildExtractValue(gallivm->builder, ret,
- interp_vgpr + 1, "");
-   interp_ij = lp_build_gather_values(gallivm, interp, 2);
-   interp_ij = LLVMBuildBitCast(gallivm->builder, 
interp_ij,
-ctx.v2i32, "");
-   }
-
-   /* Use the absolute location of the input. */
-   prim_mask = LLVMGetParam(func, SI_PS_NUM_USER_SGPR);
-
-   if (key->ps_prolog.states.color_two_side) {
-   face = LLVMGetParam(func, face_vgpr);
-   face = LLVMBuildBitCast(gallivm->builder, face, 
ctx.i32, "");
-   }
-
-   interp_fs_input(,
-   key->ps_prolog.color_attr_index[i],
-   TGSI_SEMANTIC_COLOR, i,
-   key->ps_prolog.num_interp_inputs,
-   key->ps_prolog.colors_read, interp_ij,
-   prim_mask, face, color);
-
-   while (writemask) {
-   unsigned chan = u_bit_scan();
-   ret = LLVMBuildInsertValue(gallivm->builder, ret, 
color[chan],
-  num_params++, "");
-   }
-   }
-
/* Force per-sample interpolation. */
if (key->ps_prolog.states.force_persp_sample_interp) {
unsigned i, base = key->ps_prolog.num_input_sgprs;
LLVMValueRef persp_sample[2];
 
/* Read PERSP_SAMPLE. */
for (i = 0; i < 2; i++)
persp_sample[i] = LLVMGetParam(func, base + i);
/* Overwrite PERSP_CENTER. */
for (i = 0; i < 2; i++)
@@ -7384,20 +7336,68 @@ static bool si_compile_ps_prolog(struct si_screen 
*sscreen,
/* Overwrite LINEAR_SAMPLE. */
for (i = 0; i < 2; i++)
ret = LLVMBuildInsertValue(gallivm->builder, ret,
   linear_center[i], base + 6 + 
i, "");
/* Overwrite LINEAR_CENTROID. */
for (i = 0; i < 2; i++)
ret = LLVMBuildInsertValue(gallivm->builder, ret,
   linear_center[i], base + 10 
+ i, "");
}
 
+   /* Interpolate colors. */
+   for (i = 0; i < 2; i++) {
+   unsigned writemask = (key->ps_prolog.colors_read >> (i * 4)) & 
0xf;
+   unsigned face_vgpr = key->ps_prolog.num_input_sgprs +
+key->ps_prolog.face_vgpr_index;
+   LLVMValueRef interp[2], color[4];
+   LLVMValueRef interp_ij = NULL, prim_mask = NULL, face = NULL;
+
+   if (!writemask)
+   continue;
+
+ 

[Mesa-dev] [PATCH 1/4] tgsi/scan: don't set interp flags for inputs only used by INTERP (v2)

2016-10-04 Thread Marek Olšák
From: Marek Olšák 

(v1 pushed, then reverted)

This fixes 9 randomly failing tests on radeonsi:
  GL45-CTS.shader_multisample_interpolation.render.interpolate_at_centroid.*

v2: use input_interpolate[input] (correct) instead of
input_interpolate[index] (incorrect)
---
 src/gallium/auxiliary/tgsi/tgsi_scan.c | 105 ++---
 1 file changed, 57 insertions(+), 48 deletions(-)

diff --git a/src/gallium/auxiliary/tgsi/tgsi_scan.c 
b/src/gallium/auxiliary/tgsi/tgsi_scan.c
index a3b0d9f..c7745ce 100644
--- a/src/gallium/auxiliary/tgsi/tgsi_scan.c
+++ b/src/gallium/auxiliary/tgsi/tgsi_scan.c
@@ -95,20 +95,21 @@ computes_derivative(unsigned opcode)
 }
 
 
 static void
 scan_instruction(struct tgsi_shader_info *info,
  const struct tgsi_full_instruction *fullinst,
  unsigned *current_depth)
 {
unsigned i;
bool is_mem_inst = false;
+   bool is_interp_instruction = false;
 
assert(fullinst->Instruction.Opcode < TGSI_OPCODE_LAST);
info->opcode_count[fullinst->Instruction.Opcode]++;
 
switch (fullinst->Instruction.Opcode) {
case TGSI_OPCODE_IF:
case TGSI_OPCODE_UIF:
case TGSI_OPCODE_BGNLOOP:
   (*current_depth)++;
   info->max_depth = MAX2(info->max_depth, *current_depth);
@@ -120,20 +121,22 @@ scan_instruction(struct tgsi_shader_info *info,
default:
   break;
}
 
if (fullinst->Instruction.Opcode == TGSI_OPCODE_INTERP_CENTROID ||
fullinst->Instruction.Opcode == TGSI_OPCODE_INTERP_OFFSET ||
fullinst->Instruction.Opcode == TGSI_OPCODE_INTERP_SAMPLE) {
   const struct tgsi_full_src_register *src0 = >Src[0];
   unsigned input;
 
+  is_interp_instruction = true;
+
   if (src0->Register.Indirect && src0->Indirect.ArrayID)
  input = info->input_array_first[src0->Indirect.ArrayID];
   else
  input = src0->Register.Index;
 
   /* For the INTERP opcodes, the interpolation is always
* PERSPECTIVE unless LINEAR is specified.
*/
   switch (info->input_interpolate[input]) {
   case TGSI_INTERPOLATE_COLOR:
@@ -183,43 +186,91 @@ scan_instruction(struct tgsi_shader_info *info,
  if (src->Register.Indirect) {
 for (ind = 0; ind < info->num_inputs; ++ind) {
info->input_usage_mask[ind] |= usage_mask;
 }
  } else {
 assert(ind >= 0);
 assert(ind < PIPE_MAX_SHADER_INPUTS);
 info->input_usage_mask[ind] |= usage_mask;
  }
 
- if (info->processor == PIPE_SHADER_FRAGMENT &&
- !src->Register.Indirect) {
-unsigned name =
-   info->input_semantic_name[src->Register.Index];
-unsigned index =
-   info->input_semantic_index[src->Register.Index];
+ if (info->processor == PIPE_SHADER_FRAGMENT) {
+unsigned name, index, input;
+
+if (src->Register.Indirect && src->Indirect.ArrayID)
+   input = info->input_array_first[src->Indirect.ArrayID];
+else
+   input = src->Register.Index;
+
+name = info->input_semantic_name[input];
+index = info->input_semantic_index[input];
 
 if (name == TGSI_SEMANTIC_POSITION &&
 (src->Register.SwizzleX == TGSI_SWIZZLE_Z ||
  src->Register.SwizzleY == TGSI_SWIZZLE_Z ||
  src->Register.SwizzleZ == TGSI_SWIZZLE_Z ||
  src->Register.SwizzleW == TGSI_SWIZZLE_Z))
info->reads_z = TRUE;
 
 if (name == TGSI_SEMANTIC_COLOR) {
unsigned mask =
   (1 << src->Register.SwizzleX) |
   (1 << src->Register.SwizzleY) |
   (1 << src->Register.SwizzleZ) |
   (1 << src->Register.SwizzleW);
 
info->colors_read |= mask << (index * 4);
 }
+
+/* Process only interpolated varyings. Don't include POSITION.
+ * Don't include integer varyings, because they are not
+ * interpolated. Don't process inputs interpolated by INTERP
+ * opcodes. Those are tracked separately.
+ */
+if ((!is_interp_instruction || i != 0) &&
+(name == TGSI_SEMANTIC_GENERIC ||
+ name == TGSI_SEMANTIC_TEXCOORD ||
+ name == TGSI_SEMANTIC_COLOR ||
+ name == TGSI_SEMANTIC_BCOLOR ||
+ name == TGSI_SEMANTIC_FOG ||
+ name == TGSI_SEMANTIC_CLIPDIST)) {
+   switch (info->input_interpolate[input]) {
+   case TGSI_INTERPOLATE_COLOR:
+   case TGSI_INTERPOLATE_PERSPECTIVE:
+  switch (info->input_interpolate_loc[input]) {
+  case TGSI_INTERPOLATE_LOC_CENTER:
+ info->uses_persp_center = TRUE;
+ break;
+  case 

Re: [Mesa-dev] [PATCH 2/6] i965/sync: Stop cacheing fence's signal status

2016-10-04 Thread Chad Versace
On Sun 02 Oct 2016, Kenneth Graunke wrote:
> On Tuesday, September 27, 2016 11:51:20 PM PDT Chad Versace wrote:
> > Cacheing the signal status complicates the code for questionable
> > performance benefit. I added the cacheing long ago, and I now think it
> > was the wrong decision.
> > 
> > When we later add support for fences based on sync fds (that is, a fd
> > backed by struct sync_file in Linux 4.8), the cacheing becomes even more
> > hairy. So it's best to eliminate it now.
> > ---
> >  src/mesa/drivers/dri/i965/intel_syncobj.c | 27 ++-
> >  1 file changed, 2 insertions(+), 25 deletions(-)
> > 
> 
> Aside from making it faster to answer repeated "is it done yet?" queries
> (which I agree is of dubious value)...this also makes it possible to
> release batch_bo earlier.  Now you have to keep it around until the
> fence object is destroyed.  That seems less than ideal?

That's fair. To reduce the controversy in this patch series, I'll submit
a v2 without this patch.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] radv initial submission

2016-10-04 Thread Dave Airlie
> It would have been great if the build/integration bits were properly
> split, but I won't sweat too much over it.

I thought they were, all the bits outside src/amd are in this patch really.

I suppose I could move the Makefile.am and Makefile.sources into this
patch to make it easier to review.

I'm going to squash merge all of this anyways.

>
> That aside there's a few small suggestions:
>  - please don't use pragma once

why not? lots of mesa uses it, 134 instances in my tree now.

>  - whenever possible reference the source where file X/Y is based on.
>  - (patch 3/4) add a comment that libvulkan-test.la is currently
> unused and/or comment it out.
>  - (patch 3/4) drop empty noinst_HEADERS
>  - (patch 3/4) please have all the sources (.c and .h) sorted
> alphabetically in Makefile.sources - ls + sed helps a lot.
>  - (patch 3/4) add a reference, and perhaps keep it in a radv
> README/TODO file, about the latest anv commit the driver is based
> upon.
> It'll be worth when porting newer anv changes.

So far it isn't really based on an anv commit, so I'm not sure there;'s much
value in that, really only the WSI code is a possibility for sharing, and
possibly not even that. I don't see us merging stuff across and back really
ever.

> As a follow up we might want to check that the common headers don't
> get overwritten at install stage (factor out the vulkan_include*
> and/or move to reuse the Khronos ones).

This would be a problem with anv now wouldn't it?

Dave.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 7/7] egl: Unify the EGLint/EGLAttrib paths in eglCreateSync*

2016-10-04 Thread Chad Versace
On Thu 29 Sep 2016, Emil Velikov wrote:
> On 28 September 2016 at 07:28, Chad Versace  wrote:
> > Pre-patch, there were two code paths for parsing EGLSync attribute
> > lists: one path for old-style EGLint lists, used by eglCreateSyncKHR,
> > and another for new-style EGLAttrib lists, used by eglCreateSync (1.5)
> > and eglCreateSync64 (EGL_KHR_cl_event2).
> >
> Actually we might want to use the same helper instead of
> _eglConvertAttribsToInt for all entry points where the pre-1.5 entry
> point was using EGLint while the EGL 1.5 one uses EGLAttrib.

> In those cases we currently a) loose the upper bits (admittedly they
> aren't used afaics) and b) we'll error out if the user provides an
> empty/null list (not the most useful thing to do, but still).

I don't follow. What exactly are you proposing?
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 7/7] egl: Unify the EGLint/EGLAttrib paths in eglCreateSync*

2016-10-04 Thread Chad Versace
On Thu 29 Sep 2016, Emil Velikov wrote:
> On 28 September 2016 at 07:28, Chad Versace  wrote:
> 
> > +   if (sizeof(int_list[0]) == sizeof(attrib_list[0])) {
> > +  attrib_list = (EGLAttrib *) int_list;
> > +   } else {
> > +  err = _eglConvertIntsToAttribs(int_list, _list);
> > +  if (err != EGL_SUCCESS)
> > + RETURN_EGL_ERROR(disp, err, EGL_NO_SYNC);
> > +   }
> > +
> > +   sync = _eglCreateSync(disp, type, attrib_list, EGL_FALSE,
> >   EGL_BAD_ATTRIBUTE);
> > +
> > +   if ((void *) int_list != (void *) attrib_list)

> Please use the same conditional as above - sizeof(int_list[0]) !=
> sizeof(attrib_list[0]).

Done.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965/l3: Remove redundant is_cherryview check

2016-10-04 Thread Anuj Phogat
On Tue, Oct 4, 2016 at 1:30 PM, Ben Widawsky  wrote:
> All mobile parts (so far) are GT1. The check added extra confusion
> because it appeared Broxton was missing when it wasn't. Replace it with
> a comment.
>
> Alternatively, I'd be willing to add an is_broxton check.
>
> Cc: Francisco Jerez 
> Signed-off-by: Ben Widawsky 
> ---
>  src/intel/common/gen_l3_config.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/src/intel/common/gen_l3_config.c 
> b/src/intel/common/gen_l3_config.c
> index b172ef6..eb4e8ae 100644
> --- a/src/intel/common/gen_l3_config.c
> +++ b/src/intel/common/gen_l3_config.c
> @@ -258,7 +258,8 @@ get_l3_way_size(const struct gen_device_info *devinfo)
> if (devinfo->is_baytrail)
>return 2;
>
> -   else if (devinfo->is_cherryview || devinfo->gt == 1)
> +   /* XXX: Cherryview and Broxton are always gt1 */
> +   else if (devinfo->gt == 1)
>return 4;
>
> else
> --
> 2.10.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Reviewed-by: Anuj Phogat 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] i965/l3: Remove redundant is_cherryview check

2016-10-04 Thread Ben Widawsky
All mobile parts (so far) are GT1. The check added extra confusion
because it appeared Broxton was missing when it wasn't. Replace it with
a comment.

Alternatively, I'd be willing to add an is_broxton check.

Cc: Francisco Jerez 
Signed-off-by: Ben Widawsky 
---
 src/intel/common/gen_l3_config.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/intel/common/gen_l3_config.c b/src/intel/common/gen_l3_config.c
index b172ef6..eb4e8ae 100644
--- a/src/intel/common/gen_l3_config.c
+++ b/src/intel/common/gen_l3_config.c
@@ -258,7 +258,8 @@ get_l3_way_size(const struct gen_device_info *devinfo)
if (devinfo->is_baytrail)
   return 2;
 
-   else if (devinfo->is_cherryview || devinfo->gt == 1)
+   /* XXX: Cherryview and Broxton are always gt1 */
+   else if (devinfo->gt == 1)
   return 4;
 
else
-- 
2.10.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 0/2] Enable aubinator to decode a running application

2016-10-04 Thread Gandikota, Sirisha
I like the cleanup of the code and making it more modular. My only comment is 
please include the patch version number in the subject - I found it very hard 
to look for the latest version of the patches after other's comments were 
addressed.

Thanks
Sirisha

>-Original Message-
>From: Lionel Landwerlin [mailto:llandwer...@gmail.com]
>Sent: Tuesday, October 04, 2016 7:39 AM
>To: mesa-dev@lists.freedesktop.org
>Cc: Landwerlin, Lionel G ; Kristian Høgsberg
>; Gandikota, Sirisha ; Ben
>Widawsky ; Kenneth Graunke 
>Subject: [PATCH 0/2] Enable aubinator to decode a running application
>
>Hi,
>
>Discussing with Kristian about ksim the other week, it came up that it would be
>interesting to be able to look at an application's output while it's running. 
>The end
>goal being that we could remove some hand written code from the driver
>(brw_state_dump.c) and have more complete output.
>
>This series enables aubinator to decode its standard input like it would with
>normal aubdump file.
>
>This change requires a slight modification to intel_aubdump (so it sets up the
>communication between the running application and aubinator) :
>
>https://patchwork.freedesktop.org/patch/113618/
>
>Looking forward to comments from Ben and Kenneth who seem to rely a fair bit
>on brw_state_dump.
>
>Cheers,
>
>Lionel Landwerlin (2):
>  intel: aubinator: generate a standalone binary
>  intel: aubinator: enable loading dumps from standard input
>
> src/intel/Makefile.am   |   1 +
> src/intel/Makefile.aubinator.am |  36 ++
> src/intel/Makefile.sources  |   7 ++
> src/intel/tools/.gitignore  |   5 +
> src/intel/tools/aubinator.c | 253 ++--
> src/intel/tools/decoder.c   |  82 -
> src/intel/tools/decoder.h   |   4 +-
> 7 files changed, 268 insertions(+), 120 deletions(-)  create mode 100644
>src/intel/Makefile.aubinator.am
>
>--
>2.9.3
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 97952] /usr/include/string.h:518:12: error: exception specification in declaration does not match previous declaration

2016-10-04 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=97952

--- Comment #2 from Rob Clark  ---
/me shrugs..

I don't really have a setup for building with clang..  I guess the issue is
introduction of #include "bitscan.h" in mtypes.h.  Although bitscan.h was
already used elsewhere, so not entirely sure why it worked before if this is
breaking things.  I guess something that someone who has a setup to build w/
clang will have to debug.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2] clover: add GetKernelArgInfo (CL 1.2)

2016-10-04 Thread Serge Martin
On Saturday 01 October 2016 15:54:49 Serge Martin wrote:


CC curro

> ---
>  src/gallium/state_trackers/clover/api/kernel.cpp   | 47 --
>  src/gallium/state_trackers/clover/core/kernel.cpp  |  6 +++
>  src/gallium/state_trackers/clover/core/kernel.hpp  |  1 +
>  src/gallium/state_trackers/clover/core/module.hpp  | 19 +--
>  .../state_trackers/clover/llvm/codegen/common.cpp  | 58
> +- .../state_trackers/clover/llvm/metadata.hpp|
> 16 ++
>  .../state_trackers/clover/tgsi/compiler.cpp|  2 +-
>  7 files changed, 141 insertions(+), 8 deletions(-)
> 
> diff --git a/src/gallium/state_trackers/clover/api/kernel.cpp
> b/src/gallium/state_trackers/clover/api/kernel.cpp index 73ba34a..13cfc08
> 100644
> --- a/src/gallium/state_trackers/clover/api/kernel.cpp
> +++ b/src/gallium/state_trackers/clover/api/kernel.cpp
> @@ -192,9 +192,50 @@ clGetKernelWorkGroupInfo(cl_kernel d_kern, cl_device_id
> d_dev, CLOVER_API cl_int
>  clGetKernelArgInfo(cl_kernel d_kern,
> cl_uint idx, cl_kernel_arg_info param,
> -   size_t size, void *r_buf, size_t *r_size) {
> -   CLOVER_NOT_SUPPORTED_UNTIL("1.2");
> -   return CL_KERNEL_ARG_INFO_NOT_AVAILABLE;
> +   size_t size, void *r_buf, size_t *r_size) try {
> +   property_buffer buf { r_buf, size, r_size };
> +   const auto  = obj(d_kern);
> +   const auto args_info = kern.args_info();
> +
> +   if (args_info.size() == 0)
> +  throw error(CL_KERNEL_ARG_INFO_NOT_AVAILABLE);
> +
> +   if (idx >= args_info.size())
> +  throw error(CL_INVALID_ARG_INDEX);
> +
> +   const auto  = args_info[idx];
> +
> +   switch (param) {
> +   case CL_KERNEL_ARG_ADDRESS_QUALIFIER:
> +  buf.as_scalar() =
> + 
> info.address_qualifier; +  break;
> +
> +   case CL_KERNEL_ARG_ACCESS_QUALIFIER:
> +  buf.as_scalar() =
> + 
> info.access_qualifier; +  break;
> +
> +   case CL_KERNEL_ARG_TYPE_NAME:
> +  buf.as_string() = info.type_name;
> +  break;
> +
> +   case CL_KERNEL_ARG_TYPE_QUALIFIER:
> +  buf.as_scalar() = info.type_qualifier;
> +  break;
> +
> +   case CL_KERNEL_ARG_NAME:
> +  buf.as_string() = info.arg_name;
> +  break;
> +
> +   default:
> +  throw error(CL_INVALID_VALUE);
> +   }
> +
> +   return CL_SUCCESS;
> +
> +} catch (error ) {
> +   return e.get();
>  }
> 
>  namespace {
> diff --git a/src/gallium/state_trackers/clover/core/kernel.cpp
> b/src/gallium/state_trackers/clover/core/kernel.cpp index 962f555..18dcd5c
> 100644
> --- a/src/gallium/state_trackers/clover/core/kernel.cpp
> +++ b/src/gallium/state_trackers/clover/core/kernel.cpp
> @@ -140,6 +140,12 @@ kernel::args() const {
> return map(derefs(), _args);
>  }
> 
> +std::vector
> +kernel::args_info() const {
> +   const auto  = program().symbols();
> +   return find(name_equals(_name), syms).args_info;
> +}
> +
>  const module &
>  kernel::module(const command_queue ) const {
> return program().build(q.device()).binary;
> diff --git a/src/gallium/state_trackers/clover/core/kernel.hpp
> b/src/gallium/state_trackers/clover/core/kernel.hpp index 4ba6ff4..aae51bc
> 100644
> --- a/src/gallium/state_trackers/clover/core/kernel.hpp
> +++ b/src/gallium/state_trackers/clover/core/kernel.hpp
> @@ -134,6 +134,7 @@ namespace clover {
> 
>argument_range args();
>const_argument_range args() const;
> +  std::vector args_info() const;
> 
>const intrusive_ref program;
> 
> diff --git a/src/gallium/state_trackers/clover/core/module.hpp
> b/src/gallium/state_trackers/clover/core/module.hpp index 5db0548..5ce9492
> 100644
> --- a/src/gallium/state_trackers/clover/core/module.hpp
> +++ b/src/gallium/state_trackers/clover/core/module.hpp
> @@ -102,16 +102,29 @@ namespace clover {
>   semantic semantic;
>};
> 
> +  struct argument_info {
> + argument_info() { }
> +
> + uint32_t address_qualifier;
> + uint32_t access_qualifier;
> + std::string type_name;
> + uint32_t type_qualifier;
> + std::string arg_name;
> +  };
> +
>struct symbol {
>   symbol(const std::string , resource_id section,
> -size_t offset, const std::vector ) :
> -name(name), section(section), offset(offset), args(args) {
> } - symbol() : name(), section(0), offset(0), args() { }
> +size_t offset, const std::vector ,
> +const std::vector _info) :
> +name(name), section(section), offset(offset),
> +args(args), args_info(args_info) { }
> + symbol() : name(), section(0), offset(0), args(), args_info() { }
> 
>   std::string name;
>   resource_id section;
>   size_t offset;
>   std::vector args;
> + std::vector args_info;
>};
> 
>void serialize(std::ostream 

Re: [Mesa-dev] [PATCH v2 0/2] Add support for some of the missing CL1.2 queries

2016-10-04 Thread Serge Martin
On Saturday 01 October 2016 18:51:09 Serge Martin wrote:
> Updated serie, please review.

CC curro

> 
> Serge Martin (2):
>   clover: add CL_PROGRAM_BINARY_TYPE support (CL1.2)
>   clover: add missing clGetDeviceInfo CL1.2 queries
> 
>  src/gallium/state_trackers/clover/api/device.cpp   | 23
> ++ src/gallium/state_trackers/clover/api/program.cpp  |
>  5 +
>  src/gallium/state_trackers/clover/core/device.cpp  | 10 ++
>  src/gallium/state_trackers/clover/core/device.hpp  |  2 ++
>  src/gallium/state_trackers/clover/core/module.cpp  |  1 +
>  src/gallium/state_trackers/clover/core/module.hpp  |  5 +
>  src/gallium/state_trackers/clover/llvm/codegen.hpp |  3 +++
>  .../state_trackers/clover/llvm/codegen/bitcode.cpp | 11 +--
>  .../state_trackers/clover/llvm/codegen/common.cpp  |  2 +-
>  .../state_trackers/clover/llvm/invocation.cpp  |  2 +-
>  .../state_trackers/clover/tgsi/compiler.cpp|  2 +-
>  11 files changed, 61 insertions(+), 5 deletions(-)

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] clover: clGetExtensionFunctionAddressForPlatform

2016-10-04 Thread Serge Martin
On Saturday 01 October 2016 19:03:11 Serge Martin wrote:
> On Sunday 27 September 2015 11:15:14 Serge Martin wrote:
> > add clGetExtensionFunctionAddressForPlatform (CL 1.2)
> 
> ping (one year reminder :p )

CC curro

> 
> > ---
> > 
> >  src/gallium/state_trackers/clover/api/dispatch.cpp |  2 +-
> >  src/gallium/state_trackers/clover/api/dispatch.hpp |  4 
> >  src/gallium/state_trackers/clover/api/platform.cpp | 16 
> >  3 files changed, 21 insertions(+), 1 deletion(-)
> > 
> > diff --git a/src/gallium/state_trackers/clover/api/dispatch.cpp
> > b/src/gallium/state_trackers/clover/api/dispatch.cpp index
> > f10babe..8f4cfdc
> > 100644
> > --- a/src/gallium/state_trackers/clover/api/dispatch.cpp
> > +++ b/src/gallium/state_trackers/clover/api/dispatch.cpp
> > @@ -131,7 +131,7 @@ namespace clover {
> > 
> >clEnqueueMigrateMemObjects,
> >clEnqueueMarkerWithWaitList,
> >clEnqueueBarrierWithWaitList,
> > 
> > -  NULL, // clGetExtensionFunctionAddressForPlatform
> > +  GetExtensionFunctionAddressForPlatform,
> > 
> >NULL, // clCreateFromGLTexture
> >NULL, // clGetDeviceIDsFromD3D11KHR
> >NULL, // clCreateFromD3D11BufferKHR
> > 
> > diff --git a/src/gallium/state_trackers/clover/api/dispatch.hpp
> > b/src/gallium/state_trackers/clover/api/dispatch.hpp index
> > 7f62282..0ec1b51
> > 100644
> > --- a/src/gallium/state_trackers/clover/api/dispatch.hpp
> > +++ b/src/gallium/state_trackers/clover/api/dispatch.hpp
> > @@ -777,6 +777,10 @@ namespace clover {
> > 
> > void *
> > GetExtensionFunctionAddress(const char *p_name);
> > 
> > +   void *
> > +   GetExtensionFunctionAddressForPlatform(cl_platform_id d_platform,
> > +  const char *p_name);
> > +
> > 
> > cl_int
> > IcdGetPlatformIDsKHR(cl_uint num_entries, cl_platform_id
> > *rd_platforms,
> > 
> >  cl_uint *rnum_platforms);
> > 
> > diff --git a/src/gallium/state_trackers/clover/api/platform.cpp
> > b/src/gallium/state_trackers/clover/api/platform.cpp index
> > cf71593..2bde194
> > 100644
> > --- a/src/gallium/state_trackers/clover/api/platform.cpp
> > +++ b/src/gallium/state_trackers/clover/api/platform.cpp
> > @@ -87,6 +87,16 @@ clover::GetPlatformInfo(cl_platform_id d_platform,
> > cl_platform_info param, }
> > 
> >  void *
> > 
> > +clover::GetExtensionFunctionAddressForPlatform(cl_platform_id d_platform,
> > +   const char *p_name) try {
> > +   obj(d_platform);
> > +   return GetExtensionFunctionAddress(p_name);
> > +
> > +} catch (error ) {
> > +   return NULL;
> > +}
> > +
> > +void *
> > 
> >  clover::GetExtensionFunctionAddress(const char *p_name) {
> >  
> > std::string name { p_name };
> > 
> > @@ -113,6 +123,12 @@ clGetExtensionFunctionAddress(const char *p_name) {
> > 
> > return GetExtensionFunctionAddress(p_name);
> >  
> >  }
> > 
> > +CLOVER_ICD_API void *
> > +clGetExtensionFunctionAddressForPlatform(cl_platform_id d_platform,
> > + const char *p_name) {
> > +   return GetExtensionFunctionAddressForPlatform(d_platform, p_name);
> > +}
> > +
> > 
> >  CLOVER_ICD_API cl_int
> >  clIcdGetPlatformIDsKHR(cl_uint num_entries, cl_platform_id *rd_platforms,
> >  
> > cl_uint *rnum_platforms) {
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 0/2] clover: clEnqueueMigrateMemObjects

2016-10-04 Thread Serge Martin
On Monday 03 October 2016 19:37:53 Serge Martin wrote:
> Ping

CC curro

> 
> On Saturday 12 September 2015 21:08:20 Serge Martin wrote:
> > Now that mem object can be move back to the host, I think we should latter
> > come with a way to optimize read mapping for such objets. For the moment
> > if
> > they are mapped for reading after been moved to the host, they will be
> > send
> > back to the device...
> > 
> > Serge Martin (2):
> >   clover: clEnqueueMigrateMemObjects (device)
> >   clover: clEnqueueMigrateMemObjects (host)
> >  
> >  src/gallium/state_trackers/clover/api/transfer.cpp | 51
> > 
> > ++ src/gallium/state_trackers/clover/core/memory.cpp 
> > |
> > 38 +--- src/gallium/state_trackers/clover/core/memory.hpp  |
> > 16
> > +--
> > 
> >  .../state_trackers/clover/core/resource.cpp|  5 ++-
> >  .../state_trackers/clover/core/resource.hpp|  3 +-
> >  5 files changed, 92 insertions(+), 21 deletions(-)
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH V2 10/11] genX/cmd_buffer: Enable fast depth clears

2016-10-04 Thread Nanley Chery
On Mon, Oct 03, 2016 at 06:21:30PM -0700, Jason Ekstrand wrote:
> On Mon, Oct 3, 2016 at 6:11 PM, Jason Ekstrand  wrote:
> 
> > On Tue, Sep 27, 2016 at 3:23 PM, Nanley Chery 
> > wrote:
> >
> >> On Tue, Sep 27, 2016 at 03:12:17PM -0700, Chad Versace wrote:
> >> > On Tue 27 Sep 2016, Nanley Chery wrote:
> >> > > On Tue, Sep 27, 2016 at 11:00:21AM -0700, Chad Versace wrote:
> >> >
> >> > > > As a consequence of that reasoning, we should set
> >> 3DSTATE_CLEAR_PARAMS.DepthClearValueValid = 1
> >> > > > whenever hiz is enabled, even if we don't care about the actual
> >> clear value.
> >>
> >
> > The logic seems to imply that we can't trust the context to save/restore
> > our depth clear value so we have to set it every time.  At the very least,
> > once per batch?  In any case, I doubt there's all that much cost involved
> > in emitting 3DSTATE_CLEAR_PARAMS so I don't think re-emitting it is that
> > big of a deal.
> >
> 
> Thinking about it a bit more...
> 
> We only set up dept/stencil packets once per subpass and we only do clears
> once per subpass so... I don't think we're actually saving anything by
> emitting it at clear time rather than at depth/stencil setup time.  It is a
> bit more convenient because the clear values may be more accessible at
> clear time.
> 
> As far as "should we emit 3DSTATE_CLEAR_PARAMS all the time?"  Let's not go
> to any heroics to try and avoid re-emitting it.  Once per subpass is not a
> big deal at all.
> 

I wouldn't say the code to implement it was complex, but I'm fine with
trading off efficiency for simplicity here. I'll add a comment describing the
situation.

-Nanley

> --Jason
> 
> 
> > > >
> >> > > In the V3, I plan to emit that packet once at device initialization
> >> time
> >> > > HSW+, and to always emit it (in the expected location) for IVB/BYT.
> >> Only
> >> > > the latter platforms have the restriction that it must always be
> >> > > programmed with the other depth/stencil commands.
> >> >
> >> > Is there any benefit to emitting it multiple times on ivb/byt? Does
> >> > emitting once during initialization, as for hsw, also work for ivb/byt?
> >> > If so, the code is cleaner if the two gens share the same workaround
> >> > code.
> >>
> >> The benefit for emitting it multiple times on IVB/BYT is that we're
> >> (possibly) following the oddly-worded programming note for the packet:
> >>
> >>From the IVB PRM Vol2P1, 11.5.5.4 3DSTATE_CLEAR_PARAMS:
> >>
> >>   3DSTATE_CLEAR_PARAMS must always be programmed in the along with
> >>   the other Depth/Stencil state commands(i.e.  3DSTATE_DEPTH_BUFFER,
> >>   3DSTATE_STENCIL_BUFFER, or 3DSTATE_HIER_DEPTH_BUFFER)
> >>
> >> HSW+ doesn't have this restriction, so we're free to only do it once.
> >>
> >
> >
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH V2 08/11] anv/cmd_buffer: Add code for performing HZ operations

2016-10-04 Thread Nanley Chery
On Mon, Oct 03, 2016 at 05:23:27PM -0700, Jason Ekstrand wrote:
> On Mon, Sep 26, 2016 at 5:10 PM, Nanley Chery  wrote:
> 
> > Create a function that performs one of three HiZ operations -
> > depth/stencil clears, HiZ resolve, and depth resolves.
> >
> > Signed-off-by: Nanley Chery 
> >
> > ---
> >
> > v2. Add documentation
> > Fix the alignment check
> > Don't minify clear rectangle (Jason)
> > Use blorp enums (Jason)
> > Enable depth stalls and flushes
> > Use full RT rectangle for resolve ops
> > Add stencil clear todo
> >
> >  src/intel/vulkan/anv_genX.h|   3 +
> >  src/intel/vulkan/gen7_cmd_buffer.c |   6 ++
> >  src/intel/vulkan/gen8_cmd_buffer.c | 167 ++
> > +++
> >  3 files changed, 176 insertions(+)
> >
> > diff --git a/src/intel/vulkan/anv_genX.h b/src/intel/vulkan/anv_genX.h
> > index 02e79c2..ad3bec9 100644
> > --- a/src/intel/vulkan/anv_genX.h
> > +++ b/src/intel/vulkan/anv_genX.h
> > @@ -58,6 +58,9 @@ genX(emit_urb_setup)(struct anv_device *device, struct
> > anv_batch *batch,
> >   unsigned vs_entry_size, unsigned gs_entry_size,
> >   const struct gen_l3_config *l3_config);
> >
> > +void genX(cmd_buffer_do_hz_op)(struct anv_cmd_buffer *cmd_buffer,
> > +   enum blorp_hiz_op op);
> > +
> >  VkResult
> >  genX(graphics_pipeline_create)(VkDevice _device,
> > struct anv_pipeline_cache *cache,
> > diff --git a/src/intel/vulkan/gen7_cmd_buffer.c
> > b/src/intel/vulkan/gen7_cmd_buffer.c
> > index b627ef0..78b5ac7 100644
> > --- a/src/intel/vulkan/gen7_cmd_buffer.c
> > +++ b/src/intel/vulkan/gen7_cmd_buffer.c
> > @@ -323,6 +323,12 @@ genX(cmd_buffer_flush_dynamic_state)(struct
> > anv_cmd_buffer *cmd_buffer)
> > cmd_buffer->state.dirty = 0;
> >  }
> >
> > +void
> > +genX(cmd_buffer_do_hz_op)(struct anv_cmd_buffer *cmd_buffer,
> > +  enum blorp_hiz_op op)
> > +{
> >
> 
> This should have an anv_finishme in it.
> 
> 

I'll add this in V3.

> > +}
> > +
> >  void genX(CmdSetEvent)(
> >  VkCommandBuffer commandBuffer,
> >  VkEvent event,
> > diff --git a/src/intel/vulkan/gen8_cmd_buffer.c
> > b/src/intel/vulkan/gen8_cmd_buffer.c
> > index 7058608..a13413c 100644
> > --- a/src/intel/vulkan/gen8_cmd_buffer.c
> > +++ b/src/intel/vulkan/gen8_cmd_buffer.c
> > @@ -399,6 +399,173 @@ genX(cmd_buffer_flush_compute_state)(struct
> > anv_cmd_buffer *cmd_buffer)
> > genX(cmd_buffer_apply_pipe_flushes)(cmd_buffer);
> >  }
> >
> > +
> > +/**
> > + * Emit the HZ_OP packet in the sequence specified by the BDW PRM section
> > + * entitled: "Optimized Depth Buffer Clear and/or Stencil Buffer Clear."
> > + *
> > + * \todo Enable Stencil Buffer-only clears
> > + */
> > +void
> > +genX(cmd_buffer_do_hz_op)(struct anv_cmd_buffer *cmd_buffer,
> > +  enum blorp_hiz_op op)
> > +{
> > +   struct anv_cmd_state *cmd_state = _buffer->state;
> > +   const struct anv_image_view *iview =
> > +  anv_cmd_buffer_get_depth_stencil_view(cmd_buffer);
> > +
> > +   if (iview == NULL || !anv_image_has_hiz(iview->image))
> > +  return;
> > +
> > +   const uint32_t ds = cmd_state->subpass->depth_stencil_attachment;
> > +   const bool full_surface_op =
> > + cmd_state->render_area.extent.width == iview->extent.width
> > &&
> > + cmd_state->render_area.extent.height ==
> > iview->extent.height;
> >
> 
> It's probably a bit redundant, but we might as well check
> render_area.offset == 0.  I realize that, from API requirements, if the
> extents match then offset must be 0, but it's not incredibly obvious and
> the check won't hurt that much.
> 
> 

Sure. I'll add an assertion and a comment.

> > +
> > +   /* Validate that we can perform the HZ operation and that it's
> > necessary. */
> > +   switch (op) {
> > +   case BLORP_HIZ_OP_DEPTH_CLEAR:
> > +  if (cmd_buffer->state.pass->attachments[ds].load_op !=
> > +  VK_ATTACHMENT_LOAD_OP_CLEAR)
> > + return;
> > +
> > +  /* Apply alignment restrictions. Despite the BDW PRM mentioning
> > this is
> > +   * only needed for a depth buffer surface type of D16_UNORM, testing
> > +   * showed it to be necessary for other depth formats as well
> > +   * (e.g., D32_FLOAT).
> > +   */
> > +  if (!full_surface_op) {
> > +
> > + struct isl_extent2d px_dim;
> >
> 
> Would it be better to call this hiz_block_size_px?  That follows the ISL
> naming convention a bit better.
> 
> 

The variable naming corresponds to the table headers below. I'll add
a comment about this.

> > +#if GEN_GEN == 8
> >
> 
> Mind making this <= 8?  I know it's in a gen8+ file, but <= 8 makes it
> clear that the else case is >= 9 and not != 8.
> 
> 

I see your point. I'll change the #else to #elif GEN_GEN >=9 as
I don't 

Re: [Mesa-dev] [PATCH] aubinator: Use less -RS instead of -r for the implicit pager.

2016-10-04 Thread Gandikota, Sirisha

>-Original Message-
>From: mesa-dev [mailto:mesa-dev-boun...@lists.freedesktop.org] On Behalf Of
>Kenneth Graunke
>Sent: Monday, October 03, 2016 5:59 PM
>To: mesa-dev@lists.freedesktop.org
>Cc: Kenneth Graunke 
>Subject: [Mesa-dev] [PATCH] aubinator: Use less -RS instead of -r for the 
>implicit
>pager.
>
>From the less man page:
>
>   "Warning: when the -r option is used, less cannot keep track of the
>actual appearance  of  the screen (since this depends  on  how the
>screen responds to each type of control character).  Thus, various
>display problems may result, such as long lines being split in the
>wrong place."
>
>Lines which are too long to fit in the terminal would be word wrapped, but
>unfortunately less would get confused about which line it was on, and text 
>would
>be drawn on top of other text.  The most noticable case was shader assembly,
>which is frequently too wide for an 80 character terminal, and thus would be
>drawn on top of the following state packets, making them completely
>unreadable.
>
>Using -R instead of -r fixes this problem by only allowing color escape 
>sequences.
>(Notably, Git's implicit pager invocation uses -R.) Unfortunately, it means our
>"clear to the end of the line" hack for extending the blue bar headers won't 
>work
>anymore.
>
>Word wrapping usually isn't terribly readable, anyway, so we also add the -S
>option (chop long lines) to restrict it to the terminal width.
>(You can hit the left and right arrow keys to scroll sideways.)
>
>Then, for a new blue bar hack, we can use a printf specifier to pad the command
>packet names to be 80 characters long (arbitrarily), which extends them "far
>enough" to look good, and doesn't require us to use ioctls to determine the
>terminal width.
>
>Signed-off-by: Kenneth Graunke 
>---
> src/intel/tools/aubinator.c | 7 +++
> 1 file changed, 3 insertions(+), 4 deletions(-)
>
>diff --git a/src/intel/tools/aubinator.c b/src/intel/tools/aubinator.c index
>a31dcb2..5666fe3 100644
>--- a/src/intel/tools/aubinator.c
>+++ b/src/intel/tools/aubinator.c
>@@ -48,7 +48,6 @@
> #define CSI "\e["
> #define HEADER CSI "37;44m"
> #define NORMAL CSI "0m"
>-#define CLEAR_TO_EOL CSI "0K"
>
> /* options */
>
>@@ -721,7 +720,7 @@ parse_commands(struct gen_spec *spec, uint32_t
>*cmds, int size, int engine)
>   }
>   length = gen_group_get_length(inst, p);
>
>-  const char *color, *reset_color = CLEAR_TO_EOL NORMAL;
>+  const char *color, *reset_color = NORMAL;
>   uint64_t offset;
>
>   if (option_full_decode)
>@@ -739,7 +738,7 @@ parse_commands(struct gen_spec *spec, uint32_t
>*cmds, int size, int engine)
>   else
>  offset = 0;
>
>-  printf("%s0x%08lx:  0x%08x:  %s%s\n",
>+  printf("%s0x%08lx:  0x%08x:  %-80s%s\n",
>  color, offset, p[0],
>  gen_group_get_name(inst), reset_color);
>
>@@ -1013,7 +1012,7 @@ setup_pager(void)
>if (pid == 0) {
>   close(fds[1]);
>   dup2(fds[0], 0);
>-  execlp("less", "less", "-rFi", NULL);
>+  execlp("less", "less", "-FRSi", NULL);
>}
>
>close(fds[0]);
>--
>2.10.0
>
[SG]  Works for me
Reviewed-by: Sirisha Gandikota 
>___
>mesa-dev mailing list
>mesa-dev@lists.freedesktop.org
>https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glsl: Don't emit ir_binop_carry during ir_binop_imul_high lowering

2016-10-04 Thread Ilia Mirkin
On Tue, Oct 4, 2016 at 2:08 PM, Ian Romanick  wrote:
> From: Ian Romanick 
>
> st_glsl_to_tgsi only calls lower_instructions once (instead of in a
> loop), so the ir_binop_carry generated would not get lowered.  Fixes
> assertion failure
>
> state_tracker/st_glsl_to_tgsi.cpp:2265: void 
> glsl_to_tgsi_visitor::visit_expression(ir_expression*, st_src_reg*): 
> Assertion `!"Invalid ir opcode in glsl_to_tgsi_visitor::visit()"' failed.
>
> on softpipe in 16 piglit tests:
>
> 
> mesa_shader_integer_functions/execution/built-in-functions/fs-imulExtended-nonuniform.shader_test
> 
> mesa_shader_integer_functions/execution/built-in-functions/fs-imulExtended-only-msb-nonuniform.shader_test
> 
> mesa_shader_integer_functions/execution/built-in-functions/fs-imulExtended-only-msb.shader_test
> 
> mesa_shader_integer_functions/execution/built-in-functions/fs-imulExtended.shader_test
> 
> mesa_shader_integer_functions/execution/built-in-functions/fs-umulExtended-nonuniform.shader_test
> 
> mesa_shader_integer_functions/execution/built-in-functions/fs-umulExtended-only-msb-nonuniform.shader_test
> 
> mesa_shader_integer_functions/execution/built-in-functions/fs-umulExtended-only-msb.shader_test
> 
> mesa_shader_integer_functions/execution/built-in-functions/fs-umulExtended.shader_test
> 
> mesa_shader_integer_functions/execution/built-in-functions/vs-imulExtended-nonuniform.shader_test
> 
> mesa_shader_integer_functions/execution/built-in-functions/vs-imulExtended-only-msb-nonuniform.shader_test
> 
> mesa_shader_integer_functions/execution/built-in-functions/vs-imulExtended-only-msb.shader_test
> 
> mesa_shader_integer_functions/execution/built-in-functions/vs-imulExtended.shader_test
> 
> mesa_shader_integer_functions/execution/built-in-functions/vs-umulExtended-nonuniform.shader_test
> 
> mesa_shader_integer_functions/execution/built-in-functions/vs-umulExtended-only-msb-nonuniform.shader_test
> 
> mesa_shader_integer_functions/execution/built-in-functions/vs-umulExtended-only-msb.shader_test
> 
> mesa_shader_integer_functions/execution/built-in-functions/vs-umulExtended.shader_test
>
> Signed-off-by: Ian Romanick 

Reviewed-by: Ilia Mirkin 

> ---
>  src/compiler/glsl/lower_instructions.cpp | 22 +-
>  1 file changed, 17 insertions(+), 5 deletions(-)
>
> diff --git a/src/compiler/glsl/lower_instructions.cpp 
> b/src/compiler/glsl/lower_instructions.cpp
> index a9720f2..372ded1 100644
> --- a/src/compiler/glsl/lower_instructions.cpp
> +++ b/src/compiler/glsl/lower_instructions.cpp
> @@ -166,6 +166,8 @@ private:
> void find_lsb_to_float_cast(ir_expression *ir);
> void find_msb_to_float_cast(ir_expression *ir);
> void imul_high_to_mul(ir_expression *ir);
> +
> +   ir_expression *_carry(operand a, operand b);
>  };
>
>  } /* anonymous namespace */
> @@ -1413,6 +1415,16 @@ 
> lower_instructions_visitor::find_msb_to_float_cast(ir_expression *ir)
> this->progress = true;
>  }
>
> +ir_expression *
> +lower_instructions_visitor::_carry(operand a, operand b)
> +{
> +   if (lowering(CARRY_TO_ARITH))
> +  return i2u(b2i(less(add(a, b),
> +  a.val->clone(ralloc_parent(a.val), NULL;
> +   else
> +  return carry(a, b);
> +}
> +
>  void
>  lower_instructions_visitor::imul_high_to_mul(ir_expression *ir)
>  {
> @@ -1518,11 +1530,11 @@ 
> lower_instructions_visitor::imul_high_to_mul(ir_expression *ir)
> i.insert_before(assign(t2, mul(src1h, src2l)));
> i.insert_before(assign(hi, mul(src1h, src2h)));
>
> -   i.insert_before(assign(hi, add(hi, carry(lo, lshift(t1, c16->clone(ir, 
> NULL));
> -   i.insert_before(assign(lo,   add(lo, lshift(t1, c16->clone(ir, 
> NULL);
> +   i.insert_before(assign(hi, add(hi, _carry(lo, lshift(t1, c16->clone(ir, 
> NULL));
> +   i.insert_before(assign(lo,add(lo, lshift(t1, c16->clone(ir, 
> NULL);
>
> -   i.insert_before(assign(hi, add(hi, carry(lo, lshift(t2, c16->clone(ir, 
> NULL));
> -   i.insert_before(assign(lo,   add(lo, lshift(t2, c16->clone(ir, 
> NULL);
> +   i.insert_before(assign(hi, add(hi, _carry(lo, lshift(t2, c16->clone(ir, 
> NULL));
> +   i.insert_before(assign(lo,add(lo, lshift(t2, c16->clone(ir, 
> NULL);
>
> if (different_signs == NULL) {
>assert(ir->operands[0]->type->base_type == GLSL_TYPE_UINT);
> @@ -1547,7 +1559,7 @@ 
> lower_instructions_visitor::imul_high_to_mul(ir_expression *ir)
>
>i.insert_before(neg_hi);
>i.insert_before(assign(neg_hi, add(bit_not(u2i(hi)),
> - u2i(carry(bit_not(lo), c1);
> + u2i(_carry(bit_not(lo), c1);
>
>ir->operation = ir_triop_csel;
>ir->operands[0] = new(ir) ir_dereference_variable(different_signs);
> --
> 2.5.5
>
> 

Re: [Mesa-dev] [PATCH] intel/blorp: Use documented RECTLIST vertex positions

2016-10-04 Thread Jason Ekstrand
On Tue, Oct 4, 2016 at 10:33 AM, Marek Olšák  wrote:

> On Wed, Sep 21, 2016 at 11:42 PM, Nanley Chery 
> wrote:
> > Use the vertex positions described in the PRMs. This has no effect on
> > rendering but quiets the simulator warnings seen when the vertices
> > appear out of order.
>
> Does it mean that the vertex order doesn't matter for the hardware??


Maybe?  It means that it certainly doesn't matter for blorp but blorp only
uses flat inputs and gl_FragCoord.  It may matter when it comes to things
such as getting the barycentric coordinates right.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] glsl: Don't emit ir_binop_carry during ir_binop_imul_high lowering

2016-10-04 Thread Ian Romanick
From: Ian Romanick 

st_glsl_to_tgsi only calls lower_instructions once (instead of in a
loop), so the ir_binop_carry generated would not get lowered.  Fixes
assertion failure

state_tracker/st_glsl_to_tgsi.cpp:2265: void 
glsl_to_tgsi_visitor::visit_expression(ir_expression*, st_src_reg*): Assertion 
`!"Invalid ir opcode in glsl_to_tgsi_visitor::visit()"' failed.

on softpipe in 16 piglit tests:


mesa_shader_integer_functions/execution/built-in-functions/fs-imulExtended-nonuniform.shader_test

mesa_shader_integer_functions/execution/built-in-functions/fs-imulExtended-only-msb-nonuniform.shader_test

mesa_shader_integer_functions/execution/built-in-functions/fs-imulExtended-only-msb.shader_test

mesa_shader_integer_functions/execution/built-in-functions/fs-imulExtended.shader_test

mesa_shader_integer_functions/execution/built-in-functions/fs-umulExtended-nonuniform.shader_test

mesa_shader_integer_functions/execution/built-in-functions/fs-umulExtended-only-msb-nonuniform.shader_test

mesa_shader_integer_functions/execution/built-in-functions/fs-umulExtended-only-msb.shader_test

mesa_shader_integer_functions/execution/built-in-functions/fs-umulExtended.shader_test

mesa_shader_integer_functions/execution/built-in-functions/vs-imulExtended-nonuniform.shader_test

mesa_shader_integer_functions/execution/built-in-functions/vs-imulExtended-only-msb-nonuniform.shader_test

mesa_shader_integer_functions/execution/built-in-functions/vs-imulExtended-only-msb.shader_test

mesa_shader_integer_functions/execution/built-in-functions/vs-imulExtended.shader_test

mesa_shader_integer_functions/execution/built-in-functions/vs-umulExtended-nonuniform.shader_test

mesa_shader_integer_functions/execution/built-in-functions/vs-umulExtended-only-msb-nonuniform.shader_test

mesa_shader_integer_functions/execution/built-in-functions/vs-umulExtended-only-msb.shader_test

mesa_shader_integer_functions/execution/built-in-functions/vs-umulExtended.shader_test

Signed-off-by: Ian Romanick 
---
 src/compiler/glsl/lower_instructions.cpp | 22 +-
 1 file changed, 17 insertions(+), 5 deletions(-)

diff --git a/src/compiler/glsl/lower_instructions.cpp 
b/src/compiler/glsl/lower_instructions.cpp
index a9720f2..372ded1 100644
--- a/src/compiler/glsl/lower_instructions.cpp
+++ b/src/compiler/glsl/lower_instructions.cpp
@@ -166,6 +166,8 @@ private:
void find_lsb_to_float_cast(ir_expression *ir);
void find_msb_to_float_cast(ir_expression *ir);
void imul_high_to_mul(ir_expression *ir);
+
+   ir_expression *_carry(operand a, operand b);
 };
 
 } /* anonymous namespace */
@@ -1413,6 +1415,16 @@ 
lower_instructions_visitor::find_msb_to_float_cast(ir_expression *ir)
this->progress = true;
 }
 
+ir_expression *
+lower_instructions_visitor::_carry(operand a, operand b)
+{
+   if (lowering(CARRY_TO_ARITH))
+  return i2u(b2i(less(add(a, b),
+  a.val->clone(ralloc_parent(a.val), NULL;
+   else
+  return carry(a, b);
+}
+
 void
 lower_instructions_visitor::imul_high_to_mul(ir_expression *ir)
 {
@@ -1518,11 +1530,11 @@ 
lower_instructions_visitor::imul_high_to_mul(ir_expression *ir)
i.insert_before(assign(t2, mul(src1h, src2l)));
i.insert_before(assign(hi, mul(src1h, src2h)));
 
-   i.insert_before(assign(hi, add(hi, carry(lo, lshift(t1, c16->clone(ir, 
NULL));
-   i.insert_before(assign(lo,   add(lo, lshift(t1, c16->clone(ir, 
NULL);
+   i.insert_before(assign(hi, add(hi, _carry(lo, lshift(t1, c16->clone(ir, 
NULL));
+   i.insert_before(assign(lo,add(lo, lshift(t1, c16->clone(ir, 
NULL);
 
-   i.insert_before(assign(hi, add(hi, carry(lo, lshift(t2, c16->clone(ir, 
NULL));
-   i.insert_before(assign(lo,   add(lo, lshift(t2, c16->clone(ir, 
NULL);
+   i.insert_before(assign(hi, add(hi, _carry(lo, lshift(t2, c16->clone(ir, 
NULL));
+   i.insert_before(assign(lo,add(lo, lshift(t2, c16->clone(ir, 
NULL);
 
if (different_signs == NULL) {
   assert(ir->operands[0]->type->base_type == GLSL_TYPE_UINT);
@@ -1547,7 +1559,7 @@ 
lower_instructions_visitor::imul_high_to_mul(ir_expression *ir)
 
   i.insert_before(neg_hi);
   i.insert_before(assign(neg_hi, add(bit_not(u2i(hi)),
- u2i(carry(bit_not(lo), c1);
+ u2i(_carry(bit_not(lo), c1);
 
   ir->operation = ir_triop_csel;
   ir->operands[0] = new(ir) ir_dereference_variable(different_signs);
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 98042] Git master fails to build with clang++/libc++ on Linux

2016-10-04 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=98042

Vinson Lee  changed:

   What|Removed |Added

 Resolution|--- |DUPLICATE
 Status|NEW |RESOLVED

--- Comment #1 from Vinson Lee  ---


*** This bug has been marked as a duplicate of bug 97952 ***

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 97952] /usr/include/string.h:518:12: error: exception specification in declaration does not match previous declaration

2016-10-04 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=97952

Vinson Lee  changed:

   What|Removed |Added

 CC||kre...@email.com

--- Comment #1 from Vinson Lee  ---
*** Bug 98042 has been marked as a duplicate of this bug. ***

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] util: use GCC atomic intrinsics with explicit memory model

2016-10-04 Thread Nicolai Hähnle

On 04.10.2016 17:50, Jan Vesely wrote:

On Tue, 2016-10-04 at 16:14 +0200, Nicolai Hähnle wrote:

From: Nicolai Hähnle 

This is motivated by the fact that p_atomic_read and p_atomic_set may
somewhat surprisingly not do the right thing in the old version:
while
stores and loads are de facto atomic at least on x86,


afaik, this is only true for naturally aligned loads/stores (even for
x86).


Which is all loads and stores in practice, especially in C code 
(unaligned pointers are undefined behavior IIRC).


Nicolai
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] intel/blorp: Use documented RECTLIST vertex positions

2016-10-04 Thread Marek Olšák
On Wed, Sep 21, 2016 at 11:42 PM, Nanley Chery  wrote:
> Use the vertex positions described in the PRMs. This has no effect on
> rendering but quiets the simulator warnings seen when the vertices
> appear out of order.

Does it mean that the vertex order doesn't matter for the hardware??

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] aubinator: use the correct format specifier for printing ptrdiff_t.

2016-10-04 Thread Anuj Phogat
On Tue, Oct 4, 2016 at 10:01 AM, Kenneth Graunke  wrote:
> Fixes more warnings in 32-bit builds.
>
> Signed-off-by: Kenneth Graunke 
> ---
>  src/intel/tools/aubinator.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/src/intel/tools/aubinator.c b/src/intel/tools/aubinator.c
> index 9b32e5b..27d7647 100644
> --- a/src/intel/tools/aubinator.c
> +++ b/src/intel/tools/aubinator.c
> @@ -942,7 +942,7 @@ aub_file_decode_batch(struct aub_file *file, struct 
> gen_spec *spec)
>bias = 1;
>break;
> default:
> -  printf("unknown opcode %d at %ld/%ld\n",
> +  printf("unknown opcode %d at %td/%td\n",
>   OPCODE(h), file->cursor - file->map,
>   file->end - file->map);
>file->cursor = file->end;
> --
> 2.10.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Reviewed-by: Anuj Phogat 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] aubinator: use the correct format specifier for printing ptrdiff_t.

2016-10-04 Thread Kenneth Graunke
Fixes more warnings in 32-bit builds.

Signed-off-by: Kenneth Graunke 
---
 src/intel/tools/aubinator.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/intel/tools/aubinator.c b/src/intel/tools/aubinator.c
index 9b32e5b..27d7647 100644
--- a/src/intel/tools/aubinator.c
+++ b/src/intel/tools/aubinator.c
@@ -942,7 +942,7 @@ aub_file_decode_batch(struct aub_file *file, struct 
gen_spec *spec)
   bias = 1;
   break;
default:
-  printf("unknown opcode %d at %ld/%ld\n",
+  printf("unknown opcode %d at %td/%td\n",
  OPCODE(h), file->cursor - file->map,
  file->end - file->map);
   file->cursor = file->end;
-- 
2.10.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] intel: aubinator: generate a standalone binary

2016-10-04 Thread Ben Widawsky

On 16-10-04 09:50:55, Kenneth Graunke wrote:

On Tuesday, October 4, 2016 9:26:39 AM PDT Ben Widawsky wrote:

On 16-10-04 15:38:52, Lionel Landwerlin wrote:
>Embed the xml files into the binary, so aubinator can be used from any
>location.
>
>Signed-off-by: Lionel Landwerlin 
>Cc: Sirisha Gandikota 
>---
> src/intel/Makefile.am   |  1 +
> src/intel/Makefile.aubinator.am | 36 +++
> src/intel/Makefile.sources  |  7 +++
> src/intel/tools/.gitignore  |  5 +++
> src/intel/tools/aubinator.c | 97 +
> src/intel/tools/decoder.c   | 82 --
> src/intel/tools/decoder.h   |  4 +-
> 7 files changed, 141 insertions(+), 91 deletions(-)
> create mode 100644 src/intel/Makefile.aubinator.am
>
>diff --git a/src/intel/Makefile.am b/src/intel/Makefile.am
>index 9186b5c..c3cb9fb 100644
>--- a/src/intel/Makefile.am
>+++ b/src/intel/Makefile.am
>@@ -52,6 +52,7 @@ BUILT_SOURCES =
> CLEANFILES =
> EXTRA_DIST =
>
>+include Makefile.aubinator.am
> include Makefile.blorp.am
> include Makefile.common.am
> include Makefile.genxml.am
>diff --git a/src/intel/Makefile.aubinator.am b/src/intel/Makefile.aubinator.am
>new file mode 100644
>index 000..9772700
>--- /dev/null
>+++ b/src/intel/Makefile.aubinator.am
>@@ -0,0 +1,36 @@
>+# Copyright © 2016 Intel Corporation
>+#
>+# Permission is hereby granted, free of charge, to any person obtaining a
>+# copy of this software and associated documentation files (the "Software"),
>+# to deal in the Software without restriction, including without limitation
>+# the rights to use, copy, modify, merge, publish, distribute, sublicense,
>+# and/or sell copies of the Software, and to permit persons to whom the
>+# Software is furnished to do so, subject to the following conditions:
>+#
>+# The above copyright notice and this permission notice (including the next
>+# paragraph) shall be included in all copies or substantial portions of the
>+# Software.
>+#
>+# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
>+# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
>+# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
>+# THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
>+# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
>+# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
>+# IN THE SOFTWARE.
>+
>+BUILT_SOURCES += $(AUBINATOR_GENERATED_FILES)
>+
>+SUFFIXES = _aubinator_xml.h .xml
>+
>+tools/gen6_aubinator_xml.h: genxml/gen6.xml
>+tools/gen7_aubinator_xml.h: genxml/gen7.xml
>+tools/gen75_aubinator_xml.h: genxml/gen75.xml
>+tools/gen8_aubinator_xml.h: genxml/gen8.xml
>+tools/gen9_aubinator_xml.h: genxml/gen9.xml
>+
>+$(AUBINATOR_GENERATED_FILES): Makefile
>+
>+%_aubinator_xml.h:
>+   $(MKDIR_GEN)
>+   $(AM_V_GEN) xxd -i $< > $@
>diff --git a/src/intel/Makefile.sources b/src/intel/Makefile.sources
>index 94073d2..a5c2bf0 100644
>--- a/src/intel/Makefile.sources
>+++ b/src/intel/Makefile.sources
>@@ -1,3 +1,10 @@
>+AUBINATOR_GENERATED_FILES = \
>+   tools/gen6_aubinator_xml.h \
>+   tools/gen7_aubinator_xml.h \
>+   tools/gen75_aubinator_xml.h \
>+   tools/gen8_aubinator_xml.h \
>+   tools/gen9_aubinator_xml.h
>+
> BLORP_FILES = \
>blorp/blorp.c \
>blorp/blorp.h \
>diff --git a/src/intel/tools/.gitignore b/src/intel/tools/.gitignore
>index 0c80a6f..c4eebde 100644
>--- a/src/intel/tools/.gitignore
>+++ b/src/intel/tools/.gitignore
>@@ -1 +1,6 @@
> /aubinator
>+gen6_aubinator_xml.h
>+gen75_aubinator_xml.h
>+gen7_aubinator_xml.h
>+gen8_aubinator_xml.h
>+gen9_aubinator_xml.h
>diff --git a/src/intel/tools/aubinator.c b/src/intel/tools/aubinator.c
>index a31dcb2..83328b5 100644
>--- a/src/intel/tools/aubinator.c
>+++ b/src/intel/tools/aubinator.c
>@@ -35,6 +35,8 @@
> #include 
> #include 
>
>+#include "util/macros.h"
>+
> #include "decoder.h"
> #include "intel_aub.h"
> #include "gen_disasm.h"
>@@ -1059,11 +1061,24 @@ int main(int argc, char *argv[])
> {
>struct gen_spec *spec;
>struct aub_file *file;
>-   int i, pci_id = 0;
>+   int i;
>bool found_arg_gen = false, pager = true;
>-   int gen_major, gen_minor;
>-   const char *value;
>-   char gen_file[256], gen_val[24];
>+   const char *value, *input_file = NULL;
>+   char gen_val[24];
>+   const struct {
>+  const char *name;
>+  int pci_id;
>+   } gens[] = {
>+  { "ivb", 0x0166 }, /* Intel(R) Ivybridge Mobile GT2 */
>+  { "hsw", 0x0416 }, /* Intel(R) Haswell Mobile GT2 */
>+  { "byt", 0x0155 }, /* Intel(R) Bay Trail */
>+  { "bdw", 0x1616 }, /* Intel(R) HD Graphics 5500 (Broadwell GT2) */
>+  { "chv", 0x22B3 }, /* Intel(R) HD Graphics (Cherryview) */
>+  { "skl", 0x1912 }, /* Intel(R) HD Graphics 530 (Skylake GT2) */
>+  { "kbl", 0x591D }, /* Intel(R) Kabylake GT2 */
>+  { "bxt", 0x0A84 } 

Re: [Mesa-dev] [PATCH 1/2] intel: aubinator: generate a standalone binary

2016-10-04 Thread Kenneth Graunke
On Tuesday, October 4, 2016 9:26:39 AM PDT Ben Widawsky wrote:
> On 16-10-04 15:38:52, Lionel Landwerlin wrote:
> >Embed the xml files into the binary, so aubinator can be used from any
> >location.
> >
> >Signed-off-by: Lionel Landwerlin 
> >Cc: Sirisha Gandikota 
> >---
> > src/intel/Makefile.am   |  1 +
> > src/intel/Makefile.aubinator.am | 36 +++
> > src/intel/Makefile.sources  |  7 +++
> > src/intel/tools/.gitignore  |  5 +++
> > src/intel/tools/aubinator.c | 97 
> > +
> > src/intel/tools/decoder.c   | 82 --
> > src/intel/tools/decoder.h   |  4 +-
> > 7 files changed, 141 insertions(+), 91 deletions(-)
> > create mode 100644 src/intel/Makefile.aubinator.am
> >
> >diff --git a/src/intel/Makefile.am b/src/intel/Makefile.am
> >index 9186b5c..c3cb9fb 100644
> >--- a/src/intel/Makefile.am
> >+++ b/src/intel/Makefile.am
> >@@ -52,6 +52,7 @@ BUILT_SOURCES =
> > CLEANFILES =
> > EXTRA_DIST =
> >
> >+include Makefile.aubinator.am
> > include Makefile.blorp.am
> > include Makefile.common.am
> > include Makefile.genxml.am
> >diff --git a/src/intel/Makefile.aubinator.am 
> >b/src/intel/Makefile.aubinator.am
> >new file mode 100644
> >index 000..9772700
> >--- /dev/null
> >+++ b/src/intel/Makefile.aubinator.am
> >@@ -0,0 +1,36 @@
> >+# Copyright © 2016 Intel Corporation
> >+#
> >+# Permission is hereby granted, free of charge, to any person obtaining a
> >+# copy of this software and associated documentation files (the "Software"),
> >+# to deal in the Software without restriction, including without limitation
> >+# the rights to use, copy, modify, merge, publish, distribute, sublicense,
> >+# and/or sell copies of the Software, and to permit persons to whom the
> >+# Software is furnished to do so, subject to the following conditions:
> >+#
> >+# The above copyright notice and this permission notice (including the next
> >+# paragraph) shall be included in all copies or substantial portions of the
> >+# Software.
> >+#
> >+# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> >+# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> >+# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
> >+# THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> >+# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
> >+# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER 
> >DEALINGS
> >+# IN THE SOFTWARE.
> >+
> >+BUILT_SOURCES += $(AUBINATOR_GENERATED_FILES)
> >+
> >+SUFFIXES = _aubinator_xml.h .xml
> >+
> >+tools/gen6_aubinator_xml.h: genxml/gen6.xml
> >+tools/gen7_aubinator_xml.h: genxml/gen7.xml
> >+tools/gen75_aubinator_xml.h: genxml/gen75.xml
> >+tools/gen8_aubinator_xml.h: genxml/gen8.xml
> >+tools/gen9_aubinator_xml.h: genxml/gen9.xml
> >+
> >+$(AUBINATOR_GENERATED_FILES): Makefile
> >+
> >+%_aubinator_xml.h:
> >+$(MKDIR_GEN)
> >+$(AM_V_GEN) xxd -i $< > $@
> >diff --git a/src/intel/Makefile.sources b/src/intel/Makefile.sources
> >index 94073d2..a5c2bf0 100644
> >--- a/src/intel/Makefile.sources
> >+++ b/src/intel/Makefile.sources
> >@@ -1,3 +1,10 @@
> >+AUBINATOR_GENERATED_FILES = \
> >+tools/gen6_aubinator_xml.h \
> >+tools/gen7_aubinator_xml.h \
> >+tools/gen75_aubinator_xml.h \
> >+tools/gen8_aubinator_xml.h \
> >+tools/gen9_aubinator_xml.h
> >+
> > BLORP_FILES = \
> > blorp/blorp.c \
> > blorp/blorp.h \
> >diff --git a/src/intel/tools/.gitignore b/src/intel/tools/.gitignore
> >index 0c80a6f..c4eebde 100644
> >--- a/src/intel/tools/.gitignore
> >+++ b/src/intel/tools/.gitignore
> >@@ -1 +1,6 @@
> > /aubinator
> >+gen6_aubinator_xml.h
> >+gen75_aubinator_xml.h
> >+gen7_aubinator_xml.h
> >+gen8_aubinator_xml.h
> >+gen9_aubinator_xml.h
> >diff --git a/src/intel/tools/aubinator.c b/src/intel/tools/aubinator.c
> >index a31dcb2..83328b5 100644
> >--- a/src/intel/tools/aubinator.c
> >+++ b/src/intel/tools/aubinator.c
> >@@ -35,6 +35,8 @@
> > #include 
> > #include 
> >
> >+#include "util/macros.h"
> >+
> > #include "decoder.h"
> > #include "intel_aub.h"
> > #include "gen_disasm.h"
> >@@ -1059,11 +1061,24 @@ int main(int argc, char *argv[])
> > {
> >struct gen_spec *spec;
> >struct aub_file *file;
> >-   int i, pci_id = 0;
> >+   int i;
> >bool found_arg_gen = false, pager = true;
> >-   int gen_major, gen_minor;
> >-   const char *value;
> >-   char gen_file[256], gen_val[24];
> >+   const char *value, *input_file = NULL;
> >+   char gen_val[24];
> >+   const struct {
> >+  const char *name;
> >+  int pci_id;
> >+   } gens[] = {
> >+  { "ivb", 0x0166 }, /* Intel(R) Ivybridge Mobile GT2 */
> >+  { "hsw", 0x0416 }, /* Intel(R) Haswell Mobile GT2 */
> >+  { "byt", 0x0155 }, /* Intel(R) Bay Trail */
> >+  { "bdw", 0x1616 }, /* Intel(R) HD Graphics 5500 

Re: [Mesa-dev] [PATCH 2/2] intel: aubinator: enable loading dumps from standard input

2016-10-04 Thread Ben Widawsky

On 16-10-04 15:38:53, Lionel Landwerlin wrote:

In conjuction with an intel_aubdump change, you can now look at your
application's output like this :

$ intel_aubdump -c '/path/to/aubinator --gen=hsw' my_gl_app

Signed-off-by: Lionel Landwerlin 
Cc: Sirisha Gandikota 
Cc: Kristian Høgsberg 
---
src/intel/tools/aubinator.c | 162 +++-
1 file changed, 130 insertions(+), 32 deletions(-)

diff --git a/src/intel/tools/aubinator.c b/src/intel/tools/aubinator.c
index 83328b5..73e6012 100644
--- a/src/intel/tools/aubinator.c
+++ b/src/intel/tools/aubinator.c
@@ -834,48 +834,51 @@ handle_trace_block(struct gen_spec *spec, uint32_t *p)
}

struct aub_file {
-   char *filename;
-   int fd;
-   struct stat sb;
+   FILE *stream;
+
   uint32_t *map, *end, *cursor;
+   uint32_t *mem_end;
};

static struct aub_file *
aub_file_open(const char *filename)
{
   struct aub_file *file;
+   struct stat sb;
+   int fd;

-   file = malloc(sizeof *file);
-   file->filename = strdup(filename);
-   file->fd = open(file->filename, O_RDONLY);
-   if (file->fd == -1) {
-  fprintf(stderr, "open %s failed: %s", file->filename, strerror(errno));
+   file = calloc(1, sizeof *file);
+   fd = open(filename, O_RDONLY);
+   if (fd == -1) {
+  fprintf(stderr, "open %s failed: %s", filename, strerror(errno));
  exit(EXIT_FAILURE);
   }

-   if (fstat(file->fd, >sb) == -1) {
+   if (fstat(fd, ) == -1) {
  fprintf(stderr, "stat failed: %s", strerror(errno));
  exit(EXIT_FAILURE);
   }

-   file->map = mmap(NULL, file->sb.st_size,
-PROT_READ, MAP_SHARED, file->fd, 0);
+   file->map = mmap(NULL, sb.st_size,
+PROT_READ, MAP_SHARED, fd, 0);
   if (file->map == MAP_FAILED) {
  fprintf(stderr, "mmap failed: %s", strerror(errno));
  exit(EXIT_FAILURE);
   }

   file->cursor = file->map;
-   file->end = file->map + file->sb.st_size / 4;
+   file->end = file->map + sb.st_size / 4;

-   /* mmap a terabyte for our gtt space. */
-   gtt_size = 1ul << 40;
-   gtt = mmap(NULL, gtt_size, PROT_READ | PROT_WRITE,
-  MAP_PRIVATE | MAP_ANONYMOUS |  MAP_NORESERVE, -1, 0);
-   if (gtt == MAP_FAILED) {
-  fprintf(stderr, "failed to alloc gtt space: %s", strerror(errno));
-  exit(1);
-   }
+   return file;
+}
+
+static struct aub_file *
+aub_file_stdin(void)
+{
+   struct aub_file *file;
+
+   file = calloc(1, sizeof *file);
+   file->stream = stdin;

   return file;
}
@@ -925,12 +928,21 @@ struct {
   { "bxt", MAKE_GEN(9, 0) }
};

-static void
+enum {
+   AUB_ITEM_DECODE_OK,
+   AUB_ITEM_DECODE_FAILED,
+   AUB_ITEM_DECODE_NEED_MORE_DATA,
+};
+
+static int
aub_file_decode_batch(struct aub_file *file, struct gen_spec *spec)
{
-   uint32_t *p, h, device, data_type;
+   uint32_t *p, h, device, data_type, *new_cursor;
   int header_length, payload_size, bias;

+   if (file->end - file->cursor < 12)
+  return AUB_ITEM_DECODE_NEED_MORE_DATA;
+
   p = file->cursor;
   h = *p;
   header_length = h & 0x;
@@ -946,8 +958,7 @@ aub_file_decode_batch(struct aub_file *file, struct 
gen_spec *spec)
  printf("unknown opcode %d at %ld/%ld\n",
 OPCODE(h), file->cursor - file->map,
 file->end - file->map);
-  file->cursor = file->end;
-  return;
+  return AUB_ITEM_DECODE_FAILED;
   }

   payload_size = 0;
@@ -959,9 +970,22 @@ aub_file_decode_batch(struct aub_file *file, struct 
gen_spec *spec)
  payload_size = p[4];
  handle_trace_block(spec, p);
  break;
-   case MAKE_HEADER(TYPE_AUB, OPCODE_AUB, SUBOPCODE_BMP):
+   default:
  break;
+   }
+
+   new_cursor = p + header_length + bias + payload_size / 4;
+   if (new_cursor > file->end)
+  return AUB_ITEM_DECODE_NEED_MORE_DATA;

+   switch (h & 0x) {
+   case MAKE_HEADER(TYPE_AUB, OPCODE_AUB, SUBOPCODE_HEADER):
+  break;
+   case MAKE_HEADER(TYPE_AUB, OPCODE_AUB, SUBOPCODE_BLOCK):
+  handle_trace_block(spec, p);
+  break;
+   case MAKE_HEADER(TYPE_AUB, OPCODE_AUB, SUBOPCODE_BMP):
+  break;
   case MAKE_HEADER(TYPE_AUB, OPCODE_NEW_AUB, SUBOPCODE_VERSION):
  printf("version block: dw1 %08x\n", p[1]);
  device = (p[1] >> 8) & 0xff;
@@ -987,13 +1011,65 @@ aub_file_decode_batch(struct aub_file *file, struct 
gen_spec *spec)
 "subopcode=0x%x (%08x)\n", TYPE(h), OPCODE(h), SUBOPCODE(h), h);
  break;
   }
-   file->cursor = p + header_length + bias + payload_size / 4;
+   file->cursor = new_cursor;
+
+   return AUB_ITEM_DECODE_OK;
}

static int
aub_file_more_stuff(struct aub_file *file)
{
-   return file->cursor < file->end;
+   return file->cursor < file->end || (file->stream && !feof(file->stream));
+}
+
+#define AUB_READ_BUFFER_SIZE (4096)
+#define MAX(a, b) ((a) < (b) ? (b) : (a))
+
+static void
+aub_file_data_grow(struct aub_file *file)
+{
+   size_t old_size = (file->mem_end - file->map) * 4;
+   size_t new_size = MAX(old_size 

Re: [Mesa-dev] [PATCH 2/2] intel: aubinator: enable loading dumps from standard input

2016-10-04 Thread Ben Widawsky

On 16-10-04 16:03:28, Lionel Landwerlin wrote:

On 04/10/16 16:01, Eero Tamminen wrote:

Hi,

On 04.10.2016 17:38, Lionel Landwerlin wrote:

In conjuction with an intel_aubdump change, you can now look at your
application's output like this :

$ intel_aubdump -c '/path/to/aubinator --gen=hsw' my_gl_app


Maybe you could add also a patch to document this usage?


Thanks, forgot about the print_help().





Signed-off-by: Lionel Landwerlin 
Cc: Sirisha Gandikota 
Cc: Kristian Høgsberg 
---
src/intel/tools/aubinator.c | 162 
+++-

1 file changed, 130 insertions(+), 32 deletions(-)

diff --git a/src/intel/tools/aubinator.c b/src/intel/tools/aubinator.c
index 83328b5..73e6012 100644
--- a/src/intel/tools/aubinator.c
+++ b/src/intel/tools/aubinator.c

...

+   /* mmap a terabyte for our gtt space. */
+   gtt_size = 1ul << 30;


On 32-bit systems, you run out of address space when your process' 
total mappings size increase to several GBs.




Ooops, sorry about that.
That's remaining debug stuff from running valgrind :/



The comment is also no longer correct.



   - Eero

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Mesa 12.1.0 release plan (Was Re: Next Mesa release, anyone?)

2016-10-04 Thread Axel Davy


On 04/10/2016 12:32, Emil Velikov wrote:

On 2 October 2016 at 14:17, Axel Davy  wrote:

Hi,

If I understand, there hasn't been yet statements on whether the freeze is
for Oct 7 or Oct 14.

Could there be one ?

I'd prefer myself Oct 14, because we have a lot of patches for nine, and
they deserve more cleaning and testing, but if it's Oct 7, we'll try be on
time.


14th it is. As mentioned before: _don't_ wait for the last week to get
things merged. Once you're reasonably happy just send the new work
review and commit it.
Same applies for bugfixes :-)

Thanks
Emil

Thanks. I know we should try to send patches earlier, but some are just 
not clean enough yet,


and we have one regression to fix. Will send as soon as possible.


Axel

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Mesa 12.1.0 release plan (Was Re: Next Mesa release, anyone?)

2016-10-04 Thread Emil Velikov
On 4 October 2016 at 15:57, Rob Herring  wrote:
> On Tue, Oct 4, 2016 at 5:26 AM, Emil Velikov  wrote:
>> On 4 October 2016 at 02:05, Rob Clark  wrote:
>>> On Thu, Sep 29, 2016 at 10:56 AM, Emil Velikov  
>>> wrote:
 On 28 September 2016 at 19:53, Marek Olšák  wrote:
> Hi,
>
> It's been almost 4 months since the 12.0 branch was created, and soon
> it will have been 3 months since Mesa 12.0 was released.
>
> Is there any reason we haven't created the stable branch yet?
>
> Ideally, we would time the release so that it's 1-2 months before fall
> distribution releases.
>

 Thanks Marek !

 In all honesty I was secretly hoping that we'll get Dave/Bas RADV for
 12.1. With the topic of which would be 'the default' Vulkan driver for
 ATI/AMD hardware to be considered at a later stage.
>>>
>>> btw, I pushed libdrm release that I think etnaviv was waiting for..
>>> not sure what else is needed before merging etnaviv gallium driver,
>>> but if at all possible, it would be nice to land that before the
>>> branch point too.
>>>
>> Thanks for the libdrm bits Rob. IIRC on the mesa side Christian
>> reworked the render-only parts noticeably, yet as long as there isn't
>> a crazy amount of changes outside of the etnaviv driver I think we're
>> fine with getting it in.
>
> I've been doing some work to get etnaviv working on Android. The
> render-only approach is a bit broken IMO. It may work okay for X11
> which expects to work on a card node, but it doesn't for Android which
> already uses the render node for GL and gralloc and the KMS node for
> HWC. Having the render-only driver also open the KMS node gets us into
> all of the permissions issues. It seems to me we want gralloc to be
> able to open both nodes (that still has some ioctl permission issues)
> and allocate scanout buffers from the control node (thru GBM). Then
> the etnaviv driver has to know to blit to the linear buffer when it
> has an imported scanout buffer. IOW, I don't think the render-only
> driver should internally allocate dumb buffers, but keep that
> allocation external and the etnaviv driver needs to be able to deal
> with external buffers. Maybe we just wait for the grand central
> allocator to solve all this.
>
Hi Rob,

Yes, I've pondered on the same thing a while back - (ab)using dumb
buffer might not be the best of ideas. IMHO for the time being, until
the new allocator gets into shape, I think its the smaller evil that
we can opt for. Merging the latest code, even if it only works on
X/Wayland/other is better than none, right ;-)

Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] intel/blorp: Use documented RECTLIST vertex positions

2016-10-04 Thread Ben Widawsky

Reviewed-by: Ben Widawsky 

On 16-10-04 08:03:05, Jason Ekstrand wrote:

Reviewed-by: Jason Ekstrand 

On Wed, Sep 21, 2016 at 2:42 PM, Nanley Chery  wrote:


Use the vertex positions described in the PRMs. This has no effect on
rendering but quiets the simulator warnings seen when the vertices
appear out of order.

Signed-off-by: Nanley Chery 
Cc: Jason Ekstrand 
Cc: Marek Olšák 
---
 src/intel/blorp/blorp_genX_exec.h | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/src/intel/blorp/blorp_genX_exec.h
b/src/intel/blorp/blorp_genX_exec.h
index eb4a5b9..62f16a3 100644
--- a/src/intel/blorp/blorp_genX_exec.h
+++ b/src/intel/blorp/blorp_genX_exec.h
@@ -171,8 +171,8 @@ blorp_emit_vertex_data(struct blorp_batch *batch,
uint32_t *size)
 {
const float vertices[] = {
-  /* v0 */ (float)params->x0, (float)params->y1,
-  /* v1 */ (float)params->x1, (float)params->y1,
+  /* v0 */ (float)params->x1, (float)params->y1,
+  /* v1 */ (float)params->x0, (float)params->y1,
   /* v2 */ (float)params->x0, (float)params->y0,
};

@@ -287,7 +287,7 @@ blorp_emit_vertex_elements(struct blorp_batch *batch,
 *   v2 -- implied
 *||
 *||
-*   v0 - v1
+*   v1 - v0
 *
 * Since the VS is disabled, the clipper loads each VUE directly from
 * the URB. This is controlled by the 3DSTATE_VERTEX_BUFFERS and
--
2.10.0





___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] intel: aubinator: generate a standalone binary

2016-10-04 Thread Ben Widawsky

On 16-10-04 15:38:52, Lionel Landwerlin wrote:

Embed the xml files into the binary, so aubinator can be used from any
location.

Signed-off-by: Lionel Landwerlin 
Cc: Sirisha Gandikota 
---
src/intel/Makefile.am   |  1 +
src/intel/Makefile.aubinator.am | 36 +++
src/intel/Makefile.sources  |  7 +++
src/intel/tools/.gitignore  |  5 +++
src/intel/tools/aubinator.c | 97 +
src/intel/tools/decoder.c   | 82 --
src/intel/tools/decoder.h   |  4 +-
7 files changed, 141 insertions(+), 91 deletions(-)
create mode 100644 src/intel/Makefile.aubinator.am

diff --git a/src/intel/Makefile.am b/src/intel/Makefile.am
index 9186b5c..c3cb9fb 100644
--- a/src/intel/Makefile.am
+++ b/src/intel/Makefile.am
@@ -52,6 +52,7 @@ BUILT_SOURCES =
CLEANFILES =
EXTRA_DIST =

+include Makefile.aubinator.am
include Makefile.blorp.am
include Makefile.common.am
include Makefile.genxml.am
diff --git a/src/intel/Makefile.aubinator.am b/src/intel/Makefile.aubinator.am
new file mode 100644
index 000..9772700
--- /dev/null
+++ b/src/intel/Makefile.aubinator.am
@@ -0,0 +1,36 @@
+# Copyright © 2016 Intel Corporation
+#
+# Permission is hereby granted, free of charge, to any person obtaining a
+# copy of this software and associated documentation files (the "Software"),
+# to deal in the Software without restriction, including without limitation
+# the rights to use, copy, modify, merge, publish, distribute, sublicense,
+# and/or sell copies of the Software, and to permit persons to whom the
+# Software is furnished to do so, subject to the following conditions:
+#
+# The above copyright notice and this permission notice (including the next
+# paragraph) shall be included in all copies or substantial portions of the
+# Software.
+#
+# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+# THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+# IN THE SOFTWARE.
+
+BUILT_SOURCES += $(AUBINATOR_GENERATED_FILES)
+
+SUFFIXES = _aubinator_xml.h .xml
+
+tools/gen6_aubinator_xml.h: genxml/gen6.xml
+tools/gen7_aubinator_xml.h: genxml/gen7.xml
+tools/gen75_aubinator_xml.h: genxml/gen75.xml
+tools/gen8_aubinator_xml.h: genxml/gen8.xml
+tools/gen9_aubinator_xml.h: genxml/gen9.xml
+
+$(AUBINATOR_GENERATED_FILES): Makefile
+
+%_aubinator_xml.h:
+   $(MKDIR_GEN)
+   $(AM_V_GEN) xxd -i $< > $@
diff --git a/src/intel/Makefile.sources b/src/intel/Makefile.sources
index 94073d2..a5c2bf0 100644
--- a/src/intel/Makefile.sources
+++ b/src/intel/Makefile.sources
@@ -1,3 +1,10 @@
+AUBINATOR_GENERATED_FILES = \
+   tools/gen6_aubinator_xml.h \
+   tools/gen7_aubinator_xml.h \
+   tools/gen75_aubinator_xml.h \
+   tools/gen8_aubinator_xml.h \
+   tools/gen9_aubinator_xml.h
+
BLORP_FILES = \
blorp/blorp.c \
blorp/blorp.h \
diff --git a/src/intel/tools/.gitignore b/src/intel/tools/.gitignore
index 0c80a6f..c4eebde 100644
--- a/src/intel/tools/.gitignore
+++ b/src/intel/tools/.gitignore
@@ -1 +1,6 @@
/aubinator
+gen6_aubinator_xml.h
+gen75_aubinator_xml.h
+gen7_aubinator_xml.h
+gen8_aubinator_xml.h
+gen9_aubinator_xml.h
diff --git a/src/intel/tools/aubinator.c b/src/intel/tools/aubinator.c
index a31dcb2..83328b5 100644
--- a/src/intel/tools/aubinator.c
+++ b/src/intel/tools/aubinator.c
@@ -35,6 +35,8 @@
#include 
#include 

+#include "util/macros.h"
+
#include "decoder.h"
#include "intel_aub.h"
#include "gen_disasm.h"
@@ -1059,11 +1061,24 @@ int main(int argc, char *argv[])
{
   struct gen_spec *spec;
   struct aub_file *file;
-   int i, pci_id = 0;
+   int i;
   bool found_arg_gen = false, pager = true;
-   int gen_major, gen_minor;
-   const char *value;
-   char gen_file[256], gen_val[24];
+   const char *value, *input_file = NULL;
+   char gen_val[24];
+   const struct {
+  const char *name;
+  int pci_id;
+   } gens[] = {
+  { "ivb", 0x0166 }, /* Intel(R) Ivybridge Mobile GT2 */
+  { "hsw", 0x0416 }, /* Intel(R) Haswell Mobile GT2 */
+  { "byt", 0x0155 }, /* Intel(R) Bay Trail */
+  { "bdw", 0x1616 }, /* Intel(R) HD Graphics 5500 (Broadwell GT2) */
+  { "chv", 0x22B3 }, /* Intel(R) HD Graphics (Cherryview) */
+  { "skl", 0x1912 }, /* Intel(R) HD Graphics 530 (Skylake GT2) */
+  { "kbl", 0x591D }, /* Intel(R) Kabylake GT2 */
+  { "bxt", 0x0A84 }  /* Intel(R) HD Graphics (Broxton) */
+   }, *gen = NULL;
+   struct gen_device_info devinfo;

   if (argc == 1) {
  print_help(argv[0], stderr);
@@ -1081,8 +1096,6 @@ int main(int argc, char *argv[])

Re: [Mesa-dev] [PATCH] util: use GCC atomic intrinsics with explicit memory model

2016-10-04 Thread Emil Velikov
On 4 October 2016 at 15:14, Nicolai Hähnle  wrote:
> From: Nicolai Hähnle 
>
> This is motivated by the fact that p_atomic_read and p_atomic_set may
> somewhat surprisingly not do the right thing in the old version: while
> stores and loads are de facto atomic at least on x86, the compiler may
> apply re-ordering and speculation quite liberally. Basically, the old
> version uses the "relaxed" memory ordering.
>
> The new ordering always uses acquire/release ordering. This is the
> strongest possible memory ordering that doesn't require additional
> fence instructions on x86. (And the only stronger ordering is
> "sequentially consistent", which is usually more than you need anyway.)
>
> I would feel more comfortable if p_atomic_set/read in the old
> implementation were at least using volatile loads and stores, but I
> don't see a way to get there without typeof (which we cannot use here
> since the code is compiled with -std=c99).
>
> Eventually, we should really just move to something that is based on
> the atomics in C11 / C++11.
> ---
>  configure.ac| 11 +++
>  src/util/u_atomic.h | 21 +
>  2 files changed, 32 insertions(+)
>
On the build side the patch looks great, I haven't looked at the
atomic specifics.
Reviewed-by: Emil Velikov 

-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 4/5] intel: aubinator: generate a standalone binary

2016-10-04 Thread Jason Ekstrand
On Tue, Oct 4, 2016 at 8:59 AM, Lionel Landwerlin 
wrote:

> Embed the xml files into the binary, so aubinator can be used from any
> location.
>
> v2: Split generation packing into another patch (Jason)
> Check for xxd (Jason)
>
> Signed-off-by: Lionel Landwerlin 
> Cc: Sirisha Gandikota 
> ---
>  configure.ac|  1 +
>  src/intel/Makefile.am   |  1 +
>  src/intel/Makefile.aubinator.am | 36 ++
>  src/intel/Makefile.sources  |  7 
>  src/intel/tools/.gitignore  |  5 +++
>  src/intel/tools/aubinator.c | 39 ++--
>  src/intel/tools/decoder.c   | 82 +-
> ---
>  src/intel/tools/decoder.h   |  4 +-
>  8 files changed, 122 insertions(+), 53 deletions(-)
>  create mode 100644 src/intel/Makefile.aubinator.am
>
> diff --git a/configure.ac b/configure.ac
> index 1bfac3b..7046349 100644
> --- a/configure.ac
> +++ b/configure.ac
> @@ -110,6 +110,7 @@ LT_PREREQ([2.2])
>  LT_INIT([disable-static])
>
>  AC_CHECK_PROG(RM, rm, [rm -f])
> +AC_CHECK_PROG(XXD, xxd, [xxd])
>
>  AX_PROG_BISON([],
>AS_IF([test ! -f "$srcdir/src/compiler/glsl/
> glcpp/glcpp-parse.c"],
> diff --git a/src/intel/Makefile.am b/src/intel/Makefile.am
> index 9186b5c..c3cb9fb 100644
> --- a/src/intel/Makefile.am
> +++ b/src/intel/Makefile.am
> @@ -52,6 +52,7 @@ BUILT_SOURCES =
>  CLEANFILES =
>  EXTRA_DIST =
>
> +include Makefile.aubinator.am
>  include Makefile.blorp.am
>  include Makefile.common.am
>  include Makefile.genxml.am
> diff --git a/src/intel/Makefile.aubinator.am b/src/intel/Makefile.
> aubinator.am
> new file mode 100644
> index 000..3d05d30
> --- /dev/null
> +++ b/src/intel/Makefile.aubinator.am
> @@ -0,0 +1,36 @@
> +# Copyright © 2016 Intel Corporation
> +#
> +# Permission is hereby granted, free of charge, to any person obtaining a
> +# copy of this software and associated documentation files (the
> "Software"),
> +# to deal in the Software without restriction, including without
> limitation
> +# the rights to use, copy, modify, merge, publish, distribute, sublicense,
> +# and/or sell copies of the Software, and to permit persons to whom the
> +# Software is furnished to do so, subject to the following conditions:
> +#
> +# The above copyright notice and this permission notice (including the
> next
> +# paragraph) shall be included in all copies or substantial portions of
> the
> +# Software.
> +#
> +# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS
> OR
> +# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> +# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
> +# THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR
> OTHER
> +# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
> +# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
> DEALINGS
> +# IN THE SOFTWARE.
> +
> +BUILT_SOURCES += $(AUBINATOR_GENERATED_FILES)
> +
> +SUFFIXES = _aubinator_xml.h .xml
> +
> +tools/gen6_aubinator_xml.h: genxml/gen6.xml
> +tools/gen7_aubinator_xml.h: genxml/gen7.xml
> +tools/gen75_aubinator_xml.h: genxml/gen75.xml
> +tools/gen8_aubinator_xml.h: genxml/gen8.xml
> +tools/gen9_aubinator_xml.h: genxml/gen9.xml
> +
> +$(AUBINATOR_GENERATED_FILES): Makefile
> +
> +%_aubinator_xml.h:
> +   $(MKDIR_GEN)
> +   $(AM_V_GEN) $(XXD) -i $< > $@
> diff --git a/src/intel/Makefile.sources b/src/intel/Makefile.sources
> index 94073d2..a5c2bf0 100644
> --- a/src/intel/Makefile.sources
> +++ b/src/intel/Makefile.sources
> @@ -1,3 +1,10 @@
> +AUBINATOR_GENERATED_FILES = \
> +   tools/gen6_aubinator_xml.h \
> +   tools/gen7_aubinator_xml.h \
> +   tools/gen75_aubinator_xml.h \
> +   tools/gen8_aubinator_xml.h \
> +   tools/gen9_aubinator_xml.h
> +
>  BLORP_FILES = \
> blorp/blorp.c \
> blorp/blorp.h \
> diff --git a/src/intel/tools/.gitignore b/src/intel/tools/.gitignore
> index 0c80a6f..c4eebde 100644
> --- a/src/intel/tools/.gitignore
> +++ b/src/intel/tools/.gitignore
> @@ -1 +1,6 @@
>  /aubinator
> +gen6_aubinator_xml.h
> +gen75_aubinator_xml.h
> +gen7_aubinator_xml.h
> +gen8_aubinator_xml.h
> +gen9_aubinator_xml.h
> diff --git a/src/intel/tools/aubinator.c b/src/intel/tools/aubinator.c
> index 459e3d4..3599867 100644
> --- a/src/intel/tools/aubinator.c
> +++ b/src/intel/tools/aubinator.c
> @@ -1065,22 +1065,21 @@ int main(int argc, char *argv[])
> int i;
> bool found_arg_gen = false, pager = true;
> const char *value, *input_file = NULL;
> -   char gen_file[256], gen_val[24];
> +   char gen_val[24];
> const struct {
>const char *name;
>int pci_id;
> -  int major;
> -  int minor;
> } gens[] = {
> -  { "ivb", 0x0166, 7, 0 }, /* Intel(R) Ivybridge Mobile GT2 */
> -  { "hsw", 0x0416, 7, 5 }, /* Intel(R) Haswell Mobile GT2 */
> -   

Re: [Mesa-dev] [PATCH 1/5] intel: aubinator: pack supported generations into an array

2016-10-04 Thread Jason Ekstrand
Reviewed-by: Jason Ekstrand 

On Tue, Oct 4, 2016 at 8:59 AM, Lionel Landwerlin 
wrote:

> Signed-off-by: Lionel Landwerlin 
> Cc: Sirisha Gandikota 
> ---
>  src/intel/tools/aubinator.c | 79 +-
> ---
>  1 file changed, 30 insertions(+), 49 deletions(-)
>
> diff --git a/src/intel/tools/aubinator.c b/src/intel/tools/aubinator.c
> index 9b32e5b..4e2cafa 100644
> --- a/src/intel/tools/aubinator.c
> +++ b/src/intel/tools/aubinator.c
> @@ -36,6 +36,8 @@
>  #include 
>  #include 
>
> +#include "util/macros.h"
> +
>  #include "decoder.h"
>  #include "intel_aub.h"
>  #include "gen_disasm.h"
> @@ -1060,11 +1062,25 @@ int main(int argc, char *argv[])
>  {
> struct gen_spec *spec;
> struct aub_file *file;
> -   int i, pci_id = 0;
> +   int i;
> bool found_arg_gen = false, pager = true;
> -   int gen_major, gen_minor;
> const char *value;
> char gen_file[256], gen_val[24];
> +   const struct {
> +  const char *name;
> +  int pci_id;
> +  int major;
> +  int minor;
> +   } gens[] = {
> +  { "ivb", 0x0166, 7, 0 }, /* Intel(R) Ivybridge Mobile GT2 */
> +  { "hsw", 0x0416, 7, 5 }, /* Intel(R) Haswell Mobile GT2 */
> +  { "byt", 0x0155, 7, 5 }, /* Intel(R) Bay Trail */
> +  { "bdw", 0x1616, 8, 0 }, /* Intel(R) HD Graphics 5500 (Broadwell
> GT2) */
> +  { "chv", 0x22B3, 8, 0 }, /* Intel(R) HD Graphics (Cherryview) */
> +  { "skl", 0x1912, 9, 0 }, /* Intel(R) HD Graphics 530 (Skylake GT2)
> */
> +  { "kbl", 0x591D, 9, 0 }, /* Intel(R) Kabylake GT2 */
> +  { "bxt", 0x0A84, 9, 0 }  /* Intel(R) HD Graphics (Broxton) */
> +   }, *gen = NULL;
>
> if (argc == 1) {
>print_help(argv[0], stderr);
> @@ -1082,8 +1098,6 @@ int main(int argc, char *argv[])
>  exit(EXIT_FAILURE);
>   }
>   found_arg_gen = true;
> - gen_major = 0;
> - gen_minor = 0;
>   snprintf(gen_val, sizeof(gen_val), "%s", value);
>} else if (strcmp(argv[i], "--headers") == 0) {
>   option_full_decode = false;
> @@ -1115,47 +1129,14 @@ int main(int argc, char *argv[])
>exit(EXIT_FAILURE);
> }
>
> -   if (strstr(gen_val, "ivb") != NULL) {
> -  /* Intel(R) Ivybridge Mobile GT2 */
> -  pci_id = 0x0166;
> -  gen_major = 7;
> -  gen_minor = 0;
> -   } else if (strstr(gen_val, "hsw") != NULL) {
> -  /* Intel(R) Haswell Mobile GT2 */
> -  pci_id = 0x0416;
> -  gen_major = 7;
> -  gen_minor = 5;
> -   } else if (strstr(gen_val, "byt") != NULL) {
> -  /* Intel(R) Bay Trail */
> -  pci_id = 0x0155;
> -  gen_major = 7;
> -  gen_minor = 5;
> -   } else if (strstr(gen_val, "bdw") != NULL) {
> -  /* Intel(R) HD Graphics 5500 (Broadwell GT2) */
> -  pci_id = 0x1616;
> -  gen_major = 8;
> -  gen_minor = 0;
> -   }  else if (strstr(gen_val, "chv") != NULL) {
> -  /* Intel(R) HD Graphics (Cherryview) */
> -  pci_id = 0x22B3;
> -  gen_major = 8;
> -  gen_minor = 0;
> -   } else if (strstr(gen_val, "skl") != NULL) {
> -  /* Intel(R) HD Graphics 530 (Skylake GT2) */
> -  pci_id = 0x1912;
> -  gen_major = 9;
> -  gen_minor = 0;
> -   } else if (strstr(gen_val, "kbl") != NULL) {
> -  /* Intel(R) Kabylake GT2 */
> -  pci_id = 0x591D;
> -  gen_major = 9;
> -  gen_minor = 0;
> -   } else if (strstr(gen_val, "bxt") != NULL) {
> -  /* Intel(R) HD Graphics (Broxton) */
> -  pci_id = 0x0A84;
> -  gen_major = 9;
> -  gen_minor = 0;
> -   } else {
> +   for (i = 0; i < ARRAY_SIZE(gens); i++) {
> +  if (!strcmp(gen_val, gens[i].name)) {
> + gen = [i];
> + break;
> +  }
> +   }
> +
> +   if (gen == NULL) {
>fprintf(stderr, "can't parse gen: %s, expected ivb, byt, hsw, "
>   "bdw, chv, skl, kbl or bxt\n", gen_val);
>exit(EXIT_FAILURE);
> @@ -1168,15 +1149,15 @@ int main(int argc, char *argv[])
> if (isatty(1) && pager)
>setup_pager();
>
> -   if (gen_minor > 0) {
> +   if (gen->minor > 0) {
>snprintf(gen_file, sizeof(gen_file), "../genxml/gen%d%d.xml",
> -   gen_major, gen_minor);
> +   gen->major, gen->minor);
> } else {
> -  snprintf(gen_file, sizeof(gen_file), "../genxml/gen%d.xml",
> gen_major);
> +  snprintf(gen_file, sizeof(gen_file), "../genxml/gen%d.xml",
> gen->major);
> }
>
> spec = gen_spec_load(gen_file);
> -   disasm = gen_disasm_create(pci_id);
> +   disasm = gen_disasm_create(gen->pci_id);
>
> if (argv[i] == NULL) {
> print_help(argv[0], stderr);
> --
> 2.9.3
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list

Re: [Mesa-dev] [PATCH 4/5] intel: aubinator: generate a standalone binary

2016-10-04 Thread Jason Ekstrand
On Tue, Oct 4, 2016 at 8:59 AM, Lionel Landwerlin 
wrote:

> Embed the xml files into the binary, so aubinator can be used from any
> location.
>
> v2: Split generation packing into another patch (Jason)
> Check for xxd (Jason)
>
> Signed-off-by: Lionel Landwerlin 
> Cc: Sirisha Gandikota 
> ---
>  configure.ac|  1 +
>  src/intel/Makefile.am   |  1 +
>  src/intel/Makefile.aubinator.am | 36 ++
>  src/intel/Makefile.sources  |  7 
>  src/intel/tools/.gitignore  |  5 +++
>  src/intel/tools/aubinator.c | 39 ++--
>  src/intel/tools/decoder.c   | 82 +-
> ---
>  src/intel/tools/decoder.h   |  4 +-
>  8 files changed, 122 insertions(+), 53 deletions(-)
>  create mode 100644 src/intel/Makefile.aubinator.am
>
> diff --git a/configure.ac b/configure.ac
> index 1bfac3b..7046349 100644
> --- a/configure.ac
> +++ b/configure.ac
> @@ -110,6 +110,7 @@ LT_PREREQ([2.2])
>  LT_INIT([disable-static])
>
>  AC_CHECK_PROG(RM, rm, [rm -f])
> +AC_CHECK_PROG(XXD, xxd, [xxd])
>
>  AX_PROG_BISON([],
>AS_IF([test ! -f "$srcdir/src/compiler/glsl/
> glcpp/glcpp-parse.c"],
> diff --git a/src/intel/Makefile.am b/src/intel/Makefile.am
> index 9186b5c..c3cb9fb 100644
> --- a/src/intel/Makefile.am
> +++ b/src/intel/Makefile.am
> @@ -52,6 +52,7 @@ BUILT_SOURCES =
>  CLEANFILES =
>  EXTRA_DIST =
>
> +include Makefile.aubinator.am
>  include Makefile.blorp.am
>  include Makefile.common.am
>  include Makefile.genxml.am
> diff --git a/src/intel/Makefile.aubinator.am b/src/intel/Makefile.
> aubinator.am
> new file mode 100644
> index 000..3d05d30
> --- /dev/null
> +++ b/src/intel/Makefile.aubinator.am
> @@ -0,0 +1,36 @@
> +# Copyright © 2016 Intel Corporation
> +#
> +# Permission is hereby granted, free of charge, to any person obtaining a
> +# copy of this software and associated documentation files (the
> "Software"),
> +# to deal in the Software without restriction, including without
> limitation
> +# the rights to use, copy, modify, merge, publish, distribute, sublicense,
> +# and/or sell copies of the Software, and to permit persons to whom the
> +# Software is furnished to do so, subject to the following conditions:
> +#
> +# The above copyright notice and this permission notice (including the
> next
> +# paragraph) shall be included in all copies or substantial portions of
> the
> +# Software.
> +#
> +# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS
> OR
> +# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> +# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
> +# THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR
> OTHER
> +# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
> +# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
> DEALINGS
> +# IN THE SOFTWARE.
> +
> +BUILT_SOURCES += $(AUBINATOR_GENERATED_FILES)
> +
> +SUFFIXES = _aubinator_xml.h .xml
> +
> +tools/gen6_aubinator_xml.h: genxml/gen6.xml
> +tools/gen7_aubinator_xml.h: genxml/gen7.xml
> +tools/gen75_aubinator_xml.h: genxml/gen75.xml
> +tools/gen8_aubinator_xml.h: genxml/gen8.xml
> +tools/gen9_aubinator_xml.h: genxml/gen9.xml
> +
> +$(AUBINATOR_GENERATED_FILES): Makefile
> +
> +%_aubinator_xml.h:
> +   $(MKDIR_GEN)
> +   $(AM_V_GEN) $(XXD) -i $< > $@
>

This should go in tools/Makefile.am with the rest of the aubinator build
process.

Also, you should try doing an out-of-tree build which I think will fail
with this change.  You need to add an include path for aubinator that looks
in $(builddir)


> diff --git a/src/intel/Makefile.sources b/src/intel/Makefile.sources
> index 94073d2..a5c2bf0 100644
> --- a/src/intel/Makefile.sources
> +++ b/src/intel/Makefile.sources
> @@ -1,3 +1,10 @@
> +AUBINATOR_GENERATED_FILES = \
> +   tools/gen6_aubinator_xml.h \
> +   tools/gen7_aubinator_xml.h \
> +   tools/gen75_aubinator_xml.h \
> +   tools/gen8_aubinator_xml.h \
> +   tools/gen9_aubinator_xml.h
> +
>  BLORP_FILES = \
> blorp/blorp.c \
> blorp/blorp.h \
> diff --git a/src/intel/tools/.gitignore b/src/intel/tools/.gitignore
> index 0c80a6f..c4eebde 100644
> --- a/src/intel/tools/.gitignore
> +++ b/src/intel/tools/.gitignore
> @@ -1 +1,6 @@
>  /aubinator
> +gen6_aubinator_xml.h
> +gen75_aubinator_xml.h
> +gen7_aubinator_xml.h
> +gen8_aubinator_xml.h
> +gen9_aubinator_xml.h
> diff --git a/src/intel/tools/aubinator.c b/src/intel/tools/aubinator.c
> index 459e3d4..3599867 100644
> --- a/src/intel/tools/aubinator.c
> +++ b/src/intel/tools/aubinator.c
> @@ -1065,22 +1065,21 @@ int main(int argc, char *argv[])
> int i;
> bool found_arg_gen = false, pager = true;
> const char *value, *input_file = NULL;
> -   char gen_file[256], gen_val[24];
> +   char gen_val[24];
> const struct {

Re: [Mesa-dev] [PATCH 1/3] i965: solve cubemap negative x/y/z faces buffer offset issue in dEQP.

2016-10-04 Thread Jason Ekstrand
On Tue, Oct 4, 2016 at 8:55 AM, Tapani Pälli  wrote:

> On 10/04/2016 06:09 PM, Jason Ekstrand wrote:
>
> On Thu, Sep 29, 2016 at 11:27 PM, Xu,Randy  wrote:
>
>> Add the miptree level/slice x/y_offset when count the surface offset
>> in brw_emit_surface_state. The surface offset has two parts, one is
>> from mt->offset, which should be 32 aligned in width/height for tiled
>> buffer; another is from mt->level[current_level].slice[current_slice].
>> x/y_offset.
>>
>> This fix will solve 12 deqp failure
>> dEQP-EGL.functional.image.create.gles2_cubemap_negative_*_texture
>>
>> Signed-off-by: Xu,Randy 
>> ---
>>  src/mesa/drivers/dri/i965/brw_wm_surface_state.c | 3 ++-
>>  1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
>> b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
>> index 61a4b94..3a5c573 100644
>> --- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
>> +++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
>> @@ -85,7 +85,8 @@ brw_emit_surface_state(struct brw_context *brw,
>> unsigned read_domains, unsigned write_domains)
>>  {
>> const struct surface_state_info ss_info =
>> surface_state_infos[brw->gen];
>> -   uint32_t tile_x = 0, tile_y = 0;
>> +   uint32_t tile_x = mt->level[0].slice[0].x_offset;
>> +   uint32_t tile_y = mt->level[0].slice[0].y_offset;
>>
>
> This isn't correct.  First off, there are some fairly strict restrictions
> on what we can do with tile_x and tile_y and we can't just shove x/y_offset
> in there.  We need to use intel_miptree_get_tile_offsets to get both a byte
> offset and an intratile offset.  Second, we should already be taking slices
> into account for cube maps via base_array_layer where needed.
>
> Unfortunately, I'm not 100% sure what the correct patch is without a bit
> more information about what the test is doing that causes a problem.
>
>
> I did take a brief look and when running the set mentioned above (for
> example with ./deqp-egl --deqp-case=*EGL.functional.
> image.create.gles2_cubemap_negative_*_texture) what happens is that we
> never end up to the part of code calling intel_miptree_get_tile_offsets in
> that function (because surf.dim_layout != dim_layout condition does not
> trigger). This is just what I observed, should we just call
> intel_miptree_get_tile_offsets() unconditionally then?
>

No.  Very much no.  The intel_miptree_get_tile_offsets() stuff is a hack
that lets us convert non-2D things to 2D things and it comes with piles of
restrictions.


>
> --Jason
>
>
>> uint32_t offset = mt->offset;
>>
>> struct isl_surf surf;
>> --
>> 2.7.4
>>
>> ___
>> mesa-dev mailing list
>> mesa-dev@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>>
>
>
>
> ___
> mesa-dev mailing 
> listmesa-dev@lists.freedesktop.orghttps://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
>
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 5/5] intel: aubinator: enable loading dumps from standard input

2016-10-04 Thread Lionel Landwerlin
In conjuction with an intel_aubdump change, you can now look at your
application's output like this :

$ intel_aubdump -c '/path/to/aubinator --gen=hsw' my_gl_app

v2: Add print_help() comment about standard input handling (Eero)
Remove shrinked gtt space debug workaround (Eero)

Signed-off-by: Lionel Landwerlin 
Cc: Sirisha Gandikota 
Cc: Kristian Høgsberg 
---
 src/intel/tools/aubinator.c | 166 +++-
 1 file changed, 132 insertions(+), 34 deletions(-)

diff --git a/src/intel/tools/aubinator.c b/src/intel/tools/aubinator.c
index 3599867..63ea59e 100644
--- a/src/intel/tools/aubinator.c
+++ b/src/intel/tools/aubinator.c
@@ -835,48 +835,51 @@ handle_trace_block(struct gen_spec *spec, uint32_t *p)
 }
 
 struct aub_file {
-   char *filename;
-   int fd;
-   struct stat sb;
+   FILE *stream;
+
uint32_t *map, *end, *cursor;
+   uint32_t *mem_end;
 };
 
 static struct aub_file *
 aub_file_open(const char *filename)
 {
struct aub_file *file;
+   struct stat sb;
+   int fd;
 
-   file = malloc(sizeof *file);
-   file->filename = strdup(filename);
-   file->fd = open(file->filename, O_RDONLY);
-   if (file->fd == -1) {
-  fprintf(stderr, "open %s failed: %s\n", file->filename, strerror(errno));
+   file = calloc(1, sizeof *file);
+   fd = open(filename, O_RDONLY);
+   if (fd == -1) {
+  fprintf(stderr, "open %s failed: %s\n", filename, strerror(errno));
   exit(EXIT_FAILURE);
}
 
-   if (fstat(file->fd, >sb) == -1) {
+   if (fstat(fd, ) == -1) {
   fprintf(stderr, "stat failed: %s\n", strerror(errno));
   exit(EXIT_FAILURE);
}
 
-   file->map = mmap(NULL, file->sb.st_size,
-PROT_READ, MAP_SHARED, file->fd, 0);
+   file->map = mmap(NULL, sb.st_size,
+PROT_READ, MAP_SHARED, fd, 0);
if (file->map == MAP_FAILED) {
   fprintf(stderr, "mmap failed: %s\n", strerror(errno));
   exit(EXIT_FAILURE);
}
 
file->cursor = file->map;
-   file->end = file->map + file->sb.st_size / 4;
+   file->end = file->map + sb.st_size / 4;
 
-   /* mmap a terabyte for our gtt space. */
-   gtt_size = 1ul << 40;
-   gtt = mmap(NULL, gtt_size, PROT_READ | PROT_WRITE,
-  MAP_PRIVATE | MAP_ANONYMOUS |  MAP_NORESERVE, -1, 0);
-   if (gtt == MAP_FAILED) {
-  fprintf(stderr, "failed to alloc gtt space: %s\n", strerror(errno));
-  exit(1);
-   }
+   return file;
+}
+
+static struct aub_file *
+aub_file_stdin(void)
+{
+   struct aub_file *file;
+
+   file = calloc(1, sizeof *file);
+   file->stream = stdin;
 
return file;
 }
@@ -926,12 +929,21 @@ struct {
{ "bxt", MAKE_GEN(9, 0) }
 };
 
-static void
+enum {
+   AUB_ITEM_DECODE_OK,
+   AUB_ITEM_DECODE_FAILED,
+   AUB_ITEM_DECODE_NEED_MORE_DATA,
+};
+
+static int
 aub_file_decode_batch(struct aub_file *file, struct gen_spec *spec)
 {
-   uint32_t *p, h, device, data_type;
+   uint32_t *p, h, device, data_type, *new_cursor;
int header_length, payload_size, bias;
 
+   if (file->end - file->cursor < 12)
+  return AUB_ITEM_DECODE_NEED_MORE_DATA;
+
p = file->cursor;
h = *p;
header_length = h & 0x;
@@ -947,8 +959,7 @@ aub_file_decode_batch(struct aub_file *file, struct 
gen_spec *spec)
   printf("unknown opcode %d at %ld/%ld\n",
  OPCODE(h), file->cursor - file->map,
  file->end - file->map);
-  file->cursor = file->end;
-  return;
+  return AUB_ITEM_DECODE_FAILED;
}
 
payload_size = 0;
@@ -960,9 +971,22 @@ aub_file_decode_batch(struct aub_file *file, struct 
gen_spec *spec)
   payload_size = p[4];
   handle_trace_block(spec, p);
   break;
-   case MAKE_HEADER(TYPE_AUB, OPCODE_AUB, SUBOPCODE_BMP):
+   default:
   break;
+   }
 
+   new_cursor = p + header_length + bias + payload_size / 4;
+   if (new_cursor > file->end)
+  return AUB_ITEM_DECODE_NEED_MORE_DATA;
+
+   switch (h & 0x) {
+   case MAKE_HEADER(TYPE_AUB, OPCODE_AUB, SUBOPCODE_HEADER):
+  break;
+   case MAKE_HEADER(TYPE_AUB, OPCODE_AUB, SUBOPCODE_BLOCK):
+  handle_trace_block(spec, p);
+  break;
+   case MAKE_HEADER(TYPE_AUB, OPCODE_AUB, SUBOPCODE_BMP):
+  break;
case MAKE_HEADER(TYPE_AUB, OPCODE_NEW_AUB, SUBOPCODE_VERSION):
   printf("version block: dw1 %08x\n", p[1]);
   device = (p[1] >> 8) & 0xff;
@@ -988,13 +1012,65 @@ aub_file_decode_batch(struct aub_file *file, struct 
gen_spec *spec)
  "subopcode=0x%x (%08x)\n", TYPE(h), OPCODE(h), SUBOPCODE(h), h);
   break;
}
-   file->cursor = p + header_length + bias + payload_size / 4;
+   file->cursor = new_cursor;
+
+   return AUB_ITEM_DECODE_OK;
 }
 
 static int
 aub_file_more_stuff(struct aub_file *file)
 {
-   return file->cursor < file->end;
+   return file->cursor < file->end || (file->stream && !feof(file->stream));
+}
+
+#define AUB_READ_BUFFER_SIZE (4096)
+#define MAX(a, b) ((a) < (b) ? (b) : (a))
+

[Mesa-dev] [PATCH 2/5] intel: aubinator: add missing return characters

2016-10-04 Thread Lionel Landwerlin
Signed-off-by: Lionel Landwerlin 
Cc: Sirisha Gandikota 
---
 src/intel/tools/aubinator.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/src/intel/tools/aubinator.c b/src/intel/tools/aubinator.c
index 4e2cafa..9939de7 100644
--- a/src/intel/tools/aubinator.c
+++ b/src/intel/tools/aubinator.c
@@ -808,7 +808,7 @@ handle_trace_block(struct gen_spec *spec, uint32_t *p)
   if (address_space != AUB_TRACE_MEMTYPE_GTT)
  break;
   if (gtt_size < offset + size) {
- fprintf(stderr, "overflow gtt space: %s", strerror(errno));
+ fprintf(stderr, "overflow gtt space: %s\n", strerror(errno));
  exit(EXIT_FAILURE);
   }
   memcpy((char *) gtt + offset, data, size);
@@ -850,19 +850,19 @@ aub_file_open(const char *filename)
file->filename = strdup(filename);
file->fd = open(file->filename, O_RDONLY);
if (file->fd == -1) {
-  fprintf(stderr, "open %s failed: %s", file->filename, strerror(errno));
+  fprintf(stderr, "open %s failed: %s\n", file->filename, strerror(errno));
   exit(EXIT_FAILURE);
}
 
if (fstat(file->fd, >sb) == -1) {
-  fprintf(stderr, "stat failed: %s", strerror(errno));
+  fprintf(stderr, "stat failed: %s\n", strerror(errno));
   exit(EXIT_FAILURE);
}
 
file->map = mmap(NULL, file->sb.st_size,
 PROT_READ, MAP_SHARED, file->fd, 0);
if (file->map == MAP_FAILED) {
-  fprintf(stderr, "mmap failed: %s", strerror(errno));
+  fprintf(stderr, "mmap failed: %s\n", strerror(errno));
   exit(EXIT_FAILURE);
}
 
@@ -874,7 +874,7 @@ aub_file_open(const char *filename)
gtt = mmap(NULL, gtt_size, PROT_READ | PROT_WRITE,
   MAP_PRIVATE | MAP_ANONYMOUS |  MAP_NORESERVE, -1, 0);
if (gtt == MAP_FAILED) {
-  fprintf(stderr, "failed to alloc gtt space: %s", strerror(errno));
+  fprintf(stderr, "failed to alloc gtt space: %s\n", strerror(errno));
   exit(1);
}
 
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/5] intel: aubinator: retain input file in its own variable

2016-10-04 Thread Lionel Landwerlin
Signed-off-by: Lionel Landwerlin 
Cc: Sirisha Gandikota 
---
 src/intel/tools/aubinator.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/src/intel/tools/aubinator.c b/src/intel/tools/aubinator.c
index 9939de7..459e3d4 100644
--- a/src/intel/tools/aubinator.c
+++ b/src/intel/tools/aubinator.c
@@ -1064,7 +1064,7 @@ int main(int argc, char *argv[])
struct aub_file *file;
int i;
bool found_arg_gen = false, pager = true;
-   const char *value;
+   const char *value, *input_file = NULL;
char gen_file[256], gen_val[24];
const struct {
   const char *name;
@@ -1120,6 +1120,7 @@ int main(int argc, char *argv[])
 fprintf(stderr, "unknown option %s\n", argv[i]);
 exit(EXIT_FAILURE);
  }
+ input_file = argv[i];
  break;
   }
}
@@ -1163,7 +1164,7 @@ int main(int argc, char *argv[])
print_help(argv[0], stderr);
exit(EXIT_FAILURE);
} else {
-   file = aub_file_open(argv[i]);
+   file = aub_file_open(input_file);
}
 
while (aub_file_more_stuff(file))
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 4/5] intel: aubinator: generate a standalone binary

2016-10-04 Thread Lionel Landwerlin
Embed the xml files into the binary, so aubinator can be used from any
location.

v2: Split generation packing into another patch (Jason)
Check for xxd (Jason)

Signed-off-by: Lionel Landwerlin 
Cc: Sirisha Gandikota 
---
 configure.ac|  1 +
 src/intel/Makefile.am   |  1 +
 src/intel/Makefile.aubinator.am | 36 ++
 src/intel/Makefile.sources  |  7 
 src/intel/tools/.gitignore  |  5 +++
 src/intel/tools/aubinator.c | 39 ++--
 src/intel/tools/decoder.c   | 82 +
 src/intel/tools/decoder.h   |  4 +-
 8 files changed, 122 insertions(+), 53 deletions(-)
 create mode 100644 src/intel/Makefile.aubinator.am

diff --git a/configure.ac b/configure.ac
index 1bfac3b..7046349 100644
--- a/configure.ac
+++ b/configure.ac
@@ -110,6 +110,7 @@ LT_PREREQ([2.2])
 LT_INIT([disable-static])
 
 AC_CHECK_PROG(RM, rm, [rm -f])
+AC_CHECK_PROG(XXD, xxd, [xxd])
 
 AX_PROG_BISON([],
   AS_IF([test ! -f 
"$srcdir/src/compiler/glsl/glcpp/glcpp-parse.c"],
diff --git a/src/intel/Makefile.am b/src/intel/Makefile.am
index 9186b5c..c3cb9fb 100644
--- a/src/intel/Makefile.am
+++ b/src/intel/Makefile.am
@@ -52,6 +52,7 @@ BUILT_SOURCES =
 CLEANFILES =
 EXTRA_DIST =
 
+include Makefile.aubinator.am
 include Makefile.blorp.am
 include Makefile.common.am
 include Makefile.genxml.am
diff --git a/src/intel/Makefile.aubinator.am b/src/intel/Makefile.aubinator.am
new file mode 100644
index 000..3d05d30
--- /dev/null
+++ b/src/intel/Makefile.aubinator.am
@@ -0,0 +1,36 @@
+# Copyright © 2016 Intel Corporation
+#
+# Permission is hereby granted, free of charge, to any person obtaining a
+# copy of this software and associated documentation files (the "Software"),
+# to deal in the Software without restriction, including without limitation
+# the rights to use, copy, modify, merge, publish, distribute, sublicense,
+# and/or sell copies of the Software, and to permit persons to whom the
+# Software is furnished to do so, subject to the following conditions:
+#
+# The above copyright notice and this permission notice (including the next
+# paragraph) shall be included in all copies or substantial portions of the
+# Software.
+#
+# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+# THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+# IN THE SOFTWARE.
+
+BUILT_SOURCES += $(AUBINATOR_GENERATED_FILES)
+
+SUFFIXES = _aubinator_xml.h .xml
+
+tools/gen6_aubinator_xml.h: genxml/gen6.xml
+tools/gen7_aubinator_xml.h: genxml/gen7.xml
+tools/gen75_aubinator_xml.h: genxml/gen75.xml
+tools/gen8_aubinator_xml.h: genxml/gen8.xml
+tools/gen9_aubinator_xml.h: genxml/gen9.xml
+
+$(AUBINATOR_GENERATED_FILES): Makefile
+
+%_aubinator_xml.h:
+   $(MKDIR_GEN)
+   $(AM_V_GEN) $(XXD) -i $< > $@
diff --git a/src/intel/Makefile.sources b/src/intel/Makefile.sources
index 94073d2..a5c2bf0 100644
--- a/src/intel/Makefile.sources
+++ b/src/intel/Makefile.sources
@@ -1,3 +1,10 @@
+AUBINATOR_GENERATED_FILES = \
+   tools/gen6_aubinator_xml.h \
+   tools/gen7_aubinator_xml.h \
+   tools/gen75_aubinator_xml.h \
+   tools/gen8_aubinator_xml.h \
+   tools/gen9_aubinator_xml.h
+
 BLORP_FILES = \
blorp/blorp.c \
blorp/blorp.h \
diff --git a/src/intel/tools/.gitignore b/src/intel/tools/.gitignore
index 0c80a6f..c4eebde 100644
--- a/src/intel/tools/.gitignore
+++ b/src/intel/tools/.gitignore
@@ -1 +1,6 @@
 /aubinator
+gen6_aubinator_xml.h
+gen75_aubinator_xml.h
+gen7_aubinator_xml.h
+gen8_aubinator_xml.h
+gen9_aubinator_xml.h
diff --git a/src/intel/tools/aubinator.c b/src/intel/tools/aubinator.c
index 459e3d4..3599867 100644
--- a/src/intel/tools/aubinator.c
+++ b/src/intel/tools/aubinator.c
@@ -1065,22 +1065,21 @@ int main(int argc, char *argv[])
int i;
bool found_arg_gen = false, pager = true;
const char *value, *input_file = NULL;
-   char gen_file[256], gen_val[24];
+   char gen_val[24];
const struct {
   const char *name;
   int pci_id;
-  int major;
-  int minor;
} gens[] = {
-  { "ivb", 0x0166, 7, 0 }, /* Intel(R) Ivybridge Mobile GT2 */
-  { "hsw", 0x0416, 7, 5 }, /* Intel(R) Haswell Mobile GT2 */
-  { "byt", 0x0155, 7, 5 }, /* Intel(R) Bay Trail */
-  { "bdw", 0x1616, 8, 0 }, /* Intel(R) HD Graphics 5500 (Broadwell GT2) */
-  { "chv", 0x22B3, 8, 0 }, /* Intel(R) HD Graphics (Cherryview) */
-  { "skl", 0x1912, 9, 0 }, /* Intel(R) HD Graphics 530 (Skylake GT2) */
-  { "kbl", 0x591D, 9, 0 }, /* Intel(R) Kabylake GT2 */
-  { 

[Mesa-dev] [PATCH 1/5] intel: aubinator: pack supported generations into an array

2016-10-04 Thread Lionel Landwerlin
Signed-off-by: Lionel Landwerlin 
Cc: Sirisha Gandikota 
---
 src/intel/tools/aubinator.c | 79 +
 1 file changed, 30 insertions(+), 49 deletions(-)

diff --git a/src/intel/tools/aubinator.c b/src/intel/tools/aubinator.c
index 9b32e5b..4e2cafa 100644
--- a/src/intel/tools/aubinator.c
+++ b/src/intel/tools/aubinator.c
@@ -36,6 +36,8 @@
 #include 
 #include 
 
+#include "util/macros.h"
+
 #include "decoder.h"
 #include "intel_aub.h"
 #include "gen_disasm.h"
@@ -1060,11 +1062,25 @@ int main(int argc, char *argv[])
 {
struct gen_spec *spec;
struct aub_file *file;
-   int i, pci_id = 0;
+   int i;
bool found_arg_gen = false, pager = true;
-   int gen_major, gen_minor;
const char *value;
char gen_file[256], gen_val[24];
+   const struct {
+  const char *name;
+  int pci_id;
+  int major;
+  int minor;
+   } gens[] = {
+  { "ivb", 0x0166, 7, 0 }, /* Intel(R) Ivybridge Mobile GT2 */
+  { "hsw", 0x0416, 7, 5 }, /* Intel(R) Haswell Mobile GT2 */
+  { "byt", 0x0155, 7, 5 }, /* Intel(R) Bay Trail */
+  { "bdw", 0x1616, 8, 0 }, /* Intel(R) HD Graphics 5500 (Broadwell GT2) */
+  { "chv", 0x22B3, 8, 0 }, /* Intel(R) HD Graphics (Cherryview) */
+  { "skl", 0x1912, 9, 0 }, /* Intel(R) HD Graphics 530 (Skylake GT2) */
+  { "kbl", 0x591D, 9, 0 }, /* Intel(R) Kabylake GT2 */
+  { "bxt", 0x0A84, 9, 0 }  /* Intel(R) HD Graphics (Broxton) */
+   }, *gen = NULL;
 
if (argc == 1) {
   print_help(argv[0], stderr);
@@ -1082,8 +1098,6 @@ int main(int argc, char *argv[])
 exit(EXIT_FAILURE);
  }
  found_arg_gen = true;
- gen_major = 0;
- gen_minor = 0;
  snprintf(gen_val, sizeof(gen_val), "%s", value);
   } else if (strcmp(argv[i], "--headers") == 0) {
  option_full_decode = false;
@@ -1115,47 +1129,14 @@ int main(int argc, char *argv[])
   exit(EXIT_FAILURE);
}
 
-   if (strstr(gen_val, "ivb") != NULL) {
-  /* Intel(R) Ivybridge Mobile GT2 */
-  pci_id = 0x0166;
-  gen_major = 7;
-  gen_minor = 0;
-   } else if (strstr(gen_val, "hsw") != NULL) {
-  /* Intel(R) Haswell Mobile GT2 */
-  pci_id = 0x0416;
-  gen_major = 7;
-  gen_minor = 5;
-   } else if (strstr(gen_val, "byt") != NULL) {
-  /* Intel(R) Bay Trail */
-  pci_id = 0x0155;
-  gen_major = 7;
-  gen_minor = 5;
-   } else if (strstr(gen_val, "bdw") != NULL) {
-  /* Intel(R) HD Graphics 5500 (Broadwell GT2) */
-  pci_id = 0x1616;
-  gen_major = 8;
-  gen_minor = 0;
-   }  else if (strstr(gen_val, "chv") != NULL) {
-  /* Intel(R) HD Graphics (Cherryview) */
-  pci_id = 0x22B3;
-  gen_major = 8;
-  gen_minor = 0;
-   } else if (strstr(gen_val, "skl") != NULL) {
-  /* Intel(R) HD Graphics 530 (Skylake GT2) */
-  pci_id = 0x1912;
-  gen_major = 9;
-  gen_minor = 0;
-   } else if (strstr(gen_val, "kbl") != NULL) {
-  /* Intel(R) Kabylake GT2 */
-  pci_id = 0x591D;
-  gen_major = 9;
-  gen_minor = 0;
-   } else if (strstr(gen_val, "bxt") != NULL) {
-  /* Intel(R) HD Graphics (Broxton) */
-  pci_id = 0x0A84;
-  gen_major = 9;
-  gen_minor = 0;
-   } else {
+   for (i = 0; i < ARRAY_SIZE(gens); i++) {
+  if (!strcmp(gen_val, gens[i].name)) {
+ gen = [i];
+ break;
+  }
+   }
+
+   if (gen == NULL) {
   fprintf(stderr, "can't parse gen: %s, expected ivb, byt, hsw, "
  "bdw, chv, skl, kbl or bxt\n", gen_val);
   exit(EXIT_FAILURE);
@@ -1168,15 +1149,15 @@ int main(int argc, char *argv[])
if (isatty(1) && pager)
   setup_pager();
 
-   if (gen_minor > 0) {
+   if (gen->minor > 0) {
   snprintf(gen_file, sizeof(gen_file), "../genxml/gen%d%d.xml",
-   gen_major, gen_minor);
+   gen->major, gen->minor);
} else {
-  snprintf(gen_file, sizeof(gen_file), "../genxml/gen%d.xml", gen_major);
+  snprintf(gen_file, sizeof(gen_file), "../genxml/gen%d.xml", gen->major);
}
 
spec = gen_spec_load(gen_file);
-   disasm = gen_disasm_create(pci_id);
+   disasm = gen_disasm_create(gen->pci_id);
 
if (argv[i] == NULL) {
print_help(argv[0], stderr);
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] i965: solve cubemap negative x/y/z faces buffer offset issue in dEQP.

2016-10-04 Thread Tapani Pälli

On 10/04/2016 06:09 PM, Jason Ekstrand wrote:
On Thu, Sep 29, 2016 at 11:27 PM, Xu,Randy > wrote:


Add the miptree level/slice x/y_offset when count the surface offset
in brw_emit_surface_state. The surface offset has two parts, one is
from mt->offset, which should be 32 aligned in width/height for tiled
buffer; another is from mt->level[current_level].slice[current_slice].
x/y_offset.

This fix will solve 12 deqp failure
dEQP-EGL.functional.image.create.gles2_cubemap_negative_*_texture

Signed-off-by: Xu,Randy >
---
 src/mesa/drivers/dri/i965/brw_wm_surface_state.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
index 61a4b94..3a5c573 100644
--- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
+++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
@@ -85,7 +85,8 @@ brw_emit_surface_state(struct brw_context *brw,
unsigned read_domains, unsigned write_domains)
 {
const struct surface_state_info ss_info =
surface_state_infos[brw->gen];
-   uint32_t tile_x = 0, tile_y = 0;
+   uint32_t tile_x = mt->level[0].slice[0].x_offset;
+   uint32_t tile_y = mt->level[0].slice[0].y_offset;


This isn't correct.  First off, there are some fairly strict 
restrictions on what we can do with tile_x and tile_y and we can't 
just shove x/y_offset in there.  We need to use 
intel_miptree_get_tile_offsets to get both a byte offset and an 
intratile offset.  Second, we should already be taking slices into 
account for cube maps via base_array_layer where needed.


Unfortunately, I'm not 100% sure what the correct patch is without a 
bit more information about what the test is doing that causes a problem.




I did take a brief look and when running the set mentioned above (for 
example with ./deqp-egl 
--deqp-case=*EGL.functional.image.create.gles2_cubemap_negative_*_texture) 
what happens is that we never end up to the part of code calling 
intel_miptree_get_tile_offsets in that function (because surf.dim_layout 
!= dim_layout condition does not trigger). This is just what I observed, 
should we just call intel_miptree_get_tile_offsets() unconditionally then?




--Jason

uint32_t offset = mt->offset;

struct isl_surf surf;
--
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org 
https://lists.freedesktop.org/mailman/listinfo/mesa-dev





___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] util: use GCC atomic intrinsics with explicit memory model

2016-10-04 Thread Jan Vesely
On Tue, 2016-10-04 at 16:14 +0200, Nicolai Hähnle wrote:
> From: Nicolai Hähnle 
> 
> This is motivated by the fact that p_atomic_read and p_atomic_set may
> somewhat surprisingly not do the right thing in the old version:
> while
> stores and loads are de facto atomic at least on x86,

afaik, this is only true for naturally aligned loads/stores (even for
x86).

Jan

>  the compiler may
> apply re-ordering and speculation quite liberally. Basically, the old
> version uses the "relaxed" memory ordering.
> 
> The new ordering always uses acquire/release ordering. This is the
> strongest possible memory ordering that doesn't require additional
> fence instructions on x86. (And the only stronger ordering is
> "sequentially consistent", which is usually more than you need
> anyway.)
> 
> I would feel more comfortable if p_atomic_set/read in the old
> implementation were at least using volatile loads and stores, but I
> don't see a way to get there without typeof (which we cannot use here
> since the code is compiled with -std=c99).
> 
> Eventually, we should really just move to something that is based on
> the atomics in C11 / C++11.
> ---
>  configure.ac| 11 +++
>  src/util/u_atomic.h | 21 +
>  2 files changed, 32 insertions(+)
> 
> diff --git a/configure.ac b/configure.ac
> index 1bfac3b..421f4f3 100644
> --- a/configure.ac
> +++ b/configure.ac
> @@ -380,20 +380,31 @@ int main () {
>  c = _mm_max_epu32(a, b);
>  return _mm_cvtsi128_si32(c);
>  }]])], SSE41_SUPPORTED=1)
>  CFLAGS="$save_CFLAGS"
>  if test "x$SSE41_SUPPORTED" = x1; then
>  DEFINES="$DEFINES -DUSE_SSE41"
>  fi
>  AM_CONDITIONAL([SSE41_SUPPORTED], [test x$SSE41_SUPPORTED = x1])
>  AC_SUBST([SSE41_CFLAGS], $SSE41_CFLAGS)
>  
> +dnl Check for new-style atomic builtins
> +AC_COMPILE_IFELSE([AC_LANG_SOURCE([[
> +int main() {
> +int n;
> +return __atomic_load_n(, __ATOMIC_ACQUIRE);
> +}]])], GCC_ATOMIC_BUILTINS_SUPPORTED=1)
> +if test "x$GCC_ATOMIC_BUILTINS_SUPPORTED" = x1; then
> +DEFINES="$DEFINES -DUSE_GCC_ATOMIC_BUILTINS"
> +fi
> +AM_CONDITIONAL([GCC_ATOMIC_BUILTINS_SUPPORTED], [test
> x$GCC_ATOMIC_BUILTINS_SUPPORTED = x1])
> +
>  dnl Check for Endianness
>  AC_C_BIGENDIAN(
> little_endian=no,
> little_endian=yes,
> little_endian=no,
> little_endian=no
>  )
>  
>  dnl Check for POWER8 Architecture
>  PWR8_CFLAGS="-mpower8-vector"
> diff --git a/src/util/u_atomic.h b/src/util/u_atomic.h
> index 8675903..2a5bbae 100644
> --- a/src/util/u_atomic.h
> +++ b/src/util/u_atomic.h
> @@ -29,28 +29,49 @@
>  #error "Unsupported platform"
>  #endif
>  
>  
>  /* Implementation using GCC-provided synchronization intrinsics
>   */
>  #if defined(PIPE_ATOMIC_GCC_INTRINSIC)
>  
>  #define PIPE_ATOMIC "GCC Sync Intrinsics"
>  
> +#if defined(USE_GCC_ATOMIC_BUILTINS)
> +
> +/* The builtins with explicit memory model are available since GCC
> 4.7. */
> +#define p_atomic_set(_v, _i) __atomic_store_n((_v), (_i),
> __ATOMIC_RELEASE)
> +#define p_atomic_read(_v) __atomic_load_n((_v), __ATOMIC_ACQUIRE)
> +#define p_atomic_dec_zero(v) (__atomic_sub_fetch((v), 1,
> __ATOMIC_ACQ_REL) == 0)
> +#define p_atomic_inc(v) (void) __atomic_add_fetch((v), 1,
> __ATOMIC_ACQ_REL)
> +#define p_atomic_dec(v) (void) __atomic_sub_fetch((v), 1,
> __ATOMIC_ACQ_REL)
> +#define p_atomic_add(v, i) (void) __atomic_add_fetch((v), (i),
> __ATOMIC_ACQ_REL)
> +#define p_atomic_inc_return(v) __atomic_add_fetch((v), 1,
> __ATOMIC_ACQ_REL)
> +#define p_atomic_dec_return(v) __atomic_sub_fetch((v), 1,
> __ATOMIC_ACQ_REL)
> +
> +#else
> +
>  #define p_atomic_set(_v, _i) (*(_v) = (_i))
>  #define p_atomic_read(_v) (*(_v))
>  #define p_atomic_dec_zero(v) (__sync_sub_and_fetch((v), 1) == 0)
>  #define p_atomic_inc(v) (void) __sync_add_and_fetch((v), 1)
>  #define p_atomic_dec(v) (void) __sync_sub_and_fetch((v), 1)
>  #define p_atomic_add(v, i) (void) __sync_add_and_fetch((v), (i))
>  #define p_atomic_inc_return(v) __sync_add_and_fetch((v), 1)
>  #define p_atomic_dec_return(v) __sync_sub_and_fetch((v), 1)
> +
> +#endif
> +
> +/* There is no __atomic_* compare and exchange that returns the
> current value.
> + * Also, GCC 5.4 seems unable to optimize a compound statement
> expression that
> + * uses an additional stack variable with
> __atomic_compare_exchange[_n].
> + */
>  #define p_atomic_cmpxchg(v, old, _new) \
> __sync_val_compare_and_swap((v), (old), (_new))
>  
>  #endif
>  
>  
>  
>  /* Unlocked version for single threaded environments, such as some
>   * windows kernel modules.
>   */

signature.asc
Description: This is a digitally signed message part
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] i965: solve cubemap negative x/y/z faces buffer offset issue in dEQP.

2016-10-04 Thread Jason Ekstrand
On Thu, Sep 29, 2016 at 11:27 PM, Xu,Randy  wrote:

> Add the miptree level/slice x/y_offset when count the surface offset
> in brw_emit_surface_state. The surface offset has two parts, one is
> from mt->offset, which should be 32 aligned in width/height for tiled
> buffer; another is from mt->level[current_level].slice[current_slice].
> x/y_offset.
>
> This fix will solve 12 deqp failure
> dEQP-EGL.functional.image.create.gles2_cubemap_negative_*_texture
>
> Signed-off-by: Xu,Randy 
> ---
>  src/mesa/drivers/dri/i965/brw_wm_surface_state.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
> b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
> index 61a4b94..3a5c573 100644
> --- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
> +++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
> @@ -85,7 +85,8 @@ brw_emit_surface_state(struct brw_context *brw,
> unsigned read_domains, unsigned write_domains)
>  {
> const struct surface_state_info ss_info =
> surface_state_infos[brw->gen];
> -   uint32_t tile_x = 0, tile_y = 0;
> +   uint32_t tile_x = mt->level[0].slice[0].x_offset;
> +   uint32_t tile_y = mt->level[0].slice[0].y_offset;
>

This isn't correct.  First off, there are some fairly strict restrictions
on what we can do with tile_x and tile_y and we can't just shove x/y_offset
in there.  We need to use intel_miptree_get_tile_offsets to get both a byte
offset and an intratile offset.  Second, we should already be taking slices
into account for cube maps via base_array_layer where needed.

Unfortunately, I'm not 100% sure what the correct patch is without a bit
more information about what the test is doing that causes a problem.

--Jason


> uint32_t offset = mt->offset;
>
> struct isl_surf surf;
> --
> 2.7.4
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 97260] R9 290 low performance in Linux 4.7

2016-10-04 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=97260

--- Comment #57 from Alex Deucher  ---
Created attachment 126995
  --> https://bugs.freedesktop.org/attachment.cgi?id=126995=edit
disable async flip support

If anyone else is seeing a regression not already fixed, please bisect and open
a new bug with the results.  If you can't bisect, does disabling async flips
help (see attached patch)?

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] autoconf: Make header install distinct for various APIs (v2)

2016-10-04 Thread Chuck Atkins
This fixes a problem where GL headers would only get installed if
glx was enabled.  So if osmesa was enabled but not glx, then the
GL headers required by osmesa would be missing from the install.

v2: Dropped unneeded mesa_glinterop.h redundant osmesa.h install

CC: Emil Velikov 
Signed-off-by: Chuck Atkins 
---
 configure.ac |  2 ++
 src/Makefile.am  | 24 
 src/mesa/Makefile.am | 10 --
 3 files changed, 26 insertions(+), 10 deletions(-)

diff --git a/configure.ac b/configure.ac
index 1bfac3b..c7be735 100644
--- a/configure.ac
+++ b/configure.ac
@@ -2641,6 +2641,8 @@ fi
 AM_CONDITIONAL(HAVE_LIBDRM, test "x$have_libdrm" = xyes)
 AM_CONDITIONAL(HAVE_OSMESA, test "x$enable_osmesa" = xyes)
 AM_CONDITIONAL(HAVE_GALLIUM_OSMESA, test "x$enable_gallium_osmesa" = xyes)
+AM_CONDITIONAL(HAVE_COMMON_OSMESA, test "x$enable_osmesa" = xyes -o \
+"x$enable_gallium_osmesa" = xyes)
 
 AM_CONDITIONAL(HAVE_X86_ASM, test "x$asm_arch" = xx86 -o "x$asm_arch" = 
xx86_64)
 AM_CONDITIONAL(HAVE_X86_64_ASM, test "x$asm_arch" = xx86_64)
diff --git a/src/Makefile.am b/src/Makefile.am
index 551f431..91d6a7a 100644
--- a/src/Makefile.am
+++ b/src/Makefile.am
@@ -47,6 +47,30 @@ CLEANFILES = $(BUILT_SOURCES)
 
 SUBDIRS = . gtest util mapi/glapi/gen mapi
 
+if HAVE_OPENGL
+gldir = $(includedir)/GL
+gl_HEADERS = \
+  $(top_srcdir)/include/GL/gl.h \
+  $(top_srcdir)/include/GL/glext.h \
+  $(top_srcdir)/include/GL/glcorearb.h \
+  $(top_srcdir)/include/GL/gl_mangle.h
+endif
+
+if HAVE_GLX
+glxdir = $(includedir)/GL
+glx_HEADERS = \
+  $(top_srcdir)/include/GL/glx.h \
+  $(top_srcdir)/include/GL/glxext.h \
+  $(top_srcdir)/include/GL/glx_mangle.h
+pkgconfigdir = $(libdir)/pkgconfig
+pkgconfig_DATA = mesa/gl.pc
+endif
+
+if HAVE_COMMON_OSMESA
+osmesadir = $(includedir)/GL
+osmesa_HEADERS = $(top_srcdir)/include/GL/osmesa.h
+endif
+
 # include only conditionally ?
 SUBDIRS += compiler
 
diff --git a/src/mesa/Makefile.am b/src/mesa/Makefile.am
index 037384a..9710c7f 100644
--- a/src/mesa/Makefile.am
+++ b/src/mesa/Makefile.am
@@ -33,11 +33,6 @@ if HAVE_OSMESA
 SUBDIRS += drivers/osmesa
 endif
 
-if HAVE_GLX
-gldir = $(includedir)/GL
-gl_HEADERS = $(top_srcdir)/include/GL/*.h
-endif
-
 include Makefile.sources
 
 EXTRA_DIST = \
@@ -161,11 +156,6 @@ libmesa_sse41_la_SOURCES = \
 
 libmesa_sse41_la_CFLAGS = $(AM_CFLAGS) $(SSE41_CFLAGS)
 
-if HAVE_GLX
-pkgconfigdir = $(libdir)/pkgconfig
-pkgconfig_DATA = gl.pc
-endif
-
 MKDIR_GEN = $(AM_V_at)$(MKDIR_P) $(@D)
 YACC_GEN = $(AM_V_GEN)$(YACC) $(YFLAGS)
 LEX_GEN = $(AM_V_GEN)$(LEX) $(LFLAGS)
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] intel: aubinator: enable loading dumps from standard input

2016-10-04 Thread Lionel Landwerlin

On 04/10/16 16:01, Eero Tamminen wrote:

Hi,

On 04.10.2016 17:38, Lionel Landwerlin wrote:

In conjuction with an intel_aubdump change, you can now look at your
application's output like this :

$ intel_aubdump -c '/path/to/aubinator --gen=hsw' my_gl_app


Maybe you could add also a patch to document this usage?


Thanks, forgot about the print_help().





Signed-off-by: Lionel Landwerlin 
Cc: Sirisha Gandikota 
Cc: Kristian Høgsberg 
---
 src/intel/tools/aubinator.c | 162 
+++-

 1 file changed, 130 insertions(+), 32 deletions(-)

diff --git a/src/intel/tools/aubinator.c b/src/intel/tools/aubinator.c
index 83328b5..73e6012 100644
--- a/src/intel/tools/aubinator.c
+++ b/src/intel/tools/aubinator.c

...

+   /* mmap a terabyte for our gtt space. */
+   gtt_size = 1ul << 30;


On 32-bit systems, you run out of address space when your process' 
total mappings size increase to several GBs.




Ooops, sorry about that.
That's remaining debug stuff from running valgrind :/



- Eero

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] intel/blorp: Use documented RECTLIST vertex positions

2016-10-04 Thread Jason Ekstrand
Reviewed-by: Jason Ekstrand 

On Wed, Sep 21, 2016 at 2:42 PM, Nanley Chery  wrote:

> Use the vertex positions described in the PRMs. This has no effect on
> rendering but quiets the simulator warnings seen when the vertices
> appear out of order.
>
> Signed-off-by: Nanley Chery 
> Cc: Jason Ekstrand 
> Cc: Marek Olšák 
> ---
>  src/intel/blorp/blorp_genX_exec.h | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/src/intel/blorp/blorp_genX_exec.h
> b/src/intel/blorp/blorp_genX_exec.h
> index eb4a5b9..62f16a3 100644
> --- a/src/intel/blorp/blorp_genX_exec.h
> +++ b/src/intel/blorp/blorp_genX_exec.h
> @@ -171,8 +171,8 @@ blorp_emit_vertex_data(struct blorp_batch *batch,
> uint32_t *size)
>  {
> const float vertices[] = {
> -  /* v0 */ (float)params->x0, (float)params->y1,
> -  /* v1 */ (float)params->x1, (float)params->y1,
> +  /* v0 */ (float)params->x1, (float)params->y1,
> +  /* v1 */ (float)params->x0, (float)params->y1,
>/* v2 */ (float)params->x0, (float)params->y0,
> };
>
> @@ -287,7 +287,7 @@ blorp_emit_vertex_elements(struct blorp_batch *batch,
>  *   v2 -- implied
>  *||
>  *||
> -*   v0 - v1
> +*   v1 - v0
>  *
>  * Since the VS is disabled, the clipper loads each VUE directly from
>  * the URB. This is controlled by the 3DSTATE_VERTEX_BUFFERS and
> --
> 2.10.0
>
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] intel: aubinator: enable loading dumps from standard input

2016-10-04 Thread Eero Tamminen

Hi,

On 04.10.2016 17:38, Lionel Landwerlin wrote:

In conjuction with an intel_aubdump change, you can now look at your
application's output like this :

$ intel_aubdump -c '/path/to/aubinator --gen=hsw' my_gl_app


Maybe you could add also a patch to document this usage?



Signed-off-by: Lionel Landwerlin 
Cc: Sirisha Gandikota 
Cc: Kristian Høgsberg 
---
 src/intel/tools/aubinator.c | 162 +++-
 1 file changed, 130 insertions(+), 32 deletions(-)

diff --git a/src/intel/tools/aubinator.c b/src/intel/tools/aubinator.c
index 83328b5..73e6012 100644
--- a/src/intel/tools/aubinator.c
+++ b/src/intel/tools/aubinator.c

...

+   /* mmap a terabyte for our gtt space. */
+   gtt_size = 1ul << 30;


On 32-bit systems, you run out of address space when your process' total 
mappings size increase to several GBs.



- Eero

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] intel: aubinator: generate a standalone binary

2016-10-04 Thread Jason Ekstrand
On Tue, Oct 4, 2016 at 7:38 AM, Lionel Landwerlin 
wrote:

> Embed the xml files into the binary, so aubinator can be used from any
> location.
>

Thank you for doing this!!!  Dynamically loading the xml has a few
benefits, but I think the benefits of not having to find them on disk are
far bigger.


> Signed-off-by: Lionel Landwerlin 
> Cc: Sirisha Gandikota 
> ---
>  src/intel/Makefile.am   |  1 +
>  src/intel/Makefile.aubinator.am | 36 +++
>  src/intel/Makefile.sources  |  7 +++
>  src/intel/tools/.gitignore  |  5 +++
>  src/intel/tools/aubinator.c | 97 +-
> ---
>  src/intel/tools/decoder.c   | 82 --
>  src/intel/tools/decoder.h   |  4 +-
>  7 files changed, 141 insertions(+), 91 deletions(-)
>  create mode 100644 src/intel/Makefile.aubinator.am
>
> diff --git a/src/intel/Makefile.am b/src/intel/Makefile.am
> index 9186b5c..c3cb9fb 100644
> --- a/src/intel/Makefile.am
> +++ b/src/intel/Makefile.am
> @@ -52,6 +52,7 @@ BUILT_SOURCES =
>  CLEANFILES =
>  EXTRA_DIST =
>
> +include Makefile.aubinator.am
>  include Makefile.blorp.am
>  include Makefile.common.am
>  include Makefile.genxml.am
> diff --git a/src/intel/Makefile.aubinator.am b/src/intel/Makefile.
> aubinator.am
> new file mode 100644
> index 000..9772700
> --- /dev/null
> +++ b/src/intel/Makefile.aubinator.am
> @@ -0,0 +1,36 @@
> +# Copyright © 2016 Intel Corporation
> +#
> +# Permission is hereby granted, free of charge, to any person obtaining a
> +# copy of this software and associated documentation files (the
> "Software"),
> +# to deal in the Software without restriction, including without
> limitation
> +# the rights to use, copy, modify, merge, publish, distribute, sublicense,
> +# and/or sell copies of the Software, and to permit persons to whom the
> +# Software is furnished to do so, subject to the following conditions:
> +#
> +# The above copyright notice and this permission notice (including the
> next
> +# paragraph) shall be included in all copies or substantial portions of
> the
> +# Software.
> +#
> +# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS
> OR
> +# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> +# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
> +# THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR
> OTHER
> +# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
> +# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
> DEALINGS
> +# IN THE SOFTWARE.
> +
> +BUILT_SOURCES += $(AUBINATOR_GENERATED_FILES)
> +
> +SUFFIXES = _aubinator_xml.h .xml
> +
> +tools/gen6_aubinator_xml.h: genxml/gen6.xml
> +tools/gen7_aubinator_xml.h: genxml/gen7.xml
> +tools/gen75_aubinator_xml.h: genxml/gen75.xml
> +tools/gen8_aubinator_xml.h: genxml/gen8.xml
> +tools/gen9_aubinator_xml.h: genxml/gen9.xml
> +
> +$(AUBINATOR_GENERATED_FILES): Makefile
> +
> +%_aubinator_xml.h:
> +   $(MKDIR_GEN)
> +   $(AM_V_GEN) xxd -i $< > $@
>

We should probably check for xxd in configure.ac and use $(XXD) here.  I
know that it's not installed by default on Fedora, so just using it is
kind-of mean.


> diff --git a/src/intel/Makefile.sources b/src/intel/Makefile.sources
> index 94073d2..a5c2bf0 100644
> --- a/src/intel/Makefile.sources
> +++ b/src/intel/Makefile.sources
> @@ -1,3 +1,10 @@
> +AUBINATOR_GENERATED_FILES = \
> +   tools/gen6_aubinator_xml.h \
> +   tools/gen7_aubinator_xml.h \
> +   tools/gen75_aubinator_xml.h \
> +   tools/gen8_aubinator_xml.h \
> +   tools/gen9_aubinator_xml.h
> +
>  BLORP_FILES = \
> blorp/blorp.c \
> blorp/blorp.h \
> diff --git a/src/intel/tools/.gitignore b/src/intel/tools/.gitignore
> index 0c80a6f..c4eebde 100644
> --- a/src/intel/tools/.gitignore
> +++ b/src/intel/tools/.gitignore
> @@ -1 +1,6 @@
>  /aubinator
> +gen6_aubinator_xml.h
> +gen75_aubinator_xml.h
> +gen7_aubinator_xml.h
> +gen8_aubinator_xml.h
> +gen9_aubinator_xml.h
> diff --git a/src/intel/tools/aubinator.c b/src/intel/tools/aubinator.c
> index a31dcb2..83328b5 100644
> --- a/src/intel/tools/aubinator.c
> +++ b/src/intel/tools/aubinator.c
> @@ -35,6 +35,8 @@
>  #include 
>  #include 
>
> +#include "util/macros.h"
> +
>  #include "decoder.h"
>  #include "intel_aub.h"
>  #include "gen_disasm.h"
> @@ -1059,11 +1061,24 @@ int main(int argc, char *argv[])
>  {
> struct gen_spec *spec;
> struct aub_file *file;
> -   int i, pci_id = 0;
> +   int i;
> bool found_arg_gen = false, pager = true;
> -   int gen_major, gen_minor;
> -   const char *value;
> -   char gen_file[256], gen_val[24];
> +   const char *value, *input_file = NULL;
> +   char gen_val[24];
> +   const struct {
> +  const char *name;
> +  int pci_id;
> +   } gens[] = {
> +  { "ivb", 0x0166 }, /* Intel(R) Ivybridge 

Re: [Mesa-dev] Mesa 12.1.0 release plan (Was Re: Next Mesa release, anyone?)

2016-10-04 Thread Rob Herring
On Tue, Oct 4, 2016 at 5:26 AM, Emil Velikov  wrote:
> On 4 October 2016 at 02:05, Rob Clark  wrote:
>> On Thu, Sep 29, 2016 at 10:56 AM, Emil Velikov  
>> wrote:
>>> On 28 September 2016 at 19:53, Marek Olšák  wrote:
 Hi,

 It's been almost 4 months since the 12.0 branch was created, and soon
 it will have been 3 months since Mesa 12.0 was released.

 Is there any reason we haven't created the stable branch yet?

 Ideally, we would time the release so that it's 1-2 months before fall
 distribution releases.

>>>
>>> Thanks Marek !
>>>
>>> In all honesty I was secretly hoping that we'll get Dave/Bas RADV for
>>> 12.1. With the topic of which would be 'the default' Vulkan driver for
>>> ATI/AMD hardware to be considered at a later stage.
>>
>> btw, I pushed libdrm release that I think etnaviv was waiting for..
>> not sure what else is needed before merging etnaviv gallium driver,
>> but if at all possible, it would be nice to land that before the
>> branch point too.
>>
> Thanks for the libdrm bits Rob. IIRC on the mesa side Christian
> reworked the render-only parts noticeably, yet as long as there isn't
> a crazy amount of changes outside of the etnaviv driver I think we're
> fine with getting it in.

I've been doing some work to get etnaviv working on Android. The
render-only approach is a bit broken IMO. It may work okay for X11
which expects to work on a card node, but it doesn't for Android which
already uses the render node for GL and gralloc and the KMS node for
HWC. Having the render-only driver also open the KMS node gets us into
all of the permissions issues. It seems to me we want gralloc to be
able to open both nodes (that still has some ioctl permission issues)
and allocate scanout buffers from the control node (thru GBM). Then
the etnaviv driver has to know to blit to the linear buffer when it
has an imported scanout buffer. IOW, I don't think the render-only
driver should internally allocate dumb buffers, but keep that
allocation external and the etnaviv driver needs to be able to deal
with external buffers. Maybe we just wait for the grand central
allocator to solve all this.

Rob
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH mesa] Revert "nir/spirv: add spirv2nir binary to .gitignore"

2016-10-04 Thread Jason Ekstrand
On Sun, Oct 2, 2016 at 4:15 AM, Eric Engestrom  wrote:

> This reverts commit fc03ecfeaf5a10a8b84d366f24f02e74ab03b145.
>
> Chad had already pushed the same change between me posting the patch and
> Jason
> pushing it: 44bcf1ffcced04fd7f2b (".gitignore: Ignore
> src/compiler/spirv2nir")
>
> CC: Chad Versace 
> CC: Jason Ekstrand 
> Signed-off-by: Eric Engestrom 
>

Pushed.  Thanks!


> ---
>
> Jason:
> When I saw Chad's commit, I marked my patch as superseeded, and
> I expected pwclient would say something when you'd try to push it.
> Didn't it print a warning or something?
>

I don't use pwclient. :-)  Really, statuses in patchwork don't actually
mean anything.  It'd be better to send a reply to the mailing list.


> That might be something to add to the client, for anyone that sees this
> and knows enough about patchwork :)
>
> ---
>  src/compiler/.gitignore | 1 -
>  1 file changed, 1 deletion(-)
>
> diff --git a/src/compiler/.gitignore b/src/compiler/.gitignore
> index f619567..5d30b4e 100644
> --- a/src/compiler/.gitignore
> +++ b/src/compiler/.gitignore
> @@ -4,4 +4,3 @@ subtest-cr
>  subtest-cr-lf
>  subtest-lf
>  subtest-lf-cr
> -spirv2nir
> --
> Cheers,
>   Eric
>
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/2] intel: aubinator: generate a standalone binary

2016-10-04 Thread Lionel Landwerlin
Embed the xml files into the binary, so aubinator can be used from any
location.

Signed-off-by: Lionel Landwerlin 
Cc: Sirisha Gandikota 
---
 src/intel/Makefile.am   |  1 +
 src/intel/Makefile.aubinator.am | 36 +++
 src/intel/Makefile.sources  |  7 +++
 src/intel/tools/.gitignore  |  5 +++
 src/intel/tools/aubinator.c | 97 +
 src/intel/tools/decoder.c   | 82 --
 src/intel/tools/decoder.h   |  4 +-
 7 files changed, 141 insertions(+), 91 deletions(-)
 create mode 100644 src/intel/Makefile.aubinator.am

diff --git a/src/intel/Makefile.am b/src/intel/Makefile.am
index 9186b5c..c3cb9fb 100644
--- a/src/intel/Makefile.am
+++ b/src/intel/Makefile.am
@@ -52,6 +52,7 @@ BUILT_SOURCES =
 CLEANFILES =
 EXTRA_DIST =
 
+include Makefile.aubinator.am
 include Makefile.blorp.am
 include Makefile.common.am
 include Makefile.genxml.am
diff --git a/src/intel/Makefile.aubinator.am b/src/intel/Makefile.aubinator.am
new file mode 100644
index 000..9772700
--- /dev/null
+++ b/src/intel/Makefile.aubinator.am
@@ -0,0 +1,36 @@
+# Copyright © 2016 Intel Corporation
+#
+# Permission is hereby granted, free of charge, to any person obtaining a
+# copy of this software and associated documentation files (the "Software"),
+# to deal in the Software without restriction, including without limitation
+# the rights to use, copy, modify, merge, publish, distribute, sublicense,
+# and/or sell copies of the Software, and to permit persons to whom the
+# Software is furnished to do so, subject to the following conditions:
+#
+# The above copyright notice and this permission notice (including the next
+# paragraph) shall be included in all copies or substantial portions of the
+# Software.
+#
+# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+# THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+# IN THE SOFTWARE.
+
+BUILT_SOURCES += $(AUBINATOR_GENERATED_FILES)
+
+SUFFIXES = _aubinator_xml.h .xml
+
+tools/gen6_aubinator_xml.h: genxml/gen6.xml
+tools/gen7_aubinator_xml.h: genxml/gen7.xml
+tools/gen75_aubinator_xml.h: genxml/gen75.xml
+tools/gen8_aubinator_xml.h: genxml/gen8.xml
+tools/gen9_aubinator_xml.h: genxml/gen9.xml
+
+$(AUBINATOR_GENERATED_FILES): Makefile
+
+%_aubinator_xml.h:
+   $(MKDIR_GEN)
+   $(AM_V_GEN) xxd -i $< > $@
diff --git a/src/intel/Makefile.sources b/src/intel/Makefile.sources
index 94073d2..a5c2bf0 100644
--- a/src/intel/Makefile.sources
+++ b/src/intel/Makefile.sources
@@ -1,3 +1,10 @@
+AUBINATOR_GENERATED_FILES = \
+   tools/gen6_aubinator_xml.h \
+   tools/gen7_aubinator_xml.h \
+   tools/gen75_aubinator_xml.h \
+   tools/gen8_aubinator_xml.h \
+   tools/gen9_aubinator_xml.h
+
 BLORP_FILES = \
blorp/blorp.c \
blorp/blorp.h \
diff --git a/src/intel/tools/.gitignore b/src/intel/tools/.gitignore
index 0c80a6f..c4eebde 100644
--- a/src/intel/tools/.gitignore
+++ b/src/intel/tools/.gitignore
@@ -1 +1,6 @@
 /aubinator
+gen6_aubinator_xml.h
+gen75_aubinator_xml.h
+gen7_aubinator_xml.h
+gen8_aubinator_xml.h
+gen9_aubinator_xml.h
diff --git a/src/intel/tools/aubinator.c b/src/intel/tools/aubinator.c
index a31dcb2..83328b5 100644
--- a/src/intel/tools/aubinator.c
+++ b/src/intel/tools/aubinator.c
@@ -35,6 +35,8 @@
 #include 
 #include 
 
+#include "util/macros.h"
+
 #include "decoder.h"
 #include "intel_aub.h"
 #include "gen_disasm.h"
@@ -1059,11 +1061,24 @@ int main(int argc, char *argv[])
 {
struct gen_spec *spec;
struct aub_file *file;
-   int i, pci_id = 0;
+   int i;
bool found_arg_gen = false, pager = true;
-   int gen_major, gen_minor;
-   const char *value;
-   char gen_file[256], gen_val[24];
+   const char *value, *input_file = NULL;
+   char gen_val[24];
+   const struct {
+  const char *name;
+  int pci_id;
+   } gens[] = {
+  { "ivb", 0x0166 }, /* Intel(R) Ivybridge Mobile GT2 */
+  { "hsw", 0x0416 }, /* Intel(R) Haswell Mobile GT2 */
+  { "byt", 0x0155 }, /* Intel(R) Bay Trail */
+  { "bdw", 0x1616 }, /* Intel(R) HD Graphics 5500 (Broadwell GT2) */
+  { "chv", 0x22B3 }, /* Intel(R) HD Graphics (Cherryview) */
+  { "skl", 0x1912 }, /* Intel(R) HD Graphics 530 (Skylake GT2) */
+  { "kbl", 0x591D }, /* Intel(R) Kabylake GT2 */
+  { "bxt", 0x0A84 }  /* Intel(R) HD Graphics (Broxton) */
+   }, *gen = NULL;
+   struct gen_device_info devinfo;
 
if (argc == 1) {
   print_help(argv[0], stderr);
@@ -1081,8 +1096,6 @@ int main(int argc, char *argv[])
 exit(EXIT_FAILURE);
  }

[Mesa-dev] [PATCH 0/2] Enable aubinator to decode a running application

2016-10-04 Thread Lionel Landwerlin
Hi,

Discussing with Kristian about ksim the other week, it came up that it would
be interesting to be able to look at an application's output while it's
running. The end goal being that we could remove some hand written code from
the driver (brw_state_dump.c) and have more complete output.

This series enables aubinator to decode its standard input like it would
with normal aubdump file.

This change requires a slight modification to intel_aubdump (so it sets up
the communication between the running application and aubinator) :

https://patchwork.freedesktop.org/patch/113618/

Looking forward to comments from Ben and Kenneth who seem to rely a fair bit
on brw_state_dump.

Cheers,

Lionel Landwerlin (2):
  intel: aubinator: generate a standalone binary
  intel: aubinator: enable loading dumps from standard input

 src/intel/Makefile.am   |   1 +
 src/intel/Makefile.aubinator.am |  36 ++
 src/intel/Makefile.sources  |   7 ++
 src/intel/tools/.gitignore  |   5 +
 src/intel/tools/aubinator.c | 253 ++--
 src/intel/tools/decoder.c   |  82 -
 src/intel/tools/decoder.h   |   4 +-
 7 files changed, 268 insertions(+), 120 deletions(-)
 create mode 100644 src/intel/Makefile.aubinator.am

--
2.9.3
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/2] intel: aubinator: enable loading dumps from standard input

2016-10-04 Thread Lionel Landwerlin
In conjuction with an intel_aubdump change, you can now look at your
application's output like this :

$ intel_aubdump -c '/path/to/aubinator --gen=hsw' my_gl_app

Signed-off-by: Lionel Landwerlin 
Cc: Sirisha Gandikota 
Cc: Kristian Høgsberg 
---
 src/intel/tools/aubinator.c | 162 +++-
 1 file changed, 130 insertions(+), 32 deletions(-)

diff --git a/src/intel/tools/aubinator.c b/src/intel/tools/aubinator.c
index 83328b5..73e6012 100644
--- a/src/intel/tools/aubinator.c
+++ b/src/intel/tools/aubinator.c
@@ -834,48 +834,51 @@ handle_trace_block(struct gen_spec *spec, uint32_t *p)
 }
 
 struct aub_file {
-   char *filename;
-   int fd;
-   struct stat sb;
+   FILE *stream;
+
uint32_t *map, *end, *cursor;
+   uint32_t *mem_end;
 };
 
 static struct aub_file *
 aub_file_open(const char *filename)
 {
struct aub_file *file;
+   struct stat sb;
+   int fd;
 
-   file = malloc(sizeof *file);
-   file->filename = strdup(filename);
-   file->fd = open(file->filename, O_RDONLY);
-   if (file->fd == -1) {
-  fprintf(stderr, "open %s failed: %s", file->filename, strerror(errno));
+   file = calloc(1, sizeof *file);
+   fd = open(filename, O_RDONLY);
+   if (fd == -1) {
+  fprintf(stderr, "open %s failed: %s", filename, strerror(errno));
   exit(EXIT_FAILURE);
}
 
-   if (fstat(file->fd, >sb) == -1) {
+   if (fstat(fd, ) == -1) {
   fprintf(stderr, "stat failed: %s", strerror(errno));
   exit(EXIT_FAILURE);
}
 
-   file->map = mmap(NULL, file->sb.st_size,
-PROT_READ, MAP_SHARED, file->fd, 0);
+   file->map = mmap(NULL, sb.st_size,
+PROT_READ, MAP_SHARED, fd, 0);
if (file->map == MAP_FAILED) {
   fprintf(stderr, "mmap failed: %s", strerror(errno));
   exit(EXIT_FAILURE);
}
 
file->cursor = file->map;
-   file->end = file->map + file->sb.st_size / 4;
+   file->end = file->map + sb.st_size / 4;
 
-   /* mmap a terabyte for our gtt space. */
-   gtt_size = 1ul << 40;
-   gtt = mmap(NULL, gtt_size, PROT_READ | PROT_WRITE,
-  MAP_PRIVATE | MAP_ANONYMOUS |  MAP_NORESERVE, -1, 0);
-   if (gtt == MAP_FAILED) {
-  fprintf(stderr, "failed to alloc gtt space: %s", strerror(errno));
-  exit(1);
-   }
+   return file;
+}
+
+static struct aub_file *
+aub_file_stdin(void)
+{
+   struct aub_file *file;
+
+   file = calloc(1, sizeof *file);
+   file->stream = stdin;
 
return file;
 }
@@ -925,12 +928,21 @@ struct {
{ "bxt", MAKE_GEN(9, 0) }
 };
 
-static void
+enum {
+   AUB_ITEM_DECODE_OK,
+   AUB_ITEM_DECODE_FAILED,
+   AUB_ITEM_DECODE_NEED_MORE_DATA,
+};
+
+static int
 aub_file_decode_batch(struct aub_file *file, struct gen_spec *spec)
 {
-   uint32_t *p, h, device, data_type;
+   uint32_t *p, h, device, data_type, *new_cursor;
int header_length, payload_size, bias;
 
+   if (file->end - file->cursor < 12)
+  return AUB_ITEM_DECODE_NEED_MORE_DATA;
+
p = file->cursor;
h = *p;
header_length = h & 0x;
@@ -946,8 +958,7 @@ aub_file_decode_batch(struct aub_file *file, struct 
gen_spec *spec)
   printf("unknown opcode %d at %ld/%ld\n",
  OPCODE(h), file->cursor - file->map,
  file->end - file->map);
-  file->cursor = file->end;
-  return;
+  return AUB_ITEM_DECODE_FAILED;
}
 
payload_size = 0;
@@ -959,9 +970,22 @@ aub_file_decode_batch(struct aub_file *file, struct 
gen_spec *spec)
   payload_size = p[4];
   handle_trace_block(spec, p);
   break;
-   case MAKE_HEADER(TYPE_AUB, OPCODE_AUB, SUBOPCODE_BMP):
+   default:
   break;
+   }
+
+   new_cursor = p + header_length + bias + payload_size / 4;
+   if (new_cursor > file->end)
+  return AUB_ITEM_DECODE_NEED_MORE_DATA;
 
+   switch (h & 0x) {
+   case MAKE_HEADER(TYPE_AUB, OPCODE_AUB, SUBOPCODE_HEADER):
+  break;
+   case MAKE_HEADER(TYPE_AUB, OPCODE_AUB, SUBOPCODE_BLOCK):
+  handle_trace_block(spec, p);
+  break;
+   case MAKE_HEADER(TYPE_AUB, OPCODE_AUB, SUBOPCODE_BMP):
+  break;
case MAKE_HEADER(TYPE_AUB, OPCODE_NEW_AUB, SUBOPCODE_VERSION):
   printf("version block: dw1 %08x\n", p[1]);
   device = (p[1] >> 8) & 0xff;
@@ -987,13 +1011,65 @@ aub_file_decode_batch(struct aub_file *file, struct 
gen_spec *spec)
  "subopcode=0x%x (%08x)\n", TYPE(h), OPCODE(h), SUBOPCODE(h), h);
   break;
}
-   file->cursor = p + header_length + bias + payload_size / 4;
+   file->cursor = new_cursor;
+
+   return AUB_ITEM_DECODE_OK;
 }
 
 static int
 aub_file_more_stuff(struct aub_file *file)
 {
-   return file->cursor < file->end;
+   return file->cursor < file->end || (file->stream && !feof(file->stream));
+}
+
+#define AUB_READ_BUFFER_SIZE (4096)
+#define MAX(a, b) ((a) < (b) ? (b) : (a))
+
+static void
+aub_file_data_grow(struct aub_file *file)
+{
+   size_t old_size = (file->mem_end - file->map) * 4;
+   size_t new_size = 

Re: [Mesa-dev] [PATCH v2] util/slab: re-design to allow migration between pools (v2)

2016-10-04 Thread Marek Olšák
On Tue, Oct 4, 2016 at 4:30 PM, Nicolai Hähnle  wrote:
> On 30.09.2016 15:47, Marek Olšák wrote:
>>
>> On Fri, Sep 30, 2016 at 3:08 PM, Bas Nieuwenhuizen
>>  wrote:
>>>
>>> On Fri, Sep 30, 2016 at 2:13 PM, Marek Olšák  wrote:

 intptr_t reads and writes aren't atomic. p_atomic_set and
 p_atomic_read functions don't do anything for atomicity. See:

 #define p_atomic_set(_v, _i) (*(_v) = (_i))
 #define p_atomic_read(_v) (*(_v))
>>>
>>>
>>> That implementation seems bogus to me, as the compiler sees none of
>>> them as atomic and therefore the compiler can do strange stuff.
>>>
>>> why are intptr_t reads/writes less atomic than int32_t? IIRC on x86_64
>>> aligned 64-bit accesses are atomic, and on x86 intptr_t is just 32
>>> bits.
>
>
> I looked into a number of options for p_atomic_set/read and just sent around
> a patch which (to my understanding) definitely guarantees sufficient
> atomicity and memory ordering on GCC >= 4.7.
>
> Without that patch (and so I suspect also on GCC < 4.7), the memory accesses
> are still de facto atomic as Bas wrote. Furthermore, most of the necessary
> ordering guarantees are established by the calls to mtx_lock/unlock
> functions.
>
> Without that patch (and also with that patch but with GCC < 4.7), there is
> actually still the possibility that the write of page->u.num_remaining in
> slab_destroy_child is moved to after the loop over the page's elements. GCC
> doesn't actually do it in practice, but it is a gap which we probably have
> to live with unless we introduce some ugly workarounds.
>
> Note that this is only ever a problem in the situation where an allocation
> is freed with a different child pool than the one it was allocated from. In
> other words, the new code is (as far as I can see) only buggy in the case
> where the old code was even buggier.
>
> For what it's worth, I'm going to use p_atomic_set also for
> page->u.num_remaining. This is not strictly needed, since the
> acquire/release on elt->owner already establishes the necessary ordering
> already, but it should help clarity.
>
> Do you agree with this plan?

Yes, it sounds good to me.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2] util/slab: re-design to allow migration between pools (v2)

2016-10-04 Thread Nicolai Hähnle

On 30.09.2016 15:47, Marek Olšák wrote:

On Fri, Sep 30, 2016 at 3:08 PM, Bas Nieuwenhuizen
 wrote:

On Fri, Sep 30, 2016 at 2:13 PM, Marek Olšák  wrote:

intptr_t reads and writes aren't atomic. p_atomic_set and
p_atomic_read functions don't do anything for atomicity. See:

#define p_atomic_set(_v, _i) (*(_v) = (_i))
#define p_atomic_read(_v) (*(_v))


That implementation seems bogus to me, as the compiler sees none of
them as atomic and therefore the compiler can do strange stuff.

why are intptr_t reads/writes less atomic than int32_t? IIRC on x86_64
aligned 64-bit accesses are atomic, and on x86 intptr_t is just 32
bits.


I looked into a number of options for p_atomic_set/read and just sent 
around a patch which (to my understanding) definitely guarantees 
sufficient atomicity and memory ordering on GCC >= 4.7.


Without that patch (and so I suspect also on GCC < 4.7), the memory 
accesses are still de facto atomic as Bas wrote. Furthermore, most of 
the necessary ordering guarantees are established by the calls to 
mtx_lock/unlock functions.


Without that patch (and also with that patch but with GCC < 4.7), there 
is actually still the possibility that the write of 
page->u.num_remaining in slab_destroy_child is moved to after the loop 
over the page's elements. GCC doesn't actually do it in practice, but it 
is a gap which we probably have to live with unless we introduce some 
ugly workarounds.


Note that this is only ever a problem in the situation where an 
allocation is freed with a different child pool than the one it was 
allocated from. In other words, the new code is (as far as I can see) 
only buggy in the case where the old code was even buggier.


For what it's worth, I'm going to use p_atomic_set also for 
page->u.num_remaining. This is not strictly needed, since the 
acquire/release on elt->owner already establishes the necessary ordering 
already, but it should help clarity.


Do you agree with this plan?

Thanks,
Nicolai



Really? Thanks, I didn't know that.

Marek


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] util: use GCC atomic intrinsics with explicit memory model

2016-10-04 Thread Marek Olšák
Acked-by: Marek Olšák 

Somebody else should review the configure.ac change.

Marek

On Tue, Oct 4, 2016 at 4:14 PM, Nicolai Hähnle  wrote:
> From: Nicolai Hähnle 
>
> This is motivated by the fact that p_atomic_read and p_atomic_set may
> somewhat surprisingly not do the right thing in the old version: while
> stores and loads are de facto atomic at least on x86, the compiler may
> apply re-ordering and speculation quite liberally. Basically, the old
> version uses the "relaxed" memory ordering.
>
> The new ordering always uses acquire/release ordering. This is the
> strongest possible memory ordering that doesn't require additional
> fence instructions on x86. (And the only stronger ordering is
> "sequentially consistent", which is usually more than you need anyway.)
>
> I would feel more comfortable if p_atomic_set/read in the old
> implementation were at least using volatile loads and stores, but I
> don't see a way to get there without typeof (which we cannot use here
> since the code is compiled with -std=c99).
>
> Eventually, we should really just move to something that is based on
> the atomics in C11 / C++11.
> ---
>  configure.ac| 11 +++
>  src/util/u_atomic.h | 21 +
>  2 files changed, 32 insertions(+)
>
> diff --git a/configure.ac b/configure.ac
> index 1bfac3b..421f4f3 100644
> --- a/configure.ac
> +++ b/configure.ac
> @@ -380,20 +380,31 @@ int main () {
>  c = _mm_max_epu32(a, b);
>  return _mm_cvtsi128_si32(c);
>  }]])], SSE41_SUPPORTED=1)
>  CFLAGS="$save_CFLAGS"
>  if test "x$SSE41_SUPPORTED" = x1; then
>  DEFINES="$DEFINES -DUSE_SSE41"
>  fi
>  AM_CONDITIONAL([SSE41_SUPPORTED], [test x$SSE41_SUPPORTED = x1])
>  AC_SUBST([SSE41_CFLAGS], $SSE41_CFLAGS)
>
> +dnl Check for new-style atomic builtins
> +AC_COMPILE_IFELSE([AC_LANG_SOURCE([[
> +int main() {
> +int n;
> +return __atomic_load_n(, __ATOMIC_ACQUIRE);
> +}]])], GCC_ATOMIC_BUILTINS_SUPPORTED=1)
> +if test "x$GCC_ATOMIC_BUILTINS_SUPPORTED" = x1; then
> +DEFINES="$DEFINES -DUSE_GCC_ATOMIC_BUILTINS"
> +fi
> +AM_CONDITIONAL([GCC_ATOMIC_BUILTINS_SUPPORTED], [test 
> x$GCC_ATOMIC_BUILTINS_SUPPORTED = x1])
> +
>  dnl Check for Endianness
>  AC_C_BIGENDIAN(
> little_endian=no,
> little_endian=yes,
> little_endian=no,
> little_endian=no
>  )
>
>  dnl Check for POWER8 Architecture
>  PWR8_CFLAGS="-mpower8-vector"
> diff --git a/src/util/u_atomic.h b/src/util/u_atomic.h
> index 8675903..2a5bbae 100644
> --- a/src/util/u_atomic.h
> +++ b/src/util/u_atomic.h
> @@ -29,28 +29,49 @@
>  #error "Unsupported platform"
>  #endif
>
>
>  /* Implementation using GCC-provided synchronization intrinsics
>   */
>  #if defined(PIPE_ATOMIC_GCC_INTRINSIC)
>
>  #define PIPE_ATOMIC "GCC Sync Intrinsics"
>
> +#if defined(USE_GCC_ATOMIC_BUILTINS)
> +
> +/* The builtins with explicit memory model are available since GCC 4.7. */
> +#define p_atomic_set(_v, _i) __atomic_store_n((_v), (_i), __ATOMIC_RELEASE)
> +#define p_atomic_read(_v) __atomic_load_n((_v), __ATOMIC_ACQUIRE)
> +#define p_atomic_dec_zero(v) (__atomic_sub_fetch((v), 1, __ATOMIC_ACQ_REL) 
> == 0)
> +#define p_atomic_inc(v) (void) __atomic_add_fetch((v), 1, __ATOMIC_ACQ_REL)
> +#define p_atomic_dec(v) (void) __atomic_sub_fetch((v), 1, __ATOMIC_ACQ_REL)
> +#define p_atomic_add(v, i) (void) __atomic_add_fetch((v), (i), 
> __ATOMIC_ACQ_REL)
> +#define p_atomic_inc_return(v) __atomic_add_fetch((v), 1, __ATOMIC_ACQ_REL)
> +#define p_atomic_dec_return(v) __atomic_sub_fetch((v), 1, __ATOMIC_ACQ_REL)
> +
> +#else
> +
>  #define p_atomic_set(_v, _i) (*(_v) = (_i))
>  #define p_atomic_read(_v) (*(_v))
>  #define p_atomic_dec_zero(v) (__sync_sub_and_fetch((v), 1) == 0)
>  #define p_atomic_inc(v) (void) __sync_add_and_fetch((v), 1)
>  #define p_atomic_dec(v) (void) __sync_sub_and_fetch((v), 1)
>  #define p_atomic_add(v, i) (void) __sync_add_and_fetch((v), (i))
>  #define p_atomic_inc_return(v) __sync_add_and_fetch((v), 1)
>  #define p_atomic_dec_return(v) __sync_sub_and_fetch((v), 1)
> +
> +#endif
> +
> +/* There is no __atomic_* compare and exchange that returns the current 
> value.
> + * Also, GCC 5.4 seems unable to optimize a compound statement expression 
> that
> + * uses an additional stack variable with __atomic_compare_exchange[_n].
> + */
>  #define p_atomic_cmpxchg(v, old, _new) \
> __sync_val_compare_and_swap((v), (old), (_new))
>
>  #endif
>
>
>
>  /* Unlocked version for single threaded environments, such as some
>   * windows kernel modules.
>   */
> --
> 2.7.4
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] autoconf: Make header install distinct for various APIs

2016-10-04 Thread Chuck Atkins
>
> > +eglinterop_HEADERS = $(top_srcdir)/include/GL/mesa_glinterop.h
> IIRC Marek was pretty clear that this header should not be installed.
> Then again looking at our current wildcard installing ... seems like
> it was.
>
> Please drop this file from the install stage ?
>

Dropped.


> +if HAVE_COMMON_OSMESA
> > +osmesadir = $(includedir)/GL
> > +osmesa_HEADERS = $(top_srcdir)/include/GL/osmesa.h
> > +endif
> > +
> Why do we have this hunk, considering each target is handled explicitly ?
>
> IMHO we should drop either this or the similar ones in
> src/{mesa,gallium}/Makefile.am. The latter might be better ?
>

The idea was to seperate the interface install from the implementation
build in the same way that it's done for GLX.  Mostly to reduce code
duplication.  At the top level the interface headers are installed then at
the deeper nested level the chosen implementation is built.  It does,
however, look like I forgot to drop the header install from the
implementations.  Will do.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] util: use GCC atomic intrinsics with explicit memory model

2016-10-04 Thread Nicolai Hähnle
From: Nicolai Hähnle 

This is motivated by the fact that p_atomic_read and p_atomic_set may
somewhat surprisingly not do the right thing in the old version: while
stores and loads are de facto atomic at least on x86, the compiler may
apply re-ordering and speculation quite liberally. Basically, the old
version uses the "relaxed" memory ordering.

The new ordering always uses acquire/release ordering. This is the
strongest possible memory ordering that doesn't require additional
fence instructions on x86. (And the only stronger ordering is
"sequentially consistent", which is usually more than you need anyway.)

I would feel more comfortable if p_atomic_set/read in the old
implementation were at least using volatile loads and stores, but I
don't see a way to get there without typeof (which we cannot use here
since the code is compiled with -std=c99).

Eventually, we should really just move to something that is based on
the atomics in C11 / C++11.
---
 configure.ac| 11 +++
 src/util/u_atomic.h | 21 +
 2 files changed, 32 insertions(+)

diff --git a/configure.ac b/configure.ac
index 1bfac3b..421f4f3 100644
--- a/configure.ac
+++ b/configure.ac
@@ -380,20 +380,31 @@ int main () {
 c = _mm_max_epu32(a, b);
 return _mm_cvtsi128_si32(c);
 }]])], SSE41_SUPPORTED=1)
 CFLAGS="$save_CFLAGS"
 if test "x$SSE41_SUPPORTED" = x1; then
 DEFINES="$DEFINES -DUSE_SSE41"
 fi
 AM_CONDITIONAL([SSE41_SUPPORTED], [test x$SSE41_SUPPORTED = x1])
 AC_SUBST([SSE41_CFLAGS], $SSE41_CFLAGS)
 
+dnl Check for new-style atomic builtins
+AC_COMPILE_IFELSE([AC_LANG_SOURCE([[
+int main() {
+int n;
+return __atomic_load_n(, __ATOMIC_ACQUIRE);
+}]])], GCC_ATOMIC_BUILTINS_SUPPORTED=1)
+if test "x$GCC_ATOMIC_BUILTINS_SUPPORTED" = x1; then
+DEFINES="$DEFINES -DUSE_GCC_ATOMIC_BUILTINS"
+fi
+AM_CONDITIONAL([GCC_ATOMIC_BUILTINS_SUPPORTED], [test 
x$GCC_ATOMIC_BUILTINS_SUPPORTED = x1])
+
 dnl Check for Endianness
 AC_C_BIGENDIAN(
little_endian=no,
little_endian=yes,
little_endian=no,
little_endian=no
 )
 
 dnl Check for POWER8 Architecture
 PWR8_CFLAGS="-mpower8-vector"
diff --git a/src/util/u_atomic.h b/src/util/u_atomic.h
index 8675903..2a5bbae 100644
--- a/src/util/u_atomic.h
+++ b/src/util/u_atomic.h
@@ -29,28 +29,49 @@
 #error "Unsupported platform"
 #endif
 
 
 /* Implementation using GCC-provided synchronization intrinsics
  */
 #if defined(PIPE_ATOMIC_GCC_INTRINSIC)
 
 #define PIPE_ATOMIC "GCC Sync Intrinsics"
 
+#if defined(USE_GCC_ATOMIC_BUILTINS)
+
+/* The builtins with explicit memory model are available since GCC 4.7. */
+#define p_atomic_set(_v, _i) __atomic_store_n((_v), (_i), __ATOMIC_RELEASE)
+#define p_atomic_read(_v) __atomic_load_n((_v), __ATOMIC_ACQUIRE)
+#define p_atomic_dec_zero(v) (__atomic_sub_fetch((v), 1, __ATOMIC_ACQ_REL) == 
0)
+#define p_atomic_inc(v) (void) __atomic_add_fetch((v), 1, __ATOMIC_ACQ_REL)
+#define p_atomic_dec(v) (void) __atomic_sub_fetch((v), 1, __ATOMIC_ACQ_REL)
+#define p_atomic_add(v, i) (void) __atomic_add_fetch((v), (i), 
__ATOMIC_ACQ_REL)
+#define p_atomic_inc_return(v) __atomic_add_fetch((v), 1, __ATOMIC_ACQ_REL)
+#define p_atomic_dec_return(v) __atomic_sub_fetch((v), 1, __ATOMIC_ACQ_REL)
+
+#else
+
 #define p_atomic_set(_v, _i) (*(_v) = (_i))
 #define p_atomic_read(_v) (*(_v))
 #define p_atomic_dec_zero(v) (__sync_sub_and_fetch((v), 1) == 0)
 #define p_atomic_inc(v) (void) __sync_add_and_fetch((v), 1)
 #define p_atomic_dec(v) (void) __sync_sub_and_fetch((v), 1)
 #define p_atomic_add(v, i) (void) __sync_add_and_fetch((v), (i))
 #define p_atomic_inc_return(v) __sync_add_and_fetch((v), 1)
 #define p_atomic_dec_return(v) __sync_sub_and_fetch((v), 1)
+
+#endif
+
+/* There is no __atomic_* compare and exchange that returns the current value.
+ * Also, GCC 5.4 seems unable to optimize a compound statement expression that
+ * uses an additional stack variable with __atomic_compare_exchange[_n].
+ */
 #define p_atomic_cmpxchg(v, old, _new) \
__sync_val_compare_and_swap((v), (old), (_new))
 
 #endif
 
 
 
 /* Unlocked version for single threaded environments, such as some
  * windows kernel modules.
  */
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


  1   2   >