Re: [Mesa-dev] [PATCH v4 0/6] nouveau: add support for vaapi

2015-12-04 Thread Julien Isorce
Hi Ilia,

Please give another try with these patches (i.e. 1, 2, 3 and 5 from this
patch series. 4 and 6 are already merged since I have resubmit them
afterward and split in several series). I rebased these four patches on top
of latest master here:
https://github.com/CapOM/mesa/commits/st_va_nouveau_chunk_decoding  (there
was no conflict during rebase)

h264 is now working properly :), problem was not from these patches but
from st/va. Fixes been merged recently (see the 2 latest patches in st/va)

Let me know if you see anything wrong when playing video compared to vdpau.

Cheers
Julien


On 10 November 2015 at 09:28, Ilia Mirkin  wrote:

> On Tue, Nov 10, 2015 at 4:09 AM, Julien Isorce 
> wrote:
> > Hi,
> >
> > I found some difference in the content of dec->bsp_bo[i], for h264 when
> > using st/vdpau (ok) and st/vaapi (ko).
> >
> > In src/gallium/state_trackers/va/picture.c, at least the reference frames
> > are not set. At minimum it is missing something like the following though
> > still not enough:
> >
> > @@ -193,7 +194,23 @@ handlePictureParameterBuffer(vlVaDriver *drv,
> > vlVaContext *context, vlVaBuffer *
> >context->desc.h264.pps->redundant_pic_cnt_present_flag =
> >   h264->pic_fields.bits.redundant_pic_cnt_present_flag;
> >/*reference_pic_flag*/
> > +  context->desc.h264.is_reference =
> > h264->pic_fields.bits.reference_pic_flag;
> >context->desc.h264.frame_num = h264->frame_num;
> > +
> > +  for (i = 0; i < 16; ++i) {
> > + if ((h264->ReferenceFrames[i].flags & VA_PICTURE_H264_INVALID)
> ||
> > + (h264->ReferenceFrames[i].picture_id ==
> VA_INVALID_SURFACE))
> > +break;
> > +
> > + getReferenceFrame(drv, h264->ReferenceFrames[i].picture_id,
> > >desc.h264.ref[i]);
> > +
> > + context->desc.h264.field_order_cnt_list[i][0] =
> > h264->ReferenceFrames[i].TopFieldOrderCnt;
> > + context->desc.h264.field_order_cnt_list[i][1] =
> > h264->ReferenceFrames[i].BottomFieldOrderCnt;
> > + context->desc.h264.frame_num_list[i] =
> > h264->ReferenceFrames[i].frame_idx;
> > +  }
> >
> > I am surprised that "getReferenceFrame" is not called at all in the h264
> > case.
>
> Reference frames are a huge part of h264 (all the MPEG codecs in
> fact), so I'm guessing there's some different way we're supposed to be
> retrieving the information. Perhaps look at where uvd gets it from?
>
>   -ilia
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 0/1] Do not require all components to apply opt_vector_float()

2015-12-04 Thread Juan A. Suarez Romero
On Thu, 2015-12-03 at 11:04 -0800, Matt Turner wrote:
> The GAINED: 3 is almost certainly because of a sporadic failure in
> shader-db (or Mesa...?) during the baseline shader-db run. When you
> ran shader-db with your patch applied, the same failure didn't occur,
> so report.py thinks this means 3 programs were gained. When I see
> things like that, I manually rerun that specific shader and append
> its
> results to the appropriate file and rerun report.py.
> 

Thanks for the info. I'll take in account it next time.

> 
> I sent it ("[PATCH] i965: Don't mark dead instructions' sources
> live.") Nov 25 (and Cc'd you). Ken reviewed it, and I pushed it two
> days ago as commit 48b4e88.


Oh, right. I saw it now. Thanks!


J.A.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 92850] Segfault loading War Thunder

2015-12-04 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=92850

--- Comment #45 from har...@gmx.de ---
Created attachment 120336
  --> https://bugs.freedesktop.org/attachment.cgi?id=120336=edit
backtrace with activated -finline-small-functions

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 92850] Segfault loading War Thunder

2015-12-04 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=92850

--- Comment #46 from har...@gmx.de ---
here is the code from ir.h where it fails:


   /**
* Determine the number of operands used by an expression
*/
   static unsigned int get_num_operands(ir_expression_operation);

   /**
* Determine the number of operands used by an expression
*/
   unsigned int get_num_operands() const
   {
  return (this->operation == ir_quadop_vector)
 ? this->type->vector_elements : get_num_operands(operation);
   }


Is it strange enough to fail if compiled inline in this code below?


   for (operand = 0; operand < ir->get_num_operands(); operand++) {
  this->result.file = PROGRAM_UNDEFINED;
  ir->operands[operand]->accept(this);
  if (this->result.file == PROGRAM_UNDEFINED) {
 printf("Failed to get tree for expression operand:\n");
 ir->operands[operand]->print();
 printf("\n");
 exit(1);
  }
  op[operand] = this->result;

  /* Matrix expression operands should have been broken down to vector
   * operations already.
   */
  assert(!ir->operands[operand]->type->is_matrix());
   }

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 92850] Segfault loading War Thunder

2015-12-04 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=92850

--- Comment #44 from har...@gmx.de ---
(In reply to bellamorte42 from comment #40)
> compiling with -O3 -fno-inline-small-functions works.

yep, i can confirm that '-finline-small-functions' is the black sheep.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [RFC 2/3] gallium: Move nv50 clear_texture impl down to util_surface

2015-12-04 Thread Marek Olšák
On Fri, Dec 4, 2015 at 1:37 AM, Ilia Mirkin  wrote:
> On Thu, Dec 3, 2015 at 7:25 PM, Roland Scheidegger  wrote:
>> Am 03.12.2015 um 23:48 schrieb Ilia Mirkin:
>>> On Thu, Dec 3, 2015 at 4:44 AM, Edward O'Callaghan
>>>  wrote:
 ARB_clear_texture is reasonably generic enough that it should
 be moved down to become part of the fallback mechanism of
 pipe->clear_texture.

 Signed-off-by: Edward O'Callaghan 
 ---
  src/gallium/auxiliary/util/u_surface.c  | 83 
 +
  src/gallium/auxiliary/util/u_surface.h  |  6 ++
  src/gallium/drivers/nouveau/nv50/nv50_surface.c | 67 +---
  3 files changed, 90 insertions(+), 66 deletions(-)

 diff --git a/src/gallium/auxiliary/util/u_surface.c 
 b/src/gallium/auxiliary/util/u_surface.c
 index 6aa44f9..e7ab175 100644
 --- a/src/gallium/auxiliary/util/u_surface.c
 +++ b/src/gallium/auxiliary/util/u_surface.c
 @@ -36,6 +36,7 @@
  #include "pipe/p_screen.h"
  #include "pipe/p_state.h"

 +#include "util/u_math.h"
  #include "util/u_format.h"
  #include "util/u_inlines.h"
  #include "util/u_rect.h"
 @@ -547,6 +548,88 @@ util_clear_depth_stencil(struct pipe_context *pipe,
 }
  }

 +/**
 + * Fallback for pipe->clear_texture() function.
 + * clears a non-PIPE_BUFFER resource's specified level
 + * and bounding box with a clear value provided in that
 + * resource's native format.
 + *
 + * XXX sf->format = .. is problematic as hw need
 + * not nessarily support the format.
 + */
 +void
 +util_surface_clear_texture(struct pipe_context *pipe,
 +   struct pipe_resource *res,
 +   unsigned level,
 +   const struct pipe_box *box,
 +   const void *data)
 +{
 +   struct pipe_surface tmpl = {{0}}, *sf;
 +
 +   tmpl.format = res->format;
 +   tmpl.u.tex.first_layer = box->z;
 +   tmpl.u.tex.last_layer = box->z + box->depth - 1;
 +   tmpl.u.tex.level = level;
 +   sf = pipe->create_surface(pipe, res, );
 +   if (!sf)
 +  return;
 +
 +   if (util_format_is_depth_or_stencil(res->format)) {
 +  float depth = 0;
 +  uint8_t stencil = 0;
 +  unsigned clear = 0;
 +  const struct util_format_description *desc =
 + util_format_description(res->format);
 +
 +  if (util_format_has_depth(desc)) {
 + clear |= PIPE_CLEAR_DEPTH;
 + desc->unpack_z_float(, 0, data, 0, 1, 1);
 +  }
 +  if (util_format_has_stencil(desc)) {
 + clear |= PIPE_CLEAR_STENCIL;
 + desc->unpack_s_8uint(, 0, data, 0, 1, 1);
 +  }
 +  pipe->clear_depth_stencil(pipe, sf, clear, depth, stencil,
 +box->x, box->y, box->width, box->height);
 +   } else {
 +  union pipe_color_union color;
 +
 +  switch (util_format_get_blocksizebits(res->format)) {
 +  case 128:
 + sf->format = PIPE_FORMAT_R32G32B32A32_UINT;
 + memcpy(, data, 128 / 8);
 + break;
 +  case 64:
 + sf->format = PIPE_FORMAT_R32G32_UINT;
 + memcpy(, data, 64 / 8);
 + memset([2], 0, 64 / 8);
 + break;
 +  case 32:
 + sf->format = PIPE_FORMAT_R32_UINT;
 + memcpy(, data, 32 / 8);
 + memset([1], 0, 96 / 8);
 + break;
 +  case 16:
 + sf->format = PIPE_FORMAT_R16_UINT;
 + color.ui[0] = util_cpu_to_le32(
 +util_le16_to_cpu(*(unsigned short *)data));
 + memset([1], 0, 96 / 8);
 + break;
 +  case 8:
 + sf->format = PIPE_FORMAT_R8_UINT;
 + color.ui[0] = util_cpu_to_le32(*(unsigned char *)data);
 + memset([1], 0, 96 / 8);
 + break;
 +  default:
 + assert(!"Unknown texel element size");
 + return;
 +  }
 +
 +  pipe->clear_render_target(pipe, sf, ,
 +box->x, box->y, box->width, box->height);
>>>
>>> I only recently realized this, but I'm fairly sure this is wrong --
>>> needs to be divided by util_format_blockwidth/height, otherwise we'll
>>> go way out of bounds for compressed formats. [And yes,
>>> nv50_clear_texture has the same problem.]
>> You can't clear compressed textures with ARB_clear_texture, so I guess
>> this shouldn't be a problem.
>
> Ah yes, indeed. That's why I didn't get any weird failurs :)
>
>> Even if you think it should work in gallium, with the format illegally
>> being changed after-the-fact (after creating the surface) it's not even
>> 

[Mesa-dev] [PATCH] st/va: WIP remove nonesense HEVC picture id handling

2015-12-04 Thread Christian König
From: Christian König 

The picture id in this case is a VA-API surface handle, checking
for a certain value is can't be correct.

Does anybody have any idea what the original intention was and how
to fix it correctly?

Signed-off-by: Christian König 
---
 src/gallium/state_trackers/va/picture_hevc.c | 5 -
 1 file changed, 5 deletions(-)

diff --git a/src/gallium/state_trackers/va/picture_hevc.c 
b/src/gallium/state_trackers/va/picture_hevc.c
index dc66b0f..28743ee 100644
--- a/src/gallium/state_trackers/va/picture_hevc.c
+++ b/src/gallium/state_trackers/va/picture_hevc.c
@@ -159,11 +159,6 @@ void vlVaHandlePictureParameterBufferHEVC(vlVaDriver *drv, 
vlVaContext *context,
for (i = 0 ; i < 15 ; i++) {
   context->desc.h265.PicOrderCntVal[i] = 
hevc->ReferenceFrames[i].pic_order_cnt;
 
-  unsigned int index = hevc->ReferenceFrames[i].picture_id & 0x7F;
-
-  if (index == 0x7F)
- continue;
-
   vlVaGetReferenceFrame(drv, hevc->ReferenceFrames[i].picture_id, 
>desc.h265.ref[i]);
 
   if ((hevc->ReferenceFrames[i].flags & 
VA_PICTURE_HEVC_RPS_ST_CURR_BEFORE) && (iBefore < 8)) {
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 4/7] gallium/winsys: Make use of ARRAY_SIZE macro

2015-12-04 Thread Marek Olšák
Reviewed-by: Marek Olšák 

Marek

On Fri, Dec 4, 2015 at 7:50 AM, Edward O'Callaghan
 wrote:
> Signed-off-by: Edward O'Callaghan 
> ---
>  src/gallium/winsys/amdgpu/drm/amdgpu_surface.c | 6 ++
>  1 file changed, 2 insertions(+), 4 deletions(-)
>
> diff --git a/src/gallium/winsys/amdgpu/drm/amdgpu_surface.c 
> b/src/gallium/winsys/amdgpu/drm/amdgpu_surface.c
> index 3006bd1..4c837a8 100644
> --- a/src/gallium/winsys/amdgpu/drm/amdgpu_surface.c
> +++ b/src/gallium/winsys/amdgpu/drm/amdgpu_surface.c
> @@ -145,11 +145,9 @@ ADDR_HANDLE amdgpu_addr_create(struct amdgpu_winsys *ws)
>
> regValue.backendDisables = ws->amdinfo.backend_disable[0];
> regValue.pTileConfig = ws->amdinfo.gb_tile_mode;
> -   regValue.noOfEntries = sizeof(ws->amdinfo.gb_tile_mode) /
> -  sizeof(ws->amdinfo.gb_tile_mode[0]);
> +   regValue.noOfEntries = ARRAY_SIZE(ws->amdinfo.gb_tile_mode);
> regValue.pMacroTileConfig = ws->amdinfo.gb_macro_tile_mode;
> -   regValue.noOfMacroEntries = sizeof(ws->amdinfo.gb_macro_tile_mode) /
> -   sizeof(ws->amdinfo.gb_macro_tile_mode[0]);
> +   regValue.noOfMacroEntries = ARRAY_SIZE(ws->amdinfo.gb_macro_tile_mode);
>
> createFlags.value = 0;
> createFlags.useTileIndex = 1;
> --
> 2.5.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/2] st/va: move HEVC functions into separate file v2

2015-12-04 Thread Christian König
From: Christian König 

v2: actually copy all of it

Signed-off-by: Christian König 
---
 src/gallium/state_trackers/va/Makefile.sources |   1 +
 src/gallium/state_trackers/va/picture.c| 172 +---
 src/gallium/state_trackers/va/picture_hevc.c   | 207 +
 src/gallium/state_trackers/va/va_private.h |   3 +
 4 files changed, 215 insertions(+), 168 deletions(-)
 create mode 100644 src/gallium/state_trackers/va/picture_hevc.c

diff --git a/src/gallium/state_trackers/va/Makefile.sources 
b/src/gallium/state_trackers/va/Makefile.sources
index 74cc34e..daebf01 100644
--- a/src/gallium/state_trackers/va/Makefile.sources
+++ b/src/gallium/state_trackers/va/Makefile.sources
@@ -8,6 +8,7 @@ C_SOURCES := \
picture_mpeg12.c \
picture_mpeg4.c \
picture_h264.c \
+   picture_hevc.c \
picture_vc1.c \
postproc.c \
subpicture.c \
diff --git a/src/gallium/state_trackers/va/picture.c 
b/src/gallium/state_trackers/va/picture.c
index c7c377a..8623139 100644
--- a/src/gallium/state_trackers/va/picture.c
+++ b/src/gallium/state_trackers/va/picture.c
@@ -92,7 +92,6 @@ vlVaGetReferenceFrame(vlVaDriver *drv, VASurfaceID surface_id,
 static VAStatus
 handlePictureParameterBuffer(vlVaDriver *drv, vlVaContext *context, vlVaBuffer 
*buf)
 {
-   VAPictureParameterBufferHEVC *hevc;
unsigned int i;
VAStatus vaStatus = VA_STATUS_SUCCESS;
 
@@ -114,154 +113,7 @@ handlePictureParameterBuffer(vlVaDriver *drv, vlVaContext 
*context, vlVaBuffer *
   break;
 
   case PIPE_VIDEO_FORMAT_HEVC:
-  assert(buf->size >= sizeof(VAPictureParameterBufferHEVC) && 
buf->num_elements == 1);
-  hevc = buf->data;
-  context->desc.h265.pps->sps->chroma_format_idc = 
hevc->pic_fields.bits.chroma_format_idc;
-  context->desc.h265.pps->sps->separate_colour_plane_flag =
- hevc->pic_fields.bits.separate_colour_plane_flag;
-  context->desc.h265.pps->sps->pic_width_in_luma_samples = 
hevc->pic_width_in_luma_samples;
-  context->desc.h265.pps->sps->pic_height_in_luma_samples = 
hevc->pic_height_in_luma_samples;
-  context->desc.h265.pps->sps->bit_depth_luma_minus8 = 
hevc->bit_depth_luma_minus8;
-  context->desc.h265.pps->sps->bit_depth_chroma_minus8 = 
hevc->bit_depth_chroma_minus8;
-  context->desc.h265.pps->sps->log2_max_pic_order_cnt_lsb_minus4 =
- hevc->log2_max_pic_order_cnt_lsb_minus4;
-  context->desc.h265.pps->sps->sps_max_dec_pic_buffering_minus1 =
- hevc->sps_max_dec_pic_buffering_minus1;
-  context->desc.h265.pps->sps->log2_min_luma_coding_block_size_minus3 =
- hevc->log2_min_luma_coding_block_size_minus3;
-  context->desc.h265.pps->sps->log2_diff_max_min_luma_coding_block_size =
- hevc->log2_diff_max_min_luma_coding_block_size;
-  context->desc.h265.pps->sps->log2_min_transform_block_size_minus2 =
- hevc->log2_min_transform_block_size_minus2;
-  context->desc.h265.pps->sps->log2_diff_max_min_transform_block_size =
- hevc->log2_diff_max_min_transform_block_size;
-  context->desc.h265.pps->sps->max_transform_hierarchy_depth_inter =
- hevc->max_transform_hierarchy_depth_inter;
-  context->desc.h265.pps->sps->max_transform_hierarchy_depth_intra =
- hevc->max_transform_hierarchy_depth_intra;
-  context->desc.h265.pps->sps->scaling_list_enabled_flag =
- hevc->pic_fields.bits.scaling_list_enabled_flag;
-  context->desc.h265.pps->sps->amp_enabled_flag = 
hevc->pic_fields.bits.amp_enabled_flag;
-  context->desc.h265.pps->sps->sample_adaptive_offset_enabled_flag =
- hevc->slice_parsing_fields.bits.sample_adaptive_offset_enabled_flag;
-  context->desc.h265.pps->sps->pcm_enabled_flag = 
hevc->pic_fields.bits.pcm_enabled_flag;
-  if (hevc->pic_fields.bits.pcm_enabled_flag == 1) {
- context->desc.h265.pps->sps->pcm_sample_bit_depth_luma_minus1 =
-hevc->pcm_sample_bit_depth_luma_minus1;
- context->desc.h265.pps->sps->pcm_sample_bit_depth_chroma_minus1 =
-hevc->pcm_sample_bit_depth_chroma_minus1;
- 
context->desc.h265.pps->sps->log2_min_pcm_luma_coding_block_size_minus3 =
-hevc->log2_min_pcm_luma_coding_block_size_minus3;
- 
context->desc.h265.pps->sps->log2_diff_max_min_pcm_luma_coding_block_size =
-hevc->log2_diff_max_min_pcm_luma_coding_block_size;
- context->desc.h265.pps->sps->pcm_loop_filter_disabled_flag =
-hevc->pic_fields.bits.pcm_loop_filter_disabled_flag;
-  }
-  context->desc.h265.pps->sps->num_short_term_ref_pic_sets = 
hevc->num_short_term_ref_pic_sets;
-  context->desc.h265.pps->sps->long_term_ref_pics_present_flag =
- hevc->slice_parsing_fields.bits.long_term_ref_pics_present_flag;
-  context->desc.h265.pps->sps->num_long_term_ref_pics_sps = 
hevc->num_long_term_ref_pic_sps;
-  

[Mesa-dev] [PATCH 2/2] st/va: disable MPEG4 by default v2

2015-12-04 Thread Christian König
From: Christian König 

The workarounds are too hacky to enable them by default
and otherwise MPEG4 doesn't work reliably.

v2: add docs/envvars.html, CC stable and fix typos

Signed-off-by: Christian König 
Reviewed-by: Emil Velikov  (v1)
Reviewed-by: Ilia Mirkin  (v1)
Cc: "11.1.0" 
---
 docs/envvars.html  |  6 ++
 src/gallium/state_trackers/va/config.c | 10 +-
 2 files changed, 15 insertions(+), 1 deletion(-)

diff --git a/docs/envvars.html b/docs/envvars.html
index 1b2c03e..5bb7b1e 100644
--- a/docs/envvars.html
+++ b/docs/envvars.html
@@ -238,6 +238,12 @@ for details.
 
 
 
+VA-API state tracker environment variables
+
+VAAPI_MPEG4_ENABLED - enable MPEG4 for VA-API, disabled by default.
+
+
+
 
 Other Gallium drivers have their own environment variables.  These may change
 frequently so the source code should be consulted for details.
diff --git a/src/gallium/state_trackers/va/config.c 
b/src/gallium/state_trackers/va/config.c
index a545a18..9ca0aa8 100644
--- a/src/gallium/state_trackers/va/config.c
+++ b/src/gallium/state_trackers/va/config.c
@@ -28,10 +28,14 @@
 
 #include "pipe/p_screen.h"
 
+#include "util/u_video.h"
+
 #include "vl/vl_winsys.h"
 
 #include "va_private.h"
 
+DEBUG_GET_ONCE_BOOL_OPTION(mpeg4, "VAAPI_MPEG4_ENABLED", false)
+
 VAStatus
 vlVaQueryConfigProfiles(VADriverContextP ctx, VAProfile *profile_list, int 
*num_profiles)
 {
@@ -45,12 +49,16 @@ vlVaQueryConfigProfiles(VADriverContextP ctx, VAProfile 
*profile_list, int *num_
*num_profiles = 0;
 
pscreen = VL_VA_PSCREEN(ctx);
-   for (p = PIPE_VIDEO_PROFILE_MPEG2_SIMPLE; p <= 
PIPE_VIDEO_PROFILE_HEVC_MAIN_444; ++p)
+   for (p = PIPE_VIDEO_PROFILE_MPEG2_SIMPLE; p <= 
PIPE_VIDEO_PROFILE_HEVC_MAIN_444; ++p) {
+  if (u_reduce_video_profile(p) == PIPE_VIDEO_FORMAT_MPEG4 && 
!debug_get_option_mpeg4())
+ continue;
+
   if (pscreen->get_video_param(pscreen, p, 
PIPE_VIDEO_ENTRYPOINT_BITSTREAM, PIPE_VIDEO_CAP_SUPPORTED)) {
  vap = PipeToProfile(p);
  if (vap != VAProfileNone)
 profile_list[(*num_profiles)++] = vap;
   }
+   }
 
/* Support postprocessing through vl_compositor */
profile_list[(*num_profiles)++] = VAProfileNone;
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] mesa/tests: add KHR_debug GLES glGetPointervKHR entry points

2015-12-04 Thread Michel Dänzer
On 04.12.2015 10:06, Emil Velikov wrote:
> On 4 December 2015 at 00:45, Matt Turner  wrote:
>> On Thu, Dec 3, 2015 at 4:37 PM, Emil Velikov  
>> wrote:
 FWIW, make check still fails for me even with this patch.
>>> Do you have a log that I can take a look ? I've `make clean'ed and
>>> rebuild a couple of times just in case and things seems to pass here.
>>
>> Attached is the output of ./main-test without (p.txt) and with (q.txt)
>> your patch.
> Interesting... upon closer look the build produces two tests -
> lt-main-test and main-test. The former is executed on my system
> (either via the top level make check or the one in
> src/mesa/main/tests) and passes, while the former produces the same
> issues as you've spotted.

I was getting the same failures as Matt with your v2 patch (using
lt-main-test), make clean seems to have resolved it.

Tested-by: Michel Dänzer 


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v4 1/2] intel: 48b ppgtt support (EXEC_OBJECT_SUPPORTS_48B_ADDRESS flag)

2015-12-04 Thread Michel Thierry

On 11/18/2015 10:53 PM, Kristian Høgsberg wrote:

On Wed, Oct 14, 2015 at 5:11 AM, Michel Thierry
 wrote:

On 10/14/2015 8:19 AM, Daniel Vetter wrote:


On Tue, Oct 13, 2015 at 02:51:36PM -0700, Kristian Høgsberg wrote:


On Tue, Oct 13, 2015 at 7:55 AM, Michel Thierry
 wrote:


On 10/13/2015 3:13 PM, Emil Velikov wrote:



On 13 October 2015 at 13:16, Michel Thierry 
wrote:



On 10/6/2015 2:12 PM, Michel Thierry wrote:




On 10/5/2015 7:06 PM, Kristian Høgsberg wrote:




On Mon, Oct 5, 2015 at 7:03 AM, Michel Thierry
 wrote:




On 9/14/2015 2:54 PM, Michał Winiarski wrote:





On Thu, Sep 03, 2015 at 03:23:58PM +0100, Michel Thierry wrote:





Gen8+ supports 48-bit virtual addresses, but some objects must
always be
allocated inside the 32-bit address range.

In specific, any resource used with flat/heapless
(0x-0xf000)
General State Heap (GSH) or Instruction State Heap (ISH) must be
in
a
32-bit range, because the General State Offset and Instruction
State
Offset
are limited to 32-bits.

The i915 driver has been modified to provide a flag to set when
the
4GB
limit is not necessary in a given bo
(EXEC_OBJECT_SUPPORTS_48B_ADDRESS).
48-bit range will only be used when explicitly requested.

Callers to the existing drm_intel_bo_emit_reloc function should
set
the
use_48b_address_range flag beforehand, in order to use full ppgtt
range.

v2: Make set/clear functions nops on pre-gen8 platforms, and use
them
 internally in emit_reloc functions (Ben)
 s/48BADDRESS/48B_ADDRESS/ (Dave)
v3: Keep set/clear functions internal, no-one needs to use them
directly.
v4: Don't set 48bit-support flag in emit reloc, check for ppgtt
type
 before enabling set/clear function, print full offsets in
debug
 statements, using port of lower_32_bits and upper_32_bits
from linux
 kernel (Michał)

References:


http://lists.freedesktop.org/archives/intel-gfx/2015-July/072612.html
Cc: Ben Widawsky 
Cc: Michał Winiarski 






+Kristian

LGTM.
Acked-by: Michał Winiarski 


Signed-off-by: Michel Thierry 







Hi Kristian,

More comments on this?
I've resent the mesa patch with the last changes



(http://lists.freedesktop.org/archives/dri-devel/2015-October/091752.html).


Michał has agreed with the interface and will use it for his work.
If mesa doesn't plan to use this for now, it's ok. The kernel
changes
have
been done to prevent any regressions when the support 48-bit flag
is
not
set.





I didn't get any replies to my last comments on this:


http://lists.freedesktop.org/archives/mesa-dev/2015-August/091398.html

Basically, I tried explicitly placing buffers above 8G and didn't
see
the HW problem described (ie it all worked fine).  I still think
there's some confusion as to what the W/A is about.

Here's my take: the W/A is a SW-only workaround to handle the cases
where a driver uses heapless and 48-bit ppgtt. The problem is that
the
heaps are limited to 4G but with 48bit ppgtt a buffer can be placed
anywhere it the 48 bit address space. If that happens it's
consideredd
out-of-bounds for the heap and access fails. To prevent this we need
to make sure the bos we address in a heapless fashion are placed
below
4g.

For mesa, we only configure general state base as heap-less, which
affects scratch bos. What this boils down to is that we need the 4G
restriction only for the scratch bos set up on 3DSTATE_VS,
3DSTATE_GS
and 3DSTATE_PS (and tesselation stage eventually). Look for the
OUT_RELOC64 for stage->scratch_bo in gen8_vs_state.c,
gen8_gs_state.c
and gen8_ps_state.c





I think it also affects _3DSTATE_VIEWPORT_STATE_POINTERS_CC, maybe it
isn't exclusive to the scratch bos (the error state points to that
instruction).




Hi,

Anymore inputs about this patch?
AFAIK, if changes are required based on further comments from
Kristian,
these will be in the mesa patch[1], not libdrm. Also, Michał will use
this
flag in another project.


The comment seems quite brief and I'm not sure it fully addresses
Kristian's concern. I'd suspect that providing reference to the HW
documentation (confirming your assumption) might be beneficial.



Sure, attached is the hang I get if I don't set the restriction in
gen8_misc_state.c and try to use the full 48-bit range
(i915_error_state_nowa.txt). This is just running gfxbench t-rex.

I see the same hang signature when it is only applied to the scratch bos
(in
gen8_vs_state.c, gen8_gs_state.c and gen8_ps_state.c -
i915_error_state_scratchbo.txt).

3DSTATE_VIEWPORT_STATE_POINTERS_CC (0x7823) is defined here:

https://01.org/sites/default/files/documentation/intel-gfx-prm-osrc-bdw-vol02a-commandreference-instructions_0.pdf
(page 256)



I applied your mesa and libdrm patches and then backed out the 4G
restriction in the STATE_BASE_ADDRESS 

Re: [Mesa-dev] [PATCH 00/11] do not wrap header inclusion in extern "C"

2015-12-04 Thread Emil Velikov
On 24 November 2015 at 16:29, Emil Velikov  wrote:
> Hi all,
>
> As pointed out by Jose a while back adding extern C guard around header
> inclusion is a bad idea.
>
> As I noticed a few in the glsl/nir code I went on and grepped through
> mesa.
>
> I have build tested the series on Linux only, yet patch 10/11 contains
> some windows/haiku specific code. Regardless, the lot is quite trivial.
>
> The series is in branch no-extern-c at https://github.com/evelikov/Mesa
>
Humble ping... patch 7 was dropped while 10 and 11 are the only ones reviewed.

I'm inclined to just push the lot, although I'd give it another week
for people to take comment/review.

Thanks
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 10/10] vc4/nir: Use the new unified io intrinsics

2015-12-04 Thread Emil Velikov
Hi Eric,

On 4 December 2015 at 02:03, Eric Anholt  wrote:
> Jason Ekstrand  writes:
>
>> On Dec 3, 2015 3:47 PM, "Eric Anholt"  wrote:
>>>
>>> Jason Ekstrand  writes:
>>>
>>> > Cc: Eric Anholt 
>>>
>>> OK, I've pushed a branch of partial fixes for this series to nir-loads
>>> of my fdo tree, but it's still super broken.  I've spent too much of
>>> today on it, and this series was not ready for review.  There are still
>>> const_index[0] references all over that are obviously from intrinsics
>>> you've removed the const_index on.
>>
>> I'm going to start looking at this again tomorrow.  I've had other things
>> on my plate this week.  Rob already pointed me at some crashing shaders for
>> ir3 and his little standalone compiler. I'll start there.  Is there
>> something similar I can use to debug vc4?
>
> Right now from configure.ac I've got either targeting-hardware mode, or
> targeting-simulator mode.  What I need to add is something in between,
> where we use the ioctls from simulator mode so that people can run the
> driver on i965 systems, and then just don't make the simulator calls at
> the very end of the drawing pipeline since you don't have the simulator.
> Then people could even run shader-db without having the actual hardware,
> and that would get a lot of coverage of compiler changes.
>
> You could knock this together with some #ifdefs around all 7 mentions of
> "simpenrose", but I'll try to get a clean patch sent out.  I'm just
> trying to decide whether this should be a --enable argument or "oh, hey,
> you're not on arm, you must want either simulator or
> everything-but-simulator mode"
>
> I'd also like something a bit nicer than my symlink trick for loading
> the driver (right now I link the built vc4_dri.so to i965_dri.so in a
> subdirectory, and LIBGL_DRIVERS_PATH it). Maybe if I getenv() a name
> From loader.c for computing driver name, I could also skip a bunch of
> the i965 symbols I have to add.
>
What would it take to have things a bit like virgl -> ie. single vc4
gallium/driver but two winsys (vc4/drm and vc4/sim) ? Afaict one can
(should be able to) get things doing with this approach and toggle the
simulator "mode" via an env variable. I might have some time to add
LIBGL_DRIVER_NAME next week, although I'd rather avoid having such a
"hack".

On the standalone compiler thing if you shuffle things ala winsys
style as suggested you should be able to get a nice clean standalone
compiler like nouveau and freedreno.
I'll be glad to help, mostly on the ugly pipe-loader and autohell side
of things.

Thanks
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 1/1] i965: add opportunistic behaviour to opt_vector_float()

2015-12-04 Thread Juan A. Suarez Romero
On Thu, 2015-12-03 at 11:04 -0800, Matt Turner wrote:
> > The improvement obtained regarding current upstream (56aff6bb4eaf)
> is:
> >
> > total instructions in shared programs: 6819484 -> 6811698 (-0.11%)
> > instructions in affected programs: 387245 -> 379459 (-2.01%)
> > total loops in shared programs:    1971 -> 1971 (0.00%)
> > helped:    3980
> > HURT:  0
> > GAINED:    3
> 
> Run the shader-db on just the program that had the problem here, add
> it to your before-results, and rerun report.py.


I've rebased against new master, applied the changes you suggested, and
run again the tests.

That has been enough to get rid of the GAINED shaders.

New results are:

total instructions in shared programs: 6819828 -> 6812042 (-0.11%)
instructions in affected programs: 387245 -> 379459 (-2.01%)
total loops in shared programs:1971 -> 1971 (0.00%)
helped:3980
HURT:  0
GAINED:0
LOST:  0


I'll send a second version of the patch.


J.A.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v3 1/1] i965: add opportunistic behaviour to opt_vector_float()

2015-12-04 Thread Juan A. Suarez Romero
opt_vector_float() transforms several scalar MOV operations to a single
vectorial MOV.

This is done when those MOV covers all the components of the destination
register. So something like:

mov vgrf3.0.xy:D, 0D
mov vgrf3.0.w:D, 1065353216D
mov vgrf3.0.z:D, 0D

is transformed in:

mov vgrf3.0:F, [0F, 0F, 0F, 1F]

But there are cases where not all the components are written. For
example, in:

mov vgrf2.0.x:D, 1073741824D
mov vgrf3.0.xy:D, 0D
mov vgrf3.0.w:D, 1065353216D
mov vgrf4.0.xy:D, 1065353216D
mov vgrf4.0.w:D, 0D
mov vgrf6.0:UD, u4.xyzw:UD

Nor vgrf3 nor vgrf4 .z components are written, so the optimization is
not applied.

But it could be applied anyway with the components covered, using a
writemask to select the ones written. So we could transform it in:

mov vgrf2.0.x:D, 1073741824D
mov vgrf3.0.xyw:F, [0F, 0F, 0F, 1F]
mov vgrf4.0.xyw:F, [1F, 1F, 0F, 0F]
mov vgrf6.0:UD, u4.xyzw:UD

This commit does precisely that: opportunistically apply
opt_vector_float() when possible.

The improvement obtained regarding current upstream
(11.1-branchpoint-310-gd566382) is:

total instructions in shared programs: 6819828 -> 6812042 (-0.11%)
instructions in affected programs: 387245 -> 379459 (-2.01%)
total loops in shared programs:1971 -> 1971 (0.00%)
helped:3980
HURT:  0
GAINED:0
LOST:  0

v2: change vectorize_mov() signature (Matt).

Signed-off-by: Juan A. Suarez Romero 
---
 src/mesa/drivers/dri/i965/brw_vec4.cpp | 61 ++
 src/mesa/drivers/dri/i965/brw_vec4.h   |  4 +++
 2 files changed, 43 insertions(+), 22 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4.cpp
index a697bdf..104df0f 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp
@@ -309,6 +309,29 @@ src_reg::equals(const src_reg ) const
 }
 
 bool
+vec4_visitor::vectorize_mov(bblock_t *block, vec4_instruction *inst, uint8_t 
imm[4],
+vec4_instruction *imm_inst[4], int inst_count,
+unsigned writemask)
+{
+   if (inst_count < 2) {
+  return false;
+   }
+
+   unsigned vf;
+   memcpy(, imm, sizeof(vf));
+   vec4_instruction *mov = MOV(imm_inst[0]->dst, brw_imm_vf(vf));
+   mov->dst.type = BRW_REGISTER_TYPE_F;
+   mov->dst.writemask = writemask;
+   inst->insert_before(block, mov);
+
+   for (int i = 0; i < inst_count; i++) {
+  imm_inst[i]->remove(block);
+   }
+
+   return true;}
+
+
+bool
 vec4_visitor::opt_vector_float()
 {
bool progress = false;
@@ -316,27 +339,36 @@ vec4_visitor::opt_vector_float()
int last_reg = -1, last_reg_offset = -1;
enum brw_reg_file last_reg_file = BAD_FILE;
 
-   int remaining_channels = 0;
-   uint8_t imm[4];
+   uint8_t imm[4] = { 0 };
int inst_count = 0;
vec4_instruction *imm_inst[4];
+   unsigned writemask = 0;
 
foreach_block_and_inst_safe(block, vec4_instruction, inst, cfg) {
   if (last_reg != inst->dst.nr ||
   last_reg_offset != inst->dst.reg_offset ||
   last_reg_file != inst->dst.file) {
+
+ progress |= vectorize_mov(block, inst, imm, imm_inst, inst_count, 
writemask);
+
+ inst_count = 0;
+ writemask = 0;
  last_reg = inst->dst.nr;
  last_reg_offset = inst->dst.reg_offset;
  last_reg_file = inst->dst.file;
- remaining_channels = WRITEMASK_XYZW;
-
- inst_count = 0;
+ for (int i = 0; i < 4; i++) {
+imm[i] = 0;
+ }
   }
 
   if (inst->opcode != BRW_OPCODE_MOV ||
   inst->dst.writemask == WRITEMASK_XYZW ||
-  inst->src[0].file != IMM)
+  inst->src[0].file != IMM) {
+ progress |= vectorize_mov(block, inst, imm, imm_inst, inst_count, 
writemask);
+ inst_count = 0;
+ last_reg = -1;
  continue;
+  }
 
   int vf = brw_float_to_vf(inst->src[0].f);
   if (vf == -1)
@@ -351,23 +383,8 @@ vec4_visitor::opt_vector_float()
   if ((inst->dst.writemask & WRITEMASK_W) != 0)
  imm[3] = vf;
 
+  writemask |= inst->dst.writemask;
   imm_inst[inst_count++] = inst;
-
-  remaining_channels &= ~inst->dst.writemask;
-  if (remaining_channels == 0) {
- unsigned vf;
- memcpy(, imm, sizeof(vf));
- vec4_instruction *mov = MOV(inst->dst, brw_imm_vf(vf));
- mov->dst.type = BRW_REGISTER_TYPE_F;
- mov->dst.writemask = WRITEMASK_XYZW;
- inst->insert_after(block, mov);
- last_reg = -1;
-
- for (int i = 0; i < inst_count; i++) {
-imm_inst[i]->remove(block);
- }
- progress = true;
-  }
}
 
if (progress)
diff --git a/src/mesa/drivers/dri/i965/brw_vec4.h 
b/src/mesa/drivers/dri/i965/brw_vec4.h
index ae5bf69..6476d3b 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4.h
+++ 

Re: [Mesa-dev] [PATCH] gallium/drivers/nouveau: Clean up clear_texture()

2015-12-04 Thread Ilia Mirkin
This will break depth/stencil clears.

On Fri, Dec 4, 2015 at 9:18 AM, Edward O'Callaghan
 wrote:
> We should translate and set the format before dispatching
> the call to pipe->create_surface(), instead of creating
> a surface with the given format and changing it after.
>
> Signed-off-by: Edward O'Callaghan 
> ---
>  src/gallium/drivers/nouveau/nv50/nv50_surface.c | 72 
> -
>  1 file changed, 35 insertions(+), 37 deletions(-)
>
> diff --git a/src/gallium/drivers/nouveau/nv50/nv50_surface.c 
> b/src/gallium/drivers/nouveau/nv50/nv50_surface.c
> index 86be1b4..6364aa3 100644
> --- a/src/gallium/drivers/nouveau/nv50/nv50_surface.c
> +++ b/src/gallium/drivers/nouveau/nv50/nv50_surface.c
> @@ -447,8 +447,42 @@ nv50_clear_texture(struct pipe_context *pipe,
> const void *data)
>  {
> struct pipe_surface tmpl = {{0}}, *sf;
> +   union pipe_color_union color;
> +   float depth = 0;
> +   uint8_t stencil = 0;
> +   unsigned clear = 0;
> +
> +   switch (util_format_get_blocksizebits(res->format)) {
> +   case 128:
> +  tmpl.format = PIPE_FORMAT_R32G32B32A32_UINT;
> +  memcpy(, data, 128 / 8);
> +  break;
> +   case 64:
> +  tmpl.format = PIPE_FORMAT_R32G32_UINT;
> +  memcpy(, data, 64 / 8);
> +  memset([2], 0, 64 / 8);
> +  break;
> +   case 32:
> +  tmpl.format = PIPE_FORMAT_R32_UINT;
> +  memcpy(, data, 32 / 8);
> +  memset([1], 0, 96 / 8);
> +  break;
> +   case 16:
> +  tmpl.format = PIPE_FORMAT_R16_UINT;
> +  color.ui[0] = util_cpu_to_le32(
> + util_le16_to_cpu(*(unsigned short *)data));
> +  memset([1], 0, 96 / 8);
> +  break;
> +   case 8:
> +  tmpl.format = PIPE_FORMAT_R8_UINT;
> +  color.ui[0] = util_cpu_to_le32(*(unsigned char *)data);
> +  memset([1], 0, 96 / 8);
> +  break;
> +   default:
> +  assert(!"Unknown texel element size");
> +  return;
> +   }
>
> -   tmpl.format = res->format;
> tmpl.u.tex.first_layer = box->z;
> tmpl.u.tex.last_layer = box->z + box->depth - 1;
> tmpl.u.tex.level = level;
> @@ -457,9 +491,6 @@ nv50_clear_texture(struct pipe_context *pipe,
>return;
>
> if (util_format_is_depth_or_stencil(res->format)) {
> -  float depth = 0;
> -  uint8_t stencil = 0;
> -  unsigned clear = 0;
>const struct util_format_description *desc =
>   util_format_description(res->format);
>
> @@ -474,39 +505,6 @@ nv50_clear_texture(struct pipe_context *pipe,
>pipe->clear_depth_stencil(pipe, sf, clear, depth, stencil,
>  box->x, box->y, box->width, box->height);
> } else {
> -  union pipe_color_union color;
> -
> -  switch (util_format_get_blocksizebits(res->format)) {
> -  case 128:
> - sf->format = PIPE_FORMAT_R32G32B32A32_UINT;
> - memcpy(, data, 128 / 8);
> - break;
> -  case 64:
> - sf->format = PIPE_FORMAT_R32G32_UINT;
> - memcpy(, data, 64 / 8);
> - memset([2], 0, 64 / 8);
> - break;
> -  case 32:
> - sf->format = PIPE_FORMAT_R32_UINT;
> - memcpy(, data, 32 / 8);
> - memset([1], 0, 96 / 8);
> - break;
> -  case 16:
> - sf->format = PIPE_FORMAT_R16_UINT;
> - color.ui[0] = util_cpu_to_le32(
> -util_le16_to_cpu(*(unsigned short *)data));
> - memset([1], 0, 96 / 8);
> - break;
> -  case 8:
> - sf->format = PIPE_FORMAT_R8_UINT;
> - color.ui[0] = util_cpu_to_le32(*(unsigned char *)data);
> - memset([1], 0, 96 / 8);
> - break;
> -  default:
> - assert(!"Unknown texel element size");
> - return;
> -  }
> -
>pipe->clear_render_target(pipe, sf, ,
>  box->x, box->y, box->width, box->height);
> }
> --
> 2.5.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/3v2] xvmc: force assertion in XvMC tests

2015-12-04 Thread Emil Velikov
Hi Giuseppe,

On 29 November 2015 at 17:19, Giuseppe Bilotta
 wrote:
> This follows the src/util/u_atomic_test.c model of undefining NDEBUG
> unconditionally throughouth the XvMC tests, to force asserts regardless
> of debug mode.
>
> The comment on u_atomic_test.c is also fixed (read 'debug' where it
> should have been 'release').
>
I've split that one to a separate patch, but for the future please do
so during submission.

Also can you take a look at the list of patches you've submitted [1]
and let me know if any of those are outstanding, so that I can mark
the remainder as accepted/superseded.

Thanks
Emil

[1] 
http://patchwork.freedesktop.org/project/mesa/patches/?submitter=299
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] gallium/drivers/nouveau: Clean up clear_texture()

2015-12-04 Thread Edward O'Callaghan
We should translate and set the format before dispatching
the call to pipe->create_surface(), instead of creating
a surface with the given format and changing it after.

Signed-off-by: Edward O'Callaghan 
---
 src/gallium/drivers/nouveau/nv50/nv50_surface.c | 72 -
 1 file changed, 35 insertions(+), 37 deletions(-)

diff --git a/src/gallium/drivers/nouveau/nv50/nv50_surface.c 
b/src/gallium/drivers/nouveau/nv50/nv50_surface.c
index 86be1b4..6364aa3 100644
--- a/src/gallium/drivers/nouveau/nv50/nv50_surface.c
+++ b/src/gallium/drivers/nouveau/nv50/nv50_surface.c
@@ -447,8 +447,42 @@ nv50_clear_texture(struct pipe_context *pipe,
const void *data)
 {
struct pipe_surface tmpl = {{0}}, *sf;
+   union pipe_color_union color;
+   float depth = 0;
+   uint8_t stencil = 0;
+   unsigned clear = 0;
+
+   switch (util_format_get_blocksizebits(res->format)) {
+   case 128:
+  tmpl.format = PIPE_FORMAT_R32G32B32A32_UINT;
+  memcpy(, data, 128 / 8);
+  break;
+   case 64:
+  tmpl.format = PIPE_FORMAT_R32G32_UINT;
+  memcpy(, data, 64 / 8);
+  memset([2], 0, 64 / 8);
+  break;
+   case 32:
+  tmpl.format = PIPE_FORMAT_R32_UINT;
+  memcpy(, data, 32 / 8);
+  memset([1], 0, 96 / 8);
+  break;
+   case 16:
+  tmpl.format = PIPE_FORMAT_R16_UINT;
+  color.ui[0] = util_cpu_to_le32(
+ util_le16_to_cpu(*(unsigned short *)data));
+  memset([1], 0, 96 / 8);
+  break;
+   case 8:
+  tmpl.format = PIPE_FORMAT_R8_UINT;
+  color.ui[0] = util_cpu_to_le32(*(unsigned char *)data);
+  memset([1], 0, 96 / 8);
+  break;
+   default:
+  assert(!"Unknown texel element size");
+  return;
+   }
 
-   tmpl.format = res->format;
tmpl.u.tex.first_layer = box->z;
tmpl.u.tex.last_layer = box->z + box->depth - 1;
tmpl.u.tex.level = level;
@@ -457,9 +491,6 @@ nv50_clear_texture(struct pipe_context *pipe,
   return;
 
if (util_format_is_depth_or_stencil(res->format)) {
-  float depth = 0;
-  uint8_t stencil = 0;
-  unsigned clear = 0;
   const struct util_format_description *desc =
  util_format_description(res->format);
 
@@ -474,39 +505,6 @@ nv50_clear_texture(struct pipe_context *pipe,
   pipe->clear_depth_stencil(pipe, sf, clear, depth, stencil,
 box->x, box->y, box->width, box->height);
} else {
-  union pipe_color_union color;
-
-  switch (util_format_get_blocksizebits(res->format)) {
-  case 128:
- sf->format = PIPE_FORMAT_R32G32B32A32_UINT;
- memcpy(, data, 128 / 8);
- break;
-  case 64:
- sf->format = PIPE_FORMAT_R32G32_UINT;
- memcpy(, data, 64 / 8);
- memset([2], 0, 64 / 8);
- break;
-  case 32:
- sf->format = PIPE_FORMAT_R32_UINT;
- memcpy(, data, 32 / 8);
- memset([1], 0, 96 / 8);
- break;
-  case 16:
- sf->format = PIPE_FORMAT_R16_UINT;
- color.ui[0] = util_cpu_to_le32(
-util_le16_to_cpu(*(unsigned short *)data));
- memset([1], 0, 96 / 8);
- break;
-  case 8:
- sf->format = PIPE_FORMAT_R8_UINT;
- color.ui[0] = util_cpu_to_le32(*(unsigned char *)data);
- memset([1], 0, 96 / 8);
- break;
-  default:
- assert(!"Unknown texel element size");
- return;
-  }
-
   pipe->clear_render_target(pipe, sf, ,
 box->x, box->y, box->width, box->height);
}
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] gallium/drivers/nouveau: Clean up clear_texture()

2015-12-04 Thread Edward O'Callaghan
I don't actually have access to Nvid ASIC's at the moment so
this is untested for regressions in ARB_clear_texture piglits.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 4/7] gallium/winsys: Make use of ARRAY_SIZE macro

2015-12-04 Thread Emil Velikov
Hi Edward,

A few trivial suggestions (for future patches) and a humble request

On 4 December 2015 at 06:50, Edward O'Callaghan
 wrote:
> Signed-off-by: Edward O'Callaghan 
> ---
>  src/gallium/winsys/amdgpu/drm/amdgpu_surface.c | 6 ++
For the patches do shorten up the prefix - here we'd want
"winsys/amdgpu" while on the driver side just "nouveau" or "llvmpipe"
should suffice. When in doubt git log is your friend

You mentioned that you've done these with a Coccinelle schematic patch
(script) - mind if we get that (actually start collecting) in tree ...
perhaps in bin/ or bin/cocci/ ?

Thanks
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v3 13/44] i965: Implement L3 state atom.

2015-12-04 Thread Francisco Jerez
Kenneth Graunke  writes:

> On Tuesday, December 01, 2015 12:19:31 AM Jordan Justen wrote:
>> From: Francisco Jerez 
>> 
>> The L3 state atom calculates the target L3 partition weights when the
>> program bound to some shader stage is modified, and in case they are
>> far enough from the current partitioning it makes sure that the L3
>> state is re-emitted.
>> 
>> v3: Fix for inconsistent units the context URB size is expressed in.
>> Clamp URB size to 1008 KB on SKL due to FF hardware limitation.
>> ---
>>  src/mesa/drivers/dri/i965/brw_context.h   |  6 +++
>>  src/mesa/drivers/dri/i965/brw_state.h |  1 +
>>  src/mesa/drivers/dri/i965/gen7_l3_state.c | 81 
>> +++
>>  3 files changed, 88 insertions(+)
>> 
>> diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
>> b/src/mesa/drivers/dri/i965/brw_context.h
>> index 0b91147..48024c6 100644
>> --- a/src/mesa/drivers/dri/i965/brw_context.h
>> +++ b/src/mesa/drivers/dri/i965/brw_context.h
>> @@ -672,6 +672,8 @@ enum brw_predicate_state {
>>  
>>  struct shader_times;
>>  
>> +struct brw_l3_config;
>> +
>>  /**
>>   * brw_context is derived from gl_context.
>>   */
>> @@ -1214,6 +1216,10 @@ struct brw_context
>> int basevertex;
>>  
>> struct {
>> +  const struct brw_l3_config *config;
>> +   } l3;
>> +
>> +   struct {
>>drm_intel_bo *bo;
>>const char **names;
>>int *ids;
>> diff --git a/src/mesa/drivers/dri/i965/brw_state.h 
>> b/src/mesa/drivers/dri/i965/brw_state.h
>> index 94734ba..49f301a 100644
>> --- a/src/mesa/drivers/dri/i965/brw_state.h
>> +++ b/src/mesa/drivers/dri/i965/brw_state.h
>> @@ -129,6 +129,7 @@ extern const struct brw_tracked_state gen7_depthbuffer;
>>  extern const struct brw_tracked_state gen7_clip_state;
>>  extern const struct brw_tracked_state gen7_disable_stages;
>>  extern const struct brw_tracked_state gen7_gs_state;
>> +extern const struct brw_tracked_state gen7_l3_state;
>>  extern const struct brw_tracked_state gen7_ps_state;
>>  extern const struct brw_tracked_state gen7_push_constant_space;
>>  extern const struct brw_tracked_state gen7_sbe_state;
>> diff --git a/src/mesa/drivers/dri/i965/gen7_l3_state.c 
>> b/src/mesa/drivers/dri/i965/gen7_l3_state.c
>> index 0b5e231..a895723 100644
>> --- a/src/mesa/drivers/dri/i965/gen7_l3_state.c
>> +++ b/src/mesa/drivers/dri/i965/gen7_l3_state.c
>> @@ -418,3 +418,84 @@ setup_l3_config(struct brw_context *brw, const struct 
>> brw_l3_config *cfg)
>>}
>> }
>>  }
>> +
>> +/**
>> + * Return the unit brw_context::urb::size is expressed in, in KB.  \sa
>> + * brw_device_info::urb::size.
>> + */
>> +static unsigned
>> +get_urb_size_scale(const struct brw_device_info *devinfo)
>> +{
>> +   return (devinfo->gen >= 8 ? devinfo->num_slices : 1);
>> +}
>> +
>> +/**
>> + * Update the URB size in the context state for the specified L3
>> + * configuration.
>> + */
>> +static void
>> +update_urb_size(struct brw_context *brw, const struct brw_l3_config *cfg)
>> +{
>> +   const struct brw_device_info *devinfo = brw->intelScreen->devinfo;
>> +   /* From the SKL "L3 Allocation and Programming" documentation:
>> +*
>> +* "URB is limited to 1008KB due to programming restrictions.  This is 
>> not
>> +* a restriction of the L3 implementation, but of the FF and other 
>> clients.
>> +* Therefore, in a GT4 implementation it is possible for the programmed
>> +* allocation of the L3 data array to provide 3*384KB=1152KB for URB, but
>> +* only 1008KB of this will be used."
>> +*/
>> +   const unsigned max = (devinfo->gen == 9 ? 1008 : ~0);
>
> I think this would be clearer as devinfo->gt == 4, since it applies to
> all currently known GT4 parts, and doesn't apply to GT1-3.

The reason is that only Gen9 has this limitation.  Gen10 will have a
different limit on the URB space addressable by fixed function clients.

> Presumably GT1-3 just don't have enough URB to hit that limit, so it's
> probably moot...*shrug*.
>
Yeah, in theory this limitation applies to all Gen9 hardware but only
GT4 devices have a large enough L3 to hit it regularly.  In principle
you could hit the limit on GT3 too if you tried out some non-validated
configuration so it seems sensible to apply it to Gen9 in general even
if only GT4 parts are going to hit it in practice.  It doesn't hurt
anyway.

> Acked-by: Kenneth Graunke 
>
>> +   const unsigned sz =
>> +  MIN2(max, cfg->n[L3P_URB] * get_l3_way_size(devinfo)) /
>> +  get_urb_size_scale(devinfo);
>> +
>> +   if (brw->urb.size != sz) {
>> +  brw->urb.size = sz;
>> +  brw->ctx.NewDriverState |= BRW_NEW_URB_SIZE;
>> +   }
>> +}
>> +
>> +static void
>> +emit_l3_state(struct brw_context *brw)
>> +{
>> +   const struct brw_l3_weights w = get_pipeline_state_l3_weights(brw);
>> +   const float dw = diff_l3_weights(w, 
>> get_config_l3_weights(brw->l3.config));
>> +   /* The distance between any two compatible 

Re: [Mesa-dev] [PATCH v3 07/44] i965: Import tables enumerating the set of validated L3 configurations.

2015-12-04 Thread Francisco Jerez
Kenneth Graunke  writes:

> On Tuesday, December 01, 2015 12:19:25 AM Jordan Justen wrote:
>> From: Francisco Jerez 
>> 
>> It should be possible to use additional L3 configurations other than
>> the ones listed in the tables of validated allocations ("BSpec »
>> 3D-Media-GPGPU Engine » L3 Cache and URB [IVB+] » L3 Cache and URB [*]
>> » L3 Allocation and Programming"), but it seems sensible for now to
>> hard-code the tables in order to stick to the hardware docs.  Instead
>> of setting up the arbitrary L3 partitioning given as input, the
>> closest validated L3 configuration will be looked up in these tables
>> and used to program the hardware.
>> 
>> The included tables should work for Gen7-9.  Note that the quantities
>> are specified in ways rather than in KB, this is because the L3
>> control registers expect the value in ways, and because by doing that
>> we can re-use a single table for all GT variants of the same
>> generation (and in the case of IVB/HSW and CHV/SKL across different
>> generations) which generally have different L3 way sizes but allow the
>> same combinations of way allocations.
>> 
>> v3: Use slice count from the devinfo structure instead of the gt
>> number to implement get_l3_way_size().
>> 
>> Reviewed-by: Samuel Iglesias Gonsálvez 
>> ---
>>  src/mesa/drivers/dri/i965/Makefile.sources |   1 +
>>  src/mesa/drivers/dri/i965/gen7_l3_state.c  | 163 
>> +
>>  2 files changed, 164 insertions(+)
>>  create mode 100644 src/mesa/drivers/dri/i965/gen7_l3_state.c
>> 
>> diff --git a/src/mesa/drivers/dri/i965/Makefile.sources 
>> b/src/mesa/drivers/dri/i965/Makefile.sources
>> index 5e805fa..b413f73 100644
>> --- a/src/mesa/drivers/dri/i965/Makefile.sources
>> +++ b/src/mesa/drivers/dri/i965/Makefile.sources
>> @@ -182,6 +182,7 @@ i965_FILES = \
>>  gen7_cs_state.c \
>>  gen7_disable.c \
>>  gen7_gs_state.c \
>> +gen7_l3_state.c \
>>  gen7_misc_state.c \
>>  gen7_sf_state.c \
>>  gen7_sol_state.c \
>> diff --git a/src/mesa/drivers/dri/i965/gen7_l3_state.c 
>> b/src/mesa/drivers/dri/i965/gen7_l3_state.c
>> new file mode 100644
>> index 000..8765b11
>> --- /dev/null
>> +++ b/src/mesa/drivers/dri/i965/gen7_l3_state.c
>> @@ -0,0 +1,163 @@
>> +/*
>> + * Copyright (c) 2015 Intel Corporation
>> + *
>> + * Permission is hereby granted, free of charge, to any person obtaining a
>> + * copy of this software and associated documentation files (the 
>> "Software"),
>> + * to deal in the Software without restriction, including without limitation
>> + * the rights to use, copy, modify, merge, publish, distribute, sublicense,
>> + * and/or sell copies of the Software, and to permit persons to whom the
>> + * Software is furnished to do so, subject to the following conditions:
>> + *
>> + * The above copyright notice and this permission notice (including the next
>> + * paragraph) shall be included in all copies or substantial portions of the
>> + * Software.
>> + *
>> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS 
>> OR
>> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
>> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
>> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR 
>> OTHER
>> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
>> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER 
>> DEALINGS
>> + * IN THE SOFTWARE.
>> + */
>> +
>> +#include "brw_context.h"
>> +#include "brw_defines.h"
>> +#include "brw_state.h"
>> +#include "intel_batchbuffer.h"
>> +
>> +/**
>> + * Chunk of L3 cache reserved for some specific purpose.
>> + */
>> +enum brw_l3_partition {
>> +   /** Shared local memory. */
>> +   L3P_SLM = 0,
>> +   /** Unified return buffer. */
>> +   L3P_URB,
>> +   /** Union of DC and RO. */
>> +   L3P_ALL,
>> +   /** Data cluster RW partition. */
>> +   L3P_DC,
>> +   /** Union of IS, C and T. */
>> +   L3P_RO,
>> +   /** Instruction and state cache. */
>> +   L3P_IS,
>> +   /** Constant cache. */
>> +   L3P_C,
>> +   /** Texture cache. */
>> +   L3P_T,
>> +   /** Number of supported L3 partitions. */
>> +   NUM_L3P
>> +};
>> +
>> +/**
>> + * L3 configuration represented as the number of ways allocated for each
>> + * partition.  \sa get_l3_way_size().
>> + */
>> +struct brw_l3_config {
>> +   unsigned n[NUM_L3P];
>> +};
>> +
>> +/**
>> + * IVB/HSW validated L3 configurations.
>> + */
>> +static const struct brw_l3_config ivb_l3_configs[] = {
>
> It might be nice to add comments above the columns, similar to
> brw_surface_formats.c:
>
>/* SLM  URB  ALL   DC   RO   ISCT */
>{{   0,  32,   0,   0,  32,   0,   0,   0 }},
>

Sure, fixed locally.

> Either way,
> Acked-by: Kenneth Graunke 
>
Thanks. :)

> Thanks Curro!
>
>> +   {{  0, 32,  0,  0, 32,  0,  0,  0 }},
>> +   

Re: [Mesa-dev] [PATCH 0/3] misc janitorial

2015-12-04 Thread Giuseppe Bilotta
On Tue, Dec 1, 2015 at 6:04 PM, Emil Velikov  wrote:

> With the small comment in patch 3 addressed the series is
> Reviewed-by: Emil Velikov 

And also:

> Reviewed-by: Ian Romanick 

for patch 1 and 2. Who should I ping for to get them pushed? ;-)

>>> The long version:
>>>  1. Grep for RADEON_R{1,2}00 and look at the guarded code
>>>  1.1 Some guards can just be dropped (CHIP_FAMILY_*, get_chip_family_name),
>>>  1.2 Other hunks of code can be split/moved into the respective
>>> r100/r200 codebase (radeonTexBufferExtension and
>>> r200TexBufferExtension).
>>>  1.3 Or folded into the vtbl dispatch - get_depth_z32/16 or the
>>> respective callers, depending on the overhead (neither one is likely
>>> to be noticeable).
>>>  2. Copy/pasta an existing radeon Makefile to radeon_common. Tweak to
>>> produce an empty library and get it into the final mega_dri_drivers.so
>>>  2.1 With most/all of the RADEON_R*00 guards gone, one should be able
>>> to just git mv the files into the new location. (alongside dropping
>>> the symlinks)
>>>  2.2 Remove the symbol redefinition hack.

> Three reasons - symlinks in git is frowned upon, binary is "bloated"
> and nasty symbol redefinition(hack).

Well, much of the “bloat” would probably be there even after the
cleanup, since I'm not sure (from the cursory look I've given) how
many of the functions would manage to go into a radeon-common. I
suspect there'll end up being quite some code duplication to get rid
of the symlinks and symbol redefinition hack. This is one of those
contexts in which things could be done marginally more elegantly with
C++. I am proficient enough with C macros and xincludes that I could
probably hack together something that avoids the symlinks and the code
duplication, but whether or not this would be considered an
improvement is very much up to personal taste.

-- 
Giuseppe "Oblomov" Bilotta
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 0/3] misc janitorial

2015-12-04 Thread Emil Velikov
On 4 December 2015 at 13:27, Giuseppe Bilotta
 wrote:
> On Tue, Dec 1, 2015 at 6:04 PM, Emil Velikov  wrote:
>
>> With the small comment in patch 3 addressed the series is
>> Reviewed-by: Emil Velikov 
>
> And also:
>
>> Reviewed-by: Ian Romanick 
>
> for patch 1 and 2. Who should I ping for to get them pushed? ;-)
>
Hmm I thought I pushed those in a couple of days ago. Doing it now, thanks !

 The long version:
  1. Grep for RADEON_R{1,2}00 and look at the guarded code
  1.1 Some guards can just be dropped (CHIP_FAMILY_*, get_chip_family_name),
  1.2 Other hunks of code can be split/moved into the respective
 r100/r200 codebase (radeonTexBufferExtension and
 r200TexBufferExtension).
  1.3 Or folded into the vtbl dispatch - get_depth_z32/16 or the
 respective callers, depending on the overhead (neither one is likely
 to be noticeable).
  2. Copy/pasta an existing radeon Makefile to radeon_common. Tweak to
 produce an empty library and get it into the final mega_dri_drivers.so
  2.1 With most/all of the RADEON_R*00 guards gone, one should be able
 to just git mv the files into the new location. (alongside dropping
 the symlinks)
  2.2 Remove the symbol redefinition hack.
>
>> Three reasons - symlinks in git is frowned upon, binary is "bloated"
>> and nasty symbol redefinition(hack).
>
> Well, much of the “bloat” would probably be there even after the
> cleanup, since I'm not sure (from the cursory look I've given) how
> many of the functions would manage to go into a radeon-common. I
> suspect there'll end up being quite some code duplication to get rid
> of the symlinks and symbol redefinition hack.
Not quite ... currently ~50% of the "r200" sources are (almost)
identical to the radeon ones (ls -la src/mesa/drivers/dri/r200/).

> This is one of those
> contexts in which things could be done marginally more elegantly with
> C++. I am proficient enough with C macros and xincludes that I could
> probably hack together something that avoids the symlinks and the code
> duplication, but whether or not this would be considered an
> improvement is very much up to personal taste.
>
Before you start moving things around please fish around for
RADEON_R(12)00. Once things are free of those (either the macros are
just removed or the functions are split into r100 and r200 variant,
all the movement comes into play. The last one can be tedious if
you're unfamiliar with autotools, so just ping me on IRC (not there
today) and I can guide you around.

Cheers.
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 79688] [dri3] Latest git breaks PRIME Offloading to Nouveau GPU

2015-12-04 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=79688

--- Comment #21 from Tobias Klausmann  ---
(In reply to Axel Davy from comment #20)
> Should this bug report be closed ?
> DRI3 has complete DRI_PRIME support now.
> 

As far as it concerns me, DRI3 Support is fine and this bug can be marked as
"resolved fixed". 


> Also I'm not sure why you do post these benchmarks here.
> Is there some hidden message about something not working right ?

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/3v2] xvmc: force assertion in XvMC tests

2015-12-04 Thread Giuseppe Bilotta
Hello Emil,

>> The comment on u_atomic_test.c is also fixed (read 'debug' where it
>> should have been 'release').

> I've split that one to a separate patch, but for the future please do
> so during submission.

Doh, sorry.

> Also can you take a look at the list of patches you've submitted [1]
> and let me know if any of those are outstanding, so that I can mark
> the remainder as accepted/superseded.

> [1] 
> http://patchwork.freedesktop.org/project/mesa/patches/?submitter=299

All except for the oldest one ("clover: always return number of
devices in clGetDeviceIDs") are iterations of a patchset that has been
merged as commits 9280f17e823cbdbddf30a4ae5e2de9f2d327d33c to
7932b30892ef898ec4c74ac1f972cb462f19962b. For the oldest one I
remember it being discussed and rejected (since in case of errors user
input shouldn't be touched), but I can't find the discussion about it
in my archive.

-- 
Giuseppe "Oblomov" Bilotta
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 93089] mesa fails to check for gcc atomic primitives before using them

2015-12-04 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=93089

--- Comment #8 from Jonathan Gray  ---
There is no way to backport builtins for architectures that lack atomics at the
ISA level.

On arm/hppa/sh linux has a syscall which the gcc documentation claims can only
do 64 bit values on arm:
https://gcc.gnu.org/wiki/Atomic

Even on Linux with the latest toolchains this should be checked as there are
cases where some or none of the atomics may be present.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 4/7] gallium/winsys: Make use of ARRAY_SIZE macro

2015-12-04 Thread eocallaghan

On 2015-12-05 01:33, Emil Velikov wrote:

Hi Edward,

A few trivial suggestions (for future patches) and a humble request

On 4 December 2015 at 06:50, Edward O'Callaghan
 wrote:

Signed-off-by: Edward O'Callaghan 
---
 src/gallium/winsys/amdgpu/drm/amdgpu_surface.c | 6 ++

For the patches do shorten up the prefix - here we'd want
"winsys/amdgpu" while on the driver side just "nouveau" or "llvmpipe"
should suffice. When in doubt git log is your friend


Sure no worries, will do on future patches. Thanks.



You mentioned that you've done these with a Coccinelle schematic patch
(script) - mind if we get that (actually start collecting) in tree ...
perhaps in bin/ or bin/cocci/ ?


Definitely! This is in fact part of my proposed plan after I get some of
the major churn out the way and see what passes review and hence which
.cocci scripts are suitable for mesa.



Thanks
Emil


Kind Regards,
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 91806] configure does not test whether assembler supports sse4.1

2015-12-04 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=91806

--- Comment #2 from Jonathan Gray  ---
gcc returns 1.  I tried various changes to configure.ac but didn't get
anywhere.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 93089] mesa fails to check for gcc atomic primitives before using them

2015-12-04 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=93089

--- Comment #7 from Emil Velikov  ---
What Matt is saying is that surely Mesa is not the only project out there which
relies on atomics, thus adding (backporting) those in GCC wouldn't be that bad
of an idea.

That aside I'm thinking that porting the libdrm solution wouldn't be that bad
of an idea.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 93188] "nir/nir.h", line 552: Error: Unexpected type name "nir_src" encountered.

2015-12-04 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=93188

--- Comment #5 from Emil Velikov  ---
(In reply to Ian Romanick from comment #4)
> I guess the question is whether or not this builds with Visual Studio's C++
> compiler.  If not, maybe we should reopen with changes to the summary?

Afaict MSVC was the first one to use anonymous struct/unions. After all GCC
allowed/implemented it as -fms-extensions way before C99 came along.

That aside I did explicitly pointed out the lack of portability as this was
introduced, yet was greeted by hostility (~ish) and absolutely no support on
the topic. Oh well :-\

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 91806] configure does not test whether assembler supports sse4.1

2015-12-04 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=91806

--- Comment #1 from Emil Velikov  ---
In the failing case does gcc return a non zero value ?

I'm leaning that this is some sort of GCC/autotools bug as to the best of my
knowledge we're doing things the "right way" (almost identical hunk is being
used in pixman and other projects).

That aside if you have some ideas how we can improve things please let us know.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/2] nv50/ir: flip shl(add, imm) into add(shl, imm)

2015-12-04 Thread Ilia Mirkin
This works when the add also has an immediate. This often happens in
address calculations. These addresses can then be inlined as well.

On code targeted to SM35:

total instructions in shared programs : 6223346 -> 6206257 (-0.27%)
total gprs used in shared programs: 911075 -> 911045 (-0.00%)
total local used in shared programs   : 39072 -> 39072 (0.00%)

localgpr   inst  bytes
helped   0 11936643664
  hurt   0  74  15  15

Signed-off-by: Ilia Mirkin 
---
 .../drivers/nouveau/codegen/nv50_ir_peephole.cpp   | 38 +++---
 1 file changed, 34 insertions(+), 4 deletions(-)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
index 5f47887..539edaf 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
@@ -1132,13 +1132,43 @@ ConstantFolding::opnd(Instruction *i, ImmediateValue 
, int s)
  break;
   // try to concatenate shifts
   Instruction *si = i->getSrc(0)->getInsn();
-  if (!si || si->op != OP_SHL)
+  if (!si)
  break;
   ImmediateValue imm1;
-  if (si->src(1).getImmediate(imm1)) {
+  switch (si->op) {
+  case OP_SHL:
+ if (si->src(1).getImmediate(imm1)) {
+bld.setPosition(i, false);
+i->setSrc(0, si->getSrc(0));
+i->setSrc(1, bld.loadImm(NULL, imm0.reg.data.u32 + 
imm1.reg.data.u32));
+ }
+ break;
+  case OP_SUB:
+  case OP_ADD:
+ int adds;
+ if (isFloatType(si->dType))
+return;
+ if (si->op != OP_SUB && si->src(0).getImmediate(imm1))
+adds = 0;
+ else if (si->src(1).getImmediate(imm1))
+adds = 1;
+ else
+return;
+ // SHL(ADD(x, y), z) = ADD(SHL(x, z), SHL(y, z))
+
+ // This is more operations, but if one of x, y is an immediate, then
+ // we can get a situation where (a) we can use ISCADD, or (b)
+ // propagate the add bit into an indirect load.
  bld.setPosition(i, false);
- i->setSrc(0, si->getSrc(0));
- i->setSrc(1, bld.loadImm(NULL, imm0.reg.data.u32 + 
imm1.reg.data.u32));
+ i->op = si->op;
+ i->setSrc(adds, bld.loadImm(NULL, imm1.reg.data.u32 << 
imm0.reg.data.u32));
+ i->setSrc(!adds, bld.mkOp2v(OP_SHL, i->dType,
+ bld.getSSA(i->def(0).getSize(), 
i->def(0).getFile()),
+ si->getSrc(!adds),
+ bld.mkImm(imm0.reg.data.u32)));
+ break;
+  default:
+ return;
   }
}
   break;
-- 
2.4.10

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/2] nv50/ir: propagate indirect loads into instructions

2015-12-04 Thread Ilia Mirkin
This way $r1 = $r0 + 4; c1[$r1] becomes c1[$r0+4].

On SM35:

total instructions in shared programs : 6206257 -> 6185058 (-0.34%)
total gprs used in shared programs: 911045 -> 910722 (-0.04%)
total local used in shared programs   : 39072 -> 39072 (0.00%)

localgpr   inst  bytes
helped   0 41741954195
  hurt   0 280   0   0

Signed-off-by: Ilia Mirkin 
---
 .../drivers/nouveau/codegen/nv50_ir_peephole.cpp   | 49 ++
 1 file changed, 49 insertions(+)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
index 539edaf..6dec208 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
@@ -265,6 +265,54 @@ LoadPropagation::visit(BasicBlock *bb)
 
 // 
=
 
+class IndirectPropagation : public Pass
+{
+private:
+   virtual bool visit(BasicBlock *);
+};
+
+bool
+IndirectPropagation::visit(BasicBlock *bb)
+{
+   const Target *targ = prog->getTarget();
+   Instruction *next;
+
+   for (Instruction *i = bb->getEntry(); i; i = next) {
+  next = i->next;
+
+  for (int s = 0; i->srcExists(s); ++s) {
+ Instruction *insn;
+ ImmediateValue imm;
+ if (!i->src(s).isIndirect(0))
+continue;
+ insn = i->getIndirect(s, 0)->getInsn();
+ if (!insn)
+continue;
+ if (insn->op == OP_ADD && !isFloatType(insn->dType)) {
+if (insn->src(0).getFile() != targ->nativeFile(FILE_ADDRESS) ||
+!insn->src(1).getImmediate(imm))
+   continue;
+i->setIndirect(s, 0, insn->getSrc(0));
+i->src(s).get()->reg.data.offset += imm.reg.data.u32;
+ } else if (insn->op == OP_SUB && !isFloatType(insn->dType)) {
+if (insn->src(0).getFile() != targ->nativeFile(FILE_ADDRESS) ||
+!insn->src(1).getImmediate(imm))
+   continue;
+i->setIndirect(s, 0, insn->getSrc(0));
+i->src(s).get()->reg.data.offset -= imm.reg.data.u32;
+ } else if (insn->op == OP_MOV) {
+if (!insn->src(0).getImmediate(imm))
+   continue;
+i->setIndirect(s, 0, NULL);
+i->src(s).get()->reg.data.offset += imm.reg.data.u32;
+ }
+  }
+   }
+   return true;
+}
+
+// 
=
+
 // Evaluate constant expressions.
 class ConstantFolding : public Pass
 {
@@ -3135,6 +3183,7 @@ Program::optimizeSSA(int level)
RUN_PASS(2, ModifierFolding, run); // before load propagation -> less checks
RUN_PASS(1, ConstantFolding, foldAll);
RUN_PASS(1, LoadPropagation, run);
+   RUN_PASS(1, IndirectPropagation, run);
RUN_PASS(2, MemoryOpt, run);
RUN_PASS(2, LocalCSE, run);
RUN_PASS(0, DeadCodeElim, buryAll);
-- 
2.4.10

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] svga: use the debug callback to report issues to the state tracker

2015-12-04 Thread Jose Fonseca

On 04/12/15 20:04, Ilia Mirkin wrote:

On Fri, Dec 4, 2015 at 3:01 PM, Brian Paul  wrote:

On 12/04/2015 12:46 PM, Ilia Mirkin wrote:


On Fri, Dec 4, 2015 at 2:42 PM, Brian Paul  wrote:


+   if (templ->poly_smooth && svga->debug.callback.debug_message) {
+  /* note: we always need a % something in the message string */



Why? Did I mess something up?



If I write it without the dummy %s I get:

In file included from
../../../../src/gallium/auxiliary/util/u_inlines.h:36:0,
  from svga_pipe_rasterizer.c:29:
svga_pipe_rasterizer.c: In function 'svga_create_rasterizer_state':
../../../../src/gallium/auxiliary/util/u_debug.h:273:40: error: expected
expression before ')' token
 fmt, __VA_ARGS__); \
 ^
svga_pipe_rasterizer.c:357:7: note: in expansion of macro
'pipe_debug_message'
pipe_debug_message(>debug.callback, CONFORMANCE,
^



H... so that'd be a "yes" to me messing it up :) I'll have a look
to see if there's something simple I can do here. Off-hand it feels
like it should be #__VA_ARGS__? Been a while since I looked at the
specifics of all this.





BTW, right now this callback is only set for debug contexts. Perhaps
that was not an ideal decision... just an FYI though.



apitrace creates a debug context so that's the main thing for me right now.
But it might be nice if setting MESA_DEBUG would cause the debug/info
messages to be printed to stderr for those apps that don't use
GL_ARB_debug_output.


Yeah, we should probably also throw in Enable/Disable hooks into
st/mesa that flip it on/off when flipping GL_DEBUG_OUTPUT (or whatever
the enum is).

BTW, note that apitrace cuts messages off after 100 of them which is
pretty annoying -- I opened an issue at
https://github.com/apitrace/apitrace/issues/395 a while back.


Not just "100", but 100 *per ID*.

Which is very different.

Why is it so annoying?  A patch for an option is welcome anyway BTW.

Jose

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] svga: use the debug callback to report issues to the state tracker

2015-12-04 Thread Brian Paul
Use the new debug callback hook to report conformance, performance
and fallbacks to the state tracker.  The state tracker, in turn can
report this issues to the user via the GL_ARB_debug_output extension.

More issues can be reported in the future; this is just a start.
---
 src/gallium/drivers/svga/svga_context.h  |  3 +++
 src/gallium/drivers/svga/svga_draw_arrays.c  |  7 +++
 src/gallium/drivers/svga/svga_pipe_blend.c   | 11 +++
 src/gallium/drivers/svga/svga_pipe_misc.c| 17 +
 src/gallium/drivers/svga/svga_pipe_rasterizer.c  |  6 ++
 src/gallium/drivers/svga/svga_state_need_swtnl.c | 23 +++
 6 files changed, 67 insertions(+)

diff --git a/src/gallium/drivers/svga/svga_context.h 
b/src/gallium/drivers/svga/svga_context.h
index 6a4f9d8..c4284cc 100644
--- a/src/gallium/drivers/svga/svga_context.h
+++ b/src/gallium/drivers/svga/svga_context.h
@@ -392,6 +392,9 @@ struct svga_context
 
   boolean no_line_width;
   boolean force_hw_line_stipple;
+
+  /** To report perf/conformance/etc issues to the state tracker */
+  struct pipe_debug_callback callback;
} debug;
 
struct {
diff --git a/src/gallium/drivers/svga/svga_draw_arrays.c 
b/src/gallium/drivers/svga/svga_draw_arrays.c
index acb2e95..f6956b5 100644
--- a/src/gallium/drivers/svga/svga_draw_arrays.c
+++ b/src/gallium/drivers/svga/svga_draw_arrays.c
@@ -26,6 +26,7 @@
 #include "svga_cmd.h"
 
 #include "util/u_inlines.h"
+#include "util/u_prim.h"
 #include "indices/u_indices.h"
 
 #include "svga_hw_reg.h"
@@ -277,6 +278,12 @@ svga_hwtnl_draw_arrays(struct svga_hwtnl *hwtnl,
   if (ret != PIPE_OK)
  goto done;
 
+  if (svga->debug.callback.debug_message) {
+ pipe_debug_message(>debug.callback, PERF_INFO,
+"generating temporary index buffer for drawing %s",
+u_prim_name(prim));
+  }
+
   ret = svga_hwtnl_simple_draw_range_elements(hwtnl,
   gen_buf,
   gen_size,
diff --git a/src/gallium/drivers/svga/svga_pipe_blend.c 
b/src/gallium/drivers/svga/svga_pipe_blend.c
index 0c9d612..5538193 100644
--- a/src/gallium/drivers/svga/svga_pipe_blend.c
+++ b/src/gallium/drivers/svga/svga_pipe_blend.c
@@ -243,6 +243,17 @@ svga_create_blend_state(struct pipe_context *pipe,
  blend->rt[i].srcblend_alpha = blend->rt[i].srcblend;
  blend->rt[i].dstblend_alpha = blend->rt[i].dstblend;
  blend->rt[i].blendeq_alpha = blend->rt[i].blendeq;
+
+ if (svga->debug.callback.debug_message) {
+if (templ->logicop_func == PIPE_LOGICOP_XOR) {
+   pipe_debug_message(>debug.callback, CONFORMANCE,
+  "XOR logicop mode has limited support%s", 
"");
+}
+else if (templ->logicop_func != PIPE_LOGICOP_COPY) {
+   pipe_debug_message(>debug.callback, CONFORMANCE,
+  "general logicops are not supported%s", "");
+}
+ }
   }
   else {
  /* Note: the vgpu10 device does not yet support independent
diff --git a/src/gallium/drivers/svga/svga_pipe_misc.c 
b/src/gallium/drivers/svga/svga_pipe_misc.c
index c8020da..af9356d 100644
--- a/src/gallium/drivers/svga/svga_pipe_misc.c
+++ b/src/gallium/drivers/svga/svga_pipe_misc.c
@@ -244,6 +244,22 @@ static void svga_set_viewport_states( struct pipe_context 
*pipe,
 }
 
 
+/**
+ * Called by state tracker to specify a callback function the driver
+ * can use to report info back to the state tracker.
+ */
+static void
+svga_set_debug_callback(struct pipe_context *pipe,
+const struct pipe_debug_callback *cb)
+{
+   struct svga_context *svga = svga_context(pipe);
+
+   if (cb)
+  svga->debug.callback = *cb;
+   else
+  memset(>debug.callback, 0, sizeof(svga->debug.callback));
+}
+
 
 void svga_init_misc_functions( struct svga_context *svga )
 {
@@ -252,6 +268,7 @@ void svga_init_misc_functions( struct svga_context *svga )
svga->pipe.set_framebuffer_state = svga_set_framebuffer_state;
svga->pipe.set_clip_state = svga_set_clip_state;
svga->pipe.set_viewport_states = svga_set_viewport_states;
+   svga->pipe.set_debug_callback = svga_set_debug_callback;
 }
 
 
diff --git a/src/gallium/drivers/svga/svga_pipe_rasterizer.c 
b/src/gallium/drivers/svga/svga_pipe_rasterizer.c
index 6310b7a..c6215e6 100644
--- a/src/gallium/drivers/svga/svga_pipe_rasterizer.c
+++ b/src/gallium/drivers/svga/svga_pipe_rasterizer.c
@@ -352,6 +352,12 @@ svga_create_rasterizer_state(struct pipe_context *pipe,
   define_rasterizer_object(svga, rast);
}
 
+   if (templ->poly_smooth && svga->debug.callback.debug_message) {
+  /* note: we always need a % something in the message string */
+  pipe_debug_message(>debug.callback, CONFORMANCE,
+ 

Re: [Mesa-dev] [PATCH] svga: use the debug callback to report issues to the state tracker

2015-12-04 Thread Ilia Mirkin
On Fri, Dec 4, 2015 at 2:42 PM, Brian Paul  wrote:
> +   if (templ->poly_smooth && svga->debug.callback.debug_message) {
> +  /* note: we always need a % something in the message string */

Why? Did I mess something up?

BTW, right now this callback is only set for debug contexts. Perhaps
that was not an ideal decision... just an FYI though.

  -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] svga: use the debug callback to report issues to the state tracker

2015-12-04 Thread Brian Paul

On 12/04/2015 12:46 PM, Ilia Mirkin wrote:

On Fri, Dec 4, 2015 at 2:42 PM, Brian Paul  wrote:

+   if (templ->poly_smooth && svga->debug.callback.debug_message) {
+  /* note: we always need a % something in the message string */


Why? Did I mess something up?


If I write it without the dummy %s I get:

In file included from 
../../../../src/gallium/auxiliary/util/u_inlines.h:36:0,

 from svga_pipe_rasterizer.c:29:
svga_pipe_rasterizer.c: In function 'svga_create_rasterizer_state':
../../../../src/gallium/auxiliary/util/u_debug.h:273:40: error: expected 
expression before ')' token

fmt, __VA_ARGS__); \
^
svga_pipe_rasterizer.c:357:7: note: in expansion of macro 
'pipe_debug_message'

   pipe_debug_message(>debug.callback, CONFORMANCE,
   ^




BTW, right now this callback is only set for debug contexts. Perhaps
that was not an ideal decision... just an FYI though.


apitrace creates a debug context so that's the main thing for me right 
now.  But it might be nice if setting MESA_DEBUG would cause the 
debug/info messages to be printed to stderr for those apps that don't 
use GL_ARB_debug_output.


-Brian


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] svga: use the debug callback to report issues to the state tracker

2015-12-04 Thread Ilia Mirkin
On Fri, Dec 4, 2015 at 3:01 PM, Brian Paul  wrote:
> On 12/04/2015 12:46 PM, Ilia Mirkin wrote:
>>
>> On Fri, Dec 4, 2015 at 2:42 PM, Brian Paul  wrote:
>>>
>>> +   if (templ->poly_smooth && svga->debug.callback.debug_message) {
>>> +  /* note: we always need a % something in the message string */
>>
>>
>> Why? Did I mess something up?
>
>
> If I write it without the dummy %s I get:
>
> In file included from
> ../../../../src/gallium/auxiliary/util/u_inlines.h:36:0,
>  from svga_pipe_rasterizer.c:29:
> svga_pipe_rasterizer.c: In function 'svga_create_rasterizer_state':
> ../../../../src/gallium/auxiliary/util/u_debug.h:273:40: error: expected
> expression before ')' token
> fmt, __VA_ARGS__); \
> ^
> svga_pipe_rasterizer.c:357:7: note: in expansion of macro
> 'pipe_debug_message'
>pipe_debug_message(>debug.callback, CONFORMANCE,
>^
>

H... so that'd be a "yes" to me messing it up :) I'll have a look
to see if there's something simple I can do here. Off-hand it feels
like it should be #__VA_ARGS__? Been a while since I looked at the
specifics of all this.

>
>>
>> BTW, right now this callback is only set for debug contexts. Perhaps
>> that was not an ideal decision... just an FYI though.
>
>
> apitrace creates a debug context so that's the main thing for me right now.
> But it might be nice if setting MESA_DEBUG would cause the debug/info
> messages to be printed to stderr for those apps that don't use
> GL_ARB_debug_output.

Yeah, we should probably also throw in Enable/Disable hooks into
st/mesa that flip it on/off when flipping GL_DEBUG_OUTPUT (or whatever
the enum is).

BTW, note that apitrace cuts messages off after 100 of them which is
pretty annoying -- I opened an issue at
https://github.com/apitrace/apitrace/issues/395 a while back.

  -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] nv50/ir: fold shl + mul with immediates

2015-12-04 Thread Ilia Mirkin
On SM35 this gives:

total instructions in shared programs : 6185058 -> 6181090 (-0.06%)
total gprs used in shared programs: 910722 -> 910722 (0.00%)
total local used in shared programs   : 39072 -> 39072 (0.00%)

localgpr   inst  bytes
helped   0   021372137
  hurt   0   0   0   0

Signed-off-by: Ilia Mirkin 
---
 src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp | 8 
 1 file changed, 8 insertions(+)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
index 6dec208..9760128 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
@@ -1191,6 +1191,14 @@ ConstantFolding::opnd(Instruction *i, ImmediateValue 
, int s)
 i->setSrc(1, bld.loadImm(NULL, imm0.reg.data.u32 + 
imm1.reg.data.u32));
  }
  break;
+  case OP_MUL:
+ if (!isFloatType(si->dType) && si->src(1).getImmediate(imm1)) {
+bld.setPosition(i, false);
+i->op = OP_MUL;
+i->setSrc(0, si->getSrc(0));
+i->setSrc(1, bld.loadImm(NULL, (1 << imm0.reg.data.u32) * 
imm1.reg.data.u32));
+ }
+ break;
   case OP_SUB:
   case OP_ADD:
  int adds;
-- 
2.4.10

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/2] radeonsi: apply the streamout workaround to Fiji as well

2015-12-04 Thread Marek Olšák
From: Marek Olšák 

Cc: 11.0 11.1 
---
 src/gallium/drivers/radeonsi/si_state_draw.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/src/gallium/drivers/radeonsi/si_state_draw.c 
b/src/gallium/drivers/radeonsi/si_state_draw.c
index 771d206..ee84a1f 100644
--- a/src/gallium/drivers/radeonsi/si_state_draw.c
+++ b/src/gallium/drivers/radeonsi/si_state_draw.c
@@ -872,7 +872,9 @@ void si_draw_vbo(struct pipe_context *ctx, const struct 
pipe_draw_info *info)
 
/* Workaround for a VGT hang when streamout is enabled.
 * It must be done after drawing. */
-   if ((sctx->b.family == CHIP_HAWAII || sctx->b.family == CHIP_TONGA) &&
+   if ((sctx->b.family == CHIP_HAWAII ||
+sctx->b.family == CHIP_TONGA ||
+sctx->b.family == CHIP_FIJI) &&
(sctx->b.streamout.streamout_enabled ||
 sctx->b.streamout.prims_gen_query_enabled)) {
sctx->b.flags |= SI_CONTEXT_VGT_STREAMOUT_SYNC;
-- 
2.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/2] radeonsi: don't use the CP DMA workaround on Fiji and newer

2015-12-04 Thread Marek Olšák
From: Marek Olšák 

---
 src/gallium/drivers/radeonsi/si_cp_dma.c | 36 ++--
 1 file changed, 20 insertions(+), 16 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_cp_dma.c 
b/src/gallium/drivers/radeonsi/si_cp_dma.c
index 0bf85a0..a5e3d79 100644
--- a/src/gallium/drivers/radeonsi/si_cp_dma.c
+++ b/src/gallium/drivers/radeonsi/si_cp_dma.c
@@ -273,22 +273,26 @@ void si_copy_buffer(struct si_context *sctx,
dst_offset += r600_resource(dst)->gpu_address;
src_offset += r600_resource(src)->gpu_address;
 
-   /* If the size is not aligned, we must add a dummy copy at the end
-* just to align the internal counter. Otherwise, the DMA engine
-* would slow down by an order of magnitude for following copies.
-*/
-   if (size % CP_DMA_ALIGNMENT)
-   realign_size = CP_DMA_ALIGNMENT - (size % CP_DMA_ALIGNMENT);
-
-   /* If the copy begins unaligned, we must start copying from the next
-* aligned block and the skipped part should be copied after everything
-* else has been copied. Only the src alignment matters, not dst.
-*/
-   if (src_offset % CP_DMA_ALIGNMENT) {
-   skipped_size = CP_DMA_ALIGNMENT - (src_offset % 
CP_DMA_ALIGNMENT);
-   /* The main part will be skipped if the size is too small. */
-   skipped_size = MIN2(skipped_size, size);
-   size -= skipped_size;
+   /* The workarounds aren't needed on Fiji and beyond. */
+   if (sctx->b.family <= CHIP_CARRIZO ||
+   sctx->b.family == CHIP_STONEY) {
+   /* If the size is not aligned, we must add a dummy copy at the 
end
+* just to align the internal counter. Otherwise, the DMA engine
+* would slow down by an order of magnitude for following 
copies.
+*/
+   if (size % CP_DMA_ALIGNMENT)
+   realign_size = CP_DMA_ALIGNMENT - (size % 
CP_DMA_ALIGNMENT);
+
+   /* If the copy begins unaligned, we must start copying from the 
next
+* aligned block and the skipped part should be copied after 
everything
+* else has been copied. Only the src alignment matters, not 
dst.
+*/
+   if (src_offset % CP_DMA_ALIGNMENT) {
+   skipped_size = CP_DMA_ALIGNMENT - (src_offset % 
CP_DMA_ALIGNMENT);
+   /* The main part will be skipped if the size is too 
small. */
+   skipped_size = MIN2(skipped_size, size);
+   size -= skipped_size;
+   }
}
 
/* Flush the caches. */
-- 
2.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] radeonsi: apply the streamout workaround to Fiji as well

2015-12-04 Thread Alex Deucher
On Fri, Dec 4, 2015 at 3:27 PM, Marek Olšák  wrote:
> From: Marek Olšák 
>
> Cc: 11.0 11.1 

For the series:
Reviewed-by: Alex Deucher 

> ---
>  src/gallium/drivers/radeonsi/si_state_draw.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/src/gallium/drivers/radeonsi/si_state_draw.c 
> b/src/gallium/drivers/radeonsi/si_state_draw.c
> index 771d206..ee84a1f 100644
> --- a/src/gallium/drivers/radeonsi/si_state_draw.c
> +++ b/src/gallium/drivers/radeonsi/si_state_draw.c
> @@ -872,7 +872,9 @@ void si_draw_vbo(struct pipe_context *ctx, const struct 
> pipe_draw_info *info)
>
> /* Workaround for a VGT hang when streamout is enabled.
>  * It must be done after drawing. */
> -   if ((sctx->b.family == CHIP_HAWAII || sctx->b.family == CHIP_TONGA) &&
> +   if ((sctx->b.family == CHIP_HAWAII ||
> +sctx->b.family == CHIP_TONGA ||
> +sctx->b.family == CHIP_FIJI) &&
> (sctx->b.streamout.streamout_enabled ||
>  sctx->b.streamout.prims_gen_query_enabled)) {
> sctx->b.flags |= SI_CONTEXT_VGT_STREAMOUT_SYNC;
> --
> 2.1.4
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] svga: use the debug callback to report issues to the state tracker

2015-12-04 Thread Ilia Mirkin
On Fri, Dec 4, 2015 at 5:40 PM, Jose Fonseca  wrote:
> On 04/12/15 20:04, Ilia Mirkin wrote:
>>
>> On Fri, Dec 4, 2015 at 3:01 PM, Brian Paul  wrote:
>>>
>>> On 12/04/2015 12:46 PM, Ilia Mirkin wrote:


 On Fri, Dec 4, 2015 at 2:42 PM, Brian Paul  wrote:
>
>
> +   if (templ->poly_smooth && svga->debug.callback.debug_message) {
> +  /* note: we always need a % something in the message string */



 Why? Did I mess something up?
>>>
>>>
>>>
>>> If I write it without the dummy %s I get:
>>>
>>> In file included from
>>> ../../../../src/gallium/auxiliary/util/u_inlines.h:36:0,
>>>   from svga_pipe_rasterizer.c:29:
>>> svga_pipe_rasterizer.c: In function 'svga_create_rasterizer_state':
>>> ../../../../src/gallium/auxiliary/util/u_debug.h:273:40: error: expected
>>> expression before ')' token
>>>  fmt, __VA_ARGS__); \
>>>  ^
>>> svga_pipe_rasterizer.c:357:7: note: in expansion of macro
>>> 'pipe_debug_message'
>>> pipe_debug_message(>debug.callback, CONFORMANCE,
>>> ^
>>>
>>
>> H... so that'd be a "yes" to me messing it up :) I'll have a look
>> to see if there's something simple I can do here. Off-hand it feels
>> like it should be #__VA_ARGS__? Been a while since I looked at the
>> specifics of all this.
>>
>>>

 BTW, right now this callback is only set for debug contexts. Perhaps
 that was not an ideal decision... just an FYI though.
>>>
>>>
>>>
>>> apitrace creates a debug context so that's the main thing for me right
>>> now.
>>> But it might be nice if setting MESA_DEBUG would cause the debug/info
>>> messages to be printed to stderr for those apps that don't use
>>> GL_ARB_debug_output.
>>
>>
>> Yeah, we should probably also throw in Enable/Disable hooks into
>> st/mesa that flip it on/off when flipping GL_DEBUG_OUTPUT (or whatever
>> the enum is).
>>
>> BTW, note that apitrace cuts messages off after 100 of them which is
>> pretty annoying -- I opened an issue at
>> https://github.com/apitrace/apitrace/issues/395 a while back.
>
>
> Not just "100", but 100 *per ID*.
>
> Which is very different.
>
> Why is it so annoying?  A patch for an option is welcome anyway BTW.

Right, but but an ID is a message. So let's say the message is
"falling back to XYZ", why would you only want to know the first 100
times? Or let's say the message is "shader compiles to XYZ
instructions" -- again, why the first 100 times? In fact, *given* that
you enable these messages, I don't see why it would ever be desirable
to limit them.

  -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] svga: use the debug callback to report issues to the state tracker

2015-12-04 Thread Jose Fonseca

On 04/12/15 23:00, Jose Fonseca wrote:

On 04/12/15 22:45, Ilia Mirkin wrote:

On Fri, Dec 4, 2015 at 5:40 PM, Jose Fonseca  wrote:

On 04/12/15 20:04, Ilia Mirkin wrote:


On Fri, Dec 4, 2015 at 3:01 PM, Brian Paul  wrote:


On 12/04/2015 12:46 PM, Ilia Mirkin wrote:



On Fri, Dec 4, 2015 at 2:42 PM, Brian Paul  wrote:



+   if (templ->poly_smooth && svga->debug.callback.debug_message) {
+  /* note: we always need a % something in the message
string */




Why? Did I mess something up?




If I write it without the dummy %s I get:

In file included from
../../../../src/gallium/auxiliary/util/u_inlines.h:36:0,
   from svga_pipe_rasterizer.c:29:
svga_pipe_rasterizer.c: In function 'svga_create_rasterizer_state':
../../../../src/gallium/auxiliary/util/u_debug.h:273:40: error:
expected
expression before ')' token
  fmt, __VA_ARGS__); \
  ^
svga_pipe_rasterizer.c:357:7: note: in expansion of macro
'pipe_debug_message'
 pipe_debug_message(>debug.callback, CONFORMANCE,
 ^



H... so that'd be a "yes" to me messing it up :) I'll have a look
to see if there's something simple I can do here. Off-hand it feels
like it should be #__VA_ARGS__? Been a while since I looked at the
specifics of all this.





BTW, right now this callback is only set for debug contexts. Perhaps
that was not an ideal decision... just an FYI though.




apitrace creates a debug context so that's the main thing for me right
now.
But it might be nice if setting MESA_DEBUG would cause the debug/info
messages to be printed to stderr for those apps that don't use
GL_ARB_debug_output.



Yeah, we should probably also throw in Enable/Disable hooks into
st/mesa that flip it on/off when flipping GL_DEBUG_OUTPUT (or whatever
the enum is).

BTW, note that apitrace cuts messages off after 100 of them which is
pretty annoying -- I opened an issue at
https://github.com/apitrace/apitrace/issues/395 a while back.



Not just "100", but 100 *per ID*.

Which is very different.

Why is it so annoying?  A patch for an option is welcome anyway BTW.


Right, but but an ID is a message. So let's say the message is
"falling back to XYZ", why would you only want to know the first 100
times? Or let's say the message is "shader compiles to XYZ
instructions" -- again, why the first 100 times? In fact, *given* that
you enable these messages, I don't see why it would ever be desirable
to limit them.


Isn't it obvious?


OpenGL applications are repetitive by nature.  You're bound to have the
same error over and over again.

Futhermore, some IHVs have messages so frequent that completely overwel
everything else.

So, when you take a trace of an arbitrary app, and replay e.g., on
NVIDIA, you'll get lots and performance hints.  Nevertheless, you'll
want to be able to spot rare and more important issues among all that.
limiting to a fix number, e.g., 100, means that after a while, they all
silence, so that any new issue will stand out.

Soemthing that happens all the time is invariably interesting.


BTW, this is nothing novel -- most logging systems have mechanism in 
place to rate limit messages.


Something bad/unexpected happening 100 times or 10 times is all the 
same -- it happened too many times, and needs to be fixed, which will 
bring the count to zero, or doesn't need to be fixed and it can be 
ignored.   Eitherway, limiting message prevents it from flooding and 
hiding other potentially important issues.


Jose
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] svga: use the debug callback to report issues to the state tracker

2015-12-04 Thread Jose Fonseca

On 04/12/15 19:42, Brian Paul wrote:

Use the new debug callback hook to report conformance, performance
and fallbacks to the state tracker.  The state tracker, in turn can
report this issues to the user via the GL_ARB_debug_output extension.

More issues can be reported in the future; this is just a start.
---
  src/gallium/drivers/svga/svga_context.h  |  3 +++
  src/gallium/drivers/svga/svga_draw_arrays.c  |  7 +++
  src/gallium/drivers/svga/svga_pipe_blend.c   | 11 +++
  src/gallium/drivers/svga/svga_pipe_misc.c| 17 +
  src/gallium/drivers/svga/svga_pipe_rasterizer.c  |  6 ++
  src/gallium/drivers/svga/svga_state_need_swtnl.c | 23 +++
  6 files changed, 67 insertions(+)

diff --git a/src/gallium/drivers/svga/svga_context.h 
b/src/gallium/drivers/svga/svga_context.h
index 6a4f9d8..c4284cc 100644
--- a/src/gallium/drivers/svga/svga_context.h
+++ b/src/gallium/drivers/svga/svga_context.h
@@ -392,6 +392,9 @@ struct svga_context

boolean no_line_width;
boolean force_hw_line_stipple;
+
+  /** To report perf/conformance/etc issues to the state tracker */
+  struct pipe_debug_callback callback;
 } debug;

 struct {
diff --git a/src/gallium/drivers/svga/svga_draw_arrays.c 
b/src/gallium/drivers/svga/svga_draw_arrays.c
index acb2e95..f6956b5 100644
--- a/src/gallium/drivers/svga/svga_draw_arrays.c
+++ b/src/gallium/drivers/svga/svga_draw_arrays.c
@@ -26,6 +26,7 @@
  #include "svga_cmd.h"

  #include "util/u_inlines.h"
+#include "util/u_prim.h"
  #include "indices/u_indices.h"

  #include "svga_hw_reg.h"
@@ -277,6 +278,12 @@ svga_hwtnl_draw_arrays(struct svga_hwtnl *hwtnl,
if (ret != PIPE_OK)
   goto done;

+  if (svga->debug.callback.debug_message) {
+ pipe_debug_message(>debug.callback, PERF_INFO,
+"generating temporary index buffer for drawing %s",
+u_prim_name(prim));
+  }
+
ret = svga_hwtnl_simple_draw_range_elements(hwtnl,
gen_buf,
gen_size,
diff --git a/src/gallium/drivers/svga/svga_pipe_blend.c 
b/src/gallium/drivers/svga/svga_pipe_blend.c
index 0c9d612..5538193 100644
--- a/src/gallium/drivers/svga/svga_pipe_blend.c
+++ b/src/gallium/drivers/svga/svga_pipe_blend.c
@@ -243,6 +243,17 @@ svga_create_blend_state(struct pipe_context *pipe,
   blend->rt[i].srcblend_alpha = blend->rt[i].srcblend;
   blend->rt[i].dstblend_alpha = blend->rt[i].dstblend;
   blend->rt[i].blendeq_alpha = blend->rt[i].blendeq;
+
+ if (svga->debug.callback.debug_message) {
+if (templ->logicop_func == PIPE_LOGICOP_XOR) {
+   pipe_debug_message(>debug.callback, CONFORMANCE,
+  "XOR logicop mode has limited support%s", 
"");
+}
+else if (templ->logicop_func != PIPE_LOGICOP_COPY) {
+   pipe_debug_message(>debug.callback, CONFORMANCE,
+  "general logicops are not supported%s", "");
+}
+ }
}
else {
   /* Note: the vgpu10 device does not yet support independent
diff --git a/src/gallium/drivers/svga/svga_pipe_misc.c 
b/src/gallium/drivers/svga/svga_pipe_misc.c
index c8020da..af9356d 100644
--- a/src/gallium/drivers/svga/svga_pipe_misc.c
+++ b/src/gallium/drivers/svga/svga_pipe_misc.c
@@ -244,6 +244,22 @@ static void svga_set_viewport_states( struct pipe_context 
*pipe,
  }


+/**
+ * Called by state tracker to specify a callback function the driver
+ * can use to report info back to the state tracker.
+ */
+static void
+svga_set_debug_callback(struct pipe_context *pipe,
+const struct pipe_debug_callback *cb)
+{
+   struct svga_context *svga = svga_context(pipe);
+
+   if (cb)
+  svga->debug.callback = *cb;
+   else
+  memset(>debug.callback, 0, sizeof(svga->debug.callback));
+}
+

  void svga_init_misc_functions( struct svga_context *svga )
  {
@@ -252,6 +268,7 @@ void svga_init_misc_functions( struct svga_context *svga )
 svga->pipe.set_framebuffer_state = svga_set_framebuffer_state;
 svga->pipe.set_clip_state = svga_set_clip_state;
 svga->pipe.set_viewport_states = svga_set_viewport_states;
+   svga->pipe.set_debug_callback = svga_set_debug_callback;
  }


diff --git a/src/gallium/drivers/svga/svga_pipe_rasterizer.c 
b/src/gallium/drivers/svga/svga_pipe_rasterizer.c
index 6310b7a..c6215e6 100644
--- a/src/gallium/drivers/svga/svga_pipe_rasterizer.c
+++ b/src/gallium/drivers/svga/svga_pipe_rasterizer.c
@@ -352,6 +352,12 @@ svga_create_rasterizer_state(struct pipe_context *pipe,
define_rasterizer_object(svga, rast);
 }

+   if (templ->poly_smooth && svga->debug.callback.debug_message) {
+  /* note: we always need a % something in the message string */
+  

Re: [Mesa-dev] [PATCH] svga: use the debug callback to report issues to the state tracker

2015-12-04 Thread Ilia Mirkin
On Fri, Dec 4, 2015 at 6:25 PM, Jose Fonseca  wrote:
> On 04/12/15 19:42, Brian Paul wrote:
>>
>> Use the new debug callback hook to report conformance, performance
>> and fallbacks to the state tracker.  The state tracker, in turn can
>> report this issues to the user via the GL_ARB_debug_output extension.
>>
>> More issues can be reported in the future; this is just a start.
>> ---
>>   src/gallium/drivers/svga/svga_context.h  |  3 +++
>>   src/gallium/drivers/svga/svga_draw_arrays.c  |  7 +++
>>   src/gallium/drivers/svga/svga_pipe_blend.c   | 11 +++
>>   src/gallium/drivers/svga/svga_pipe_misc.c| 17 +
>>   src/gallium/drivers/svga/svga_pipe_rasterizer.c  |  6 ++
>>   src/gallium/drivers/svga/svga_state_need_swtnl.c | 23
>> +++
>>   6 files changed, 67 insertions(+)
>>
>> diff --git a/src/gallium/drivers/svga/svga_context.h
>> b/src/gallium/drivers/svga/svga_context.h
>> index 6a4f9d8..c4284cc 100644
>> --- a/src/gallium/drivers/svga/svga_context.h
>> +++ b/src/gallium/drivers/svga/svga_context.h
>> @@ -392,6 +392,9 @@ struct svga_context
>>
>> boolean no_line_width;
>> boolean force_hw_line_stipple;
>> +
>> +  /** To report perf/conformance/etc issues to the state tracker */
>> +  struct pipe_debug_callback callback;
>>  } debug;
>>
>>  struct {
>> diff --git a/src/gallium/drivers/svga/svga_draw_arrays.c
>> b/src/gallium/drivers/svga/svga_draw_arrays.c
>> index acb2e95..f6956b5 100644
>> --- a/src/gallium/drivers/svga/svga_draw_arrays.c
>> +++ b/src/gallium/drivers/svga/svga_draw_arrays.c
>> @@ -26,6 +26,7 @@
>>   #include "svga_cmd.h"
>>
>>   #include "util/u_inlines.h"
>> +#include "util/u_prim.h"
>>   #include "indices/u_indices.h"
>>
>>   #include "svga_hw_reg.h"
>> @@ -277,6 +278,12 @@ svga_hwtnl_draw_arrays(struct svga_hwtnl *hwtnl,
>> if (ret != PIPE_OK)
>>goto done;
>>
>> +  if (svga->debug.callback.debug_message) {
>> + pipe_debug_message(>debug.callback, PERF_INFO,
>> +"generating temporary index buffer for
>> drawing %s",
>> +u_prim_name(prim));
>> +  }
>> +
>> ret = svga_hwtnl_simple_draw_range_elements(hwtnl,
>> gen_buf,
>> gen_size,
>> diff --git a/src/gallium/drivers/svga/svga_pipe_blend.c
>> b/src/gallium/drivers/svga/svga_pipe_blend.c
>> index 0c9d612..5538193 100644
>> --- a/src/gallium/drivers/svga/svga_pipe_blend.c
>> +++ b/src/gallium/drivers/svga/svga_pipe_blend.c
>> @@ -243,6 +243,17 @@ svga_create_blend_state(struct pipe_context *pipe,
>>blend->rt[i].srcblend_alpha = blend->rt[i].srcblend;
>>blend->rt[i].dstblend_alpha = blend->rt[i].dstblend;
>>blend->rt[i].blendeq_alpha = blend->rt[i].blendeq;
>> +
>> + if (svga->debug.callback.debug_message) {
>> +if (templ->logicop_func == PIPE_LOGICOP_XOR) {
>> +   pipe_debug_message(>debug.callback, CONFORMANCE,
>> +  "XOR logicop mode has limited
>> support%s", "");
>> +}
>> +else if (templ->logicop_func != PIPE_LOGICOP_COPY) {
>> +   pipe_debug_message(>debug.callback, CONFORMANCE,
>> +  "general logicops are not supported%s",
>> "");
>> +}
>> + }
>> }
>> else {
>>/* Note: the vgpu10 device does not yet support independent
>> diff --git a/src/gallium/drivers/svga/svga_pipe_misc.c
>> b/src/gallium/drivers/svga/svga_pipe_misc.c
>> index c8020da..af9356d 100644
>> --- a/src/gallium/drivers/svga/svga_pipe_misc.c
>> +++ b/src/gallium/drivers/svga/svga_pipe_misc.c
>> @@ -244,6 +244,22 @@ static void svga_set_viewport_states( struct
>> pipe_context *pipe,
>>   }
>>
>>
>> +/**
>> + * Called by state tracker to specify a callback function the driver
>> + * can use to report info back to the state tracker.
>> + */
>> +static void
>> +svga_set_debug_callback(struct pipe_context *pipe,
>> +const struct pipe_debug_callback *cb)
>> +{
>> +   struct svga_context *svga = svga_context(pipe);
>> +
>> +   if (cb)
>> +  svga->debug.callback = *cb;
>> +   else
>> +  memset(>debug.callback, 0, sizeof(svga->debug.callback));
>> +}
>> +
>>
>>   void svga_init_misc_functions( struct svga_context *svga )
>>   {
>> @@ -252,6 +268,7 @@ void svga_init_misc_functions( struct svga_context
>> *svga )
>>  svga->pipe.set_framebuffer_state = svga_set_framebuffer_state;
>>  svga->pipe.set_clip_state = svga_set_clip_state;
>>  svga->pipe.set_viewport_states = svga_set_viewport_states;
>> +   svga->pipe.set_debug_callback = svga_set_debug_callback;
>>   }
>>
>>
>> diff --git a/src/gallium/drivers/svga/svga_pipe_rasterizer.c
>> b/src/gallium/drivers/svga/svga_pipe_rasterizer.c
>> index 

Re: [Mesa-dev] [PATCH] radeonsi: disable DCC on Stoney

2015-12-04 Thread Alex Deucher
On Fri, Dec 4, 2015 at 2:35 PM, Marek Olšák  wrote:
> From: Marek Olšák 
>
> Cc: 11.0 11.1 

Reviewed-by: Alex Deucher 

> ---
>  src/gallium/drivers/radeon/r600_texture.c | 4 
>  1 file changed, 4 insertions(+)
>
> diff --git a/src/gallium/drivers/radeon/r600_texture.c 
> b/src/gallium/drivers/radeon/r600_texture.c
> index e2947d9..e7b2aea 100644
> --- a/src/gallium/drivers/radeon/r600_texture.c
> +++ b/src/gallium/drivers/radeon/r600_texture.c
> @@ -497,6 +497,10 @@ static void vi_texture_alloc_dcc_separate(struct 
> r600_common_screen *rscreen,
> if (rscreen->debug_flags & DBG_NO_DCC)
> return;
>
> +   /* TODO: DCC is broken on Stoney */
> +   if (rscreen->family == CHIP_STONEY)
> +   return;
> +
> rtex->dcc_buffer = (struct r600_resource *)
> r600_aligned_buffer_create(>b, PIPE_BIND_CUSTOM,
>PIPE_USAGE_DEFAULT, 
> rtex->surface.dcc_size, rtex->surface.dcc_alignment);
> --
> 2.1.4
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 1/9] vc4: Do all uniform loads with byte offsets

2015-12-04 Thread Jason Ekstrand
Previously, we lowered direct uniform loads to dword offsets and indirect
loads to byte offsets in vc4_nir_lower_io.  However, it simplifies things a
bit if we just use byte offsets for everything and then divide by 4 when we
handle the direct uniform load.

Cc: Eric Anholt 
---
 src/gallium/drivers/vc4/vc4_nir_lower_io.c | 17 +
 src/gallium/drivers/vc4/vc4_program.c  | 11 +++
 2 files changed, 12 insertions(+), 16 deletions(-)

diff --git a/src/gallium/drivers/vc4/vc4_nir_lower_io.c 
b/src/gallium/drivers/vc4/vc4_nir_lower_io.c
index 1afe52a..e2581c8 100644
--- a/src/gallium/drivers/vc4/vc4_nir_lower_io.c
+++ b/src/gallium/drivers/vc4/vc4_nir_lower_io.c
@@ -338,8 +338,8 @@ static void
 vc4_nir_lower_uniform(struct vc4_compile *c, nir_builder *b,
   nir_intrinsic_instr *intr)
 {
-/* All TGSI-to-NIR uniform loads are vec4, but we may create dword
- * loads in our lowering passes.
+/* All TGSI-to-NIR uniform loads are vec4, but we need byte offsets
+ * byte offsets in the backend.
  */
 if (intr->num_components == 1)
 return;
@@ -355,6 +355,9 @@ vc4_nir_lower_uniform(struct vc4_compile *c, nir_builder *b,
 intr_comp->num_components = 1;
 nir_ssa_dest_init(_comp->instr, _comp->dest, 1, 
NULL);
 
+/* Convert the base offset to bytes and add the component */
+intr_comp->const_index[0] = (intr->const_index[0] * 16 + i * 
4);
+
 if (intr->intrinsic == nir_intrinsic_load_uniform_indirect) {
 /* Convert the variable TGSI register index to a byte
  * offset.
@@ -363,16 +366,6 @@ vc4_nir_lower_uniform(struct vc4_compile *c, nir_builder 
*b,
 nir_src_for_ssa(nir_ishl(b,
  intr->src[0].ssa,
  nir_imm_int(b, 4)));
-
-/* Convert the offset to be a byte index, too. */
-intr_comp->const_index[0] = (intr->const_index[0] * 16 
+
- i * 4);
-} else {
-/* We want a dword index for non-indirect uniform
- * loads.
- */
-intr_comp->const_index[0] = (intr->const_index[0] * 4 +
- i);
 }
 
 dests[i] = _comp->dest.ssa;
diff --git a/src/gallium/drivers/vc4/vc4_program.c 
b/src/gallium/drivers/vc4/vc4_program.c
index 197577b..1a45bac 100644
--- a/src/gallium/drivers/vc4/vc4_program.c
+++ b/src/gallium/drivers/vc4/vc4_program.c
@@ -1433,6 +1433,7 @@ static void
 ntq_emit_intrinsic(struct vc4_compile *c, nir_intrinsic_instr *instr)
 {
 const nir_intrinsic_info *info = 
_intrinsic_infos[instr->intrinsic];
+int dword_offset;
 struct qreg *dest = NULL;
 
 if (info->has_dest) {
@@ -1442,11 +1443,13 @@ ntq_emit_intrinsic(struct vc4_compile *c, 
nir_intrinsic_instr *instr)
 switch (instr->intrinsic) {
 case nir_intrinsic_load_uniform:
 assert(instr->num_components == 1);
-if (instr->const_index[0] < VC4_NIR_STATE_UNIFORM_OFFSET) {
-*dest = qir_uniform(c, QUNIFORM_UNIFORM,
-instr->const_index[0]);
+/* The offset is in bytes, but we need dwords */
+assert(instr->const_index[0] % 4 == 0);
+dword_offset = instr->const_index[0] / 4;
+if (dword_offset < VC4_NIR_STATE_UNIFORM_OFFSET) {
+*dest = qir_uniform(c, QUNIFORM_UNIFORM, dword_offset);
 } else {
-*dest = qir_uniform(c, instr->const_index[0] -
+*dest = qir_uniform(c, dword_offset -
 VC4_NIR_STATE_UNIFORM_OFFSET,
 0);
 }
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 3/9] nir/glsl: Stop handling UBO/SSBO load/stores differently depending on indirect

2015-12-04 Thread Jason Ekstrand
---
 src/glsl/nir/glsl_to_nir.cpp | 47 +++-
 1 file changed, 7 insertions(+), 40 deletions(-)

diff --git a/src/glsl/nir/glsl_to_nir.cpp b/src/glsl/nir/glsl_to_nir.cpp
index 45d045c..ba23b91 100644
--- a/src/glsl/nir/glsl_to_nir.cpp
+++ b/src/glsl/nir/glsl_to_nir.cpp
@@ -853,24 +853,12 @@ nir_visitor::visit(ir_call *ir)
  ir_constant *write_mask = ((ir_instruction *)param)->as_constant();
  assert(write_mask);
 
- /* Check if we need the indirect version */
- ir_constant *const_offset = offset->as_constant();
- if (!const_offset) {
-op = nir_intrinsic_store_ssbo_indirect;
-ralloc_free(instr);
-instr = nir_intrinsic_instr_create(shader, op);
-instr->src[2] = nir_src_for_ssa(evaluate_rvalue(offset));
-instr->const_index[0] = 0;
- } else {
-instr->const_index[0] = const_offset->value.u[0];
- }
-
- instr->const_index[1] = write_mask->value.u[0];
-
  instr->src[0] = nir_src_for_ssa(evaluate_rvalue(val));
+ instr->src[1] = nir_src_for_ssa(evaluate_rvalue(block));
+ instr->src[2] = nir_src_for_ssa(evaluate_rvalue(offset));
+ instr->const_index[0] = write_mask->value.u[0];
  instr->num_components = val->type->vector_elements;
 
- instr->src[1] = nir_src_for_ssa(evaluate_rvalue(block));
  nir_builder_instr_insert(, >instr);
  break;
   }
@@ -881,20 +869,8 @@ nir_visitor::visit(ir_call *ir)
  param = param->get_next();
  ir_rvalue *offset = ((ir_instruction *)param)->as_rvalue();
 
- /* Check if we need the indirect version */
- ir_constant *const_offset = offset->as_constant();
- if (!const_offset) {
-op = nir_intrinsic_load_ssbo_indirect;
-ralloc_free(instr);
-instr = nir_intrinsic_instr_create(shader, op);
-instr->src[1] = nir_src_for_ssa(evaluate_rvalue(offset));
-instr->const_index[0] = 0;
-dest = >dest;
- } else {
-instr->const_index[0] = const_offset->value.u[0];
- }
-
  instr->src[0] = nir_src_for_ssa(evaluate_rvalue(block));
+ instr->src[1] = nir_src_for_ssa(evaluate_rvalue(offset));
 
  const glsl_type *type = ir->return_deref->var->type;
  instr->num_components = type->vector_elements;
@@ -1174,20 +1150,11 @@ nir_visitor::visit(ir_expression *ir)
/* Some special cases */
switch (ir->operation) {
case ir_binop_ubo_load: {
-  ir_constant *const_index = ir->operands[1]->as_constant();
-
-  nir_intrinsic_op op;
-  if (const_index) {
- op = nir_intrinsic_load_ubo;
-  } else {
- op = nir_intrinsic_load_ubo_indirect;
-  }
-  nir_intrinsic_instr *load = nir_intrinsic_instr_create(this->shader, op);
+  nir_intrinsic_instr *load =
+ nir_intrinsic_instr_create(this->shader, nir_intrinsic_load_ubo);
   load->num_components = ir->type->vector_elements;
-  load->const_index[0] = const_index ? const_index->value.u[0] : 0; /* 
base offset */
   load->src[0] = nir_src_for_ssa(evaluate_rvalue(ir->operands[0]));
-  if (!const_index)
- load->src[1] = nir_src_for_ssa(evaluate_rvalue(ir->operands[1]));
+  load->src[1] = nir_src_for_ssa(evaluate_rvalue(ir->operands[1]));
   add_instr(>instr, ir->type->vector_elements);
 
   /*
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 6/9] i965/vec4: Get rid of load/store_foo_indirect

2015-12-04 Thread Jason Ekstrand
---
 src/mesa/drivers/dri/i965/brw_vec4_gs_nir.cpp |  7 +--
 src/mesa/drivers/dri/i965/brw_vec4_nir.cpp| 91 +++
 2 files changed, 40 insertions(+), 58 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_vec4_gs_nir.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4_gs_nir.cpp
index e51ef4b..6f66978 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_gs_nir.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4_gs_nir.cpp
@@ -60,19 +60,19 @@ vec4_gs_visitor::nir_emit_intrinsic(nir_intrinsic_instr 
*instr)
src_reg src;
 
switch (instr->intrinsic) {
-   case nir_intrinsic_load_per_vertex_input_indirect:
-  assert(!"EmitNoIndirectInput should prevent this.");
case nir_intrinsic_load_per_vertex_input: {
   /* The EmitNoIndirectInput flag guarantees our vertex index will
* be constant.  We should handle indirects someday.
*/
   nir_const_value *vertex = nir_src_as_const_value(instr->src[0]);
+  nir_const_value *offset = nir_src_as_const_value(instr->src[1]);
 
   /* Make up a type...we have no way of knowing... */
   const glsl_type *const type = glsl_type::ivec(instr->num_components);
 
   src = src_reg(ATTR, BRW_VARYING_SLOT_COUNT * vertex->u[0] +
-  instr->const_index[0], type);
+  instr->const_index[0] + offset->u[0],
+type);
   dest = get_nir_dest(instr->dest, src.type);
   dest.writemask = brw_writemask_for_size(instr->num_components);
   emit(MOV(dest, src));
@@ -80,7 +80,6 @@ vec4_gs_visitor::nir_emit_intrinsic(nir_intrinsic_instr 
*instr)
}
 
case nir_intrinsic_load_input:
-   case nir_intrinsic_load_input_indirect:
   unreachable("nir_lower_io should have produced per_vertex intrinsics");
 
case nir_intrinsic_emit_vertex_with_counter: {
diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
index 1e03022..dd2d7e7 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
@@ -369,22 +369,17 @@ vec4_visitor::nir_emit_intrinsic(nir_intrinsic_instr 
*instr)
dst_reg dest;
src_reg src;
 
-   bool has_indirect = false;
-
switch (instr->intrinsic) {
 
-   case nir_intrinsic_load_input_indirect:
-  has_indirect = true;
-  /* fallthrough */
case nir_intrinsic_load_input: {
-  int offset = instr->const_index[0];
-  src = src_reg(ATTR, offset, glsl_type::uvec4_type);
+  nir_const_value *const_offset = nir_src_as_const_value(instr->src[0]);
+
+  /* We set EmitNoIndirectInput for VS */
+  assert(const_offset);
+
+  src = src_reg(ATTR, instr->const_index[0] + const_offset->u[0],
+glsl_type::uvec4_type);
 
-  if (has_indirect) {
- dest.reladdr = new(mem_ctx) src_reg(get_nir_src(instr->src[0],
- BRW_REGISTER_TYPE_D,
- 1));
-  }
   dest = get_nir_dest(instr->dest, src.type);
   dest.writemask = brw_writemask_for_size(instr->num_components);
 
@@ -392,11 +387,11 @@ vec4_visitor::nir_emit_intrinsic(nir_intrinsic_instr 
*instr)
   break;
}
 
-   case nir_intrinsic_store_output_indirect:
-  unreachable("nir_lower_outputs_to_temporaries should prevent this");
-
case nir_intrinsic_store_output: {
-  int varying = instr->const_index[0];
+  nir_const_value *const_offset = nir_src_as_const_value(instr->src[1]);
+  assert(const_offset);
+
+  int varying = instr->const_index[0] + const_offset->u[0];
 
   src = get_nir_src(instr->src[0], BRW_REGISTER_TYPE_F,
 instr->num_components);
@@ -431,9 +426,6 @@ vec4_visitor::nir_emit_intrinsic(nir_intrinsic_instr *instr)
   break;
}
 
-   case nir_intrinsic_store_ssbo_indirect:
-  has_indirect = true;
-  /* fallthrough */
case nir_intrinsic_store_ssbo: {
   assert(devinfo->gen >= 7);
 
@@ -458,20 +450,19 @@ vec4_visitor::nir_emit_intrinsic(nir_intrinsic_instr 
*instr)
   }
 
   /* Offset */
-  src_reg offset_reg = src_reg(this, glsl_type::uint_type);
-  unsigned const_offset_bytes = 0;
-  if (has_indirect) {
- emit(MOV(dst_reg(offset_reg), get_nir_src(instr->src[2], 1)));
+  src_reg offset_reg;
+  nir_const_value *const_offset = nir_src_as_const_value(instr->src[2]);
+  if (const_offset) {
+ offset_reg = brw_imm_ud(const_offset->u[0]);
   } else {
- const_offset_bytes = instr->const_index[0];
- emit(MOV(dst_reg(offset_reg), brw_imm_ud(const_offset_bytes)));
+ offset_reg = get_nir_src(instr->src[2], 1);
   }
 
   /* Value */
   src_reg val_reg = get_nir_src(instr->src[0], 4);
 
   /* Writemask */
-  unsigned write_mask = instr->const_index[1];
+  unsigned write_mask = instr->const_index[0];
 
   /* IvyBridge does not have a native SIMD4x2 untyped write 

[Mesa-dev] [PATCH v2 9/9] vc4/nir: Use the new unified io intrinsics

2015-12-04 Thread Jason Ekstrand
Cc: Eric Anholt 
---
 src/gallium/drivers/vc4/vc4_nir_lower_io.c | 16 ++-
 src/gallium/drivers/vc4/vc4_program.c  | 31 --
 2 files changed, 23 insertions(+), 24 deletions(-)

diff --git a/src/gallium/drivers/vc4/vc4_nir_lower_io.c 
b/src/gallium/drivers/vc4/vc4_nir_lower_io.c
index e2581c8..ba05797 100644
--- a/src/gallium/drivers/vc4/vc4_nir_lower_io.c
+++ b/src/gallium/drivers/vc4/vc4_nir_lower_io.c
@@ -358,15 +358,12 @@ vc4_nir_lower_uniform(struct vc4_compile *c, nir_builder 
*b,
 /* Convert the base offset to bytes and add the component */
 intr_comp->const_index[0] = (intr->const_index[0] * 16 + i * 
4);
 
-if (intr->intrinsic == nir_intrinsic_load_uniform_indirect) {
-/* Convert the variable TGSI register index to a byte
- * offset.
- */
-intr_comp->src[0] =
-nir_src_for_ssa(nir_ishl(b,
- intr->src[0].ssa,
- nir_imm_int(b, 4)));
-}
+/* Convert the offset to bytes.  If it happens to be a
+ * constant, constant-folding will clean up the shift for us.
+ */
+intr_comp->src[0] =
+nir_src_for_ssa(nir_ishl(b, intr->src[0].ssa,
+ nir_imm_int(b, 4)));
 
 dests[i] = _comp->dest.ssa;
 
@@ -397,7 +394,6 @@ vc4_nir_lower_io_instr(struct vc4_compile *c, nir_builder 
*b,
 break;
 
 case nir_intrinsic_load_uniform:
-case nir_intrinsic_load_uniform_indirect:
 case nir_intrinsic_load_user_clip_plane:
 vc4_nir_lower_uniform(c, b, intr);
 break;
diff --git a/src/gallium/drivers/vc4/vc4_program.c 
b/src/gallium/drivers/vc4/vc4_program.c
index 1a45bac..1423d6e 100644
--- a/src/gallium/drivers/vc4/vc4_program.c
+++ b/src/gallium/drivers/vc4/vc4_program.c
@@ -1433,7 +1433,7 @@ static void
 ntq_emit_intrinsic(struct vc4_compile *c, nir_intrinsic_instr *instr)
 {
 const nir_intrinsic_info *info = 
_intrinsic_infos[instr->intrinsic];
-int dword_offset;
+nir_const_value *const_offset;
 struct qreg *dest = NULL;
 
 if (info->has_dest) {
@@ -1443,23 +1443,26 @@ ntq_emit_intrinsic(struct vc4_compile *c, 
nir_intrinsic_instr *instr)
 switch (instr->intrinsic) {
 case nir_intrinsic_load_uniform:
 assert(instr->num_components == 1);
-/* The offset is in bytes, but we need dwords */
-assert(instr->const_index[0] % 4 == 0);
-dword_offset = instr->const_index[0] / 4;
-if (dword_offset < VC4_NIR_STATE_UNIFORM_OFFSET) {
-*dest = qir_uniform(c, QUNIFORM_UNIFORM, dword_offset);
+const_offset = nir_src_as_const_value(instr->src[0]);
+if (const_offset) {
+unsigned offset = const_offset->u[0] +
+  instr->const_index[0];
+assert(offset % 4 == 0);
+/* We need dwords */
+offset = offset / 4;
+if (offset < VC4_NIR_STATE_UNIFORM_OFFSET) {
+*dest = qir_uniform(c, QUNIFORM_UNIFORM,
+offset);
+} else {
+*dest = qir_uniform(c, offset -
+
VC4_NIR_STATE_UNIFORM_OFFSET,
+0);
+}
 } else {
-*dest = qir_uniform(c, dword_offset -
-VC4_NIR_STATE_UNIFORM_OFFSET,
-0);
+*dest = indirect_uniform_load(c, instr);
 }
 break;
 
-case nir_intrinsic_load_uniform_indirect:
-*dest = indirect_uniform_load(c, instr);
-
-break;
-
 case nir_intrinsic_load_user_clip_plane:
 *dest = qir_uniform(c, QUNIFORM_USER_CLIP_PLANE,
 instr->const_index[0]);
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 4/9] nir/lower_io: Get rid of load/store_foo_indirect

2015-12-04 Thread Jason Ekstrand
---
 src/glsl/nir/nir.h  |   2 +-
 src/glsl/nir/nir_lower_io.c | 111 +---
 2 files changed, 44 insertions(+), 69 deletions(-)

diff --git a/src/glsl/nir/nir.h b/src/glsl/nir/nir.h
index e161b70..2e72e66 100644
--- a/src/glsl/nir/nir.h
+++ b/src/glsl/nir/nir.h
@@ -1969,7 +1969,7 @@ void nir_assign_var_locations(struct exec_list *var_list,
 void nir_lower_io(nir_shader *shader,
   nir_variable_mode mode,
   int (*type_size)(const struct glsl_type *));
-nir_src *nir_get_io_indirect_src(nir_intrinsic_instr *instr);
+nir_src *nir_get_io_offset_src(nir_intrinsic_instr *instr);
 nir_src *nir_get_io_vertex_index_src(nir_intrinsic_instr *instr);
 
 void nir_lower_vars_to_ssa(nir_shader *shader);
diff --git a/src/glsl/nir/nir_lower_io.c b/src/glsl/nir/nir_lower_io.c
index f64ac69..a2723d5 100644
--- a/src/glsl/nir/nir_lower_io.c
+++ b/src/glsl/nir/nir_lower_io.c
@@ -86,10 +86,9 @@ is_per_vertex_output(struct lower_io_state *state, 
nir_variable *var)
   stage == MESA_SHADER_TESS_CTRL;
 }
 
-static unsigned
+static nir_ssa_def *
 get_io_offset(nir_builder *b, nir_deref_var *deref,
   nir_ssa_def **vertex_index,
-  nir_ssa_def **out_indirect,
   int (*type_size)(const struct glsl_type *))
 {
nir_deref *tail = >deref;
@@ -109,8 +108,8 @@ get_io_offset(nir_builder *b, nir_deref_var *deref,
   *vertex_index = vtx;
}
 
-   nir_ssa_def *indirect = NULL;
-   unsigned base_offset = 0;
+   /* Just emit code and let constant-folding go to town */
+   nir_ssa_def *offset = nir_imm_int(b, 0);
 
while (tail->child != NULL) {
   const struct glsl_type *parent_type = tail->type;
@@ -120,55 +119,46 @@ get_io_offset(nir_builder *b, nir_deref_var *deref,
  nir_deref_array *deref_array = nir_deref_as_array(tail);
  unsigned size = type_size(tail->type);
 
- base_offset += size * deref_array->base_offset;
+ offset = nir_iadd(b, offset,
+   nir_imm_int(b, size * deref_array->base_offset));
 
  if (deref_array->deref_array_type == nir_deref_array_type_indirect) {
 nir_ssa_def *mul =
nir_imul(b, nir_imm_int(b, size),
 nir_ssa_for_src(b, deref_array->indirect, 1));
 
-indirect = indirect ? nir_iadd(b, indirect, mul) : mul;
+offset = nir_iadd(b, offset, mul);
  }
   } else if (tail->deref_type == nir_deref_type_struct) {
  nir_deref_struct *deref_struct = nir_deref_as_struct(tail);
 
+ unsigned field_offset = 0;
  for (unsigned i = 0; i < deref_struct->index; i++) {
-base_offset += type_size(glsl_get_struct_field(parent_type, i));
+field_offset += type_size(glsl_get_struct_field(parent_type, i));
  }
+ offset = nir_iadd(b, offset, nir_imm_int(b, field_offset));
   }
}
 
-   *out_indirect = indirect;
-   return base_offset;
+   return offset;
 }
 
 static nir_intrinsic_op
 load_op(struct lower_io_state *state,
-nir_variable_mode mode, bool per_vertex, bool has_indirect)
+nir_variable_mode mode, bool per_vertex)
 {
nir_intrinsic_op op;
switch (mode) {
case nir_var_shader_in:
-  if (per_vertex) {
- op = has_indirect ? nir_intrinsic_load_per_vertex_input_indirect :
- nir_intrinsic_load_per_vertex_input;
-  } else {
- op = has_indirect ? nir_intrinsic_load_input_indirect :
- nir_intrinsic_load_input;
-  }
+  op = per_vertex ? nir_intrinsic_load_per_vertex_input :
+nir_intrinsic_load_input;
   break;
case nir_var_shader_out:
-  if (per_vertex) {
- op = has_indirect ? nir_intrinsic_load_per_vertex_output_indirect :
- nir_intrinsic_load_per_vertex_output;
-  } else {
- op = has_indirect ? nir_intrinsic_load_output_indirect :
- nir_intrinsic_load_output;
-  }
+  op = per_vertex ? nir_intrinsic_load_per_vertex_output :
+nir_intrinsic_load_output;
   break;
case nir_var_uniform:
-  op = has_indirect ? nir_intrinsic_load_uniform_indirect :
-  nir_intrinsic_load_uniform;
+  op = nir_intrinsic_load_uniform;
   break;
default:
   unreachable("Unknown variable mode");
@@ -211,32 +201,25 @@ nir_lower_io_block(nir_block *block, void *void_state)
 is_per_vertex_input(state, intrin->variables[0]->var) ||
 is_per_vertex_output(state, intrin->variables[0]->var);
 
- nir_ssa_def *indirect;
+ nir_ssa_def *offset;
  nir_ssa_def *vertex_index;
 
- unsigned offset = get_io_offset(b, intrin->variables[0],
- per_vertex ? _index : NULL,
- , state->type_size);
+ 

[Mesa-dev] [PATCH v2 8/9] ir3/nir: Use the new unified io intrinsics

2015-12-04 Thread Jason Ekstrand
Cc: Rob Clark 
---
 .../drivers/freedreno/ir3/ir3_compiler_nir.c   | 79 +-
 1 file changed, 47 insertions(+), 32 deletions(-)

diff --git a/src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c 
b/src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c
index 8617704..99b4264 100644
--- a/src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c
+++ b/src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c
@@ -1215,6 +1215,7 @@ emit_intrinsic_load_ubo(struct ir3_compile *ctx, 
nir_intrinsic_instr *intr,
 {
struct ir3_block *b = ctx->block;
struct ir3_instruction *addr, *src0, *src1;
+   nir_const_value *const_offset;
/* UBO addresses are the first driver params: */
unsigned ubo = regid(ctx->so->first_driver_param + IR3_UBOS_OFF, 0);
unsigned off = intr->const_index[0];
@@ -1228,7 +1229,10 @@ emit_intrinsic_load_ubo(struct ir3_compile *ctx, 
nir_intrinsic_instr *intr,
addr = create_uniform_indirect(ctx, ubo, get_addr(ctx, src0));
}
 
-   if (intr->intrinsic == nir_intrinsic_load_ubo_indirect) {
+   const_offset = nir_src_as_const_value(intr->src[1]);
+   if (const_offset) {
+   off += const_offset->u[0];
+   } else {
/* For load_ubo_indirect, second src is indirect offset: */
src1 = get_src(ctx, >src[1])[0];
 
@@ -1391,6 +1395,7 @@ emit_intrinisic(struct ir3_compile *ctx, 
nir_intrinsic_instr *intr)
struct ir3_instruction **dst, **src;
struct ir3_block *b = ctx->block;
unsigned idx = intr->const_index[0];
+   nir_const_value *const_offset;
 
if (info->has_dest) {
dst = get_dst(ctx, >dest, intr->num_components);
@@ -1400,43 +1405,49 @@ emit_intrinisic(struct ir3_compile *ctx, 
nir_intrinsic_instr *intr)
 
switch (intr->intrinsic) {
case nir_intrinsic_load_uniform:
-   for (int i = 0; i < intr->num_components; i++) {
-   unsigned n = idx * 4 + i;
-   dst[i] = create_uniform(ctx, n);
-   }
-   break;
-   case nir_intrinsic_load_uniform_indirect:
-   src = get_src(ctx, >src[0]);
-   for (int i = 0; i < intr->num_components; i++) {
-   unsigned n = idx * 4 + i;
-   dst[i] = create_uniform_indirect(ctx, n,
-   get_addr(ctx, src[0]));
+   const_offset = nir_src_as_const_value(intr->src[0]);
+   if (const_offset) {
+   idx += const_offset->u[0];
+   for (int i = 0; i < intr->num_components; i++) {
+   unsigned n = idx * 4 + i;
+   dst[i] = create_uniform(ctx, n);
+   }
+   } else {
+   src = get_src(ctx, >src[0]);
+   for (int i = 0; i < intr->num_components; i++) {
+   unsigned n = idx * 4 + i;
+   dst[i] = create_uniform_indirect(ctx, n,
+   get_addr(ctx, src[0]));
+   }
+   /* NOTE: if relative addressing is used, we set
+* constlen in the compiler (to worst-case value)
+* since we don't know in the assembler what the max
+* addr reg value can be:
+*/
+   ctx->so->constlen = ctx->s->num_uniforms;
}
-   /* NOTE: if relative addressing is used, we set constlen in
-* the compiler (to worst-case value) since we don't know in
-* the assembler what the max addr reg value can be:
-*/
-   ctx->so->constlen = ctx->s->num_uniforms;
break;
case nir_intrinsic_load_ubo:
-   case nir_intrinsic_load_ubo_indirect:
emit_intrinsic_load_ubo(ctx, intr, dst);
break;
case nir_intrinsic_load_input:
-   for (int i = 0; i < intr->num_components; i++) {
-   unsigned n = idx * 4 + i;
-   dst[i] = ctx->ir->inputs[n];
-   }
-   break;
-   case nir_intrinsic_load_input_indirect:
-   src = get_src(ctx, >src[0]);
-   struct ir3_instruction *collect =
-   create_collect(b, ctx->ir->inputs, 
ctx->ir->ninputs);
-   struct ir3_instruction *addr = get_addr(ctx, src[0]);
-   for (int i = 0; i < intr->num_components; i++) {
-   unsigned n = idx * 4 + i;
-   dst[i] = create_indirect_load(ctx, ctx->ir->ninputs,
-   n, addr, collect);
+   const_offset = nir_src_as_const_value(intr->src[0]);
+   if 

[Mesa-dev] [PATCH v2 5/9] i965/fs: Get rid of load/store_foo_indirect

2015-12-04 Thread Jason Ekstrand
---
 src/mesa/drivers/dri/i965/brw_fs.h   |   2 +-
 src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 153 ++-
 src/mesa/drivers/dri/i965/brw_nir.c  |  45 +++--
 3 files changed, 105 insertions(+), 95 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.h 
b/src/mesa/drivers/dri/i965/brw_fs.h
index bca4589..b55589f 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.h
+++ b/src/mesa/drivers/dri/i965/brw_fs.h
@@ -278,7 +278,7 @@ public:
unsigned stream_id);
void emit_gs_thread_end();
void emit_gs_input_load(const fs_reg , const nir_src _src,
-   const fs_reg _offset, unsigned imm_offset,
+   unsigned base_offset, const nir_src _src,
unsigned num_components);
void emit_cs_terminate();
fs_reg *emit_cs_local_invocation_id_setup();
diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
index 9b50e4e..a7a565b 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
@@ -1621,29 +1621,31 @@ fs_visitor::emit_gs_vertex(const nir_src 
_count_nir_src,
 void
 fs_visitor::emit_gs_input_load(const fs_reg ,
const nir_src _src,
-   const fs_reg _offset,
-   unsigned imm_offset,
+   unsigned base_offset,
+   const nir_src _src,
unsigned num_components)
 {
struct brw_gs_prog_data *gs_prog_data = (struct brw_gs_prog_data *) 
prog_data;
 
+   nir_const_value *vertex_const = nir_src_as_const_value(vertex_src);
+   nir_const_value *offset_const = nir_src_as_const_value(offset_src);
+   const unsigned push_reg_count = gs_prog_data->base.urb_read_length * 8;
+
/* Offset 0 is the VUE header, which contains VARYING_SLOT_LAYER [.y],
 * VARYING_SLOT_VIEWPORT [.z], and VARYING_SLOT_PSIZ [.w].  Only
 * gl_PointSize is available as a GS input, however, so it must be that.
 */
-   const bool is_point_size =
-  indirect_offset.file == BAD_FILE && imm_offset == 0;
-
-   nir_const_value *vertex_const = nir_src_as_const_value(vertex_src);
-   const unsigned push_reg_count = gs_prog_data->base.urb_read_length * 8;
+   const bool is_point_size = (base_offset == 0);
 
-   if (indirect_offset.file == BAD_FILE && vertex_const != NULL &&
-   4 * imm_offset < push_reg_count) {
-  imm_offset = 4 * imm_offset + vertex_const->u[0] * push_reg_count;
+   if (offset_const != NULL && vertex_const != NULL &&
+   4 * offset_const->u[0] < push_reg_count) {
+  int imm_offset = (base_offset + offset_const->u[0]) * 4 +
+   vertex_const->u[0] * push_reg_count;
   /* This input was pushed into registers. */
   if (is_point_size) {
  /* gl_PointSize comes in .w */
- bld.MOV(dst, fs_reg(ATTR, imm_offset + 3, dst.type));
+ assert(imm_offset == 0);
+ bld.MOV(dst, fs_reg(ATTR, 3, dst.type));
   } else {
  for (unsigned i = 0; i < num_components; i++) {
 bld.MOV(offset(dst, bld, i),
@@ -1701,21 +1703,21 @@ fs_visitor::emit_gs_input_load(const fs_reg ,
   }
 
   fs_inst *inst;
-  if (indirect_offset.file == BAD_FILE) {
+  if (offset_const) {
  /* Constant indexing - use global offset. */
  inst = bld.emit(SHADER_OPCODE_URB_READ_SIMD8, dst, icp_handle);
- inst->offset = imm_offset;
+ inst->offset = base_offset + offset_const->u[0];
  inst->base_mrf = -1;
  inst->mlen = 1;
  inst->regs_written = num_components;
   } else {
  /* Indirect indexing - use per-slot offsets as well. */
- const fs_reg srcs[] = { icp_handle, indirect_offset };
+ const fs_reg srcs[] = { icp_handle, get_nir_src(offset_src) };
  fs_reg payload = bld.vgrf(BRW_REGISTER_TYPE_UD, 2);
  bld.LOAD_PAYLOAD(payload, srcs, ARRAY_SIZE(srcs), 0);
 
  inst = bld.emit(SHADER_OPCODE_URB_READ_SIMD8_PER_SLOT, dst, payload);
- inst->offset = imm_offset;
+ inst->offset = base_offset;
  inst->base_mrf = -1;
  inst->mlen = 2;
  inst->regs_written = num_components;
@@ -1781,17 +1783,12 @@ fs_visitor::nir_emit_gs_intrinsic(const fs_builder ,
   retype(fs_reg(brw_vec8_grf(2, 0)), BRW_REGISTER_TYPE_UD));
   break;
 
-   case nir_intrinsic_load_input_indirect:
case nir_intrinsic_load_input:
   unreachable("load_input intrinsics are invalid for the GS stage");
 
-   case nir_intrinsic_load_per_vertex_input_indirect:
-  indirect_offset = retype(get_nir_src(instr->src[1]), 
BRW_REGISTER_TYPE_D);
-  /* fallthrough */
case nir_intrinsic_load_per_vertex_input:
-  emit_gs_input_load(dest, instr->src[0],
- indirect_offset, instr->const_index[0],
- 

[Mesa-dev] [PATCH v2 0/9] nir: Remove the *_indirect form of input/output

2015-12-04 Thread Jason Ekstrand
This is a second crack at my attempt to get rid of the *_indirect for of
the input/output intrinsics.  The primary change is that we decided to keep
the const_index[0] base offset for all of the load/store intrinsics that
operate on opaque memory.  Along with this, the base offset is now much
better defined:  it is always the driver_location of the actual variable.
If, for instance, you access the second column of a mat4, const_index[0] is
the location of the whole mat4 and the offset source is load_const(2).

Thanks to keeping the base offset arround, this series is a lot more
straightforward than the first version.  I think the chances are
substantially higher that I didn't break vc4 or ir3.

Cc: Eric Anholt 
Cc: Rob Clark 

Jason Ekstrand (9):
  vc4: Do all uniform loads with byte offsets
  nir: Get rid of *_indirect variants of input/output load/store
intrinsics
  nir/glsl: Stop handling UBO/SSBO load/stores differently depending on
indirect
  nir/lower_io: Get rid of load/store_foo_indirect
  i965/fs: Get rid of load/store_foo_indirect
  i965/vec4: Get rid of load/store_foo_indirect
  tgsi_to_nir: Get rid of load/store_foo_indirect
  ir3/nir: Use the new unified io intrinsics
  vc4/nir: Use the new unified io intrinsics

 src/gallium/auxiliary/nir/tgsi_to_nir.c|  51 +++
 .../drivers/freedreno/ir3/ir3_compiler_nir.c   |  79 ++-
 src/gallium/drivers/vc4/vc4_nir_lower_io.c |  33 ++---
 src/gallium/drivers/vc4/vc4_program.c  |  28 ++--
 src/glsl/nir/glsl_to_nir.cpp   |  47 +--
 src/glsl/nir/nir.h |   2 +-
 src/glsl/nir/nir_intrinsics.h  |  81 +--
 src/glsl/nir/nir_lower_io.c| 111 ++-
 src/glsl/nir/nir_lower_phis_to_scalar.c|   4 -
 src/glsl/nir/nir_print.c   |   9 +-
 src/mesa/drivers/dri/i965/brw_fs.h |   2 +-
 src/mesa/drivers/dri/i965/brw_fs_nir.cpp   | 153 ++---
 src/mesa/drivers/dri/i965/brw_nir.c|  45 --
 src/mesa/drivers/dri/i965/brw_vec4_gs_nir.cpp  |   7 +-
 src/mesa/drivers/dri/i965/brw_vec4_nir.cpp |  91 +---
 15 files changed, 340 insertions(+), 403 deletions(-)

-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 2/9] nir: Get rid of *_indirect variants of input/output load/store intrinsics

2015-12-04 Thread Jason Ekstrand
There is some special-casing needed in a competent back-end.  However, they
can do their special-casing easily enough based on whether or not the
offset is a constant.  In the mean time, having the *_indirect variants
adds special cases a number of places where they don't need to be and, in
general, only complicates things.  To complicate matters, NIR had no way to
convdert an indirect load/store to a direct one in the case that the
indirect was a constant so we would still not really get what the back-ends
wanted.  The best solution seems to be to get rid of the *_indirect
variants entirely.
---
 src/glsl/nir/nir_intrinsics.h   | 81 +
 src/glsl/nir/nir_lower_phis_to_scalar.c |  4 --
 src/glsl/nir/nir_print.c|  9 +---
 3 files changed, 42 insertions(+), 52 deletions(-)

diff --git a/src/glsl/nir/nir_intrinsics.h b/src/glsl/nir/nir_intrinsics.h
index b2565c5..63df21e 100644
--- a/src/glsl/nir/nir_intrinsics.h
+++ b/src/glsl/nir/nir_intrinsics.h
@@ -228,54 +228,55 @@ SYSTEM_VALUE(num_work_groups, 3, 0)
 SYSTEM_VALUE(helper_invocation, 1, 0)
 
 /*
- * The format of the indices depends on the type of the load.  For uniforms,
- * the first index is the base address and the second index is an offset that
- * should be added to the base address.  (This way you can determine in the
- * back-end which variable is being accessed even in an array.)  For inputs,
- * the one and only index corresponds to the attribute slot.  UBO loads also
- * have a single index which is the base address to load from.
+ * Load operations pull data from some piece of GPU memory.  All load
+ * operations operate in terms of offsets into some piece of theoretical
+ * memory.  Loads from externally visible memory (UBO and SSBO) simply take a
+ * byte offset as a source.  Loads from opaque memory (uniforms, inputs, etc.)
+ * take a base+offset pair where the base (const_index[0]) gives the location
+ * of the start of the variable being loaded and and the offset source is a
+ * offset into that variable.
  *
- * UBO loads have a (possibly constant) source which is the UBO buffer index.
- * For each type of load, the _indirect variant has one additional source
- * (the second in the case of UBO's) that is the is an indirect to be added to
- * the constant address or base offset to compute the final offset.
+ * Some load operations such as UBO/SSBO load and per_vertex loads take an
+ * additional source to specify which UBO/SSBO/vertex to load from.
  *
- * For vector backends, the address is in terms of one vec4, and so each array
- * element is +4 scalar components from the previous array element. For scalar
- * backends, the address is in terms of a single 4-byte float/int and arrays
- * elements begin immediately after the previous array element.
+ * The exact address type depends on the lowering pass that generates the
+ * load/store intrinsics.  Typically, this is vec4 units for things such as
+ * varying slots and float units for fragment shader inputs.  UBO and SSBO
+ * offsets are always in bytes.
  */
 
-#define LOAD(name, extra_srcs, indices, flags) \
-   INTRINSIC(load_##name, extra_srcs, ARR(1), true, 0, 0, indices, flags) \
-   INTRINSIC(load_##name##_indirect, extra_srcs + 1, ARR(1, 1), \
- true, 0, 0, indices, flags)
+#define LOAD(name, srcs, indices, flags) \
+   INTRINSIC(load_##name, srcs, ARR(1, 1, 1, 1), true, 0, 0, indices, flags)
 
-LOAD(uniform, 0, 2, NIR_INTRINSIC_CAN_ELIMINATE | NIR_INTRINSIC_CAN_REORDER)
-LOAD(ubo, 1, 1, NIR_INTRINSIC_CAN_ELIMINATE | NIR_INTRINSIC_CAN_REORDER)
-LOAD(input, 0, 1, NIR_INTRINSIC_CAN_ELIMINATE | NIR_INTRINSIC_CAN_REORDER)
-LOAD(per_vertex_input, 1, 1, NIR_INTRINSIC_CAN_ELIMINATE | 
NIR_INTRINSIC_CAN_REORDER)
-LOAD(ssbo, 1, 1, NIR_INTRINSIC_CAN_ELIMINATE)
-LOAD(output, 0, 1, NIR_INTRINSIC_CAN_ELIMINATE)
-LOAD(per_vertex_output, 1, 1, NIR_INTRINSIC_CAN_ELIMINATE)
+/* src[] = { offset }. const_index[] = { base } */
+LOAD(uniform, 1, 1, NIR_INTRINSIC_CAN_ELIMINATE | NIR_INTRINSIC_CAN_REORDER)
+/* src[] = { buffer_index, offset }. No const_index */
+LOAD(ubo, 2, 0, NIR_INTRINSIC_CAN_ELIMINATE | NIR_INTRINSIC_CAN_REORDER)
+/* src[] = { offset }. const_index[0] = { base } */
+LOAD(input, 1, 1, NIR_INTRINSIC_CAN_ELIMINATE | NIR_INTRINSIC_CAN_REORDER)
+/* src[] = { vertex, offset }. const_index[] = { base } */
+LOAD(per_vertex_input, 2, 1, NIR_INTRINSIC_CAN_ELIMINATE | 
NIR_INTRINSIC_CAN_REORDER)
+/* src[] = { buffer_index, offset }. No const_index */
+LOAD(ssbo, 2, 0, NIR_INTRINSIC_CAN_ELIMINATE)
+/* src[] = { offset }. const_index[] = { base } */
+LOAD(output, 1, 1, NIR_INTRINSIC_CAN_ELIMINATE)
+/* src[] = { vertex, offset }. const_index[] = { base } */
+LOAD(per_vertex_output, 2, 1, NIR_INTRINSIC_CAN_ELIMINATE)
 
 /*
- * Stores work the same way as loads, except now the first register input is
- * the value or array to store and the optional second input is the indirect
- * offset. SSBO stores are similar, but they accept 

[Mesa-dev] [PATCH v2 7/9] tgsi_to_nir: Get rid of load/store_foo_indirect

2015-12-04 Thread Jason Ekstrand
Cc: Eric Anholt 
Cc: Rob Clark 
---
 src/gallium/auxiliary/nir/tgsi_to_nir.c | 51 +
 1 file changed, 27 insertions(+), 24 deletions(-)

diff --git a/src/gallium/auxiliary/nir/tgsi_to_nir.c 
b/src/gallium/auxiliary/nir/tgsi_to_nir.c
index 5fef542..480c0b9 100644
--- a/src/gallium/auxiliary/nir/tgsi_to_nir.c
+++ b/src/gallium/auxiliary/nir/tgsi_to_nir.c
@@ -468,7 +468,7 @@ ttn_emit_immediate(struct ttn_compile *c)
nir_builder_instr_insert(b, _const->instr);
 }
 
-static nir_src
+static nir_ssa_def *
 ttn_src_for_indirect(struct ttn_compile *c, struct tgsi_ind_register 
*indirect);
 
 /* generate either a constant or indirect deref chain for accessing an
@@ -487,7 +487,7 @@ ttn_array_deref(struct ttn_compile *c, nir_intrinsic_instr 
*instr,
 
if (indirect) {
   arr->deref_array_type = nir_deref_array_type_indirect;
-  arr->indirect = ttn_src_for_indirect(c, indirect);
+  arr->indirect = nir_src_for_ssa(ttn_src_for_indirect(c, indirect));
} else {
   arr->deref_array_type = nir_deref_array_type_direct;
}
@@ -586,19 +586,14 @@ ttn_src_for_file_and_index(struct ttn_compile *c, 
unsigned file, unsigned index,
 
   switch (file) {
   case TGSI_FILE_INPUT:
- op = indirect ? nir_intrinsic_load_input_indirect :
- nir_intrinsic_load_input;
+ op = nir_intrinsic_load_input;
  assert(!dim);
  break;
   case TGSI_FILE_CONSTANT:
  if (dim) {
-op = indirect ? nir_intrinsic_load_ubo_indirect :
-nir_intrinsic_load_ubo;
-/* convert index from vec4 to byte: */
-index *= 16;
+op = nir_intrinsic_load_ubo;
  } else {
-op = indirect ? nir_intrinsic_load_uniform_indirect :
-nir_intrinsic_load_uniform;
+op = nir_intrinsic_load_uniform;
  }
  break;
   default:
@@ -609,7 +604,6 @@ ttn_src_for_file_and_index(struct ttn_compile *c, unsigned 
file, unsigned index,
   load = nir_intrinsic_instr_create(b->shader, op);
 
   load->num_components = 4;
-  load->const_index[0] = index;
   if (dim) {
  if (dimind) {
 load->src[srcn] =
@@ -622,17 +616,25 @@ ttn_src_for_file_and_index(struct ttn_compile *c, 
unsigned file, unsigned index,
  }
  srcn++;
   }
-  if (indirect) {
- load->src[srcn] = ttn_src_for_indirect(c, indirect);
- if (dim) {
-assert(load->src[srcn].is_ssa);
-/* we also need to covert vec4 to byte here too: */
-load->src[srcn] =
-   nir_src_for_ssa(nir_ishl(b, load->src[srcn].ssa,
-nir_imm_int(b, 4)));
+
+  nir_ssa_def *offset;
+  if (dim) {
+ /* UBO loads don't have a const_index[0] base offset. */
+ offset = nir_imm_int(b, index);
+ if (indirect) {
+offset = nir_iadd(b, offset, ttn_src_for_indirect(c, indirect));
+offset = nir_ishl(b, offset, nir_imm_int(b, 4));
+ }
+  } else {
+ load->const_index[0] = index;
+ if (indirect) {
+offset = ttn_src_for_indirect(c, indirect);
+ } else {
+offset = nir_imm_int(b, 0);
  }
- srcn++;
   }
+  load->src[srcn++] = nir_src_for_ssa(offset);
+
   nir_ssa_dest_init(>instr, >dest, 4, NULL);
   nir_builder_instr_insert(b, >instr);
 
@@ -648,7 +650,7 @@ ttn_src_for_file_and_index(struct ttn_compile *c, unsigned 
file, unsigned index,
return src;
 }
 
-static nir_src
+static nir_ssa_def *
 ttn_src_for_indirect(struct ttn_compile *c, struct tgsi_ind_register *indirect)
 {
nir_builder *b = >build;
@@ -660,7 +662,7 @@ ttn_src_for_indirect(struct ttn_compile *c, struct 
tgsi_ind_register *indirect)
 indirect->File,
 indirect->Index,
 NULL, NULL, NULL);
-   return nir_src_for_ssa(nir_imov_alu(b, src, 1));
+   return nir_imov_alu(b, src, 1);
 }
 
 static nir_alu_dest
@@ -729,7 +731,7 @@ ttn_get_dest(struct ttn_compile *c, struct 
tgsi_full_dst_register *tgsi_fdst)
 
if (tgsi_dst->Indirect && (tgsi_dst->File != TGSI_FILE_TEMPORARY)) {
   nir_src *indirect = ralloc(c->build.shader, nir_src);
-  *indirect = ttn_src_for_indirect(c, _fdst->Indirect);
+  *indirect = nir_src_for_ssa(ttn_src_for_indirect(c, 
_fdst->Indirect));
   dest.dest.reg.indirect = indirect;
}
 
@@ -1927,9 +1929,10 @@ ttn_add_output_stores(struct ttn_compile *c)
 nir_intrinsic_instr_create(b->shader, nir_intrinsic_store_output);
  unsigned loc = var->data.driver_location + i;
  store->num_components = 4;
- store->const_index[0] = loc;
  store->src[0].reg.reg = c->output_regs[loc].reg;
  

Re: [Mesa-dev] [PATCH] svga: use the debug callback to report issues to the state tracker

2015-12-04 Thread Jose Fonseca

On 04/12/15 23:08, Ilia Mirkin wrote:

On Fri, Dec 4, 2015 at 6:00 PM, Jose Fonseca  wrote:

On 04/12/15 22:45, Ilia Mirkin wrote:


On Fri, Dec 4, 2015 at 5:40 PM, Jose Fonseca  wrote:


On 04/12/15 20:04, Ilia Mirkin wrote:



On Fri, Dec 4, 2015 at 3:01 PM, Brian Paul  wrote:



On 12/04/2015 12:46 PM, Ilia Mirkin wrote:




On Fri, Dec 4, 2015 at 2:42 PM, Brian Paul  wrote:




+   if (templ->poly_smooth && svga->debug.callback.debug_message) {
+  /* note: we always need a % something in the message string */





Why? Did I mess something up?





If I write it without the dummy %s I get:

In file included from
../../../../src/gallium/auxiliary/util/u_inlines.h:36:0,
from svga_pipe_rasterizer.c:29:
svga_pipe_rasterizer.c: In function 'svga_create_rasterizer_state':
../../../../src/gallium/auxiliary/util/u_debug.h:273:40: error:
expected
expression before ')' token
   fmt, __VA_ARGS__); \
   ^
svga_pipe_rasterizer.c:357:7: note: in expansion of macro
'pipe_debug_message'
  pipe_debug_message(>debug.callback, CONFORMANCE,
  ^



H... so that'd be a "yes" to me messing it up :) I'll have a look
to see if there's something simple I can do here. Off-hand it feels
like it should be #__VA_ARGS__? Been a while since I looked at the
specifics of all this.





BTW, right now this callback is only set for debug contexts. Perhaps
that was not an ideal decision... just an FYI though.





apitrace creates a debug context so that's the main thing for me right
now.
But it might be nice if setting MESA_DEBUG would cause the debug/info
messages to be printed to stderr for those apps that don't use
GL_ARB_debug_output.




Yeah, we should probably also throw in Enable/Disable hooks into
st/mesa that flip it on/off when flipping GL_DEBUG_OUTPUT (or whatever
the enum is).

BTW, note that apitrace cuts messages off after 100 of them which is
pretty annoying -- I opened an issue at
https://github.com/apitrace/apitrace/issues/395 a while back.




Not just "100", but 100 *per ID*.

Which is very different.

Why is it so annoying?  A patch for an option is welcome anyway BTW.



Right, but but an ID is a message. So let's say the message is
"falling back to XYZ", why would you only want to know the first 100
times? Or let's say the message is "shader compiles to XYZ
instructions" -- again, why the first 100 times? In fact, *given* that
you enable these messages, I don't see why it would ever be desirable
to limit them.



Isn't it obvious?


OpenGL applications are repetitive by nature.  You're bound to have the same
error over and over again.

Futhermore, some IHVs have messages so frequent that completely overwel
everything else.

So, when you take a trace of an arbitrary app, and replay e.g., on NVIDIA,
you'll get lots and performance hints.  Nevertheless, you'll want to be able
to spot rare and more important issues among all that. limiting to a fix
number, e.g., 100, means that after a while, they all silence, so that any
new issue will stand out.

Soemthing that happens all the time is invariably interesting.


I doubt we're going to change each others' mind on this.


I'm not trying to change your mind. You said you couldn't see any reason 
to ever limit, and I gave you an example otherwise.


> I've

contemplated throwing prints into mesa, but haven't been sufficiently
motivated to worry about it.

Like you say, GL applications tend to be repetitive, but they tend to
have a "startup" phase, and a "do repetitive thing" phase. If the
message gets triggered > 100x during the startup phase, I don't get to
see it when I'm actually interested in the results.


Yes, I agree there might be situations where you might want to see all. 
That's why I already agreed to have the apitrace option to cofigure an 
arbitraryly/infinite limit.



But it's OK, with Matt's run from shader-db, it all works, which was
the primary motivation for piping this thing through to gallium.


Super.

Jose

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] svga: use the debug callback to report issues to the state tracker

2015-12-04 Thread Jose Fonseca

On 04/12/15 23:20, Ilia Mirkin wrote:

On Fri, Dec 4, 2015 at 6:14 PM, Jose Fonseca  wrote:

On 04/12/15 22:40, Jose Fonseca wrote:


On 04/12/15 20:04, Ilia Mirkin wrote:


On Fri, Dec 4, 2015 at 3:01 PM, Brian Paul  wrote:


On 12/04/2015 12:46 PM, Ilia Mirkin wrote:



On Fri, Dec 4, 2015 at 2:42 PM, Brian Paul  wrote:



+   if (templ->poly_smooth && svga->debug.callback.debug_message) {
+  /* note: we always need a % something in the message string */




Why? Did I mess something up?




If I write it without the dummy %s I get:

In file included from
../../../../src/gallium/auxiliary/util/u_inlines.h:36:0,
   from svga_pipe_rasterizer.c:29:
svga_pipe_rasterizer.c: In function 'svga_create_rasterizer_state':
../../../../src/gallium/auxiliary/util/u_debug.h:273:40: error: expected
expression before ')' token
  fmt, __VA_ARGS__); \
  ^
svga_pipe_rasterizer.c:357:7: note: in expansion of macro
'pipe_debug_message'
 pipe_debug_message(>debug.callback, CONFORMANCE,
 ^



H... so that'd be a "yes" to me messing it up :) I'll have a look
to see if there's something simple I can do here. Off-hand it feels
like it should be #__VA_ARGS__? Been a while since I looked at the
specifics of all this.





BTW, right now this callback is only set for debug contexts. Perhaps
that was not an ideal decision... just an FYI though.




apitrace creates a debug context so that's the main thing for me
right now.
But it might be nice if setting MESA_DEBUG would cause the debug/info
messages to be printed to stderr for those apps that don't use
GL_ARB_debug_output.



Yeah, we should probably also throw in Enable/Disable hooks into
st/mesa that flip it on/off when flipping GL_DEBUG_OUTPUT (or whatever
the enum is).

BTW, note that apitrace cuts messages off after 100 of them which is
pretty annoying -- I opened an issue at
https://github.com/apitrace/apitrace/issues/395 a while back.



Not just "100", but 100 *per ID*.

Which is very different.

Why is it so annoying?  A patch for an option is welcome anyway BTW.



I noticed that pipe_debug_message() doesn't take a message ID.


It's a little clever... this is why pipe_debug_message is a macro.
Every separate callsite gets its own id. It's initialized to 0, and
passed in as a pointer... the mesa function actually assigns it a new
id if the passed in id is 0. It's a bit racey and definitely a weird
API, but... it was easy :)


Ah, I see it now.  It's not reproducible, but at least different message 
sites have different id.


Jose

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] svga: use the debug callback to report issues to the state tracker

2015-12-04 Thread Jose Fonseca

On 04/12/15 22:40, Jose Fonseca wrote:

On 04/12/15 20:04, Ilia Mirkin wrote:

On Fri, Dec 4, 2015 at 3:01 PM, Brian Paul  wrote:

On 12/04/2015 12:46 PM, Ilia Mirkin wrote:


On Fri, Dec 4, 2015 at 2:42 PM, Brian Paul  wrote:


+   if (templ->poly_smooth && svga->debug.callback.debug_message) {
+  /* note: we always need a % something in the message string */



Why? Did I mess something up?



If I write it without the dummy %s I get:

In file included from
../../../../src/gallium/auxiliary/util/u_inlines.h:36:0,
  from svga_pipe_rasterizer.c:29:
svga_pipe_rasterizer.c: In function 'svga_create_rasterizer_state':
../../../../src/gallium/auxiliary/util/u_debug.h:273:40: error: expected
expression before ')' token
 fmt, __VA_ARGS__); \
 ^
svga_pipe_rasterizer.c:357:7: note: in expansion of macro
'pipe_debug_message'
pipe_debug_message(>debug.callback, CONFORMANCE,
^



H... so that'd be a "yes" to me messing it up :) I'll have a look
to see if there's something simple I can do here. Off-hand it feels
like it should be #__VA_ARGS__? Been a while since I looked at the
specifics of all this.





BTW, right now this callback is only set for debug contexts. Perhaps
that was not an ideal decision... just an FYI though.



apitrace creates a debug context so that's the main thing for me
right now.
But it might be nice if setting MESA_DEBUG would cause the debug/info
messages to be printed to stderr for those apps that don't use
GL_ARB_debug_output.


Yeah, we should probably also throw in Enable/Disable hooks into
st/mesa that flip it on/off when flipping GL_DEBUG_OUTPUT (or whatever
the enum is).

BTW, note that apitrace cuts messages off after 100 of them which is
pretty annoying -- I opened an issue at
https://github.com/apitrace/apitrace/issues/395 a while back.


Not just "100", but 100 *per ID*.

Which is very different.

Why is it so annoying?  A patch for an option is welcome anyway BTW.



I noticed that pipe_debug_message() doesn't take a message ID.

It looks  like the same id is passed for every context:


/**
   * Handles generating a GL_ARB_debug_output message ID generated by 
the GL or

   * GLSL compiler.
   *
   * The GL API has this "ID" mechanism, where the intention is to allow a
   * client to filter in/out messages based on source, type, and ID. 
Of course,
   * building a giant enum list of all debug output messages that Mesa 
might
   * generate is ridiculous, so instead we have our caller pass us a 
pointer to

   * static storage where the ID should get stored.  This ID will be shared
   * across all contexts for that message (which seems like a desirable
   * property, even if it's not expected by the spec), but note that it 
won't be
   * the same between executions if messages aren't generated in the 
same order.

   */


This is far from ideal.

I think at east Mesa should use ID namespaces: each Mesa driver / pipe 
driver should be assigned an unique driver ID (high word), then use low 
word internally.


For the low word, it doesn't need to be an enum per message, but it 
should be more fined grained than one ID per context.


If nothing else, you could use the address or hash of the printf-like 
format string to make an id.



Jose

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] svga: use the debug callback to report issues to the state tracker

2015-12-04 Thread Ilia Mirkin
On Fri, Dec 4, 2015 at 6:14 PM, Jose Fonseca  wrote:
> On 04/12/15 22:40, Jose Fonseca wrote:
>>
>> On 04/12/15 20:04, Ilia Mirkin wrote:
>>>
>>> On Fri, Dec 4, 2015 at 3:01 PM, Brian Paul  wrote:

 On 12/04/2015 12:46 PM, Ilia Mirkin wrote:
>
>
> On Fri, Dec 4, 2015 at 2:42 PM, Brian Paul  wrote:
>>
>>
>> +   if (templ->poly_smooth && svga->debug.callback.debug_message) {
>> +  /* note: we always need a % something in the message string */
>
>
>
> Why? Did I mess something up?



 If I write it without the dummy %s I get:

 In file included from
 ../../../../src/gallium/auxiliary/util/u_inlines.h:36:0,
   from svga_pipe_rasterizer.c:29:
 svga_pipe_rasterizer.c: In function 'svga_create_rasterizer_state':
 ../../../../src/gallium/auxiliary/util/u_debug.h:273:40: error: expected
 expression before ')' token
  fmt, __VA_ARGS__); \
  ^
 svga_pipe_rasterizer.c:357:7: note: in expansion of macro
 'pipe_debug_message'
 pipe_debug_message(>debug.callback, CONFORMANCE,
 ^

>>>
>>> H... so that'd be a "yes" to me messing it up :) I'll have a look
>>> to see if there's something simple I can do here. Off-hand it feels
>>> like it should be #__VA_ARGS__? Been a while since I looked at the
>>> specifics of all this.
>>>

>
> BTW, right now this callback is only set for debug contexts. Perhaps
> that was not an ideal decision... just an FYI though.



 apitrace creates a debug context so that's the main thing for me
 right now.
 But it might be nice if setting MESA_DEBUG would cause the debug/info
 messages to be printed to stderr for those apps that don't use
 GL_ARB_debug_output.
>>>
>>>
>>> Yeah, we should probably also throw in Enable/Disable hooks into
>>> st/mesa that flip it on/off when flipping GL_DEBUG_OUTPUT (or whatever
>>> the enum is).
>>>
>>> BTW, note that apitrace cuts messages off after 100 of them which is
>>> pretty annoying -- I opened an issue at
>>> https://github.com/apitrace/apitrace/issues/395 a while back.
>>
>>
>> Not just "100", but 100 *per ID*.
>>
>> Which is very different.
>>
>> Why is it so annoying?  A patch for an option is welcome anyway BTW.
>>
>
> I noticed that pipe_debug_message() doesn't take a message ID.

It's a little clever... this is why pipe_debug_message is a macro.
Every separate callsite gets its own id. It's initialized to 0, and
passed in as a pointer... the mesa function actually assigns it a new
id if the passed in id is 0. It's a bit racey and definitely a weird
API, but... it was easy :)

  -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] svga: use the debug callback to report issues to the state tracker

2015-12-04 Thread Jose Fonseca

On 04/12/15 23:27, Ilia Mirkin wrote:

On Fri, Dec 4, 2015 at 6:25 PM, Jose Fonseca  wrote:

On 04/12/15 19:42, Brian Paul wrote:


Use the new debug callback hook to report conformance, performance
and fallbacks to the state tracker.  The state tracker, in turn can
report this issues to the user via the GL_ARB_debug_output extension.

More issues can be reported in the future; this is just a start.
---
   src/gallium/drivers/svga/svga_context.h  |  3 +++
   src/gallium/drivers/svga/svga_draw_arrays.c  |  7 +++
   src/gallium/drivers/svga/svga_pipe_blend.c   | 11 +++
   src/gallium/drivers/svga/svga_pipe_misc.c| 17 +
   src/gallium/drivers/svga/svga_pipe_rasterizer.c  |  6 ++
   src/gallium/drivers/svga/svga_state_need_swtnl.c | 23
+++
   6 files changed, 67 insertions(+)

diff --git a/src/gallium/drivers/svga/svga_context.h
b/src/gallium/drivers/svga/svga_context.h
index 6a4f9d8..c4284cc 100644
--- a/src/gallium/drivers/svga/svga_context.h
+++ b/src/gallium/drivers/svga/svga_context.h
@@ -392,6 +392,9 @@ struct svga_context

 boolean no_line_width;
 boolean force_hw_line_stipple;
+
+  /** To report perf/conformance/etc issues to the state tracker */
+  struct pipe_debug_callback callback;
  } debug;

  struct {
diff --git a/src/gallium/drivers/svga/svga_draw_arrays.c
b/src/gallium/drivers/svga/svga_draw_arrays.c
index acb2e95..f6956b5 100644
--- a/src/gallium/drivers/svga/svga_draw_arrays.c
+++ b/src/gallium/drivers/svga/svga_draw_arrays.c
@@ -26,6 +26,7 @@
   #include "svga_cmd.h"

   #include "util/u_inlines.h"
+#include "util/u_prim.h"
   #include "indices/u_indices.h"

   #include "svga_hw_reg.h"
@@ -277,6 +278,12 @@ svga_hwtnl_draw_arrays(struct svga_hwtnl *hwtnl,
 if (ret != PIPE_OK)
goto done;

+  if (svga->debug.callback.debug_message) {
+ pipe_debug_message(>debug.callback, PERF_INFO,
+"generating temporary index buffer for
drawing %s",
+u_prim_name(prim));
+  }
+
 ret = svga_hwtnl_simple_draw_range_elements(hwtnl,
 gen_buf,
 gen_size,
diff --git a/src/gallium/drivers/svga/svga_pipe_blend.c
b/src/gallium/drivers/svga/svga_pipe_blend.c
index 0c9d612..5538193 100644
--- a/src/gallium/drivers/svga/svga_pipe_blend.c
+++ b/src/gallium/drivers/svga/svga_pipe_blend.c
@@ -243,6 +243,17 @@ svga_create_blend_state(struct pipe_context *pipe,
blend->rt[i].srcblend_alpha = blend->rt[i].srcblend;
blend->rt[i].dstblend_alpha = blend->rt[i].dstblend;
blend->rt[i].blendeq_alpha = blend->rt[i].blendeq;
+
+ if (svga->debug.callback.debug_message) {
+if (templ->logicop_func == PIPE_LOGICOP_XOR) {
+   pipe_debug_message(>debug.callback, CONFORMANCE,
+  "XOR logicop mode has limited
support%s", "");
+}
+else if (templ->logicop_func != PIPE_LOGICOP_COPY) {
+   pipe_debug_message(>debug.callback, CONFORMANCE,
+  "general logicops are not supported%s",
"");
+}
+ }
 }
 else {
/* Note: the vgpu10 device does not yet support independent
diff --git a/src/gallium/drivers/svga/svga_pipe_misc.c
b/src/gallium/drivers/svga/svga_pipe_misc.c
index c8020da..af9356d 100644
--- a/src/gallium/drivers/svga/svga_pipe_misc.c
+++ b/src/gallium/drivers/svga/svga_pipe_misc.c
@@ -244,6 +244,22 @@ static void svga_set_viewport_states( struct
pipe_context *pipe,
   }


+/**
+ * Called by state tracker to specify a callback function the driver
+ * can use to report info back to the state tracker.
+ */
+static void
+svga_set_debug_callback(struct pipe_context *pipe,
+const struct pipe_debug_callback *cb)
+{
+   struct svga_context *svga = svga_context(pipe);
+
+   if (cb)
+  svga->debug.callback = *cb;
+   else
+  memset(>debug.callback, 0, sizeof(svga->debug.callback));
+}
+

   void svga_init_misc_functions( struct svga_context *svga )
   {
@@ -252,6 +268,7 @@ void svga_init_misc_functions( struct svga_context
*svga )
  svga->pipe.set_framebuffer_state = svga_set_framebuffer_state;
  svga->pipe.set_clip_state = svga_set_clip_state;
  svga->pipe.set_viewport_states = svga_set_viewport_states;
+   svga->pipe.set_debug_callback = svga_set_debug_callback;
   }


diff --git a/src/gallium/drivers/svga/svga_pipe_rasterizer.c
b/src/gallium/drivers/svga/svga_pipe_rasterizer.c
index 6310b7a..c6215e6 100644
--- a/src/gallium/drivers/svga/svga_pipe_rasterizer.c
+++ b/src/gallium/drivers/svga/svga_pipe_rasterizer.c
@@ -352,6 +352,12 @@ svga_create_rasterizer_state(struct pipe_context
*pipe,
 define_rasterizer_object(svga, rast);
  }

+   if (templ->poly_smooth 

[Mesa-dev] [Bug 79688] [dri3] Latest git breaks PRIME Offloading to Nouveau GPU

2015-12-04 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=79688

--- Comment #24 from Tobias Klausmann  ---
(In reply to poma from comment #22)
> (In reply to Tobias Klausmann from comment #21)
> > (In reply to Axel Davy from comment #20)
> > > Should this bug report be closed ?
> > > DRI3 has complete DRI_PRIME support now.
> > > 
> > 
> > As far as it concerns me, DRI3 Support is fine and this bug can be marked as
> > "resolved fixed". 
> > 
> 
> Do you refer to DRI3 in general, or is it DRI_PRIME?

I was referring to RI_PRIME, as this bug was about DRI_PRIME

> 
> Tobias, while DRI3 enabled, can you run:
> $ ssh -Y [user@]hostname firefox
> 
> What is the result?

If you mean a running firefox process without a window showing up, i would
guess this is not DRI3's fault, as other applications are showing up just fine,
others don't. As far as i could elaborate in under 5 minutes, newer frameworks
(gtk3, kde5 (qt5)) do not work, older ones do work just fine...if you have more
information about this beeing DRI3, please open a bugreport about this specific
problem!

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] gallium/util: fix pipe_debug_message macro to allow 0 args

2015-12-04 Thread Ilia Mirkin
Signed-off-by: Ilia Mirkin 
---
 src/gallium/auxiliary/util/u_debug.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/gallium/auxiliary/util/u_debug.h 
b/src/gallium/auxiliary/util/u_debug.h
index aaf223c..9900703 100644
--- a/src/gallium/auxiliary/util/u_debug.h
+++ b/src/gallium/auxiliary/util/u_debug.h
@@ -270,7 +270,7 @@ void _debug_assert_fail(const char *expr,
static unsigned id = 0; \
_pipe_debug_message(cb, , \
PIPE_DEBUG_TYPE_ ## type, \
-   fmt, __VA_ARGS__); \
+   fmt, ##__VA_ARGS__); \
 } while (0)
 
 struct pipe_debug_callback;
-- 
2.4.10

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [RFC] r600g: evergreen/cayman tessellation support

2015-12-04 Thread Dieter Nützel

Am 04.12.2015 08:48, schrieb eocallag...@alterapraxis.com:

On 2015-12-04 14:03, Dieter Nützel wrote:

Am 03.12.2015 19:57, schrieb Dave Airlie:

On 4 Dec 2015 03:01, "Aaron Watry"  wrote:


Hi Dave (and others),

I cloned your fdo r600g-tess-submit branch and gave it a spin on

CEDAR (Radeon 5400, kernel 4.3.0) with Heaven, and ran into a few
issues.

Just grab r600g-tess-staging

-submit only worked on cayman, but i d posted it here so didn't want
to change it during review.

Dave.


Hello to all,

r600g-tess-staging WORKS
on Turks XT (6670) with poor 2 GB, RAM width 128bits DDR, but slow.
Time for R9 290 or Tonga (R9 380X).


Tonga already works, Tonga is radeonsi not r600g, this series only
applies to r600g.
The radeonsi gallium mesa driver has supported tess for a while now.


You didn't got it:
It is time for ME for R9 xxx (including Tonga)...
I'm with Radeon since the beginning on Linux ;-)


Only issue so far:
Switching tessellation (with mouse) and/or wireframe (F2) in 
Heaven-4.0

let system hang. But NOT hard.
SysRq Key + S | U | B works
But haven't had time to hunt for a useful log.

GREAT WORK!

Dieter

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 93188] "nir/nir.h", line 552: Error: Unexpected type name "nir_src" encountered.

2015-12-04 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=93188

--- Comment #6 from Jason Ekstrand  ---
I had a patch at one point to help C++ out here.  I ended up not sending it out
because it only fixed warnings that only came up if you had extra warnings
turned on.  basically it boiled down to

#ifdef __cplusplus
#  define NIR_SRC_INIT nir_src()
#else
#  define NIR_SRC_INIT (nir_src) { { NULL } }
#endif

And the same thing for NIR_DEST_INIT.  Can you try that?

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] svga: use the debug callback to report issues to the state tracker

2015-12-04 Thread Jose Fonseca

On 04/12/15 22:45, Ilia Mirkin wrote:

On Fri, Dec 4, 2015 at 5:40 PM, Jose Fonseca  wrote:

On 04/12/15 20:04, Ilia Mirkin wrote:


On Fri, Dec 4, 2015 at 3:01 PM, Brian Paul  wrote:


On 12/04/2015 12:46 PM, Ilia Mirkin wrote:



On Fri, Dec 4, 2015 at 2:42 PM, Brian Paul  wrote:



+   if (templ->poly_smooth && svga->debug.callback.debug_message) {
+  /* note: we always need a % something in the message string */




Why? Did I mess something up?




If I write it without the dummy %s I get:

In file included from
../../../../src/gallium/auxiliary/util/u_inlines.h:36:0,
   from svga_pipe_rasterizer.c:29:
svga_pipe_rasterizer.c: In function 'svga_create_rasterizer_state':
../../../../src/gallium/auxiliary/util/u_debug.h:273:40: error: expected
expression before ')' token
  fmt, __VA_ARGS__); \
  ^
svga_pipe_rasterizer.c:357:7: note: in expansion of macro
'pipe_debug_message'
 pipe_debug_message(>debug.callback, CONFORMANCE,
 ^



H... so that'd be a "yes" to me messing it up :) I'll have a look
to see if there's something simple I can do here. Off-hand it feels
like it should be #__VA_ARGS__? Been a while since I looked at the
specifics of all this.





BTW, right now this callback is only set for debug contexts. Perhaps
that was not an ideal decision... just an FYI though.




apitrace creates a debug context so that's the main thing for me right
now.
But it might be nice if setting MESA_DEBUG would cause the debug/info
messages to be printed to stderr for those apps that don't use
GL_ARB_debug_output.



Yeah, we should probably also throw in Enable/Disable hooks into
st/mesa that flip it on/off when flipping GL_DEBUG_OUTPUT (or whatever
the enum is).

BTW, note that apitrace cuts messages off after 100 of them which is
pretty annoying -- I opened an issue at
https://github.com/apitrace/apitrace/issues/395 a while back.



Not just "100", but 100 *per ID*.

Which is very different.

Why is it so annoying?  A patch for an option is welcome anyway BTW.


Right, but but an ID is a message. So let's say the message is
"falling back to XYZ", why would you only want to know the first 100
times? Or let's say the message is "shader compiles to XYZ
instructions" -- again, why the first 100 times? In fact, *given* that
you enable these messages, I don't see why it would ever be desirable
to limit them.


Isn't it obvious?


OpenGL applications are repetitive by nature.  You're bound to have the 
same error over and over again.


Futhermore, some IHVs have messages so frequent that completely overwel 
everything else.


So, when you take a trace of an arbitrary app, and replay e.g., on 
NVIDIA, you'll get lots and performance hints.  Nevertheless, you'll 
want to be able to spot rare and more important issues among all that. 
limiting to a fix number, e.g., 100, means that after a while, they all 
silence, so that any new issue will stand out.


Soemthing that happens all the time is invariably interesting.

Jose
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] svga: use the debug callback to report issues to the state tracker

2015-12-04 Thread Ilia Mirkin
On Fri, Dec 4, 2015 at 6:00 PM, Jose Fonseca  wrote:
> On 04/12/15 22:45, Ilia Mirkin wrote:
>>
>> On Fri, Dec 4, 2015 at 5:40 PM, Jose Fonseca  wrote:
>>>
>>> On 04/12/15 20:04, Ilia Mirkin wrote:


 On Fri, Dec 4, 2015 at 3:01 PM, Brian Paul  wrote:
>
>
> On 12/04/2015 12:46 PM, Ilia Mirkin wrote:
>>
>>
>>
>> On Fri, Dec 4, 2015 at 2:42 PM, Brian Paul  wrote:
>>>
>>>
>>>
>>> +   if (templ->poly_smooth && svga->debug.callback.debug_message) {
>>> +  /* note: we always need a % something in the message string */
>>
>>
>>
>>
>> Why? Did I mess something up?
>
>
>
>
> If I write it without the dummy %s I get:
>
> In file included from
> ../../../../src/gallium/auxiliary/util/u_inlines.h:36:0,
>from svga_pipe_rasterizer.c:29:
> svga_pipe_rasterizer.c: In function 'svga_create_rasterizer_state':
> ../../../../src/gallium/auxiliary/util/u_debug.h:273:40: error:
> expected
> expression before ')' token
>   fmt, __VA_ARGS__); \
>   ^
> svga_pipe_rasterizer.c:357:7: note: in expansion of macro
> 'pipe_debug_message'
>  pipe_debug_message(>debug.callback, CONFORMANCE,
>  ^
>

 H... so that'd be a "yes" to me messing it up :) I'll have a look
 to see if there's something simple I can do here. Off-hand it feels
 like it should be #__VA_ARGS__? Been a while since I looked at the
 specifics of all this.

>
>>
>> BTW, right now this callback is only set for debug contexts. Perhaps
>> that was not an ideal decision... just an FYI though.
>
>
>
>
> apitrace creates a debug context so that's the main thing for me right
> now.
> But it might be nice if setting MESA_DEBUG would cause the debug/info
> messages to be printed to stderr for those apps that don't use
> GL_ARB_debug_output.



 Yeah, we should probably also throw in Enable/Disable hooks into
 st/mesa that flip it on/off when flipping GL_DEBUG_OUTPUT (or whatever
 the enum is).

 BTW, note that apitrace cuts messages off after 100 of them which is
 pretty annoying -- I opened an issue at
 https://github.com/apitrace/apitrace/issues/395 a while back.
>>>
>>>
>>>
>>> Not just "100", but 100 *per ID*.
>>>
>>> Which is very different.
>>>
>>> Why is it so annoying?  A patch for an option is welcome anyway BTW.
>>
>>
>> Right, but but an ID is a message. So let's say the message is
>> "falling back to XYZ", why would you only want to know the first 100
>> times? Or let's say the message is "shader compiles to XYZ
>> instructions" -- again, why the first 100 times? In fact, *given* that
>> you enable these messages, I don't see why it would ever be desirable
>> to limit them.
>
>
> Isn't it obvious?
>
>
> OpenGL applications are repetitive by nature.  You're bound to have the same
> error over and over again.
>
> Futhermore, some IHVs have messages so frequent that completely overwel
> everything else.
>
> So, when you take a trace of an arbitrary app, and replay e.g., on NVIDIA,
> you'll get lots and performance hints.  Nevertheless, you'll want to be able
> to spot rare and more important issues among all that. limiting to a fix
> number, e.g., 100, means that after a while, they all silence, so that any
> new issue will stand out.
>
> Soemthing that happens all the time is invariably interesting.

I doubt we're going to change each others' mind on this. I've
contemplated throwing prints into mesa, but haven't been sufficiently
motivated to worry about it.

Like you say, GL applications tend to be repetitive, but they tend to
have a "startup" phase, and a "do repetitive thing" phase. If the
message gets triggered > 100x during the startup phase, I don't get to
see it when I'm actually interested in the results.

But it's OK, with Matt's run from shader-db, it all works, which was
the primary motivation for piping this thing through to gallium.

  -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] gallium/util: fix pipe_debug_message macro to allow 0 args

2015-12-04 Thread Brian Paul

On 12/04/2015 01:09 PM, Ilia Mirkin wrote:

Signed-off-by: Ilia Mirkin 
---
  src/gallium/auxiliary/util/u_debug.h | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/gallium/auxiliary/util/u_debug.h 
b/src/gallium/auxiliary/util/u_debug.h
index aaf223c..9900703 100644
--- a/src/gallium/auxiliary/util/u_debug.h
+++ b/src/gallium/auxiliary/util/u_debug.h
@@ -270,7 +270,7 @@ void _debug_assert_fail(const char *expr,
 static unsigned id = 0; \
 _pipe_debug_message(cb, , \
 PIPE_DEBUG_TYPE_ ## type, \
-   fmt, __VA_ARGS__); \
+   fmt, ##__VA_ARGS__); \
  } while (0)

  struct pipe_debug_callback;



Reviewed-by: Brian Paul 
Tested-by: Brian Paul 

Thanks.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/2] svga: use the debug callback to report issues to the state tracker

2015-12-04 Thread Brian Paul
Use the new debug callback hook to report conformance, performance
and fallbacks to the state tracker.  The state tracker, in turn can
report this issues to the user via the GL_ARB_debug_output extension.

More issues can be reported in the future; this is just a start.

v2: remove conditionals around pipe_debug_message() calls since the
check is now done in the macro itself.
---
 src/gallium/drivers/svga/svga_context.h  |  3 +++
 src/gallium/drivers/svga/svga_draw_arrays.c  |  5 +
 src/gallium/drivers/svga/svga_pipe_blend.c   |  9 +
 src/gallium/drivers/svga/svga_pipe_misc.c| 17 +
 src/gallium/drivers/svga/svga_pipe_rasterizer.c  |  5 +
 src/gallium/drivers/svga/svga_state_need_swtnl.c | 23 +++
 6 files changed, 62 insertions(+)

diff --git a/src/gallium/drivers/svga/svga_context.h 
b/src/gallium/drivers/svga/svga_context.h
index 6a4f9d8..c4284cc 100644
--- a/src/gallium/drivers/svga/svga_context.h
+++ b/src/gallium/drivers/svga/svga_context.h
@@ -392,6 +392,9 @@ struct svga_context
 
   boolean no_line_width;
   boolean force_hw_line_stipple;
+
+  /** To report perf/conformance/etc issues to the state tracker */
+  struct pipe_debug_callback callback;
} debug;
 
struct {
diff --git a/src/gallium/drivers/svga/svga_draw_arrays.c 
b/src/gallium/drivers/svga/svga_draw_arrays.c
index acb2e95..b02adeb 100644
--- a/src/gallium/drivers/svga/svga_draw_arrays.c
+++ b/src/gallium/drivers/svga/svga_draw_arrays.c
@@ -26,6 +26,7 @@
 #include "svga_cmd.h"
 
 #include "util/u_inlines.h"
+#include "util/u_prim.h"
 #include "indices/u_indices.h"
 
 #include "svga_hw_reg.h"
@@ -277,6 +278,10 @@ svga_hwtnl_draw_arrays(struct svga_hwtnl *hwtnl,
   if (ret != PIPE_OK)
  goto done;
 
+  pipe_debug_message(>debug.callback, PERF_INFO,
+ "generating temporary index buffer for drawing %s",
+ u_prim_name(prim));
+
   ret = svga_hwtnl_simple_draw_range_elements(hwtnl,
   gen_buf,
   gen_size,
diff --git a/src/gallium/drivers/svga/svga_pipe_blend.c 
b/src/gallium/drivers/svga/svga_pipe_blend.c
index 0c9d612..b485703 100644
--- a/src/gallium/drivers/svga/svga_pipe_blend.c
+++ b/src/gallium/drivers/svga/svga_pipe_blend.c
@@ -243,6 +243,15 @@ svga_create_blend_state(struct pipe_context *pipe,
  blend->rt[i].srcblend_alpha = blend->rt[i].srcblend;
  blend->rt[i].dstblend_alpha = blend->rt[i].dstblend;
  blend->rt[i].blendeq_alpha = blend->rt[i].blendeq;
+
+ if (templ->logicop_func == PIPE_LOGICOP_XOR) {
+pipe_debug_message(>debug.callback, CONFORMANCE,
+   "XOR logicop mode has limited support%s", "");
+ }
+ else if (templ->logicop_func != PIPE_LOGICOP_COPY) {
+pipe_debug_message(>debug.callback, CONFORMANCE,
+   "general logicops are not supported%s", "");
+ }
   }
   else {
  /* Note: the vgpu10 device does not yet support independent
diff --git a/src/gallium/drivers/svga/svga_pipe_misc.c 
b/src/gallium/drivers/svga/svga_pipe_misc.c
index c8020da..af9356d 100644
--- a/src/gallium/drivers/svga/svga_pipe_misc.c
+++ b/src/gallium/drivers/svga/svga_pipe_misc.c
@@ -244,6 +244,22 @@ static void svga_set_viewport_states( struct pipe_context 
*pipe,
 }
 
 
+/**
+ * Called by state tracker to specify a callback function the driver
+ * can use to report info back to the state tracker.
+ */
+static void
+svga_set_debug_callback(struct pipe_context *pipe,
+const struct pipe_debug_callback *cb)
+{
+   struct svga_context *svga = svga_context(pipe);
+
+   if (cb)
+  svga->debug.callback = *cb;
+   else
+  memset(>debug.callback, 0, sizeof(svga->debug.callback));
+}
+
 
 void svga_init_misc_functions( struct svga_context *svga )
 {
@@ -252,6 +268,7 @@ void svga_init_misc_functions( struct svga_context *svga )
svga->pipe.set_framebuffer_state = svga_set_framebuffer_state;
svga->pipe.set_clip_state = svga_set_clip_state;
svga->pipe.set_viewport_states = svga_set_viewport_states;
+   svga->pipe.set_debug_callback = svga_set_debug_callback;
 }
 
 
diff --git a/src/gallium/drivers/svga/svga_pipe_rasterizer.c 
b/src/gallium/drivers/svga/svga_pipe_rasterizer.c
index 6310b7a..fa1744f 100644
--- a/src/gallium/drivers/svga/svga_pipe_rasterizer.c
+++ b/src/gallium/drivers/svga/svga_pipe_rasterizer.c
@@ -352,6 +352,11 @@ svga_create_rasterizer_state(struct pipe_context *pipe,
   define_rasterizer_object(svga, rast);
}
 
+   if (templ->poly_smooth) {
+  pipe_debug_message(>debug.callback, CONFORMANCE,
+ "GL_POLYGON_SMOOTH not supported");
+   }
+
svga->hud.num_state_objects++;
 
return rast;
diff --git a/src/gallium/drivers/svga/svga_state_need_swtnl.c 

Re: [Mesa-dev] [PATCH] svga: use the debug callback to report issues to the state tracker

2015-12-04 Thread Ilia Mirkin
On Fri, Dec 4, 2015 at 6:32 PM, Jose Fonseca  wrote:
> On 04/12/15 23:27, Ilia Mirkin wrote:
>>
>> On Fri, Dec 4, 2015 at 6:25 PM, Jose Fonseca  wrote:
>>>
>>> On 04/12/15 19:42, Brian Paul wrote:


 Use the new debug callback hook to report conformance, performance
 and fallbacks to the state tracker.  The state tracker, in turn can
 report this issues to the user via the GL_ARB_debug_output extension.

 More issues can be reported in the future; this is just a start.
 ---
src/gallium/drivers/svga/svga_context.h  |  3 +++
src/gallium/drivers/svga/svga_draw_arrays.c  |  7 +++
src/gallium/drivers/svga/svga_pipe_blend.c   | 11 +++
src/gallium/drivers/svga/svga_pipe_misc.c| 17
 +
src/gallium/drivers/svga/svga_pipe_rasterizer.c  |  6 ++
src/gallium/drivers/svga/svga_state_need_swtnl.c | 23
 +++
6 files changed, 67 insertions(+)

 diff --git a/src/gallium/drivers/svga/svga_context.h
 b/src/gallium/drivers/svga/svga_context.h
 index 6a4f9d8..c4284cc 100644
 --- a/src/gallium/drivers/svga/svga_context.h
 +++ b/src/gallium/drivers/svga/svga_context.h
 @@ -392,6 +392,9 @@ struct svga_context

  boolean no_line_width;
  boolean force_hw_line_stipple;
 +
 +  /** To report perf/conformance/etc issues to the state tracker */
 +  struct pipe_debug_callback callback;
   } debug;

   struct {
 diff --git a/src/gallium/drivers/svga/svga_draw_arrays.c
 b/src/gallium/drivers/svga/svga_draw_arrays.c
 index acb2e95..f6956b5 100644
 --- a/src/gallium/drivers/svga/svga_draw_arrays.c
 +++ b/src/gallium/drivers/svga/svga_draw_arrays.c
 @@ -26,6 +26,7 @@
#include "svga_cmd.h"

#include "util/u_inlines.h"
 +#include "util/u_prim.h"
#include "indices/u_indices.h"

#include "svga_hw_reg.h"
 @@ -277,6 +278,12 @@ svga_hwtnl_draw_arrays(struct svga_hwtnl *hwtnl,
  if (ret != PIPE_OK)
 goto done;

 +  if (svga->debug.callback.debug_message) {
 + pipe_debug_message(>debug.callback, PERF_INFO,
 +"generating temporary index buffer for
 drawing %s",
 +u_prim_name(prim));
 +  }
 +
  ret = svga_hwtnl_simple_draw_range_elements(hwtnl,
  gen_buf,
  gen_size,
 diff --git a/src/gallium/drivers/svga/svga_pipe_blend.c
 b/src/gallium/drivers/svga/svga_pipe_blend.c
 index 0c9d612..5538193 100644
 --- a/src/gallium/drivers/svga/svga_pipe_blend.c
 +++ b/src/gallium/drivers/svga/svga_pipe_blend.c
 @@ -243,6 +243,17 @@ svga_create_blend_state(struct pipe_context *pipe,
 blend->rt[i].srcblend_alpha = blend->rt[i].srcblend;
 blend->rt[i].dstblend_alpha = blend->rt[i].dstblend;
 blend->rt[i].blendeq_alpha = blend->rt[i].blendeq;
 +
 + if (svga->debug.callback.debug_message) {
 +if (templ->logicop_func == PIPE_LOGICOP_XOR) {
 +   pipe_debug_message(>debug.callback, CONFORMANCE,
 +  "XOR logicop mode has limited
 support%s", "");
 +}
 +else if (templ->logicop_func != PIPE_LOGICOP_COPY) {
 +   pipe_debug_message(>debug.callback, CONFORMANCE,
 +  "general logicops are not
 supported%s",
 "");
 +}
 + }
  }
  else {
 /* Note: the vgpu10 device does not yet support independent
 diff --git a/src/gallium/drivers/svga/svga_pipe_misc.c
 b/src/gallium/drivers/svga/svga_pipe_misc.c
 index c8020da..af9356d 100644
 --- a/src/gallium/drivers/svga/svga_pipe_misc.c
 +++ b/src/gallium/drivers/svga/svga_pipe_misc.c
 @@ -244,6 +244,22 @@ static void svga_set_viewport_states( struct
 pipe_context *pipe,
}


 +/**
 + * Called by state tracker to specify a callback function the driver
 + * can use to report info back to the state tracker.
 + */
 +static void
 +svga_set_debug_callback(struct pipe_context *pipe,
 +const struct pipe_debug_callback *cb)
 +{
 +   struct svga_context *svga = svga_context(pipe);
 +
 +   if (cb)
 +  svga->debug.callback = *cb;
 +   else
 +  memset(>debug.callback, 0, sizeof(svga->debug.callback));
 +}
 +

void svga_init_misc_functions( struct svga_context *svga )
{
 @@ -252,6 +268,7 @@ void svga_init_misc_functions( struct svga_context
 *svga )

[Mesa-dev] [Bug 93257] [SKL, bisected] ASTC dEQP tests segfault

2015-12-04 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=93257

Nanley Chery  changed:

   What|Removed |Added

  Component|Mesa core   |Drivers/DRI/i965
 QA Contact|mesa-dev@lists.freedesktop. |intel-3d-bugs@lists.freedes
   |org |ktop.org

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 01/26] nir: Add a writemask to store intrinsics.

2015-12-04 Thread Jason Ekstrand
On Wed, Dec 2, 2015 at 4:15 PM, Kenneth Graunke  wrote:
> Tessellation control shaders need to be careful when writing outputs.
> Because multiple threads can concurrently write the same output
> variables, we need to only write the exact components we were told.
>
> Traditionally, for sub-vector writes, we've read the whole vector,
> updated the temporary, and written the whole vector back.  This breaks
> down with concurrent access.
>
> This patch prepares the way for a solution by adding a writemask field
> to store_var intrinsics, as well as the other store intrinsics.  It then
> updates all produces to emit a writemask of "all channels enabled".  It
> updates nir_lower_io to copy the writemask to output store intrinsics.
>
> Finally, it updates nir_lower_vars_to_ssa to handle partial writemasks
> by doing a read-modify-write cycle (which is safe, because local
> variables are specific to a single thread).
>
> This should have no functional change, since no one actually emits
> partial writemasks yet.
>
> Signed-off-by: Kenneth Graunke 
> ---
>  src/gallium/auxiliary/nir/tgsi_to_nir.c|  1 +
>  .../drivers/freedreno/ir3/ir3_compiler_nir.c   |  6 +++
>  src/glsl/nir/glsl_to_nir.cpp   |  2 +
>  src/glsl/nir/nir_builder.h |  4 +-
>  src/glsl/nir/nir_intrinsics.h  | 14 +++
>  src/glsl/nir/nir_lower_gs_intrinsics.c |  3 +-
>  src/glsl/nir/nir_lower_io.c|  2 +
>  src/glsl/nir/nir_lower_locals_to_regs.c|  2 +-
>  src/glsl/nir/nir_lower_var_copies.c|  1 +
>  src/glsl/nir/nir_lower_vars_to_ssa.c   | 46 
> --
>  src/glsl/nir/nir_validate.c|  1 +
>  src/mesa/program/prog_to_nir.c |  2 +
>  12 files changed, 63 insertions(+), 21 deletions(-)
>
> Technically, I don't think I need the ir3 changes here - it should work
> without any changes.  I think I was just overzealously preparing things
> to handle writemasks before I realized there shouldn't be any.
>
> But, it might be worth doing anyway.  Not sure.
>
> Jason and I have already talked about the fact that we're conflicting.
> I like what he's doing, and he suggested this; we just need to figure out
> what lands first.  I decided to send these out anyway for people to look
> at and start reviewing, since there's a lot to do there.  We'll sort it
> out in v2.
>
> diff --git a/src/gallium/auxiliary/nir/tgsi_to_nir.c 
> b/src/gallium/auxiliary/nir/tgsi_to_nir.c
> index 86c2ffa..f5547c8 100644
> --- a/src/gallium/auxiliary/nir/tgsi_to_nir.c
> +++ b/src/gallium/auxiliary/nir/tgsi_to_nir.c
> @@ -1894,6 +1894,7 @@ ttn_emit_instruction(struct ttn_compile *c)
> _dst->Indirect : NULL;
>
>store->num_components = 4;
> +  store->const_index[0] = 0xf;
>store->variables[0] = ttn_array_deref(c, store, var, offset, indirect);
>store->src[0] = nir_src_for_reg(dest.dest.reg.reg);
>
> diff --git a/src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c 
> b/src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c
> index 8617704..3beed0e 100644
> --- a/src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c
> +++ b/src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c
> @@ -1314,6 +1314,9 @@ emit_intrinisic_store_var(struct ir3_compile *ctx, 
> nir_intrinsic_instr *intr)
> case nir_deref_array_type_direct:
> /* direct access does not require anything special: */
> for (int i = 0; i < intr->num_components; i++) {
> +   if (!(intr->const_index[0] & (1 << i)))
> +   continue;
> +
> unsigned n = darr->base_offset * 4 + i;
> compile_assert(ctx, n < arr->length);
> arr->arr[n] = src[i];
> @@ -1326,6 +1329,9 @@ emit_intrinisic_store_var(struct ir3_compile *ctx, 
> nir_intrinsic_instr *intr)
> struct ir3_instruction *addr =
> get_addr(ctx, get_src(ctx, 
> >indirect)[0]);
> for (int i = 0; i < intr->num_components; i++) {
> +   if (!(intr->const_index[0] & (1 << i)))
> +   continue;
> +
> struct ir3_instruction *store;
> unsigned n = darr->base_offset * 4 + i;
> compile_assert(ctx, n < arr->length);
> diff --git a/src/glsl/nir/glsl_to_nir.cpp b/src/glsl/nir/glsl_to_nir.cpp
> index 45d045c..5c84b0d 100644
> --- a/src/glsl/nir/glsl_to_nir.cpp
> +++ b/src/glsl/nir/glsl_to_nir.cpp
> @@ -982,6 +982,7 @@ nir_visitor::visit(ir_call *ir)
>   nir_intrinsic_instr *store_instr =
>  nir_intrinsic_instr_create(shader, nir_intrinsic_store_var);
>   store_instr->num_components = 
> 

[Mesa-dev] [PATCH 1/2] gallium/util: move conditional from _pipe_debug_message() to macro

2015-12-04 Thread Brian Paul
To avoid function calls when the pointer is null.

Also, assert that the pipe_debug_callback object is non-null.  It looks
like all the drivers which use this feature never pass a non-null pointer.
---
 src/gallium/auxiliary/util/u_debug.c | 3 +--
 src/gallium/auxiliary/util/u_debug.h | 9 ++---
 2 files changed, 7 insertions(+), 5 deletions(-)

diff --git a/src/gallium/auxiliary/util/u_debug.c 
b/src/gallium/auxiliary/util/u_debug.c
index 7029536..c42505d 100644
--- a/src/gallium/auxiliary/util/u_debug.c
+++ b/src/gallium/auxiliary/util/u_debug.c
@@ -79,8 +79,7 @@ _pipe_debug_message(
 {
va_list args;
va_start(args, fmt);
-   if (cb && cb->debug_message)
-  cb->debug_message(cb->data, id, type, fmt, args);
+   cb->debug_message(cb->data, id, type, fmt, args);
va_end(args);
 }
 
diff --git a/src/gallium/auxiliary/util/u_debug.h 
b/src/gallium/auxiliary/util/u_debug.h
index 9900703..4cc3d63 100644
--- a/src/gallium/auxiliary/util/u_debug.h
+++ b/src/gallium/auxiliary/util/u_debug.h
@@ -268,9 +268,12 @@ void _debug_assert_fail(const char *expr,
  */
 #define pipe_debug_message(cb, type, fmt, ...) do { \
static unsigned id = 0; \
-   _pipe_debug_message(cb, , \
-   PIPE_DEBUG_TYPE_ ## type, \
-   fmt, ##__VA_ARGS__); \
+   assert(cb); \
+   if ((cb)->debug_message) { \
+  _pipe_debug_message(cb, , \
+  PIPE_DEBUG_TYPE_ ## type, \
+  fmt, ##__VA_ARGS__); \
+   } \
 } while (0)
 
 struct pipe_debug_callback;
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [Mesa-stable] [PATCH] glsl: always re-validate program pipeline

2015-12-04 Thread Kenneth Graunke
On Wednesday, December 02, 2015 10:45:02 AM Timothy Arceri wrote:
> On Tue, 2015-12-01 at 13:46 +0200, Tapani Pälli wrote:
> > On 12/01/2015 02:13 AM, Timothy Arceri wrote:
> > > Just because the validation passed the last time is was called
> > > doesn't
> > > automatically mean it will pass again the next time its called.
> > 
> > This is a rather large hammer though :/ Maybe we should look at 
> > invalidating the pipeline when samplers are changed instead. Was
> > there 
> > other issues this resolves/fixes than sampler validation?
> 
> There are no other tests that this fixes however I believe that's just
> because the tests use ValidateProgramPipeline to check for problems
> rather than issuing a draw call. For example this patch is the
> equivalent of doing this in ValidateProgramPipeline
> 
> if (!ctx->_Shader->Validated)
>_mesa_validate_program_pipeline(ctx, pipe, false);
> 
> And this causes the CTS pipeline test to fail. The interesting thing is
> that the piglit test this fixes also fails on Nvidia.
> 
> The spec is somewhat confusing in its wording:
> 
> "Therefore validation is done when the first rendering command which
> triggers shader invocations is issued, to determine if the set of
> active program objects can be executed."
> 
> This seem like its saying we only need to check it on the first call
> however even then ctx->_Shader->Validated may have been set by
> ValidateProgramPipeline rather than validation triggered a rendering
> command.
> 
> Also the spec goes on to say:
> 
> "An INVALID_OPERATION error is generated by any command that trans-fers
> vertices to the GL or launches compute work"
> 
> Which seems to conflict with the first statement.
> 
> It seems to me the intention is to call validation each time as
> relinking or changing samplers could cause validation to fail.
> 
> Maybe we should have a test which does a draw call, then relinks with
> SEPARABLE to false, then calls draw again and checks for a failure.
> 
> A less hammer type approach may be to reset ctx->_Shader->Validated
> after linking and when samplers are changed but we really need more
> tests to make sure we get this right.

I agree with Tapani, performing shader validation on every draw call is
a pretty big hammer, and will almost certainly impact performance - we
really want less draw-time validation, where possible.

Setting Validated = false when re-linking, changing program stages, or
otherwise altering the pipeline sounds like a good idea.  That way we
validate it on the first draw following the change, but can avoid it
after that.

--Ken


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 93257] [SKL, bisected] ASTC dEQP tests segfault

2015-12-04 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=93257

Bug ID: 93257
   Summary: [SKL, bisected] ASTC dEQP tests segfault
   Product: Mesa
   Version: git
  Hardware: Other
OS: All
Status: ASSIGNED
  Severity: normal
  Priority: medium
 Component: Mesa core
  Assignee: b...@bwidawsk.net
  Reporter: nanleych...@gmail.com
QA Contact: mesa-dev@lists.freedesktop.org
CC: chad.vers...@intel.com, n...@linux.intel.com

Test: dEQP-GLES3.functional.texture.compressed.astc.void_extent_ldr.5x4 
Expected result: Test fails due to comparison errors.
Actual result:   Test segfaults.

The segfaults seem to occur for all ASTC tests with block size dimensions >=
5x4.
More tests may be affected. This issue does not occur on Jenkins, but it does
occur
locally on the given configuration:

HW: SKL
dEQP: https://github.com/chadversary/deqp chadv
  CMAKE_BUILD_TYPE Debug 
  DEQP_TARGET  drm
Mesa:
commit 0288f92e7b0ce5f0d821f2d0ddef522a23776ecb
Author: Ben Widawsky 
Date:   Tue Oct 13 20:50:27 2015 -0700

i965/gen9: Support fast clears for 32b float

SKL supports the ability to do fast clears and resolves of 32b RGBA as both
integer and floats. This patch only enables float color clears because we
haven't yet enabled integer color clears, (HW support for that was added in
BDW).

v2: Remove LUMINANCE16F and INTENSITY16F special cases since they are now
handled by Neil's patch to disable MSAA fast clears.

Signed-off-by: Ben Widawsky 
Reviewed-by: Neil Roberts 
Reviewed-by: Chad Versace 


Core dump info:
Signal: 11 (SEGV)
  Command Line: ./modules/gles3/deqp-gles3
--deqp-case=dEQP-GLES3.functional.texture.compressed.astc.void_extent_ldr.5x4
--deqp-surface-type=fbo --deqp-log-images=disable --deqp-surface-width=64
--deqp-surface-height=64
   Stack trace:
   #0  0x7fa3556059b9 dri2InvalidateDrawable (i965_dri.so)
   #1  0x7fa3556222dc intel_viewport (i965_dri.so)
   #2  0x7fa355372ace _mesa_set_viewport (i965_dri.so)
   #3  0x7fa35547e7a8 _mesa_meta_end (i965_dri.so)
   #4  0x7fa35562fa40 brw_meta_resolve_color (i965_dri.so)
   #5  0x7fa355680bdd intel_miptree_resolve_color (i965_dri.so)
   #6  0x7fa355685ee2 intel_readpixels_tiled_memcpy (i965_dri.so)
   #7  0x7fa35568637b intelReadPixels (i965_dri.so)
   #8  0x7fa355317716 _mesa_ReadnPixelsARB (i965_dri.so)
   #9  0x7fa355317767 _mesa_ReadPixels (i965_dri.so)
   #10 0x7fa358127640 shared_dispatch_stub_256 (libglapi.so.0)
   #11 0x010343eb
_ZN3glu10readPixelsERKNS_13RenderContextEiiRKN3tcu17PixelBufferAccessE
(deqp-gles3)
   #12 0x00c6772f
_ZN4deqp5gles310Functional29ASTCDecompressionCaseInternal14ASTCRenderer2D6renderERN3tcu7SurfaceES6_RKN3glu9Texture2DERKNS4_13TextureFormatE
(deqp-gles3)
   #13 0x00c6839f _ZN4deqp5gles310Functional15ASTCBlockCase2D7iterateEv
(deqp-gles3)
   #14 0x00feab6b
_ZN3tcu15TestCaseWrapper15iterateTestCaseEPNS_8TestCaseE (deqp-gles3)
   #15 0x00aacb9f
_ZN4deqp5gles315TestCaseWrapper15iterateTestCaseEPN3tcu8TestCaseE (deqp-gles3)
   #16 0x00fec6cd _ZN3tcu12TestExecutor7iterateEv (deqp-gles3)
   #17 0x00fc31e7 _ZN3tcu3App7iterateEv (deqp-gles3)
   #18 0x00aa92d9 main (deqp-gles3)
   #19 0x7fa3583a0610 __libc_start_main (libc.so.6)
   #20 0x00aa90c9 _start (deqp-gles3)

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 02/26] nir: Use writemasked store_vars in glsl_to_nir.

2015-12-04 Thread Jason Ekstrand
Yeah, I like this.

On Wed, Dec 2, 2015 at 4:15 PM, Kenneth Graunke  wrote:
> Instead of performing the read-modify-write cycle in glsl->nir, we can
> simply emit a partial writemask.  For locals, nir_lower_vars_to_ssa will
> do the equivalent read-modify-write cycle for us, so we continue to get
> the same SSA values we had before.
>
> Because glsl_to_nir calls nir_lower_outputs_to_temporaries, all outputs
> are shadowed with temporary values, and written out as whole vectors at
> the end of the shader.  So, most consumers will still not see partial
> writemasks.
>
> However, nir_lower_outputs_to_temporaries bails for tessellation control
> shader outputs.  So those remain actual variables, and stores to those
> variables now get a writemask.  nir_lower_io passes that through.  This
> means that TCS outputs should actually work now.
>
> This is a functional change for tessellation control shaders.
>
> Signed-off-by: Kenneth Graunke 
> ---
>  src/glsl/nir/glsl_to_nir.cpp | 39 +--
>  1 file changed, 9 insertions(+), 30 deletions(-)
>
> diff --git a/src/glsl/nir/glsl_to_nir.cpp b/src/glsl/nir/glsl_to_nir.cpp
> index 5c84b0d..0d33eaa 100644
> --- a/src/glsl/nir/glsl_to_nir.cpp
> +++ b/src/glsl/nir/glsl_to_nir.cpp
> @@ -1044,44 +1044,23 @@ nir_visitor::visit(ir_assignment *ir)
> nir_ssa_def *src = evaluate_rvalue(ir->rhs);
>
> if (ir->write_mask != (1 << num_components) - 1 && ir->write_mask != 0) {
> -  /*
> -   * We have no good way to update only part of a variable, so just load
> -   * the LHS and do a vec operation to combine the old with the new, and
> -   * then store it
> -   * back into the LHS. Copy propagation should get rid of the mess.
> +  /* GLSL IR will give us the input to the write-masked assignment in a
> +   * single packed vector.  So, for example, if the writemask is xzw, 
> then
> +   * we have to swizzle x -> x, y -> z, and z -> w and get the y 
> component
> +   * from the load.
> */
> -
> -  nir_intrinsic_instr *load =
> - nir_intrinsic_instr_create(this->shader, nir_intrinsic_load_var);
> -  load->num_components = ir->lhs->type->vector_elements;
> -  nir_ssa_dest_init(>instr, >dest, num_components, NULL);
> -  load->variables[0] = lhs_deref;
> -  ralloc_steal(load, load->variables[0]);
> -  nir_builder_instr_insert(, >instr);
> -
> -  nir_ssa_def *srcs[4];
> -
> +  unsigned swiz[4];
>unsigned component = 0;
> -  for (unsigned i = 0; i < ir->lhs->type->vector_elements; i++) {
> - if (ir->write_mask & (1 << i)) {
> -/* GLSL IR will give us the input to the write-masked assignment
> - * in a single packed vector.  So, for example, if the
> - * writemask is xzw, then we have to swizzle x -> x, y -> z,
> - * and z -> w and get the y component from the load.
> - */
> -srcs[i] = nir_channel(, src, component++);
> - } else {
> -srcs[i] = nir_channel(, >dest.ssa, i);
> - }
> +  for (unsigned i = 0; i < 4; i++) {
> + swiz[i] = ir->write_mask & (1 << i) ? component++ : 0;
>}
> -
> -  src = nir_vec(, srcs, ir->lhs->type->vector_elements);
> +  src = nir_swizzle(, src, swiz, num_components, !supports_ints);
> }
>
> nir_intrinsic_instr *store =
>nir_intrinsic_instr_create(this->shader, nir_intrinsic_store_var);
> store->num_components = ir->lhs->type->vector_elements;
> -   store->const_index[0] = (1 << store->num_components) - 1;
> +   store->const_index[0] = ir->write_mask;
> nir_deref *store_deref = nir_copy_deref(store, _deref->deref);
> store->variables[0] = nir_deref_as_var(store_deref);
> store->src[0] = nir_src_for_ssa(src);
> --
> 2.6.2
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 01/26] nir: Add a writemask to store intrinsics.

2015-12-04 Thread Kenneth Graunke
On Friday, December 04, 2015 04:45:03 PM Jason Ekstrand wrote:
> On Wed, Dec 2, 2015 at 4:15 PM, Kenneth Graunke  wrote:
> > Tessellation control shaders need to be careful when writing outputs.
> > Because multiple threads can concurrently write the same output
> > variables, we need to only write the exact components we were told.
> >
> > Traditionally, for sub-vector writes, we've read the whole vector,
> > updated the temporary, and written the whole vector back.  This breaks
> > down with concurrent access.
> >
> > This patch prepares the way for a solution by adding a writemask field
> > to store_var intrinsics, as well as the other store intrinsics.  It then
> > updates all produces to emit a writemask of "all channels enabled".  It
> > updates nir_lower_io to copy the writemask to output store intrinsics.
> >
> > Finally, it updates nir_lower_vars_to_ssa to handle partial writemasks
> > by doing a read-modify-write cycle (which is safe, because local
> > variables are specific to a single thread).
> >
> > This should have no functional change, since no one actually emits
> > partial writemasks yet.
> >
> > Signed-off-by: Kenneth Graunke 
> > ---
> >  src/gallium/auxiliary/nir/tgsi_to_nir.c|  1 +
> >  .../drivers/freedreno/ir3/ir3_compiler_nir.c   |  6 +++
> >  src/glsl/nir/glsl_to_nir.cpp   |  2 +
> >  src/glsl/nir/nir_builder.h |  4 +-
> >  src/glsl/nir/nir_intrinsics.h  | 14 +++
> >  src/glsl/nir/nir_lower_gs_intrinsics.c |  3 +-
> >  src/glsl/nir/nir_lower_io.c|  2 +
> >  src/glsl/nir/nir_lower_locals_to_regs.c|  2 +-
> >  src/glsl/nir/nir_lower_var_copies.c|  1 +
> >  src/glsl/nir/nir_lower_vars_to_ssa.c   | 46 
> > --
> >  src/glsl/nir/nir_validate.c|  1 +
> >  src/mesa/program/prog_to_nir.c |  2 +
> >  12 files changed, 63 insertions(+), 21 deletions(-)
> >
> > Technically, I don't think I need the ir3 changes here - it should work
> > without any changes.  I think I was just overzealously preparing things
> > to handle writemasks before I realized there shouldn't be any.
> >
> > But, it might be worth doing anyway.  Not sure.
> >
> > Jason and I have already talked about the fact that we're conflicting.
> > I like what he's doing, and he suggested this; we just need to figure out
> > what lands first.  I decided to send these out anyway for people to look
> > at and start reviewing, since there's a lot to do there.  We'll sort it
> > out in v2.
> >
> > diff --git a/src/gallium/auxiliary/nir/tgsi_to_nir.c 
> > b/src/gallium/auxiliary/nir/tgsi_to_nir.c
> > index 86c2ffa..f5547c8 100644
> > --- a/src/gallium/auxiliary/nir/tgsi_to_nir.c
> > +++ b/src/gallium/auxiliary/nir/tgsi_to_nir.c
> > @@ -1894,6 +1894,7 @@ ttn_emit_instruction(struct ttn_compile *c)
> > _dst->Indirect : NULL;
> >
> >store->num_components = 4;
> > +  store->const_index[0] = 0xf;
> >store->variables[0] = ttn_array_deref(c, store, var, offset, 
> > indirect);
> >store->src[0] = nir_src_for_reg(dest.dest.reg.reg);
> >
> > diff --git a/src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c 
> > b/src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c
> > index 8617704..3beed0e 100644
> > --- a/src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c
> > +++ b/src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c
> > @@ -1314,6 +1314,9 @@ emit_intrinisic_store_var(struct ir3_compile *ctx, 
> > nir_intrinsic_instr *intr)
> > case nir_deref_array_type_direct:
> > /* direct access does not require anything special: */
> > for (int i = 0; i < intr->num_components; i++) {
> > +   if (!(intr->const_index[0] & (1 << i)))
> > +   continue;
> > +
> > unsigned n = darr->base_offset * 4 + i;
> > compile_assert(ctx, n < arr->length);
> > arr->arr[n] = src[i];
> > @@ -1326,6 +1329,9 @@ emit_intrinisic_store_var(struct ir3_compile *ctx, 
> > nir_intrinsic_instr *intr)
> > struct ir3_instruction *addr =
> > get_addr(ctx, get_src(ctx, 
> > >indirect)[0]);
> > for (int i = 0; i < intr->num_components; i++) {
> > +   if (!(intr->const_index[0] & (1 << i)))
> > +   continue;
> > +
> > struct ir3_instruction *store;
> > unsigned n = darr->base_offset * 4 + i;
> > compile_assert(ctx, n < arr->length);
> > diff --git a/src/glsl/nir/glsl_to_nir.cpp b/src/glsl/nir/glsl_to_nir.cpp
> > index 45d045c..5c84b0d 100644
> > --- a/src/glsl/nir/glsl_to_nir.cpp
> > +++ 

Re: [Mesa-dev] [PATCH 2/2] svga: use the debug callback to report issues to the state tracker

2015-12-04 Thread Ilia Mirkin
On Fri, Dec 4, 2015 at 7:52 PM, Brian Paul  wrote:
> Use the new debug callback hook to report conformance, performance
> and fallbacks to the state tracker.  The state tracker, in turn can
> report this issues to the user via the GL_ARB_debug_output extension.
>
> More issues can be reported in the future; this is just a start.
>
> v2: remove conditionals around pipe_debug_message() calls since the
> check is now done in the macro itself.
> ---
>  src/gallium/drivers/svga/svga_context.h  |  3 +++
>  src/gallium/drivers/svga/svga_draw_arrays.c  |  5 +
>  src/gallium/drivers/svga/svga_pipe_blend.c   |  9 +
>  src/gallium/drivers/svga/svga_pipe_misc.c| 17 +
>  src/gallium/drivers/svga/svga_pipe_rasterizer.c  |  5 +
>  src/gallium/drivers/svga/svga_state_need_swtnl.c | 23 +++
>  6 files changed, 62 insertions(+)
>
> diff --git a/src/gallium/drivers/svga/svga_context.h 
> b/src/gallium/drivers/svga/svga_context.h
> index 6a4f9d8..c4284cc 100644
> --- a/src/gallium/drivers/svga/svga_context.h
> +++ b/src/gallium/drivers/svga/svga_context.h
> @@ -392,6 +392,9 @@ struct svga_context
>
>boolean no_line_width;
>boolean force_hw_line_stipple;
> +
> +  /** To report perf/conformance/etc issues to the state tracker */
> +  struct pipe_debug_callback callback;
> } debug;
>
> struct {
> diff --git a/src/gallium/drivers/svga/svga_draw_arrays.c 
> b/src/gallium/drivers/svga/svga_draw_arrays.c
> index acb2e95..b02adeb 100644
> --- a/src/gallium/drivers/svga/svga_draw_arrays.c
> +++ b/src/gallium/drivers/svga/svga_draw_arrays.c
> @@ -26,6 +26,7 @@
>  #include "svga_cmd.h"
>
>  #include "util/u_inlines.h"
> +#include "util/u_prim.h"
>  #include "indices/u_indices.h"
>
>  #include "svga_hw_reg.h"
> @@ -277,6 +278,10 @@ svga_hwtnl_draw_arrays(struct svga_hwtnl *hwtnl,
>if (ret != PIPE_OK)
>   goto done;
>
> +  pipe_debug_message(>debug.callback, PERF_INFO,
> + "generating temporary index buffer for drawing %s",
> + u_prim_name(prim));
> +
>ret = svga_hwtnl_simple_draw_range_elements(hwtnl,
>gen_buf,
>gen_size,
> diff --git a/src/gallium/drivers/svga/svga_pipe_blend.c 
> b/src/gallium/drivers/svga/svga_pipe_blend.c
> index 0c9d612..b485703 100644
> --- a/src/gallium/drivers/svga/svga_pipe_blend.c
> +++ b/src/gallium/drivers/svga/svga_pipe_blend.c
> @@ -243,6 +243,15 @@ svga_create_blend_state(struct pipe_context *pipe,
>   blend->rt[i].srcblend_alpha = blend->rt[i].srcblend;
>   blend->rt[i].dstblend_alpha = blend->rt[i].dstblend;
>   blend->rt[i].blendeq_alpha = blend->rt[i].blendeq;
> +
> + if (templ->logicop_func == PIPE_LOGICOP_XOR) {
> +pipe_debug_message(>debug.callback, CONFORMANCE,
> +   "XOR logicop mode has limited support%s", "");
> + }
> + else if (templ->logicop_func != PIPE_LOGICOP_COPY) {
> +pipe_debug_message(>debug.callback, CONFORMANCE,
> +   "general logicops are not supported%s", "");

These %s's should no longer be necessary after my ##__VA_ARGS__ fix right?

> + }
>}
>else {
>   /* Note: the vgpu10 device does not yet support independent
> diff --git a/src/gallium/drivers/svga/svga_pipe_misc.c 
> b/src/gallium/drivers/svga/svga_pipe_misc.c
> index c8020da..af9356d 100644
> --- a/src/gallium/drivers/svga/svga_pipe_misc.c
> +++ b/src/gallium/drivers/svga/svga_pipe_misc.c
> @@ -244,6 +244,22 @@ static void svga_set_viewport_states( struct 
> pipe_context *pipe,
>  }
>
>
> +/**
> + * Called by state tracker to specify a callback function the driver
> + * can use to report info back to the state tracker.
> + */
> +static void
> +svga_set_debug_callback(struct pipe_context *pipe,
> +const struct pipe_debug_callback *cb)
> +{
> +   struct svga_context *svga = svga_context(pipe);
> +
> +   if (cb)
> +  svga->debug.callback = *cb;
> +   else
> +  memset(>debug.callback, 0, sizeof(svga->debug.callback));
> +}
> +
>
>  void svga_init_misc_functions( struct svga_context *svga )
>  {
> @@ -252,6 +268,7 @@ void svga_init_misc_functions( struct svga_context *svga )
> svga->pipe.set_framebuffer_state = svga_set_framebuffer_state;
> svga->pipe.set_clip_state = svga_set_clip_state;
> svga->pipe.set_viewport_states = svga_set_viewport_states;
> +   svga->pipe.set_debug_callback = svga_set_debug_callback;
>  }
>
>
> diff --git a/src/gallium/drivers/svga/svga_pipe_rasterizer.c 
> b/src/gallium/drivers/svga/svga_pipe_rasterizer.c
> index 6310b7a..fa1744f 100644
> --- a/src/gallium/drivers/svga/svga_pipe_rasterizer.c
> +++ b/src/gallium/drivers/svga/svga_pipe_rasterizer.c
> @@ -352,6 +352,11 @@ 

Re: [Mesa-dev] [PATCH v3 10/44] i965: Define and use REG_MASK macro to make masked MMIO writes slightly more readable.

2015-12-04 Thread Kristian Høgsberg
On Tue, Dec 01, 2015 at 12:19:28AM -0800, Jordan Justen wrote:
> From: Francisco Jerez 
> 
> Reviewed-by: Samuel Iglesias Gonsálvez 
> ---
>  src/mesa/drivers/dri/i965/brw_defines.h  | 6 ++
>  src/mesa/drivers/dri/i965/brw_state_upload.c | 2 +-
>  src/mesa/drivers/dri/i965/gen7_l3_state.c| 2 +-
>  src/mesa/drivers/dri/i965/intel_reg.h| 2 +-
>  4 files changed, 9 insertions(+), 3 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_defines.h 
> b/src/mesa/drivers/dri/i965/brw_defines.h
> index a511d5c..94f878b 100644
> --- a/src/mesa/drivers/dri/i965/brw_defines.h
> +++ b/src/mesa/drivers/dri/i965/brw_defines.h
> @@ -41,6 +41,12 @@
>  #define GET_BITS(data, high, low) ((data & INTEL_MASK((high), (low))) >> 
> (low))
>  #define GET_FIELD(word, field) (((word)  & field ## _MASK) >> field ## 
> _SHIFT)
>  
> +/**
> + * For use with masked MMIO registers where the upper 16 bits control which
> + * of the lower bits are committed to the register.
> + */
> +#define REG_MASK(value) ((value) << 16)

Might nice to also take the value of the bit and do

  #define REG_MASK(bit, enable) ( ((bit) << 16) | (enable ? (bit) : 0) )

> +
>  #ifndef BRW_DEFINES_H
>  #define BRW_DEFINES_H
>  
> diff --git a/src/mesa/drivers/dri/i965/brw_state_upload.c 
> b/src/mesa/drivers/dri/i965/brw_state_upload.c
> index aab5c91..3d14d65 100644
> --- a/src/mesa/drivers/dri/i965/brw_state_upload.c
> +++ b/src/mesa/drivers/dri/i965/brw_state_upload.c
> @@ -382,7 +382,7 @@ brw_upload_initial_gpu_state(struct brw_context *brw)
>BEGIN_BATCH(3);
>OUT_BATCH(MI_LOAD_REGISTER_IMM | (3 - 2));
>OUT_BATCH(GEN7_CACHE_MODE_1);
> -  OUT_BATCH((GEN9_PARTIAL_RESOLVE_DISABLE_IN_VC << 16) |
> +  OUT_BATCH(REG_MASK(GEN9_PARTIAL_RESOLVE_DISABLE_IN_VC) |
>  GEN9_PARTIAL_RESOLVE_DISABLE_IN_VC);

and then this becomes

  OUT_BATCH(REG_MASK(GEN9_PARTIAL_RESOLVE_DISABLE_IN_VC, true));

which avoids having to type GEN9_PARTIAL_RESOLVE_DISABLE_IN_VC twice.

> }
> diff --git a/src/mesa/drivers/dri/i965/gen7_l3_state.c 
> b/src/mesa/drivers/dri/i965/gen7_l3_state.c
> index 9aad563..2c692be 100644
> --- a/src/mesa/drivers/dri/i965/gen7_l3_state.c
> +++ b/src/mesa/drivers/dri/i965/gen7_l3_state.c
> @@ -264,7 +264,7 @@ setup_l3_config(struct brw_context *brw, const struct 
> brw_l3_config *cfg)
>   OUT_BATCH(HSW_SCRATCH1);
>   OUT_BATCH(has_dc ? 0 : HSW_SCRATCH1_L3_ATOMIC_DISABLE);
>   OUT_BATCH(HSW_ROW_CHICKEN3);
> - OUT_BATCH(HSW_ROW_CHICKEN3_L3_ATOMIC_DISABLE << 16 |
> + OUT_BATCH(REG_MASK(HSW_ROW_CHICKEN3_L3_ATOMIC_DISABLE) |
> (has_dc ? 0 : HSW_ROW_CHICKEN3_L3_ATOMIC_DISABLE));

and this is even nicer:

  OUT_BATCH(REG_MASK(HSW_ROW_CHICKEN3_L3_ATOMIC_DISABLE, has_dc));

Also, multipe REG_MASK()s can be bitwise or-ed together to set or
clear multiple bits.

>   ADVANCE_BATCH();
>}
> diff --git a/src/mesa/drivers/dri/i965/intel_reg.h 
> b/src/mesa/drivers/dri/i965/intel_reg.h
> index 0b167d5..d6f 100644
> --- a/src/mesa/drivers/dri/i965/intel_reg.h
> +++ b/src/mesa/drivers/dri/i965/intel_reg.h
> @@ -183,7 +183,7 @@
>  # define GEN8_HIZ_NP_EARLY_Z_FAILS_DISABLE (1 << 13)
>  # define GEN9_PARTIAL_RESOLVE_DISABLE_IN_VC (1 << 1)
>  # define GEN8_HIZ_PMA_MASK_BITS \
> -   ((GEN8_HIZ_NP_PMA_FIX_ENABLE | GEN8_HIZ_NP_EARLY_Z_FAILS_DISABLE) << 16)
> +   REG_MASK(GEN8_HIZ_NP_PMA_FIX_ENABLE | GEN8_HIZ_NP_EARLY_Z_FAILS_DISABLE)
>  
>  /* Predicate registers */
>  #define MI_PREDICATE_SRC0   0x2400
> -- 
> 2.6.2
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] gallium/util: move conditional from _pipe_debug_message() to macro

2015-12-04 Thread Ilia Mirkin
Actually nouveau can, in some situations, pass in a null debug object.
I have a thing in the fence wait to say how long it waited, which can
be invoked without a pipe_context, and hence no debug object. Other
(most) times it will be invoked with a pipe_context in hand and be
able to log the messages.

Oh, but perhaps for other reasons nouveau actually checks if it's null
before calling pipe_debug_message. Either way, I think it's a nice
property that the object is allowed to be null... I'd rather keep that
option open.

  -ilia


On Fri, Dec 4, 2015 at 7:52 PM, Brian Paul  wrote:
> To avoid function calls when the pointer is null.
>
> Also, assert that the pipe_debug_callback object is non-null.  It looks
> like all the drivers which use this feature never pass a non-null pointer.
> ---
>  src/gallium/auxiliary/util/u_debug.c | 3 +--
>  src/gallium/auxiliary/util/u_debug.h | 9 ++---
>  2 files changed, 7 insertions(+), 5 deletions(-)
>
> diff --git a/src/gallium/auxiliary/util/u_debug.c 
> b/src/gallium/auxiliary/util/u_debug.c
> index 7029536..c42505d 100644
> --- a/src/gallium/auxiliary/util/u_debug.c
> +++ b/src/gallium/auxiliary/util/u_debug.c
> @@ -79,8 +79,7 @@ _pipe_debug_message(
>  {
> va_list args;
> va_start(args, fmt);
> -   if (cb && cb->debug_message)
> -  cb->debug_message(cb->data, id, type, fmt, args);
> +   cb->debug_message(cb->data, id, type, fmt, args);
> va_end(args);
>  }
>
> diff --git a/src/gallium/auxiliary/util/u_debug.h 
> b/src/gallium/auxiliary/util/u_debug.h
> index 9900703..4cc3d63 100644
> --- a/src/gallium/auxiliary/util/u_debug.h
> +++ b/src/gallium/auxiliary/util/u_debug.h
> @@ -268,9 +268,12 @@ void _debug_assert_fail(const char *expr,
>   */
>  #define pipe_debug_message(cb, type, fmt, ...) do { \
> static unsigned id = 0; \
> -   _pipe_debug_message(cb, , \
> -   PIPE_DEBUG_TYPE_ ## type, \
> -   fmt, ##__VA_ARGS__); \
> +   assert(cb); \
> +   if ((cb)->debug_message) { \
> +  _pipe_debug_message(cb, , \
> +  PIPE_DEBUG_TYPE_ ## type, \
> +  fmt, ##__VA_ARGS__); \
> +   } \
>  } while (0)
>
>  struct pipe_debug_callback;
> --
> 1.9.1
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] gallium/util: move conditional from _pipe_debug_message() to macro

2015-12-04 Thread Jose Fonseca
I suspect that adding the check in the macro amight cause compilers to 
warn that "taking the address of cb can never be false".



Either way, series is

Reviewed-by: Jose Fonseca 



On 05/12/15 02:17, Ilia Mirkin wrote:

Actually nouveau can, in some situations, pass in a null debug object.
I have a thing in the fence wait to say how long it waited, which can
be invoked without a pipe_context, and hence no debug object. Other
(most) times it will be invoked with a pipe_context in hand and be
able to log the messages.

Oh, but perhaps for other reasons nouveau actually checks if it's null
before calling pipe_debug_message. Either way, I think it's a nice
property that the object is allowed to be null... I'd rather keep that
option open.

   -ilia


On Fri, Dec 4, 2015 at 7:52 PM, Brian Paul  wrote:

To avoid function calls when the pointer is null.

Also, assert that the pipe_debug_callback object is non-null.  It looks
like all the drivers which use this feature never pass a non-null pointer.
---
  src/gallium/auxiliary/util/u_debug.c | 3 +--
  src/gallium/auxiliary/util/u_debug.h | 9 ++---
  2 files changed, 7 insertions(+), 5 deletions(-)

diff --git a/src/gallium/auxiliary/util/u_debug.c 
b/src/gallium/auxiliary/util/u_debug.c
index 7029536..c42505d 100644
--- a/src/gallium/auxiliary/util/u_debug.c
+++ b/src/gallium/auxiliary/util/u_debug.c
@@ -79,8 +79,7 @@ _pipe_debug_message(
  {
 va_list args;
 va_start(args, fmt);
-   if (cb && cb->debug_message)
-  cb->debug_message(cb->data, id, type, fmt, args);
+   cb->debug_message(cb->data, id, type, fmt, args);
 va_end(args);
  }

diff --git a/src/gallium/auxiliary/util/u_debug.h 
b/src/gallium/auxiliary/util/u_debug.h
index 9900703..4cc3d63 100644
--- a/src/gallium/auxiliary/util/u_debug.h
+++ b/src/gallium/auxiliary/util/u_debug.h
@@ -268,9 +268,12 @@ void _debug_assert_fail(const char *expr,
   */
  #define pipe_debug_message(cb, type, fmt, ...) do { \
 static unsigned id = 0; \
-   _pipe_debug_message(cb, , \
-   PIPE_DEBUG_TYPE_ ## type, \
-   fmt, ##__VA_ARGS__); \
+   assert(cb); \
+   if ((cb)->debug_message) { \
+  _pipe_debug_message(cb, , \
+  PIPE_DEBUG_TYPE_ ## type, \
+  fmt, ##__VA_ARGS__); \
+   } \
  } while (0)

  struct pipe_debug_callback;
--
1.9.1



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 0/3] misc janitorial

2015-12-04 Thread Giuseppe Bilotta
> Hmm I thought I pushed those in a couple of days ago. Doing it now, thanks !

No, thank you 8-)

>> This is one of those
>> contexts in which things could be done marginally more elegantly with
>> C++. I am proficient enough with C macros and xincludes that I could
>> probably hack together something that avoids the symlinks and the code
>> duplication, but whether or not this would be considered an
>> improvement is very much up to personal taste.
>>
> Before you start moving things around please fish around for
> RADEON_R(12)00. Once things are free of those (either the macros are
> just removed or the functions are split into r100 and r200 variant,
> all the movement comes into play. The last one can be tedious if
> you're unfamiliar with autotools, so just ping me on IRC (not there
> today) and I can guide you around.

I've started looking into the defines and it's not just a matter of
tediousness. The problem with splitting out the conditional functions
is with the callers. Consider for example
radeon_map_renderbuffer_s8z24, which uses get_depth_z32, which has
different forms for R100 and R200. The caller could go in the common
library, since the code is exactly the same, but it would need to know
which get_depth_z32 incarnation to invoke. What would be a nice way to
approach this? Function pointers? Runtime switch? Code duplication of
the entire call chain? Other?

-- 
Giuseppe "Oblomov" Bilotta
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] nir: Optimize useless comparisons against true/false.

2015-12-04 Thread Neil Roberts
Matt Turner  writes:

>  # Written in the form (, ) where  is an expression
>  # and  is either an expression or a value.  An expression is
>  # defined as a tuple of the form (, , , , )
> @@ -94,6 +97,8 @@ optimizations = [
> (('inot', ('ige', a, b)), ('ilt', a, b)),
> (('inot', ('ieq', a, b)), ('ine', a, b)),
> (('inot', ('ine', a, b)), ('ieq', a, b)),
> +   (('ieq', 'a@bool', true), a),
> +   (('ine', 'a@bool', false), a),

I think this second one is already in the file on line 187 here:

   # Boolean simplifications
   (('ine', 'a@bool', 0), 'a'),
   (('ieq', 'a@bool', 0), ('inot', 'a')),

Maybe you could add the first one near these two? It could be good to
add a line like this as well to complete the set:

   (('ine', 'a@bool', true), ('inot', a)),

Regards,
- Neil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 06/10] i965/fs: Get rid of load/store_foo_indirect

2015-12-04 Thread Jason Ekstrand
On Wed, Dec 2, 2015 at 12:33 AM, Kenneth Graunke  wrote:
> On Wednesday, November 25, 2015 09:00:09 PM Jason Ekstrand wrote:
>> ---
>>  src/mesa/drivers/dri/i965/brw_fs.h   |   3 +-
>>  src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 145 
>> ++-
>>  src/mesa/drivers/dri/i965/brw_nir.c  |  36 ++--
>>  3 files changed, 92 insertions(+), 92 deletions(-)
>>
>> diff --git a/src/mesa/drivers/dri/i965/brw_fs.h 
>> b/src/mesa/drivers/dri/i965/brw_fs.h
>> index 2d408b2..b8891dd 100644
>> --- a/src/mesa/drivers/dri/i965/brw_fs.h
>> +++ b/src/mesa/drivers/dri/i965/brw_fs.h
>> @@ -296,8 +296,7 @@ public:
>> unsigned stream_id);
>> void emit_gs_thread_end();
>> void emit_gs_input_load(const fs_reg , const nir_src _src,
>> -   const fs_reg _offset, unsigned 
>> imm_offset,
>> -   unsigned num_components);
>> +   const nir_src _src, unsigned 
>> num_components);
>> void emit_cs_terminate();
>> fs_reg *emit_cs_local_invocation_id_setup();
>> fs_reg *emit_cs_work_group_id_setup();
>> diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp 
>> b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
>> index c439da2..b2e6ee8 100644
>> --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
>> +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
>> @@ -1601,25 +1601,26 @@ fs_visitor::emit_gs_vertex(const nir_src 
>> _count_nir_src,
>>  void
>>  fs_visitor::emit_gs_input_load(const fs_reg ,
>> const nir_src _src,
>> -   const fs_reg _offset,
>> -   unsigned imm_offset,
>> +   const nir_src _src,
>> unsigned num_components)
>>  {
>> struct brw_gs_prog_data *gs_prog_data = (struct brw_gs_prog_data *) 
>> prog_data;
>>
>> +   nir_const_value *vertex_const = nir_src_as_const_value(vertex_src);
>> +   nir_const_value *offset_const = nir_src_as_const_value(offset_src);
>> +   const unsigned push_reg_count = gs_prog_data->base.urb_read_length * 8;
>> +
>> /* Offset 0 is the VUE header, which contains VARYING_SLOT_LAYER [.y],
>>  * VARYING_SLOT_VIEWPORT [.z], and VARYING_SLOT_PSIZ [.w].  Only
>>  * gl_PointSize is available as a GS input, however, so it must be that.
>>  */
>> const bool is_point_size =
>> -  indirect_offset.file == BAD_FILE && imm_offset == 0;
>> -
>> -   nir_const_value *vertex_const = nir_src_as_const_value(vertex_src);
>> -   const unsigned push_reg_count = gs_prog_data->base.urb_read_length * 8;
>> +  offset_const != NULL && offset_const->u[0] == 0;
>>
>> -   if (indirect_offset.file == BAD_FILE && vertex_const != NULL &&
>> -   4 * imm_offset < push_reg_count) {
>> -  imm_offset = 4 * imm_offset + vertex_const->u[0] * push_reg_count;
>> +   if (offset_const != NULL && vertex_const != NULL &&
>> +   4 * offset_const->u[0] < push_reg_count) {
>
> Hmm, I thought you were planning to write a NIR helper that looked if an
> expression was ( + ) and return both pieces?  As is,
> this will generate different code...I used to use the global offset for
> the constant part, and the per-slot offsets for only the non-constant
> part.  I think that can save some ADD instructions.

I'll be sending out a V2 that puts the base_offset back in but in a
well-defined and sane manner.  Eric and rob both want to keep the
base+offset pair for opaque storage.

>> +  int imm_offset = offset_const->u[0] * 4 +
>> +   vertex_const->u[0] * push_reg_count;
>>/* This input was pushed into registers. */
>>if (is_point_size) {
>>   /* gl_PointSize comes in .w */
>> @@ -1681,21 +1682,21 @@ fs_visitor::emit_gs_input_load(const fs_reg ,
>>}
>>
>>fs_inst *inst;
>> -  if (indirect_offset.file == BAD_FILE) {
>> +  if (offset_const) {
>>   /* Constant indexing - use global offset. */
>>   inst = bld.emit(SHADER_OPCODE_URB_READ_SIMD8, dst, icp_handle);
>> - inst->offset = imm_offset;
>> + inst->offset = offset_const->u[0];
>>   inst->base_mrf = -1;
>>   inst->mlen = 1;
>>   inst->regs_written = num_components;
>>} else {
>>   /* Indirect indexing - use per-slot offsets as well. */
>> - const fs_reg srcs[] = { icp_handle, indirect_offset };
>> + const fs_reg srcs[] = { icp_handle, get_nir_src(offset_src) };
>>   fs_reg payload = bld.vgrf(BRW_REGISTER_TYPE_UD, 2);
>>   bld.LOAD_PAYLOAD(payload, srcs, ARRAY_SIZE(srcs), 0);
>>
>>   inst = bld.emit(SHADER_OPCODE_URB_READ_SIMD8_PER_SLOT, dst, 
>> payload);
>> - inst->offset = imm_offset;
>> + inst->offset = 0;
>>   inst->base_mrf = -1;
>>   inst->mlen = 2;
>>   inst->regs_written = num_components;
>> @@ -1761,16 +1762,11 @@ 

[Mesa-dev] [PATCH 2/3] winsys/amdgpu: addrlib - port Checks mip 0 for czDispCompatible

2015-12-04 Thread Marek Olšák
From: Sonny Jiang 

Change-Id: I3a6ae6b76b13aa95103f68ca3a2a8d03d5d0aeb1
Signed-off-by: Sonny Jiang 

Reviewed-by: Alex Deucher 
---
 src/gallium/winsys/amdgpu/drm/addrlib/r800/egbaddrlib.cpp | 4 +++-
 src/gallium/winsys/amdgpu/drm/addrlib/r800/egbaddrlib.h   | 3 ++-
 2 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/src/gallium/winsys/amdgpu/drm/addrlib/r800/egbaddrlib.cpp 
b/src/gallium/winsys/amdgpu/drm/addrlib/r800/egbaddrlib.cpp
index 110e3d0..088b645 100644
--- a/src/gallium/winsys/amdgpu/drm/addrlib/r800/egbaddrlib.cpp
+++ b/src/gallium/winsys/amdgpu/drm/addrlib/r800/egbaddrlib.cpp
@@ -352,6 +352,7 @@ BOOL_32 EgBasedAddrLib::ComputeSurfaceInfoMicroTiled(
 ComputeSurfaceAlignmentsMicroTiled(expTileMode,
pIn->bpp,
pIn->flags,
+   pIn->mipLevel,
numSamples,
>baseAlign,
>pitchAlign,
@@ -647,6 +648,7 @@ BOOL_32 EgBasedAddrLib::ComputeSurfaceAlignmentsMicroTiled(
 AddrTileModetileMode,  ///< [in] tile mode
 UINT_32 bpp,   ///< [in] bits per pixel
 ADDR_SURFACE_FLAGS  flags, ///< [in] surface flags
+UINT_32 mipLevel,  ///< [in] mip level
 UINT_32 numSamples,///< [in] number of samples
 UINT_32*pBaseAlign,///< [out] base address alignment 
in bytes
 UINT_32*pPitchAlign,   ///< [out] pitch alignment in pixels
@@ -669,7 +671,7 @@ BOOL_32 EgBasedAddrLib::ComputeSurfaceAlignmentsMicroTiled(
 // ECR#393489
 // Workaround 2 for 1D tiling -  There is HW bug for Carrizo
 // where it requires the following alignments for 1D tiling.
-if (flags.czDispCompatible)
+if (flags.czDispCompatible && (mipLevel == 0))
 {
 *pBaseAlign  = PowTwoAlign(*pBaseAlign, 4096); 
//Base address MOD 4096 = 0
 *pPitchAlign = PowTwoAlign(*pPitchAlign, 512 / (BITS_TO_BYTES(bpp))); 
//(8 lines * pitch * bytes per pixel) MOD 4096 = 0
diff --git a/src/gallium/winsys/amdgpu/drm/addrlib/r800/egbaddrlib.h 
b/src/gallium/winsys/amdgpu/drm/addrlib/r800/egbaddrlib.h
index 84adb66..25e3896 100644
--- a/src/gallium/winsys/amdgpu/drm/addrlib/r800/egbaddrlib.h
+++ b/src/gallium/winsys/amdgpu/drm/addrlib/r800/egbaddrlib.h
@@ -315,7 +315,8 @@ private:
 UINT_32* pBaseAlign, UINT_32* pPitchAlign, UINT_32* pHeightAlign) 
const;
 
 BOOL_32 ComputeSurfaceAlignmentsMicroTiled(
-AddrTileMode tileMode, UINT_32 bpp, ADDR_SURFACE_FLAGS flags, UINT_32 
numSamples,
+AddrTileMode tileMode, UINT_32 bpp, ADDR_SURFACE_FLAGS flags,
+UINT_32 mipLevel, UINT_32 numSamples,
 UINT_32* pBaseAlign, UINT_32* pPitchAlign, UINT_32* pHeightAlign) 
const;
 
 BOOL_32 ComputeSurfaceAlignmentsMacroTiled(
-- 
2.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/3] winsys/amdgpu: addrlib - port fix error for workaround for 1D tiling

2015-12-04 Thread Marek Olšák
From: Sonny Jiang 

Change-Id: I1383d2b2670ed02bc6a6761531ee20f27fd6492f
Signed-off-by: Sonny Jiang 

Reviewed-by: Alex Deucher 
---
 src/gallium/winsys/amdgpu/drm/addrlib/r800/egbaddrlib.cpp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/gallium/winsys/amdgpu/drm/addrlib/r800/egbaddrlib.cpp 
b/src/gallium/winsys/amdgpu/drm/addrlib/r800/egbaddrlib.cpp
index b1e008b..110e3d0 100644
--- a/src/gallium/winsys/amdgpu/drm/addrlib/r800/egbaddrlib.cpp
+++ b/src/gallium/winsys/amdgpu/drm/addrlib/r800/egbaddrlib.cpp
@@ -672,7 +672,7 @@ BOOL_32 EgBasedAddrLib::ComputeSurfaceAlignmentsMicroTiled(
 if (flags.czDispCompatible)
 {
 *pBaseAlign  = PowTwoAlign(*pBaseAlign, 4096); 
//Base address MOD 4096 = 0
-*pPitchAlign = PowTwoAlign(*pPitchAlign, 512 >> (BITS_TO_BYTES(bpp))); 
//(8 lines * pitch * bytes per pixel) MOD 4096 = 0
+*pPitchAlign = PowTwoAlign(*pPitchAlign, 512 / (BITS_TO_BYTES(bpp))); 
//(8 lines * pitch * bytes per pixel) MOD 4096 = 0
 }
 // end Carrizo workaround for 1D tilling
 
-- 
2.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/3] winsys/amdgpu: addrlib - port a Fiji bug fix

2015-12-04 Thread Marek Olšák
From: Sonny Jiang 

Fiji: Fixed tiled resource failures

Change-Id: Idbad4c09ded6dd5cb28d6342f9b04a3345377c82
Signed-off-by: Sonny Jiang 

Reviewed-by: Alex Deucher 
---
 .../winsys/amdgpu/drm/addrlib/r800/ciaddrlib.cpp   | 45 +-
 .../winsys/amdgpu/drm/addrlib/r800/ciaddrlib.h |  2 +
 2 files changed, 46 insertions(+), 1 deletion(-)

diff --git a/src/gallium/winsys/amdgpu/drm/addrlib/r800/ciaddrlib.cpp 
b/src/gallium/winsys/amdgpu/drm/addrlib/r800/ciaddrlib.cpp
index 7393953..995f276 100644
--- a/src/gallium/winsys/amdgpu/drm/addrlib/r800/ciaddrlib.cpp
+++ b/src/gallium/winsys/amdgpu/drm/addrlib/r800/ciaddrlib.cpp
@@ -896,6 +896,49 @@ BOOL_32 CIAddrLib::HwlOverrideTileMode(
 
 /**
 
***
+*   CiAddrLib::GetPrtSwitchP4Threshold
+*
+*   @brief
+*   Return the threshold of switching to P4_* instead of P16_* for PRT 
resources
+***
+*/
+UINT_32 CiAddrLib::GetPrtSwitchP4Threshold() const
+{
+UINT_32 threshold;
+
+switch (m_pipes)
+{
+case 8:
+threshold = 32;
+break;
+case 16:
+if (m_settings.isFiji)
+{
+threshold = 16;
+}
+else if (m_settings.isHawaii)
+{
+threshold = 8;
+}
+else
+{
+///@todo add for possible new ASICs.
+ADDR_ASSERT_ALWAYS();
+threshold = 16;
+}
+break;
+default:
+///@todo add for possible new ASICs.
+ADDR_ASSERT_ALWAYS();
+threshold = 32;
+break;
+}
+
+return threshold;
+}
+
+/**
+***
 *   CIAddrLib::HwlSetupTileInfo
 *
 *   @brief
@@ -1123,7 +1166,7 @@ VOID CIAddrLib::HwlSetupTileInfo(
 {
 UINT_32 bytesXSamples = bpp * numSamples / 8;
 UINT_32 bytesXThickness = bpp * thickness / 8;
-UINT_32 switchP4Threshold = (m_pipes == 16) ? 8 : 32;
+UINT_32 switchP4Threshold = GetPrtSwitchP4Threshold();
 
 if ((bytesXSamples > switchP4Threshold) || (bytesXThickness > 
switchP4Threshold))
 {
diff --git a/src/gallium/winsys/amdgpu/drm/addrlib/r800/ciaddrlib.h 
b/src/gallium/winsys/amdgpu/drm/addrlib/r800/ciaddrlib.h
index 4515086..4cbe970 100644
--- a/src/gallium/winsys/amdgpu/drm/addrlib/r800/ciaddrlib.h
+++ b/src/gallium/winsys/amdgpu/drm/addrlib/r800/ciaddrlib.h
@@ -167,6 +167,8 @@ private:
 VOID ReadGbMacroTileCfg(
 UINT_32 regValue, ADDR_TILEINFO* pCfg) const;
 
+UINT_32 GetPrtSwitchP4Threshold() const;
+
 BOOL_32 InitTileSettingTable(
 const UINT_32 *pSetting, UINT_32 noOfEntries);
 
-- 
2.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 79688] [dri3] Latest git breaks PRIME Offloading to Nouveau GPU

2015-12-04 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=79688

--- Comment #22 from poma  ---
(In reply to Tobias Klausmann from comment #21)
> (In reply to Axel Davy from comment #20)
> > Should this bug report be closed ?
> > DRI3 has complete DRI_PRIME support now.
> > 
> 
> As far as it concerns me, DRI3 Support is fine and this bug can be marked as
> "resolved fixed". 
> 

Do you refer to DRI3 in general, or is it DRI_PRIME?

Tobias, while DRI3 enabled, can you run:
$ ssh -Y [user@]hostname firefox

What is the result?

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 79688] [dri3] Latest git breaks PRIME Offloading to Nouveau GPU

2015-12-04 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=79688

--- Comment #23 from poma  ---
(In reply to Axel Davy from comment #20)
> Should this bug report be closed ?
> DRI3 has complete DRI_PRIME support now.
> 

Actually, I have some questions, but if you closing, bye bye.

> Also I'm not sure why you do post these benchmarks here.
> Is there some hidden message about something not working right ?

For me these are ordinary tests, no benchmarks, to see is it working, working
properly, or not.

phoronix.com has some benchmarks.

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] radeonsi: disable DCC on Stoney

2015-12-04 Thread Marek Olšák
From: Marek Olšák 

Cc: 11.0 11.1 
---
 src/gallium/drivers/radeon/r600_texture.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/src/gallium/drivers/radeon/r600_texture.c 
b/src/gallium/drivers/radeon/r600_texture.c
index e2947d9..e7b2aea 100644
--- a/src/gallium/drivers/radeon/r600_texture.c
+++ b/src/gallium/drivers/radeon/r600_texture.c
@@ -497,6 +497,10 @@ static void vi_texture_alloc_dcc_separate(struct 
r600_common_screen *rscreen,
if (rscreen->debug_flags & DBG_NO_DCC)
return;
 
+   /* TODO: DCC is broken on Stoney */
+   if (rscreen->family == CHIP_STONEY)
+   return;
+
rtex->dcc_buffer = (struct r600_resource *)
r600_aligned_buffer_create(>b, PIPE_BIND_CUSTOM,
   PIPE_USAGE_DEFAULT, rtex->surface.dcc_size, 
rtex->surface.dcc_alignment);
-- 
2.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev