Re: [Mesa-dev] [PATCH v4 0/6] nouveau: add support for vaapi
Hi Ilia, Please give another try with these patches (i.e. 1, 2, 3 and 5 from this patch series. 4 and 6 are already merged since I have resubmit them afterward and split in several series). I rebased these four patches on top of latest master here: https://github.com/CapOM/mesa/commits/st_va_nouveau_chunk_decoding (there was no conflict during rebase) h264 is now working properly :), problem was not from these patches but from st/va. Fixes been merged recently (see the 2 latest patches in st/va) Let me know if you see anything wrong when playing video compared to vdpau. Cheers Julien On 10 November 2015 at 09:28, Ilia Mirkinwrote: > On Tue, Nov 10, 2015 at 4:09 AM, Julien Isorce > wrote: > > Hi, > > > > I found some difference in the content of dec->bsp_bo[i], for h264 when > > using st/vdpau (ok) and st/vaapi (ko). > > > > In src/gallium/state_trackers/va/picture.c, at least the reference frames > > are not set. At minimum it is missing something like the following though > > still not enough: > > > > @@ -193,7 +194,23 @@ handlePictureParameterBuffer(vlVaDriver *drv, > > vlVaContext *context, vlVaBuffer * > >context->desc.h264.pps->redundant_pic_cnt_present_flag = > > h264->pic_fields.bits.redundant_pic_cnt_present_flag; > >/*reference_pic_flag*/ > > + context->desc.h264.is_reference = > > h264->pic_fields.bits.reference_pic_flag; > >context->desc.h264.frame_num = h264->frame_num; > > + > > + for (i = 0; i < 16; ++i) { > > + if ((h264->ReferenceFrames[i].flags & VA_PICTURE_H264_INVALID) > || > > + (h264->ReferenceFrames[i].picture_id == > VA_INVALID_SURFACE)) > > +break; > > + > > + getReferenceFrame(drv, h264->ReferenceFrames[i].picture_id, > > >desc.h264.ref[i]); > > + > > + context->desc.h264.field_order_cnt_list[i][0] = > > h264->ReferenceFrames[i].TopFieldOrderCnt; > > + context->desc.h264.field_order_cnt_list[i][1] = > > h264->ReferenceFrames[i].BottomFieldOrderCnt; > > + context->desc.h264.frame_num_list[i] = > > h264->ReferenceFrames[i].frame_idx; > > + } > > > > I am surprised that "getReferenceFrame" is not called at all in the h264 > > case. > > Reference frames are a huge part of h264 (all the MPEG codecs in > fact), so I'm guessing there's some different way we're supposed to be > retrieving the information. Perhaps look at where uvd gets it from? > > -ilia > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 0/1] Do not require all components to apply opt_vector_float()
On Thu, 2015-12-03 at 11:04 -0800, Matt Turner wrote: > The GAINED: 3 is almost certainly because of a sporadic failure in > shader-db (or Mesa...?) during the baseline shader-db run. When you > ran shader-db with your patch applied, the same failure didn't occur, > so report.py thinks this means 3 programs were gained. When I see > things like that, I manually rerun that specific shader and append > its > results to the appropriate file and rerun report.py. > Thanks for the info. I'll take in account it next time. > > I sent it ("[PATCH] i965: Don't mark dead instructions' sources > live.") Nov 25 (and Cc'd you). Ken reviewed it, and I pushed it two > days ago as commit 48b4e88. Oh, right. I saw it now. Thanks! J.A. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 92850] Segfault loading War Thunder
https://bugs.freedesktop.org/show_bug.cgi?id=92850 --- Comment #45 from har...@gmx.de --- Created attachment 120336 --> https://bugs.freedesktop.org/attachment.cgi?id=120336=edit backtrace with activated -finline-small-functions -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 92850] Segfault loading War Thunder
https://bugs.freedesktop.org/show_bug.cgi?id=92850 --- Comment #46 from har...@gmx.de --- here is the code from ir.h where it fails: /** * Determine the number of operands used by an expression */ static unsigned int get_num_operands(ir_expression_operation); /** * Determine the number of operands used by an expression */ unsigned int get_num_operands() const { return (this->operation == ir_quadop_vector) ? this->type->vector_elements : get_num_operands(operation); } Is it strange enough to fail if compiled inline in this code below? for (operand = 0; operand < ir->get_num_operands(); operand++) { this->result.file = PROGRAM_UNDEFINED; ir->operands[operand]->accept(this); if (this->result.file == PROGRAM_UNDEFINED) { printf("Failed to get tree for expression operand:\n"); ir->operands[operand]->print(); printf("\n"); exit(1); } op[operand] = this->result; /* Matrix expression operands should have been broken down to vector * operations already. */ assert(!ir->operands[operand]->type->is_matrix()); } -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 92850] Segfault loading War Thunder
https://bugs.freedesktop.org/show_bug.cgi?id=92850 --- Comment #44 from har...@gmx.de --- (In reply to bellamorte42 from comment #40) > compiling with -O3 -fno-inline-small-functions works. yep, i can confirm that '-finline-small-functions' is the black sheep. -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [RFC 2/3] gallium: Move nv50 clear_texture impl down to util_surface
On Fri, Dec 4, 2015 at 1:37 AM, Ilia Mirkinwrote: > On Thu, Dec 3, 2015 at 7:25 PM, Roland Scheidegger wrote: >> Am 03.12.2015 um 23:48 schrieb Ilia Mirkin: >>> On Thu, Dec 3, 2015 at 4:44 AM, Edward O'Callaghan >>> wrote: ARB_clear_texture is reasonably generic enough that it should be moved down to become part of the fallback mechanism of pipe->clear_texture. Signed-off-by: Edward O'Callaghan --- src/gallium/auxiliary/util/u_surface.c | 83 + src/gallium/auxiliary/util/u_surface.h | 6 ++ src/gallium/drivers/nouveau/nv50/nv50_surface.c | 67 +--- 3 files changed, 90 insertions(+), 66 deletions(-) diff --git a/src/gallium/auxiliary/util/u_surface.c b/src/gallium/auxiliary/util/u_surface.c index 6aa44f9..e7ab175 100644 --- a/src/gallium/auxiliary/util/u_surface.c +++ b/src/gallium/auxiliary/util/u_surface.c @@ -36,6 +36,7 @@ #include "pipe/p_screen.h" #include "pipe/p_state.h" +#include "util/u_math.h" #include "util/u_format.h" #include "util/u_inlines.h" #include "util/u_rect.h" @@ -547,6 +548,88 @@ util_clear_depth_stencil(struct pipe_context *pipe, } } +/** + * Fallback for pipe->clear_texture() function. + * clears a non-PIPE_BUFFER resource's specified level + * and bounding box with a clear value provided in that + * resource's native format. + * + * XXX sf->format = .. is problematic as hw need + * not nessarily support the format. + */ +void +util_surface_clear_texture(struct pipe_context *pipe, + struct pipe_resource *res, + unsigned level, + const struct pipe_box *box, + const void *data) +{ + struct pipe_surface tmpl = {{0}}, *sf; + + tmpl.format = res->format; + tmpl.u.tex.first_layer = box->z; + tmpl.u.tex.last_layer = box->z + box->depth - 1; + tmpl.u.tex.level = level; + sf = pipe->create_surface(pipe, res, ); + if (!sf) + return; + + if (util_format_is_depth_or_stencil(res->format)) { + float depth = 0; + uint8_t stencil = 0; + unsigned clear = 0; + const struct util_format_description *desc = + util_format_description(res->format); + + if (util_format_has_depth(desc)) { + clear |= PIPE_CLEAR_DEPTH; + desc->unpack_z_float(, 0, data, 0, 1, 1); + } + if (util_format_has_stencil(desc)) { + clear |= PIPE_CLEAR_STENCIL; + desc->unpack_s_8uint(, 0, data, 0, 1, 1); + } + pipe->clear_depth_stencil(pipe, sf, clear, depth, stencil, +box->x, box->y, box->width, box->height); + } else { + union pipe_color_union color; + + switch (util_format_get_blocksizebits(res->format)) { + case 128: + sf->format = PIPE_FORMAT_R32G32B32A32_UINT; + memcpy(, data, 128 / 8); + break; + case 64: + sf->format = PIPE_FORMAT_R32G32_UINT; + memcpy(, data, 64 / 8); + memset([2], 0, 64 / 8); + break; + case 32: + sf->format = PIPE_FORMAT_R32_UINT; + memcpy(, data, 32 / 8); + memset([1], 0, 96 / 8); + break; + case 16: + sf->format = PIPE_FORMAT_R16_UINT; + color.ui[0] = util_cpu_to_le32( +util_le16_to_cpu(*(unsigned short *)data)); + memset([1], 0, 96 / 8); + break; + case 8: + sf->format = PIPE_FORMAT_R8_UINT; + color.ui[0] = util_cpu_to_le32(*(unsigned char *)data); + memset([1], 0, 96 / 8); + break; + default: + assert(!"Unknown texel element size"); + return; + } + + pipe->clear_render_target(pipe, sf, , +box->x, box->y, box->width, box->height); >>> >>> I only recently realized this, but I'm fairly sure this is wrong -- >>> needs to be divided by util_format_blockwidth/height, otherwise we'll >>> go way out of bounds for compressed formats. [And yes, >>> nv50_clear_texture has the same problem.] >> You can't clear compressed textures with ARB_clear_texture, so I guess >> this shouldn't be a problem. > > Ah yes, indeed. That's why I didn't get any weird failurs :) > >> Even if you think it should work in gallium, with the format illegally >> being changed after-the-fact (after creating the surface) it's not even >>
[Mesa-dev] [PATCH] st/va: WIP remove nonesense HEVC picture id handling
From: Christian KönigThe picture id in this case is a VA-API surface handle, checking for a certain value is can't be correct. Does anybody have any idea what the original intention was and how to fix it correctly? Signed-off-by: Christian König --- src/gallium/state_trackers/va/picture_hevc.c | 5 - 1 file changed, 5 deletions(-) diff --git a/src/gallium/state_trackers/va/picture_hevc.c b/src/gallium/state_trackers/va/picture_hevc.c index dc66b0f..28743ee 100644 --- a/src/gallium/state_trackers/va/picture_hevc.c +++ b/src/gallium/state_trackers/va/picture_hevc.c @@ -159,11 +159,6 @@ void vlVaHandlePictureParameterBufferHEVC(vlVaDriver *drv, vlVaContext *context, for (i = 0 ; i < 15 ; i++) { context->desc.h265.PicOrderCntVal[i] = hevc->ReferenceFrames[i].pic_order_cnt; - unsigned int index = hevc->ReferenceFrames[i].picture_id & 0x7F; - - if (index == 0x7F) - continue; - vlVaGetReferenceFrame(drv, hevc->ReferenceFrames[i].picture_id, >desc.h265.ref[i]); if ((hevc->ReferenceFrames[i].flags & VA_PICTURE_HEVC_RPS_ST_CURR_BEFORE) && (iBefore < 8)) { -- 2.5.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 4/7] gallium/winsys: Make use of ARRAY_SIZE macro
Reviewed-by: Marek OlšákMarek On Fri, Dec 4, 2015 at 7:50 AM, Edward O'Callaghan wrote: > Signed-off-by: Edward O'Callaghan > --- > src/gallium/winsys/amdgpu/drm/amdgpu_surface.c | 6 ++ > 1 file changed, 2 insertions(+), 4 deletions(-) > > diff --git a/src/gallium/winsys/amdgpu/drm/amdgpu_surface.c > b/src/gallium/winsys/amdgpu/drm/amdgpu_surface.c > index 3006bd1..4c837a8 100644 > --- a/src/gallium/winsys/amdgpu/drm/amdgpu_surface.c > +++ b/src/gallium/winsys/amdgpu/drm/amdgpu_surface.c > @@ -145,11 +145,9 @@ ADDR_HANDLE amdgpu_addr_create(struct amdgpu_winsys *ws) > > regValue.backendDisables = ws->amdinfo.backend_disable[0]; > regValue.pTileConfig = ws->amdinfo.gb_tile_mode; > - regValue.noOfEntries = sizeof(ws->amdinfo.gb_tile_mode) / > - sizeof(ws->amdinfo.gb_tile_mode[0]); > + regValue.noOfEntries = ARRAY_SIZE(ws->amdinfo.gb_tile_mode); > regValue.pMacroTileConfig = ws->amdinfo.gb_macro_tile_mode; > - regValue.noOfMacroEntries = sizeof(ws->amdinfo.gb_macro_tile_mode) / > - sizeof(ws->amdinfo.gb_macro_tile_mode[0]); > + regValue.noOfMacroEntries = ARRAY_SIZE(ws->amdinfo.gb_macro_tile_mode); > > createFlags.value = 0; > createFlags.useTileIndex = 1; > -- > 2.5.0 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/2] st/va: move HEVC functions into separate file v2
From: Christian Königv2: actually copy all of it Signed-off-by: Christian König --- src/gallium/state_trackers/va/Makefile.sources | 1 + src/gallium/state_trackers/va/picture.c| 172 +--- src/gallium/state_trackers/va/picture_hevc.c | 207 + src/gallium/state_trackers/va/va_private.h | 3 + 4 files changed, 215 insertions(+), 168 deletions(-) create mode 100644 src/gallium/state_trackers/va/picture_hevc.c diff --git a/src/gallium/state_trackers/va/Makefile.sources b/src/gallium/state_trackers/va/Makefile.sources index 74cc34e..daebf01 100644 --- a/src/gallium/state_trackers/va/Makefile.sources +++ b/src/gallium/state_trackers/va/Makefile.sources @@ -8,6 +8,7 @@ C_SOURCES := \ picture_mpeg12.c \ picture_mpeg4.c \ picture_h264.c \ + picture_hevc.c \ picture_vc1.c \ postproc.c \ subpicture.c \ diff --git a/src/gallium/state_trackers/va/picture.c b/src/gallium/state_trackers/va/picture.c index c7c377a..8623139 100644 --- a/src/gallium/state_trackers/va/picture.c +++ b/src/gallium/state_trackers/va/picture.c @@ -92,7 +92,6 @@ vlVaGetReferenceFrame(vlVaDriver *drv, VASurfaceID surface_id, static VAStatus handlePictureParameterBuffer(vlVaDriver *drv, vlVaContext *context, vlVaBuffer *buf) { - VAPictureParameterBufferHEVC *hevc; unsigned int i; VAStatus vaStatus = VA_STATUS_SUCCESS; @@ -114,154 +113,7 @@ handlePictureParameterBuffer(vlVaDriver *drv, vlVaContext *context, vlVaBuffer * break; case PIPE_VIDEO_FORMAT_HEVC: - assert(buf->size >= sizeof(VAPictureParameterBufferHEVC) && buf->num_elements == 1); - hevc = buf->data; - context->desc.h265.pps->sps->chroma_format_idc = hevc->pic_fields.bits.chroma_format_idc; - context->desc.h265.pps->sps->separate_colour_plane_flag = - hevc->pic_fields.bits.separate_colour_plane_flag; - context->desc.h265.pps->sps->pic_width_in_luma_samples = hevc->pic_width_in_luma_samples; - context->desc.h265.pps->sps->pic_height_in_luma_samples = hevc->pic_height_in_luma_samples; - context->desc.h265.pps->sps->bit_depth_luma_minus8 = hevc->bit_depth_luma_minus8; - context->desc.h265.pps->sps->bit_depth_chroma_minus8 = hevc->bit_depth_chroma_minus8; - context->desc.h265.pps->sps->log2_max_pic_order_cnt_lsb_minus4 = - hevc->log2_max_pic_order_cnt_lsb_minus4; - context->desc.h265.pps->sps->sps_max_dec_pic_buffering_minus1 = - hevc->sps_max_dec_pic_buffering_minus1; - context->desc.h265.pps->sps->log2_min_luma_coding_block_size_minus3 = - hevc->log2_min_luma_coding_block_size_minus3; - context->desc.h265.pps->sps->log2_diff_max_min_luma_coding_block_size = - hevc->log2_diff_max_min_luma_coding_block_size; - context->desc.h265.pps->sps->log2_min_transform_block_size_minus2 = - hevc->log2_min_transform_block_size_minus2; - context->desc.h265.pps->sps->log2_diff_max_min_transform_block_size = - hevc->log2_diff_max_min_transform_block_size; - context->desc.h265.pps->sps->max_transform_hierarchy_depth_inter = - hevc->max_transform_hierarchy_depth_inter; - context->desc.h265.pps->sps->max_transform_hierarchy_depth_intra = - hevc->max_transform_hierarchy_depth_intra; - context->desc.h265.pps->sps->scaling_list_enabled_flag = - hevc->pic_fields.bits.scaling_list_enabled_flag; - context->desc.h265.pps->sps->amp_enabled_flag = hevc->pic_fields.bits.amp_enabled_flag; - context->desc.h265.pps->sps->sample_adaptive_offset_enabled_flag = - hevc->slice_parsing_fields.bits.sample_adaptive_offset_enabled_flag; - context->desc.h265.pps->sps->pcm_enabled_flag = hevc->pic_fields.bits.pcm_enabled_flag; - if (hevc->pic_fields.bits.pcm_enabled_flag == 1) { - context->desc.h265.pps->sps->pcm_sample_bit_depth_luma_minus1 = -hevc->pcm_sample_bit_depth_luma_minus1; - context->desc.h265.pps->sps->pcm_sample_bit_depth_chroma_minus1 = -hevc->pcm_sample_bit_depth_chroma_minus1; - context->desc.h265.pps->sps->log2_min_pcm_luma_coding_block_size_minus3 = -hevc->log2_min_pcm_luma_coding_block_size_minus3; - context->desc.h265.pps->sps->log2_diff_max_min_pcm_luma_coding_block_size = -hevc->log2_diff_max_min_pcm_luma_coding_block_size; - context->desc.h265.pps->sps->pcm_loop_filter_disabled_flag = -hevc->pic_fields.bits.pcm_loop_filter_disabled_flag; - } - context->desc.h265.pps->sps->num_short_term_ref_pic_sets = hevc->num_short_term_ref_pic_sets; - context->desc.h265.pps->sps->long_term_ref_pics_present_flag = - hevc->slice_parsing_fields.bits.long_term_ref_pics_present_flag; - context->desc.h265.pps->sps->num_long_term_ref_pics_sps = hevc->num_long_term_ref_pic_sps; -
[Mesa-dev] [PATCH 2/2] st/va: disable MPEG4 by default v2
From: Christian KönigThe workarounds are too hacky to enable them by default and otherwise MPEG4 doesn't work reliably. v2: add docs/envvars.html, CC stable and fix typos Signed-off-by: Christian König Reviewed-by: Emil Velikov (v1) Reviewed-by: Ilia Mirkin (v1) Cc: "11.1.0" --- docs/envvars.html | 6 ++ src/gallium/state_trackers/va/config.c | 10 +- 2 files changed, 15 insertions(+), 1 deletion(-) diff --git a/docs/envvars.html b/docs/envvars.html index 1b2c03e..5bb7b1e 100644 --- a/docs/envvars.html +++ b/docs/envvars.html @@ -238,6 +238,12 @@ for details. +VA-API state tracker environment variables + +VAAPI_MPEG4_ENABLED - enable MPEG4 for VA-API, disabled by default. + + + Other Gallium drivers have their own environment variables. These may change frequently so the source code should be consulted for details. diff --git a/src/gallium/state_trackers/va/config.c b/src/gallium/state_trackers/va/config.c index a545a18..9ca0aa8 100644 --- a/src/gallium/state_trackers/va/config.c +++ b/src/gallium/state_trackers/va/config.c @@ -28,10 +28,14 @@ #include "pipe/p_screen.h" +#include "util/u_video.h" + #include "vl/vl_winsys.h" #include "va_private.h" +DEBUG_GET_ONCE_BOOL_OPTION(mpeg4, "VAAPI_MPEG4_ENABLED", false) + VAStatus vlVaQueryConfigProfiles(VADriverContextP ctx, VAProfile *profile_list, int *num_profiles) { @@ -45,12 +49,16 @@ vlVaQueryConfigProfiles(VADriverContextP ctx, VAProfile *profile_list, int *num_ *num_profiles = 0; pscreen = VL_VA_PSCREEN(ctx); - for (p = PIPE_VIDEO_PROFILE_MPEG2_SIMPLE; p <= PIPE_VIDEO_PROFILE_HEVC_MAIN_444; ++p) + for (p = PIPE_VIDEO_PROFILE_MPEG2_SIMPLE; p <= PIPE_VIDEO_PROFILE_HEVC_MAIN_444; ++p) { + if (u_reduce_video_profile(p) == PIPE_VIDEO_FORMAT_MPEG4 && !debug_get_option_mpeg4()) + continue; + if (pscreen->get_video_param(pscreen, p, PIPE_VIDEO_ENTRYPOINT_BITSTREAM, PIPE_VIDEO_CAP_SUPPORTED)) { vap = PipeToProfile(p); if (vap != VAProfileNone) profile_list[(*num_profiles)++] = vap; } + } /* Support postprocessing through vl_compositor */ profile_list[(*num_profiles)++] = VAProfileNone; -- 2.5.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] mesa/tests: add KHR_debug GLES glGetPointervKHR entry points
On 04.12.2015 10:06, Emil Velikov wrote: > On 4 December 2015 at 00:45, Matt Turnerwrote: >> On Thu, Dec 3, 2015 at 4:37 PM, Emil Velikov >> wrote: FWIW, make check still fails for me even with this patch. >>> Do you have a log that I can take a look ? I've `make clean'ed and >>> rebuild a couple of times just in case and things seems to pass here. >> >> Attached is the output of ./main-test without (p.txt) and with (q.txt) >> your patch. > Interesting... upon closer look the build produces two tests - > lt-main-test and main-test. The former is executed on my system > (either via the top level make check or the one in > src/mesa/main/tests) and passes, while the former produces the same > issues as you've spotted. I was getting the same failures as Matt with your v2 patch (using lt-main-test), make clean seems to have resolved it. Tested-by: Michel Dänzer -- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Mesa and X developer ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v4 1/2] intel: 48b ppgtt support (EXEC_OBJECT_SUPPORTS_48B_ADDRESS flag)
On 11/18/2015 10:53 PM, Kristian Høgsberg wrote: On Wed, Oct 14, 2015 at 5:11 AM, Michel Thierrywrote: On 10/14/2015 8:19 AM, Daniel Vetter wrote: On Tue, Oct 13, 2015 at 02:51:36PM -0700, Kristian Høgsberg wrote: On Tue, Oct 13, 2015 at 7:55 AM, Michel Thierry wrote: On 10/13/2015 3:13 PM, Emil Velikov wrote: On 13 October 2015 at 13:16, Michel Thierry wrote: On 10/6/2015 2:12 PM, Michel Thierry wrote: On 10/5/2015 7:06 PM, Kristian Høgsberg wrote: On Mon, Oct 5, 2015 at 7:03 AM, Michel Thierry wrote: On 9/14/2015 2:54 PM, Michał Winiarski wrote: On Thu, Sep 03, 2015 at 03:23:58PM +0100, Michel Thierry wrote: Gen8+ supports 48-bit virtual addresses, but some objects must always be allocated inside the 32-bit address range. In specific, any resource used with flat/heapless (0x-0xf000) General State Heap (GSH) or Instruction State Heap (ISH) must be in a 32-bit range, because the General State Offset and Instruction State Offset are limited to 32-bits. The i915 driver has been modified to provide a flag to set when the 4GB limit is not necessary in a given bo (EXEC_OBJECT_SUPPORTS_48B_ADDRESS). 48-bit range will only be used when explicitly requested. Callers to the existing drm_intel_bo_emit_reloc function should set the use_48b_address_range flag beforehand, in order to use full ppgtt range. v2: Make set/clear functions nops on pre-gen8 platforms, and use them internally in emit_reloc functions (Ben) s/48BADDRESS/48B_ADDRESS/ (Dave) v3: Keep set/clear functions internal, no-one needs to use them directly. v4: Don't set 48bit-support flag in emit reloc, check for ppgtt type before enabling set/clear function, print full offsets in debug statements, using port of lower_32_bits and upper_32_bits from linux kernel (Michał) References: http://lists.freedesktop.org/archives/intel-gfx/2015-July/072612.html Cc: Ben Widawsky Cc: Michał Winiarski +Kristian LGTM. Acked-by: Michał Winiarski Signed-off-by: Michel Thierry Hi Kristian, More comments on this? I've resent the mesa patch with the last changes (http://lists.freedesktop.org/archives/dri-devel/2015-October/091752.html). Michał has agreed with the interface and will use it for his work. If mesa doesn't plan to use this for now, it's ok. The kernel changes have been done to prevent any regressions when the support 48-bit flag is not set. I didn't get any replies to my last comments on this: http://lists.freedesktop.org/archives/mesa-dev/2015-August/091398.html Basically, I tried explicitly placing buffers above 8G and didn't see the HW problem described (ie it all worked fine). I still think there's some confusion as to what the W/A is about. Here's my take: the W/A is a SW-only workaround to handle the cases where a driver uses heapless and 48-bit ppgtt. The problem is that the heaps are limited to 4G but with 48bit ppgtt a buffer can be placed anywhere it the 48 bit address space. If that happens it's consideredd out-of-bounds for the heap and access fails. To prevent this we need to make sure the bos we address in a heapless fashion are placed below 4g. For mesa, we only configure general state base as heap-less, which affects scratch bos. What this boils down to is that we need the 4G restriction only for the scratch bos set up on 3DSTATE_VS, 3DSTATE_GS and 3DSTATE_PS (and tesselation stage eventually). Look for the OUT_RELOC64 for stage->scratch_bo in gen8_vs_state.c, gen8_gs_state.c and gen8_ps_state.c I think it also affects _3DSTATE_VIEWPORT_STATE_POINTERS_CC, maybe it isn't exclusive to the scratch bos (the error state points to that instruction). Hi, Anymore inputs about this patch? AFAIK, if changes are required based on further comments from Kristian, these will be in the mesa patch[1], not libdrm. Also, Michał will use this flag in another project. The comment seems quite brief and I'm not sure it fully addresses Kristian's concern. I'd suspect that providing reference to the HW documentation (confirming your assumption) might be beneficial. Sure, attached is the hang I get if I don't set the restriction in gen8_misc_state.c and try to use the full 48-bit range (i915_error_state_nowa.txt). This is just running gfxbench t-rex. I see the same hang signature when it is only applied to the scratch bos (in gen8_vs_state.c, gen8_gs_state.c and gen8_ps_state.c - i915_error_state_scratchbo.txt). 3DSTATE_VIEWPORT_STATE_POINTERS_CC (0x7823) is defined here: https://01.org/sites/default/files/documentation/intel-gfx-prm-osrc-bdw-vol02a-commandreference-instructions_0.pdf (page 256) I applied your mesa and libdrm patches and then backed out the 4G restriction in the STATE_BASE_ADDRESS
Re: [Mesa-dev] [PATCH 00/11] do not wrap header inclusion in extern "C"
On 24 November 2015 at 16:29, Emil Velikovwrote: > Hi all, > > As pointed out by Jose a while back adding extern C guard around header > inclusion is a bad idea. > > As I noticed a few in the glsl/nir code I went on and grepped through > mesa. > > I have build tested the series on Linux only, yet patch 10/11 contains > some windows/haiku specific code. Regardless, the lot is quite trivial. > > The series is in branch no-extern-c at https://github.com/evelikov/Mesa > Humble ping... patch 7 was dropped while 10 and 11 are the only ones reviewed. I'm inclined to just push the lot, although I'd give it another week for people to take comment/review. Thanks Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 10/10] vc4/nir: Use the new unified io intrinsics
Hi Eric, On 4 December 2015 at 02:03, Eric Anholtwrote: > Jason Ekstrand writes: > >> On Dec 3, 2015 3:47 PM, "Eric Anholt" wrote: >>> >>> Jason Ekstrand writes: >>> >>> > Cc: Eric Anholt >>> >>> OK, I've pushed a branch of partial fixes for this series to nir-loads >>> of my fdo tree, but it's still super broken. I've spent too much of >>> today on it, and this series was not ready for review. There are still >>> const_index[0] references all over that are obviously from intrinsics >>> you've removed the const_index on. >> >> I'm going to start looking at this again tomorrow. I've had other things >> on my plate this week. Rob already pointed me at some crashing shaders for >> ir3 and his little standalone compiler. I'll start there. Is there >> something similar I can use to debug vc4? > > Right now from configure.ac I've got either targeting-hardware mode, or > targeting-simulator mode. What I need to add is something in between, > where we use the ioctls from simulator mode so that people can run the > driver on i965 systems, and then just don't make the simulator calls at > the very end of the drawing pipeline since you don't have the simulator. > Then people could even run shader-db without having the actual hardware, > and that would get a lot of coverage of compiler changes. > > You could knock this together with some #ifdefs around all 7 mentions of > "simpenrose", but I'll try to get a clean patch sent out. I'm just > trying to decide whether this should be a --enable argument or "oh, hey, > you're not on arm, you must want either simulator or > everything-but-simulator mode" > > I'd also like something a bit nicer than my symlink trick for loading > the driver (right now I link the built vc4_dri.so to i965_dri.so in a > subdirectory, and LIBGL_DRIVERS_PATH it). Maybe if I getenv() a name > From loader.c for computing driver name, I could also skip a bunch of > the i965 symbols I have to add. > What would it take to have things a bit like virgl -> ie. single vc4 gallium/driver but two winsys (vc4/drm and vc4/sim) ? Afaict one can (should be able to) get things doing with this approach and toggle the simulator "mode" via an env variable. I might have some time to add LIBGL_DRIVER_NAME next week, although I'd rather avoid having such a "hack". On the standalone compiler thing if you shuffle things ala winsys style as suggested you should be able to get a nice clean standalone compiler like nouveau and freedreno. I'll be glad to help, mostly on the ugly pipe-loader and autohell side of things. Thanks Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 1/1] i965: add opportunistic behaviour to opt_vector_float()
On Thu, 2015-12-03 at 11:04 -0800, Matt Turner wrote: > > The improvement obtained regarding current upstream (56aff6bb4eaf) > is: > > > > total instructions in shared programs: 6819484 -> 6811698 (-0.11%) > > instructions in affected programs: 387245 -> 379459 (-2.01%) > > total loops in shared programs: 1971 -> 1971 (0.00%) > > helped: 3980 > > HURT: 0 > > GAINED: 3 > > Run the shader-db on just the program that had the problem here, add > it to your before-results, and rerun report.py. I've rebased against new master, applied the changes you suggested, and run again the tests. That has been enough to get rid of the GAINED shaders. New results are: total instructions in shared programs: 6819828 -> 6812042 (-0.11%) instructions in affected programs: 387245 -> 379459 (-2.01%) total loops in shared programs:1971 -> 1971 (0.00%) helped:3980 HURT: 0 GAINED:0 LOST: 0 I'll send a second version of the patch. J.A. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v3 1/1] i965: add opportunistic behaviour to opt_vector_float()
opt_vector_float() transforms several scalar MOV operations to a single vectorial MOV. This is done when those MOV covers all the components of the destination register. So something like: mov vgrf3.0.xy:D, 0D mov vgrf3.0.w:D, 1065353216D mov vgrf3.0.z:D, 0D is transformed in: mov vgrf3.0:F, [0F, 0F, 0F, 1F] But there are cases where not all the components are written. For example, in: mov vgrf2.0.x:D, 1073741824D mov vgrf3.0.xy:D, 0D mov vgrf3.0.w:D, 1065353216D mov vgrf4.0.xy:D, 1065353216D mov vgrf4.0.w:D, 0D mov vgrf6.0:UD, u4.xyzw:UD Nor vgrf3 nor vgrf4 .z components are written, so the optimization is not applied. But it could be applied anyway with the components covered, using a writemask to select the ones written. So we could transform it in: mov vgrf2.0.x:D, 1073741824D mov vgrf3.0.xyw:F, [0F, 0F, 0F, 1F] mov vgrf4.0.xyw:F, [1F, 1F, 0F, 0F] mov vgrf6.0:UD, u4.xyzw:UD This commit does precisely that: opportunistically apply opt_vector_float() when possible. The improvement obtained regarding current upstream (11.1-branchpoint-310-gd566382) is: total instructions in shared programs: 6819828 -> 6812042 (-0.11%) instructions in affected programs: 387245 -> 379459 (-2.01%) total loops in shared programs:1971 -> 1971 (0.00%) helped:3980 HURT: 0 GAINED:0 LOST: 0 v2: change vectorize_mov() signature (Matt). Signed-off-by: Juan A. Suarez Romero--- src/mesa/drivers/dri/i965/brw_vec4.cpp | 61 ++ src/mesa/drivers/dri/i965/brw_vec4.h | 4 +++ 2 files changed, 43 insertions(+), 22 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp b/src/mesa/drivers/dri/i965/brw_vec4.cpp index a697bdf..104df0f 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp +++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp @@ -309,6 +309,29 @@ src_reg::equals(const src_reg ) const } bool +vec4_visitor::vectorize_mov(bblock_t *block, vec4_instruction *inst, uint8_t imm[4], +vec4_instruction *imm_inst[4], int inst_count, +unsigned writemask) +{ + if (inst_count < 2) { + return false; + } + + unsigned vf; + memcpy(, imm, sizeof(vf)); + vec4_instruction *mov = MOV(imm_inst[0]->dst, brw_imm_vf(vf)); + mov->dst.type = BRW_REGISTER_TYPE_F; + mov->dst.writemask = writemask; + inst->insert_before(block, mov); + + for (int i = 0; i < inst_count; i++) { + imm_inst[i]->remove(block); + } + + return true;} + + +bool vec4_visitor::opt_vector_float() { bool progress = false; @@ -316,27 +339,36 @@ vec4_visitor::opt_vector_float() int last_reg = -1, last_reg_offset = -1; enum brw_reg_file last_reg_file = BAD_FILE; - int remaining_channels = 0; - uint8_t imm[4]; + uint8_t imm[4] = { 0 }; int inst_count = 0; vec4_instruction *imm_inst[4]; + unsigned writemask = 0; foreach_block_and_inst_safe(block, vec4_instruction, inst, cfg) { if (last_reg != inst->dst.nr || last_reg_offset != inst->dst.reg_offset || last_reg_file != inst->dst.file) { + + progress |= vectorize_mov(block, inst, imm, imm_inst, inst_count, writemask); + + inst_count = 0; + writemask = 0; last_reg = inst->dst.nr; last_reg_offset = inst->dst.reg_offset; last_reg_file = inst->dst.file; - remaining_channels = WRITEMASK_XYZW; - - inst_count = 0; + for (int i = 0; i < 4; i++) { +imm[i] = 0; + } } if (inst->opcode != BRW_OPCODE_MOV || inst->dst.writemask == WRITEMASK_XYZW || - inst->src[0].file != IMM) + inst->src[0].file != IMM) { + progress |= vectorize_mov(block, inst, imm, imm_inst, inst_count, writemask); + inst_count = 0; + last_reg = -1; continue; + } int vf = brw_float_to_vf(inst->src[0].f); if (vf == -1) @@ -351,23 +383,8 @@ vec4_visitor::opt_vector_float() if ((inst->dst.writemask & WRITEMASK_W) != 0) imm[3] = vf; + writemask |= inst->dst.writemask; imm_inst[inst_count++] = inst; - - remaining_channels &= ~inst->dst.writemask; - if (remaining_channels == 0) { - unsigned vf; - memcpy(, imm, sizeof(vf)); - vec4_instruction *mov = MOV(inst->dst, brw_imm_vf(vf)); - mov->dst.type = BRW_REGISTER_TYPE_F; - mov->dst.writemask = WRITEMASK_XYZW; - inst->insert_after(block, mov); - last_reg = -1; - - for (int i = 0; i < inst_count; i++) { -imm_inst[i]->remove(block); - } - progress = true; - } } if (progress) diff --git a/src/mesa/drivers/dri/i965/brw_vec4.h b/src/mesa/drivers/dri/i965/brw_vec4.h index ae5bf69..6476d3b 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4.h +++
Re: [Mesa-dev] [PATCH] gallium/drivers/nouveau: Clean up clear_texture()
This will break depth/stencil clears. On Fri, Dec 4, 2015 at 9:18 AM, Edward O'Callaghanwrote: > We should translate and set the format before dispatching > the call to pipe->create_surface(), instead of creating > a surface with the given format and changing it after. > > Signed-off-by: Edward O'Callaghan > --- > src/gallium/drivers/nouveau/nv50/nv50_surface.c | 72 > - > 1 file changed, 35 insertions(+), 37 deletions(-) > > diff --git a/src/gallium/drivers/nouveau/nv50/nv50_surface.c > b/src/gallium/drivers/nouveau/nv50/nv50_surface.c > index 86be1b4..6364aa3 100644 > --- a/src/gallium/drivers/nouveau/nv50/nv50_surface.c > +++ b/src/gallium/drivers/nouveau/nv50/nv50_surface.c > @@ -447,8 +447,42 @@ nv50_clear_texture(struct pipe_context *pipe, > const void *data) > { > struct pipe_surface tmpl = {{0}}, *sf; > + union pipe_color_union color; > + float depth = 0; > + uint8_t stencil = 0; > + unsigned clear = 0; > + > + switch (util_format_get_blocksizebits(res->format)) { > + case 128: > + tmpl.format = PIPE_FORMAT_R32G32B32A32_UINT; > + memcpy(, data, 128 / 8); > + break; > + case 64: > + tmpl.format = PIPE_FORMAT_R32G32_UINT; > + memcpy(, data, 64 / 8); > + memset([2], 0, 64 / 8); > + break; > + case 32: > + tmpl.format = PIPE_FORMAT_R32_UINT; > + memcpy(, data, 32 / 8); > + memset([1], 0, 96 / 8); > + break; > + case 16: > + tmpl.format = PIPE_FORMAT_R16_UINT; > + color.ui[0] = util_cpu_to_le32( > + util_le16_to_cpu(*(unsigned short *)data)); > + memset([1], 0, 96 / 8); > + break; > + case 8: > + tmpl.format = PIPE_FORMAT_R8_UINT; > + color.ui[0] = util_cpu_to_le32(*(unsigned char *)data); > + memset([1], 0, 96 / 8); > + break; > + default: > + assert(!"Unknown texel element size"); > + return; > + } > > - tmpl.format = res->format; > tmpl.u.tex.first_layer = box->z; > tmpl.u.tex.last_layer = box->z + box->depth - 1; > tmpl.u.tex.level = level; > @@ -457,9 +491,6 @@ nv50_clear_texture(struct pipe_context *pipe, >return; > > if (util_format_is_depth_or_stencil(res->format)) { > - float depth = 0; > - uint8_t stencil = 0; > - unsigned clear = 0; >const struct util_format_description *desc = > util_format_description(res->format); > > @@ -474,39 +505,6 @@ nv50_clear_texture(struct pipe_context *pipe, >pipe->clear_depth_stencil(pipe, sf, clear, depth, stencil, > box->x, box->y, box->width, box->height); > } else { > - union pipe_color_union color; > - > - switch (util_format_get_blocksizebits(res->format)) { > - case 128: > - sf->format = PIPE_FORMAT_R32G32B32A32_UINT; > - memcpy(, data, 128 / 8); > - break; > - case 64: > - sf->format = PIPE_FORMAT_R32G32_UINT; > - memcpy(, data, 64 / 8); > - memset([2], 0, 64 / 8); > - break; > - case 32: > - sf->format = PIPE_FORMAT_R32_UINT; > - memcpy(, data, 32 / 8); > - memset([1], 0, 96 / 8); > - break; > - case 16: > - sf->format = PIPE_FORMAT_R16_UINT; > - color.ui[0] = util_cpu_to_le32( > -util_le16_to_cpu(*(unsigned short *)data)); > - memset([1], 0, 96 / 8); > - break; > - case 8: > - sf->format = PIPE_FORMAT_R8_UINT; > - color.ui[0] = util_cpu_to_le32(*(unsigned char *)data); > - memset([1], 0, 96 / 8); > - break; > - default: > - assert(!"Unknown texel element size"); > - return; > - } > - >pipe->clear_render_target(pipe, sf, , > box->x, box->y, box->width, box->height); > } > -- > 2.5.0 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 3/3v2] xvmc: force assertion in XvMC tests
Hi Giuseppe, On 29 November 2015 at 17:19, Giuseppe Bilottawrote: > This follows the src/util/u_atomic_test.c model of undefining NDEBUG > unconditionally throughouth the XvMC tests, to force asserts regardless > of debug mode. > > The comment on u_atomic_test.c is also fixed (read 'debug' where it > should have been 'release'). > I've split that one to a separate patch, but for the future please do so during submission. Also can you take a look at the list of patches you've submitted [1] and let me know if any of those are outstanding, so that I can mark the remainder as accepted/superseded. Thanks Emil [1] http://patchwork.freedesktop.org/project/mesa/patches/?submitter=299 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] gallium/drivers/nouveau: Clean up clear_texture()
We should translate and set the format before dispatching the call to pipe->create_surface(), instead of creating a surface with the given format and changing it after. Signed-off-by: Edward O'Callaghan--- src/gallium/drivers/nouveau/nv50/nv50_surface.c | 72 - 1 file changed, 35 insertions(+), 37 deletions(-) diff --git a/src/gallium/drivers/nouveau/nv50/nv50_surface.c b/src/gallium/drivers/nouveau/nv50/nv50_surface.c index 86be1b4..6364aa3 100644 --- a/src/gallium/drivers/nouveau/nv50/nv50_surface.c +++ b/src/gallium/drivers/nouveau/nv50/nv50_surface.c @@ -447,8 +447,42 @@ nv50_clear_texture(struct pipe_context *pipe, const void *data) { struct pipe_surface tmpl = {{0}}, *sf; + union pipe_color_union color; + float depth = 0; + uint8_t stencil = 0; + unsigned clear = 0; + + switch (util_format_get_blocksizebits(res->format)) { + case 128: + tmpl.format = PIPE_FORMAT_R32G32B32A32_UINT; + memcpy(, data, 128 / 8); + break; + case 64: + tmpl.format = PIPE_FORMAT_R32G32_UINT; + memcpy(, data, 64 / 8); + memset([2], 0, 64 / 8); + break; + case 32: + tmpl.format = PIPE_FORMAT_R32_UINT; + memcpy(, data, 32 / 8); + memset([1], 0, 96 / 8); + break; + case 16: + tmpl.format = PIPE_FORMAT_R16_UINT; + color.ui[0] = util_cpu_to_le32( + util_le16_to_cpu(*(unsigned short *)data)); + memset([1], 0, 96 / 8); + break; + case 8: + tmpl.format = PIPE_FORMAT_R8_UINT; + color.ui[0] = util_cpu_to_le32(*(unsigned char *)data); + memset([1], 0, 96 / 8); + break; + default: + assert(!"Unknown texel element size"); + return; + } - tmpl.format = res->format; tmpl.u.tex.first_layer = box->z; tmpl.u.tex.last_layer = box->z + box->depth - 1; tmpl.u.tex.level = level; @@ -457,9 +491,6 @@ nv50_clear_texture(struct pipe_context *pipe, return; if (util_format_is_depth_or_stencil(res->format)) { - float depth = 0; - uint8_t stencil = 0; - unsigned clear = 0; const struct util_format_description *desc = util_format_description(res->format); @@ -474,39 +505,6 @@ nv50_clear_texture(struct pipe_context *pipe, pipe->clear_depth_stencil(pipe, sf, clear, depth, stencil, box->x, box->y, box->width, box->height); } else { - union pipe_color_union color; - - switch (util_format_get_blocksizebits(res->format)) { - case 128: - sf->format = PIPE_FORMAT_R32G32B32A32_UINT; - memcpy(, data, 128 / 8); - break; - case 64: - sf->format = PIPE_FORMAT_R32G32_UINT; - memcpy(, data, 64 / 8); - memset([2], 0, 64 / 8); - break; - case 32: - sf->format = PIPE_FORMAT_R32_UINT; - memcpy(, data, 32 / 8); - memset([1], 0, 96 / 8); - break; - case 16: - sf->format = PIPE_FORMAT_R16_UINT; - color.ui[0] = util_cpu_to_le32( -util_le16_to_cpu(*(unsigned short *)data)); - memset([1], 0, 96 / 8); - break; - case 8: - sf->format = PIPE_FORMAT_R8_UINT; - color.ui[0] = util_cpu_to_le32(*(unsigned char *)data); - memset([1], 0, 96 / 8); - break; - default: - assert(!"Unknown texel element size"); - return; - } - pipe->clear_render_target(pipe, sf, , box->x, box->y, box->width, box->height); } -- 2.5.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] gallium/drivers/nouveau: Clean up clear_texture()
I don't actually have access to Nvid ASIC's at the moment so this is untested for regressions in ARB_clear_texture piglits. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 4/7] gallium/winsys: Make use of ARRAY_SIZE macro
Hi Edward, A few trivial suggestions (for future patches) and a humble request On 4 December 2015 at 06:50, Edward O'Callaghanwrote: > Signed-off-by: Edward O'Callaghan > --- > src/gallium/winsys/amdgpu/drm/amdgpu_surface.c | 6 ++ For the patches do shorten up the prefix - here we'd want "winsys/amdgpu" while on the driver side just "nouveau" or "llvmpipe" should suffice. When in doubt git log is your friend You mentioned that you've done these with a Coccinelle schematic patch (script) - mind if we get that (actually start collecting) in tree ... perhaps in bin/ or bin/cocci/ ? Thanks Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v3 13/44] i965: Implement L3 state atom.
Kenneth Graunkewrites: > On Tuesday, December 01, 2015 12:19:31 AM Jordan Justen wrote: >> From: Francisco Jerez >> >> The L3 state atom calculates the target L3 partition weights when the >> program bound to some shader stage is modified, and in case they are >> far enough from the current partitioning it makes sure that the L3 >> state is re-emitted. >> >> v3: Fix for inconsistent units the context URB size is expressed in. >> Clamp URB size to 1008 KB on SKL due to FF hardware limitation. >> --- >> src/mesa/drivers/dri/i965/brw_context.h | 6 +++ >> src/mesa/drivers/dri/i965/brw_state.h | 1 + >> src/mesa/drivers/dri/i965/gen7_l3_state.c | 81 >> +++ >> 3 files changed, 88 insertions(+) >> >> diff --git a/src/mesa/drivers/dri/i965/brw_context.h >> b/src/mesa/drivers/dri/i965/brw_context.h >> index 0b91147..48024c6 100644 >> --- a/src/mesa/drivers/dri/i965/brw_context.h >> +++ b/src/mesa/drivers/dri/i965/brw_context.h >> @@ -672,6 +672,8 @@ enum brw_predicate_state { >> >> struct shader_times; >> >> +struct brw_l3_config; >> + >> /** >> * brw_context is derived from gl_context. >> */ >> @@ -1214,6 +1216,10 @@ struct brw_context >> int basevertex; >> >> struct { >> + const struct brw_l3_config *config; >> + } l3; >> + >> + struct { >>drm_intel_bo *bo; >>const char **names; >>int *ids; >> diff --git a/src/mesa/drivers/dri/i965/brw_state.h >> b/src/mesa/drivers/dri/i965/brw_state.h >> index 94734ba..49f301a 100644 >> --- a/src/mesa/drivers/dri/i965/brw_state.h >> +++ b/src/mesa/drivers/dri/i965/brw_state.h >> @@ -129,6 +129,7 @@ extern const struct brw_tracked_state gen7_depthbuffer; >> extern const struct brw_tracked_state gen7_clip_state; >> extern const struct brw_tracked_state gen7_disable_stages; >> extern const struct brw_tracked_state gen7_gs_state; >> +extern const struct brw_tracked_state gen7_l3_state; >> extern const struct brw_tracked_state gen7_ps_state; >> extern const struct brw_tracked_state gen7_push_constant_space; >> extern const struct brw_tracked_state gen7_sbe_state; >> diff --git a/src/mesa/drivers/dri/i965/gen7_l3_state.c >> b/src/mesa/drivers/dri/i965/gen7_l3_state.c >> index 0b5e231..a895723 100644 >> --- a/src/mesa/drivers/dri/i965/gen7_l3_state.c >> +++ b/src/mesa/drivers/dri/i965/gen7_l3_state.c >> @@ -418,3 +418,84 @@ setup_l3_config(struct brw_context *brw, const struct >> brw_l3_config *cfg) >>} >> } >> } >> + >> +/** >> + * Return the unit brw_context::urb::size is expressed in, in KB. \sa >> + * brw_device_info::urb::size. >> + */ >> +static unsigned >> +get_urb_size_scale(const struct brw_device_info *devinfo) >> +{ >> + return (devinfo->gen >= 8 ? devinfo->num_slices : 1); >> +} >> + >> +/** >> + * Update the URB size in the context state for the specified L3 >> + * configuration. >> + */ >> +static void >> +update_urb_size(struct brw_context *brw, const struct brw_l3_config *cfg) >> +{ >> + const struct brw_device_info *devinfo = brw->intelScreen->devinfo; >> + /* From the SKL "L3 Allocation and Programming" documentation: >> +* >> +* "URB is limited to 1008KB due to programming restrictions. This is >> not >> +* a restriction of the L3 implementation, but of the FF and other >> clients. >> +* Therefore, in a GT4 implementation it is possible for the programmed >> +* allocation of the L3 data array to provide 3*384KB=1152KB for URB, but >> +* only 1008KB of this will be used." >> +*/ >> + const unsigned max = (devinfo->gen == 9 ? 1008 : ~0); > > I think this would be clearer as devinfo->gt == 4, since it applies to > all currently known GT4 parts, and doesn't apply to GT1-3. The reason is that only Gen9 has this limitation. Gen10 will have a different limit on the URB space addressable by fixed function clients. > Presumably GT1-3 just don't have enough URB to hit that limit, so it's > probably moot...*shrug*. > Yeah, in theory this limitation applies to all Gen9 hardware but only GT4 devices have a large enough L3 to hit it regularly. In principle you could hit the limit on GT3 too if you tried out some non-validated configuration so it seems sensible to apply it to Gen9 in general even if only GT4 parts are going to hit it in practice. It doesn't hurt anyway. > Acked-by: Kenneth Graunke > >> + const unsigned sz = >> + MIN2(max, cfg->n[L3P_URB] * get_l3_way_size(devinfo)) / >> + get_urb_size_scale(devinfo); >> + >> + if (brw->urb.size != sz) { >> + brw->urb.size = sz; >> + brw->ctx.NewDriverState |= BRW_NEW_URB_SIZE; >> + } >> +} >> + >> +static void >> +emit_l3_state(struct brw_context *brw) >> +{ >> + const struct brw_l3_weights w = get_pipeline_state_l3_weights(brw); >> + const float dw = diff_l3_weights(w, >> get_config_l3_weights(brw->l3.config)); >> + /* The distance between any two compatible
Re: [Mesa-dev] [PATCH v3 07/44] i965: Import tables enumerating the set of validated L3 configurations.
Kenneth Graunkewrites: > On Tuesday, December 01, 2015 12:19:25 AM Jordan Justen wrote: >> From: Francisco Jerez >> >> It should be possible to use additional L3 configurations other than >> the ones listed in the tables of validated allocations ("BSpec » >> 3D-Media-GPGPU Engine » L3 Cache and URB [IVB+] » L3 Cache and URB [*] >> » L3 Allocation and Programming"), but it seems sensible for now to >> hard-code the tables in order to stick to the hardware docs. Instead >> of setting up the arbitrary L3 partitioning given as input, the >> closest validated L3 configuration will be looked up in these tables >> and used to program the hardware. >> >> The included tables should work for Gen7-9. Note that the quantities >> are specified in ways rather than in KB, this is because the L3 >> control registers expect the value in ways, and because by doing that >> we can re-use a single table for all GT variants of the same >> generation (and in the case of IVB/HSW and CHV/SKL across different >> generations) which generally have different L3 way sizes but allow the >> same combinations of way allocations. >> >> v3: Use slice count from the devinfo structure instead of the gt >> number to implement get_l3_way_size(). >> >> Reviewed-by: Samuel Iglesias Gonsálvez >> --- >> src/mesa/drivers/dri/i965/Makefile.sources | 1 + >> src/mesa/drivers/dri/i965/gen7_l3_state.c | 163 >> + >> 2 files changed, 164 insertions(+) >> create mode 100644 src/mesa/drivers/dri/i965/gen7_l3_state.c >> >> diff --git a/src/mesa/drivers/dri/i965/Makefile.sources >> b/src/mesa/drivers/dri/i965/Makefile.sources >> index 5e805fa..b413f73 100644 >> --- a/src/mesa/drivers/dri/i965/Makefile.sources >> +++ b/src/mesa/drivers/dri/i965/Makefile.sources >> @@ -182,6 +182,7 @@ i965_FILES = \ >> gen7_cs_state.c \ >> gen7_disable.c \ >> gen7_gs_state.c \ >> +gen7_l3_state.c \ >> gen7_misc_state.c \ >> gen7_sf_state.c \ >> gen7_sol_state.c \ >> diff --git a/src/mesa/drivers/dri/i965/gen7_l3_state.c >> b/src/mesa/drivers/dri/i965/gen7_l3_state.c >> new file mode 100644 >> index 000..8765b11 >> --- /dev/null >> +++ b/src/mesa/drivers/dri/i965/gen7_l3_state.c >> @@ -0,0 +1,163 @@ >> +/* >> + * Copyright (c) 2015 Intel Corporation >> + * >> + * Permission is hereby granted, free of charge, to any person obtaining a >> + * copy of this software and associated documentation files (the >> "Software"), >> + * to deal in the Software without restriction, including without limitation >> + * the rights to use, copy, modify, merge, publish, distribute, sublicense, >> + * and/or sell copies of the Software, and to permit persons to whom the >> + * Software is furnished to do so, subject to the following conditions: >> + * >> + * The above copyright notice and this permission notice (including the next >> + * paragraph) shall be included in all copies or substantial portions of the >> + * Software. >> + * >> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS >> OR >> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, >> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL >> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR >> OTHER >> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING >> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER >> DEALINGS >> + * IN THE SOFTWARE. >> + */ >> + >> +#include "brw_context.h" >> +#include "brw_defines.h" >> +#include "brw_state.h" >> +#include "intel_batchbuffer.h" >> + >> +/** >> + * Chunk of L3 cache reserved for some specific purpose. >> + */ >> +enum brw_l3_partition { >> + /** Shared local memory. */ >> + L3P_SLM = 0, >> + /** Unified return buffer. */ >> + L3P_URB, >> + /** Union of DC and RO. */ >> + L3P_ALL, >> + /** Data cluster RW partition. */ >> + L3P_DC, >> + /** Union of IS, C and T. */ >> + L3P_RO, >> + /** Instruction and state cache. */ >> + L3P_IS, >> + /** Constant cache. */ >> + L3P_C, >> + /** Texture cache. */ >> + L3P_T, >> + /** Number of supported L3 partitions. */ >> + NUM_L3P >> +}; >> + >> +/** >> + * L3 configuration represented as the number of ways allocated for each >> + * partition. \sa get_l3_way_size(). >> + */ >> +struct brw_l3_config { >> + unsigned n[NUM_L3P]; >> +}; >> + >> +/** >> + * IVB/HSW validated L3 configurations. >> + */ >> +static const struct brw_l3_config ivb_l3_configs[] = { > > It might be nice to add comments above the columns, similar to > brw_surface_formats.c: > >/* SLM URB ALL DC RO ISCT */ >{{ 0, 32, 0, 0, 32, 0, 0, 0 }}, > Sure, fixed locally. > Either way, > Acked-by: Kenneth Graunke > Thanks. :) > Thanks Curro! > >> + {{ 0, 32, 0, 0, 32, 0, 0, 0 }}, >> +
Re: [Mesa-dev] [PATCH 0/3] misc janitorial
On Tue, Dec 1, 2015 at 6:04 PM, Emil Velikovwrote: > With the small comment in patch 3 addressed the series is > Reviewed-by: Emil Velikov And also: > Reviewed-by: Ian Romanick for patch 1 and 2. Who should I ping for to get them pushed? ;-) >>> The long version: >>> 1. Grep for RADEON_R{1,2}00 and look at the guarded code >>> 1.1 Some guards can just be dropped (CHIP_FAMILY_*, get_chip_family_name), >>> 1.2 Other hunks of code can be split/moved into the respective >>> r100/r200 codebase (radeonTexBufferExtension and >>> r200TexBufferExtension). >>> 1.3 Or folded into the vtbl dispatch - get_depth_z32/16 or the >>> respective callers, depending on the overhead (neither one is likely >>> to be noticeable). >>> 2. Copy/pasta an existing radeon Makefile to radeon_common. Tweak to >>> produce an empty library and get it into the final mega_dri_drivers.so >>> 2.1 With most/all of the RADEON_R*00 guards gone, one should be able >>> to just git mv the files into the new location. (alongside dropping >>> the symlinks) >>> 2.2 Remove the symbol redefinition hack. > Three reasons - symlinks in git is frowned upon, binary is "bloated" > and nasty symbol redefinition(hack). Well, much of the “bloat” would probably be there even after the cleanup, since I'm not sure (from the cursory look I've given) how many of the functions would manage to go into a radeon-common. I suspect there'll end up being quite some code duplication to get rid of the symlinks and symbol redefinition hack. This is one of those contexts in which things could be done marginally more elegantly with C++. I am proficient enough with C macros and xincludes that I could probably hack together something that avoids the symlinks and the code duplication, but whether or not this would be considered an improvement is very much up to personal taste. -- Giuseppe "Oblomov" Bilotta ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 0/3] misc janitorial
On 4 December 2015 at 13:27, Giuseppe Bilottawrote: > On Tue, Dec 1, 2015 at 6:04 PM, Emil Velikov wrote: > >> With the small comment in patch 3 addressed the series is >> Reviewed-by: Emil Velikov > > And also: > >> Reviewed-by: Ian Romanick > > for patch 1 and 2. Who should I ping for to get them pushed? ;-) > Hmm I thought I pushed those in a couple of days ago. Doing it now, thanks ! The long version: 1. Grep for RADEON_R{1,2}00 and look at the guarded code 1.1 Some guards can just be dropped (CHIP_FAMILY_*, get_chip_family_name), 1.2 Other hunks of code can be split/moved into the respective r100/r200 codebase (radeonTexBufferExtension and r200TexBufferExtension). 1.3 Or folded into the vtbl dispatch - get_depth_z32/16 or the respective callers, depending on the overhead (neither one is likely to be noticeable). 2. Copy/pasta an existing radeon Makefile to radeon_common. Tweak to produce an empty library and get it into the final mega_dri_drivers.so 2.1 With most/all of the RADEON_R*00 guards gone, one should be able to just git mv the files into the new location. (alongside dropping the symlinks) 2.2 Remove the symbol redefinition hack. > >> Three reasons - symlinks in git is frowned upon, binary is "bloated" >> and nasty symbol redefinition(hack). > > Well, much of the “bloat” would probably be there even after the > cleanup, since I'm not sure (from the cursory look I've given) how > many of the functions would manage to go into a radeon-common. I > suspect there'll end up being quite some code duplication to get rid > of the symlinks and symbol redefinition hack. Not quite ... currently ~50% of the "r200" sources are (almost) identical to the radeon ones (ls -la src/mesa/drivers/dri/r200/). > This is one of those > contexts in which things could be done marginally more elegantly with > C++. I am proficient enough with C macros and xincludes that I could > probably hack together something that avoids the symlinks and the code > duplication, but whether or not this would be considered an > improvement is very much up to personal taste. > Before you start moving things around please fish around for RADEON_R(12)00. Once things are free of those (either the macros are just removed or the functions are split into r100 and r200 variant, all the movement comes into play. The last one can be tedious if you're unfamiliar with autotools, so just ping me on IRC (not there today) and I can guide you around. Cheers. Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 79688] [dri3] Latest git breaks PRIME Offloading to Nouveau GPU
https://bugs.freedesktop.org/show_bug.cgi?id=79688 --- Comment #21 from Tobias Klausmann--- (In reply to Axel Davy from comment #20) > Should this bug report be closed ? > DRI3 has complete DRI_PRIME support now. > As far as it concerns me, DRI3 Support is fine and this bug can be marked as "resolved fixed". > Also I'm not sure why you do post these benchmarks here. > Is there some hidden message about something not working right ? -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 3/3v2] xvmc: force assertion in XvMC tests
Hello Emil, >> The comment on u_atomic_test.c is also fixed (read 'debug' where it >> should have been 'release'). > I've split that one to a separate patch, but for the future please do > so during submission. Doh, sorry. > Also can you take a look at the list of patches you've submitted [1] > and let me know if any of those are outstanding, so that I can mark > the remainder as accepted/superseded. > [1] > http://patchwork.freedesktop.org/project/mesa/patches/?submitter=299 All except for the oldest one ("clover: always return number of devices in clGetDeviceIDs") are iterations of a patchset that has been merged as commits 9280f17e823cbdbddf30a4ae5e2de9f2d327d33c to 7932b30892ef898ec4c74ac1f972cb462f19962b. For the oldest one I remember it being discussed and rejected (since in case of errors user input shouldn't be touched), but I can't find the discussion about it in my archive. -- Giuseppe "Oblomov" Bilotta ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 93089] mesa fails to check for gcc atomic primitives before using them
https://bugs.freedesktop.org/show_bug.cgi?id=93089 --- Comment #8 from Jonathan Gray--- There is no way to backport builtins for architectures that lack atomics at the ISA level. On arm/hppa/sh linux has a syscall which the gcc documentation claims can only do 64 bit values on arm: https://gcc.gnu.org/wiki/Atomic Even on Linux with the latest toolchains this should be checked as there are cases where some or none of the atomics may be present. -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 4/7] gallium/winsys: Make use of ARRAY_SIZE macro
On 2015-12-05 01:33, Emil Velikov wrote: Hi Edward, A few trivial suggestions (for future patches) and a humble request On 4 December 2015 at 06:50, Edward O'Callaghanwrote: Signed-off-by: Edward O'Callaghan --- src/gallium/winsys/amdgpu/drm/amdgpu_surface.c | 6 ++ For the patches do shorten up the prefix - here we'd want "winsys/amdgpu" while on the driver side just "nouveau" or "llvmpipe" should suffice. When in doubt git log is your friend Sure no worries, will do on future patches. Thanks. You mentioned that you've done these with a Coccinelle schematic patch (script) - mind if we get that (actually start collecting) in tree ... perhaps in bin/ or bin/cocci/ ? Definitely! This is in fact part of my proposed plan after I get some of the major churn out the way and see what passes review and hence which .cocci scripts are suitable for mesa. Thanks Emil Kind Regards, ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 91806] configure does not test whether assembler supports sse4.1
https://bugs.freedesktop.org/show_bug.cgi?id=91806 --- Comment #2 from Jonathan Gray--- gcc returns 1. I tried various changes to configure.ac but didn't get anywhere. -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 93089] mesa fails to check for gcc atomic primitives before using them
https://bugs.freedesktop.org/show_bug.cgi?id=93089 --- Comment #7 from Emil Velikov--- What Matt is saying is that surely Mesa is not the only project out there which relies on atomics, thus adding (backporting) those in GCC wouldn't be that bad of an idea. That aside I'm thinking that porting the libdrm solution wouldn't be that bad of an idea. -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 93188] "nir/nir.h", line 552: Error: Unexpected type name "nir_src" encountered.
https://bugs.freedesktop.org/show_bug.cgi?id=93188 --- Comment #5 from Emil Velikov--- (In reply to Ian Romanick from comment #4) > I guess the question is whether or not this builds with Visual Studio's C++ > compiler. If not, maybe we should reopen with changes to the summary? Afaict MSVC was the first one to use anonymous struct/unions. After all GCC allowed/implemented it as -fms-extensions way before C99 came along. That aside I did explicitly pointed out the lack of portability as this was introduced, yet was greeted by hostility (~ish) and absolutely no support on the topic. Oh well :-\ -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 91806] configure does not test whether assembler supports sse4.1
https://bugs.freedesktop.org/show_bug.cgi?id=91806 --- Comment #1 from Emil Velikov--- In the failing case does gcc return a non zero value ? I'm leaning that this is some sort of GCC/autotools bug as to the best of my knowledge we're doing things the "right way" (almost identical hunk is being used in pixman and other projects). That aside if you have some ideas how we can improve things please let us know. -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/2] nv50/ir: flip shl(add, imm) into add(shl, imm)
This works when the add also has an immediate. This often happens in address calculations. These addresses can then be inlined as well. On code targeted to SM35: total instructions in shared programs : 6223346 -> 6206257 (-0.27%) total gprs used in shared programs: 911075 -> 911045 (-0.00%) total local used in shared programs : 39072 -> 39072 (0.00%) localgpr inst bytes helped 0 11936643664 hurt 0 74 15 15 Signed-off-by: Ilia Mirkin--- .../drivers/nouveau/codegen/nv50_ir_peephole.cpp | 38 +++--- 1 file changed, 34 insertions(+), 4 deletions(-) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp index 5f47887..539edaf 100644 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp @@ -1132,13 +1132,43 @@ ConstantFolding::opnd(Instruction *i, ImmediateValue , int s) break; // try to concatenate shifts Instruction *si = i->getSrc(0)->getInsn(); - if (!si || si->op != OP_SHL) + if (!si) break; ImmediateValue imm1; - if (si->src(1).getImmediate(imm1)) { + switch (si->op) { + case OP_SHL: + if (si->src(1).getImmediate(imm1)) { +bld.setPosition(i, false); +i->setSrc(0, si->getSrc(0)); +i->setSrc(1, bld.loadImm(NULL, imm0.reg.data.u32 + imm1.reg.data.u32)); + } + break; + case OP_SUB: + case OP_ADD: + int adds; + if (isFloatType(si->dType)) +return; + if (si->op != OP_SUB && si->src(0).getImmediate(imm1)) +adds = 0; + else if (si->src(1).getImmediate(imm1)) +adds = 1; + else +return; + // SHL(ADD(x, y), z) = ADD(SHL(x, z), SHL(y, z)) + + // This is more operations, but if one of x, y is an immediate, then + // we can get a situation where (a) we can use ISCADD, or (b) + // propagate the add bit into an indirect load. bld.setPosition(i, false); - i->setSrc(0, si->getSrc(0)); - i->setSrc(1, bld.loadImm(NULL, imm0.reg.data.u32 + imm1.reg.data.u32)); + i->op = si->op; + i->setSrc(adds, bld.loadImm(NULL, imm1.reg.data.u32 << imm0.reg.data.u32)); + i->setSrc(!adds, bld.mkOp2v(OP_SHL, i->dType, + bld.getSSA(i->def(0).getSize(), i->def(0).getFile()), + si->getSrc(!adds), + bld.mkImm(imm0.reg.data.u32))); + break; + default: + return; } } break; -- 2.4.10 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/2] nv50/ir: propagate indirect loads into instructions
This way $r1 = $r0 + 4; c1[$r1] becomes c1[$r0+4]. On SM35: total instructions in shared programs : 6206257 -> 6185058 (-0.34%) total gprs used in shared programs: 911045 -> 910722 (-0.04%) total local used in shared programs : 39072 -> 39072 (0.00%) localgpr inst bytes helped 0 41741954195 hurt 0 280 0 0 Signed-off-by: Ilia Mirkin--- .../drivers/nouveau/codegen/nv50_ir_peephole.cpp | 49 ++ 1 file changed, 49 insertions(+) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp index 539edaf..6dec208 100644 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp @@ -265,6 +265,54 @@ LoadPropagation::visit(BasicBlock *bb) // = +class IndirectPropagation : public Pass +{ +private: + virtual bool visit(BasicBlock *); +}; + +bool +IndirectPropagation::visit(BasicBlock *bb) +{ + const Target *targ = prog->getTarget(); + Instruction *next; + + for (Instruction *i = bb->getEntry(); i; i = next) { + next = i->next; + + for (int s = 0; i->srcExists(s); ++s) { + Instruction *insn; + ImmediateValue imm; + if (!i->src(s).isIndirect(0)) +continue; + insn = i->getIndirect(s, 0)->getInsn(); + if (!insn) +continue; + if (insn->op == OP_ADD && !isFloatType(insn->dType)) { +if (insn->src(0).getFile() != targ->nativeFile(FILE_ADDRESS) || +!insn->src(1).getImmediate(imm)) + continue; +i->setIndirect(s, 0, insn->getSrc(0)); +i->src(s).get()->reg.data.offset += imm.reg.data.u32; + } else if (insn->op == OP_SUB && !isFloatType(insn->dType)) { +if (insn->src(0).getFile() != targ->nativeFile(FILE_ADDRESS) || +!insn->src(1).getImmediate(imm)) + continue; +i->setIndirect(s, 0, insn->getSrc(0)); +i->src(s).get()->reg.data.offset -= imm.reg.data.u32; + } else if (insn->op == OP_MOV) { +if (!insn->src(0).getImmediate(imm)) + continue; +i->setIndirect(s, 0, NULL); +i->src(s).get()->reg.data.offset += imm.reg.data.u32; + } + } + } + return true; +} + +// = + // Evaluate constant expressions. class ConstantFolding : public Pass { @@ -3135,6 +3183,7 @@ Program::optimizeSSA(int level) RUN_PASS(2, ModifierFolding, run); // before load propagation -> less checks RUN_PASS(1, ConstantFolding, foldAll); RUN_PASS(1, LoadPropagation, run); + RUN_PASS(1, IndirectPropagation, run); RUN_PASS(2, MemoryOpt, run); RUN_PASS(2, LocalCSE, run); RUN_PASS(0, DeadCodeElim, buryAll); -- 2.4.10 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] svga: use the debug callback to report issues to the state tracker
On 04/12/15 20:04, Ilia Mirkin wrote: On Fri, Dec 4, 2015 at 3:01 PM, Brian Paulwrote: On 12/04/2015 12:46 PM, Ilia Mirkin wrote: On Fri, Dec 4, 2015 at 2:42 PM, Brian Paul wrote: + if (templ->poly_smooth && svga->debug.callback.debug_message) { + /* note: we always need a % something in the message string */ Why? Did I mess something up? If I write it without the dummy %s I get: In file included from ../../../../src/gallium/auxiliary/util/u_inlines.h:36:0, from svga_pipe_rasterizer.c:29: svga_pipe_rasterizer.c: In function 'svga_create_rasterizer_state': ../../../../src/gallium/auxiliary/util/u_debug.h:273:40: error: expected expression before ')' token fmt, __VA_ARGS__); \ ^ svga_pipe_rasterizer.c:357:7: note: in expansion of macro 'pipe_debug_message' pipe_debug_message(>debug.callback, CONFORMANCE, ^ H... so that'd be a "yes" to me messing it up :) I'll have a look to see if there's something simple I can do here. Off-hand it feels like it should be #__VA_ARGS__? Been a while since I looked at the specifics of all this. BTW, right now this callback is only set for debug contexts. Perhaps that was not an ideal decision... just an FYI though. apitrace creates a debug context so that's the main thing for me right now. But it might be nice if setting MESA_DEBUG would cause the debug/info messages to be printed to stderr for those apps that don't use GL_ARB_debug_output. Yeah, we should probably also throw in Enable/Disable hooks into st/mesa that flip it on/off when flipping GL_DEBUG_OUTPUT (or whatever the enum is). BTW, note that apitrace cuts messages off after 100 of them which is pretty annoying -- I opened an issue at https://github.com/apitrace/apitrace/issues/395 a while back. Not just "100", but 100 *per ID*. Which is very different. Why is it so annoying? A patch for an option is welcome anyway BTW. Jose ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] svga: use the debug callback to report issues to the state tracker
Use the new debug callback hook to report conformance, performance and fallbacks to the state tracker. The state tracker, in turn can report this issues to the user via the GL_ARB_debug_output extension. More issues can be reported in the future; this is just a start. --- src/gallium/drivers/svga/svga_context.h | 3 +++ src/gallium/drivers/svga/svga_draw_arrays.c | 7 +++ src/gallium/drivers/svga/svga_pipe_blend.c | 11 +++ src/gallium/drivers/svga/svga_pipe_misc.c| 17 + src/gallium/drivers/svga/svga_pipe_rasterizer.c | 6 ++ src/gallium/drivers/svga/svga_state_need_swtnl.c | 23 +++ 6 files changed, 67 insertions(+) diff --git a/src/gallium/drivers/svga/svga_context.h b/src/gallium/drivers/svga/svga_context.h index 6a4f9d8..c4284cc 100644 --- a/src/gallium/drivers/svga/svga_context.h +++ b/src/gallium/drivers/svga/svga_context.h @@ -392,6 +392,9 @@ struct svga_context boolean no_line_width; boolean force_hw_line_stipple; + + /** To report perf/conformance/etc issues to the state tracker */ + struct pipe_debug_callback callback; } debug; struct { diff --git a/src/gallium/drivers/svga/svga_draw_arrays.c b/src/gallium/drivers/svga/svga_draw_arrays.c index acb2e95..f6956b5 100644 --- a/src/gallium/drivers/svga/svga_draw_arrays.c +++ b/src/gallium/drivers/svga/svga_draw_arrays.c @@ -26,6 +26,7 @@ #include "svga_cmd.h" #include "util/u_inlines.h" +#include "util/u_prim.h" #include "indices/u_indices.h" #include "svga_hw_reg.h" @@ -277,6 +278,12 @@ svga_hwtnl_draw_arrays(struct svga_hwtnl *hwtnl, if (ret != PIPE_OK) goto done; + if (svga->debug.callback.debug_message) { + pipe_debug_message(>debug.callback, PERF_INFO, +"generating temporary index buffer for drawing %s", +u_prim_name(prim)); + } + ret = svga_hwtnl_simple_draw_range_elements(hwtnl, gen_buf, gen_size, diff --git a/src/gallium/drivers/svga/svga_pipe_blend.c b/src/gallium/drivers/svga/svga_pipe_blend.c index 0c9d612..5538193 100644 --- a/src/gallium/drivers/svga/svga_pipe_blend.c +++ b/src/gallium/drivers/svga/svga_pipe_blend.c @@ -243,6 +243,17 @@ svga_create_blend_state(struct pipe_context *pipe, blend->rt[i].srcblend_alpha = blend->rt[i].srcblend; blend->rt[i].dstblend_alpha = blend->rt[i].dstblend; blend->rt[i].blendeq_alpha = blend->rt[i].blendeq; + + if (svga->debug.callback.debug_message) { +if (templ->logicop_func == PIPE_LOGICOP_XOR) { + pipe_debug_message(>debug.callback, CONFORMANCE, + "XOR logicop mode has limited support%s", ""); +} +else if (templ->logicop_func != PIPE_LOGICOP_COPY) { + pipe_debug_message(>debug.callback, CONFORMANCE, + "general logicops are not supported%s", ""); +} + } } else { /* Note: the vgpu10 device does not yet support independent diff --git a/src/gallium/drivers/svga/svga_pipe_misc.c b/src/gallium/drivers/svga/svga_pipe_misc.c index c8020da..af9356d 100644 --- a/src/gallium/drivers/svga/svga_pipe_misc.c +++ b/src/gallium/drivers/svga/svga_pipe_misc.c @@ -244,6 +244,22 @@ static void svga_set_viewport_states( struct pipe_context *pipe, } +/** + * Called by state tracker to specify a callback function the driver + * can use to report info back to the state tracker. + */ +static void +svga_set_debug_callback(struct pipe_context *pipe, +const struct pipe_debug_callback *cb) +{ + struct svga_context *svga = svga_context(pipe); + + if (cb) + svga->debug.callback = *cb; + else + memset(>debug.callback, 0, sizeof(svga->debug.callback)); +} + void svga_init_misc_functions( struct svga_context *svga ) { @@ -252,6 +268,7 @@ void svga_init_misc_functions( struct svga_context *svga ) svga->pipe.set_framebuffer_state = svga_set_framebuffer_state; svga->pipe.set_clip_state = svga_set_clip_state; svga->pipe.set_viewport_states = svga_set_viewport_states; + svga->pipe.set_debug_callback = svga_set_debug_callback; } diff --git a/src/gallium/drivers/svga/svga_pipe_rasterizer.c b/src/gallium/drivers/svga/svga_pipe_rasterizer.c index 6310b7a..c6215e6 100644 --- a/src/gallium/drivers/svga/svga_pipe_rasterizer.c +++ b/src/gallium/drivers/svga/svga_pipe_rasterizer.c @@ -352,6 +352,12 @@ svga_create_rasterizer_state(struct pipe_context *pipe, define_rasterizer_object(svga, rast); } + if (templ->poly_smooth && svga->debug.callback.debug_message) { + /* note: we always need a % something in the message string */ + pipe_debug_message(>debug.callback, CONFORMANCE, +
Re: [Mesa-dev] [PATCH] svga: use the debug callback to report issues to the state tracker
On Fri, Dec 4, 2015 at 2:42 PM, Brian Paulwrote: > + if (templ->poly_smooth && svga->debug.callback.debug_message) { > + /* note: we always need a % something in the message string */ Why? Did I mess something up? BTW, right now this callback is only set for debug contexts. Perhaps that was not an ideal decision... just an FYI though. -ilia ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] svga: use the debug callback to report issues to the state tracker
On 12/04/2015 12:46 PM, Ilia Mirkin wrote: On Fri, Dec 4, 2015 at 2:42 PM, Brian Paulwrote: + if (templ->poly_smooth && svga->debug.callback.debug_message) { + /* note: we always need a % something in the message string */ Why? Did I mess something up? If I write it without the dummy %s I get: In file included from ../../../../src/gallium/auxiliary/util/u_inlines.h:36:0, from svga_pipe_rasterizer.c:29: svga_pipe_rasterizer.c: In function 'svga_create_rasterizer_state': ../../../../src/gallium/auxiliary/util/u_debug.h:273:40: error: expected expression before ')' token fmt, __VA_ARGS__); \ ^ svga_pipe_rasterizer.c:357:7: note: in expansion of macro 'pipe_debug_message' pipe_debug_message(>debug.callback, CONFORMANCE, ^ BTW, right now this callback is only set for debug contexts. Perhaps that was not an ideal decision... just an FYI though. apitrace creates a debug context so that's the main thing for me right now. But it might be nice if setting MESA_DEBUG would cause the debug/info messages to be printed to stderr for those apps that don't use GL_ARB_debug_output. -Brian ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] svga: use the debug callback to report issues to the state tracker
On Fri, Dec 4, 2015 at 3:01 PM, Brian Paulwrote: > On 12/04/2015 12:46 PM, Ilia Mirkin wrote: >> >> On Fri, Dec 4, 2015 at 2:42 PM, Brian Paul wrote: >>> >>> + if (templ->poly_smooth && svga->debug.callback.debug_message) { >>> + /* note: we always need a % something in the message string */ >> >> >> Why? Did I mess something up? > > > If I write it without the dummy %s I get: > > In file included from > ../../../../src/gallium/auxiliary/util/u_inlines.h:36:0, > from svga_pipe_rasterizer.c:29: > svga_pipe_rasterizer.c: In function 'svga_create_rasterizer_state': > ../../../../src/gallium/auxiliary/util/u_debug.h:273:40: error: expected > expression before ')' token > fmt, __VA_ARGS__); \ > ^ > svga_pipe_rasterizer.c:357:7: note: in expansion of macro > 'pipe_debug_message' >pipe_debug_message(>debug.callback, CONFORMANCE, >^ > H... so that'd be a "yes" to me messing it up :) I'll have a look to see if there's something simple I can do here. Off-hand it feels like it should be #__VA_ARGS__? Been a while since I looked at the specifics of all this. > >> >> BTW, right now this callback is only set for debug contexts. Perhaps >> that was not an ideal decision... just an FYI though. > > > apitrace creates a debug context so that's the main thing for me right now. > But it might be nice if setting MESA_DEBUG would cause the debug/info > messages to be printed to stderr for those apps that don't use > GL_ARB_debug_output. Yeah, we should probably also throw in Enable/Disable hooks into st/mesa that flip it on/off when flipping GL_DEBUG_OUTPUT (or whatever the enum is). BTW, note that apitrace cuts messages off after 100 of them which is pretty annoying -- I opened an issue at https://github.com/apitrace/apitrace/issues/395 a while back. -ilia ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] nv50/ir: fold shl + mul with immediates
On SM35 this gives: total instructions in shared programs : 6185058 -> 6181090 (-0.06%) total gprs used in shared programs: 910722 -> 910722 (0.00%) total local used in shared programs : 39072 -> 39072 (0.00%) localgpr inst bytes helped 0 021372137 hurt 0 0 0 0 Signed-off-by: Ilia Mirkin--- src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp | 8 1 file changed, 8 insertions(+) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp index 6dec208..9760128 100644 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp @@ -1191,6 +1191,14 @@ ConstantFolding::opnd(Instruction *i, ImmediateValue , int s) i->setSrc(1, bld.loadImm(NULL, imm0.reg.data.u32 + imm1.reg.data.u32)); } break; + case OP_MUL: + if (!isFloatType(si->dType) && si->src(1).getImmediate(imm1)) { +bld.setPosition(i, false); +i->op = OP_MUL; +i->setSrc(0, si->getSrc(0)); +i->setSrc(1, bld.loadImm(NULL, (1 << imm0.reg.data.u32) * imm1.reg.data.u32)); + } + break; case OP_SUB: case OP_ADD: int adds; -- 2.4.10 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/2] radeonsi: apply the streamout workaround to Fiji as well
From: Marek OlšákCc: 11.0 11.1 --- src/gallium/drivers/radeonsi/si_state_draw.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/src/gallium/drivers/radeonsi/si_state_draw.c b/src/gallium/drivers/radeonsi/si_state_draw.c index 771d206..ee84a1f 100644 --- a/src/gallium/drivers/radeonsi/si_state_draw.c +++ b/src/gallium/drivers/radeonsi/si_state_draw.c @@ -872,7 +872,9 @@ void si_draw_vbo(struct pipe_context *ctx, const struct pipe_draw_info *info) /* Workaround for a VGT hang when streamout is enabled. * It must be done after drawing. */ - if ((sctx->b.family == CHIP_HAWAII || sctx->b.family == CHIP_TONGA) && + if ((sctx->b.family == CHIP_HAWAII || +sctx->b.family == CHIP_TONGA || +sctx->b.family == CHIP_FIJI) && (sctx->b.streamout.streamout_enabled || sctx->b.streamout.prims_gen_query_enabled)) { sctx->b.flags |= SI_CONTEXT_VGT_STREAMOUT_SYNC; -- 2.1.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/2] radeonsi: don't use the CP DMA workaround on Fiji and newer
From: Marek Olšák--- src/gallium/drivers/radeonsi/si_cp_dma.c | 36 ++-- 1 file changed, 20 insertions(+), 16 deletions(-) diff --git a/src/gallium/drivers/radeonsi/si_cp_dma.c b/src/gallium/drivers/radeonsi/si_cp_dma.c index 0bf85a0..a5e3d79 100644 --- a/src/gallium/drivers/radeonsi/si_cp_dma.c +++ b/src/gallium/drivers/radeonsi/si_cp_dma.c @@ -273,22 +273,26 @@ void si_copy_buffer(struct si_context *sctx, dst_offset += r600_resource(dst)->gpu_address; src_offset += r600_resource(src)->gpu_address; - /* If the size is not aligned, we must add a dummy copy at the end -* just to align the internal counter. Otherwise, the DMA engine -* would slow down by an order of magnitude for following copies. -*/ - if (size % CP_DMA_ALIGNMENT) - realign_size = CP_DMA_ALIGNMENT - (size % CP_DMA_ALIGNMENT); - - /* If the copy begins unaligned, we must start copying from the next -* aligned block and the skipped part should be copied after everything -* else has been copied. Only the src alignment matters, not dst. -*/ - if (src_offset % CP_DMA_ALIGNMENT) { - skipped_size = CP_DMA_ALIGNMENT - (src_offset % CP_DMA_ALIGNMENT); - /* The main part will be skipped if the size is too small. */ - skipped_size = MIN2(skipped_size, size); - size -= skipped_size; + /* The workarounds aren't needed on Fiji and beyond. */ + if (sctx->b.family <= CHIP_CARRIZO || + sctx->b.family == CHIP_STONEY) { + /* If the size is not aligned, we must add a dummy copy at the end +* just to align the internal counter. Otherwise, the DMA engine +* would slow down by an order of magnitude for following copies. +*/ + if (size % CP_DMA_ALIGNMENT) + realign_size = CP_DMA_ALIGNMENT - (size % CP_DMA_ALIGNMENT); + + /* If the copy begins unaligned, we must start copying from the next +* aligned block and the skipped part should be copied after everything +* else has been copied. Only the src alignment matters, not dst. +*/ + if (src_offset % CP_DMA_ALIGNMENT) { + skipped_size = CP_DMA_ALIGNMENT - (src_offset % CP_DMA_ALIGNMENT); + /* The main part will be skipped if the size is too small. */ + skipped_size = MIN2(skipped_size, size); + size -= skipped_size; + } } /* Flush the caches. */ -- 2.1.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] radeonsi: apply the streamout workaround to Fiji as well
On Fri, Dec 4, 2015 at 3:27 PM, Marek Olšákwrote: > From: Marek Olšák > > Cc: 11.0 11.1 For the series: Reviewed-by: Alex Deucher > --- > src/gallium/drivers/radeonsi/si_state_draw.c | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > diff --git a/src/gallium/drivers/radeonsi/si_state_draw.c > b/src/gallium/drivers/radeonsi/si_state_draw.c > index 771d206..ee84a1f 100644 > --- a/src/gallium/drivers/radeonsi/si_state_draw.c > +++ b/src/gallium/drivers/radeonsi/si_state_draw.c > @@ -872,7 +872,9 @@ void si_draw_vbo(struct pipe_context *ctx, const struct > pipe_draw_info *info) > > /* Workaround for a VGT hang when streamout is enabled. > * It must be done after drawing. */ > - if ((sctx->b.family == CHIP_HAWAII || sctx->b.family == CHIP_TONGA) && > + if ((sctx->b.family == CHIP_HAWAII || > +sctx->b.family == CHIP_TONGA || > +sctx->b.family == CHIP_FIJI) && > (sctx->b.streamout.streamout_enabled || > sctx->b.streamout.prims_gen_query_enabled)) { > sctx->b.flags |= SI_CONTEXT_VGT_STREAMOUT_SYNC; > -- > 2.1.4 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] svga: use the debug callback to report issues to the state tracker
On Fri, Dec 4, 2015 at 5:40 PM, Jose Fonsecawrote: > On 04/12/15 20:04, Ilia Mirkin wrote: >> >> On Fri, Dec 4, 2015 at 3:01 PM, Brian Paul wrote: >>> >>> On 12/04/2015 12:46 PM, Ilia Mirkin wrote: On Fri, Dec 4, 2015 at 2:42 PM, Brian Paul wrote: > > > + if (templ->poly_smooth && svga->debug.callback.debug_message) { > + /* note: we always need a % something in the message string */ Why? Did I mess something up? >>> >>> >>> >>> If I write it without the dummy %s I get: >>> >>> In file included from >>> ../../../../src/gallium/auxiliary/util/u_inlines.h:36:0, >>> from svga_pipe_rasterizer.c:29: >>> svga_pipe_rasterizer.c: In function 'svga_create_rasterizer_state': >>> ../../../../src/gallium/auxiliary/util/u_debug.h:273:40: error: expected >>> expression before ')' token >>> fmt, __VA_ARGS__); \ >>> ^ >>> svga_pipe_rasterizer.c:357:7: note: in expansion of macro >>> 'pipe_debug_message' >>> pipe_debug_message(>debug.callback, CONFORMANCE, >>> ^ >>> >> >> H... so that'd be a "yes" to me messing it up :) I'll have a look >> to see if there's something simple I can do here. Off-hand it feels >> like it should be #__VA_ARGS__? Been a while since I looked at the >> specifics of all this. >> >>> BTW, right now this callback is only set for debug contexts. Perhaps that was not an ideal decision... just an FYI though. >>> >>> >>> >>> apitrace creates a debug context so that's the main thing for me right >>> now. >>> But it might be nice if setting MESA_DEBUG would cause the debug/info >>> messages to be printed to stderr for those apps that don't use >>> GL_ARB_debug_output. >> >> >> Yeah, we should probably also throw in Enable/Disable hooks into >> st/mesa that flip it on/off when flipping GL_DEBUG_OUTPUT (or whatever >> the enum is). >> >> BTW, note that apitrace cuts messages off after 100 of them which is >> pretty annoying -- I opened an issue at >> https://github.com/apitrace/apitrace/issues/395 a while back. > > > Not just "100", but 100 *per ID*. > > Which is very different. > > Why is it so annoying? A patch for an option is welcome anyway BTW. Right, but but an ID is a message. So let's say the message is "falling back to XYZ", why would you only want to know the first 100 times? Or let's say the message is "shader compiles to XYZ instructions" -- again, why the first 100 times? In fact, *given* that you enable these messages, I don't see why it would ever be desirable to limit them. -ilia ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] svga: use the debug callback to report issues to the state tracker
On 04/12/15 23:00, Jose Fonseca wrote: On 04/12/15 22:45, Ilia Mirkin wrote: On Fri, Dec 4, 2015 at 5:40 PM, Jose Fonsecawrote: On 04/12/15 20:04, Ilia Mirkin wrote: On Fri, Dec 4, 2015 at 3:01 PM, Brian Paul wrote: On 12/04/2015 12:46 PM, Ilia Mirkin wrote: On Fri, Dec 4, 2015 at 2:42 PM, Brian Paul wrote: + if (templ->poly_smooth && svga->debug.callback.debug_message) { + /* note: we always need a % something in the message string */ Why? Did I mess something up? If I write it without the dummy %s I get: In file included from ../../../../src/gallium/auxiliary/util/u_inlines.h:36:0, from svga_pipe_rasterizer.c:29: svga_pipe_rasterizer.c: In function 'svga_create_rasterizer_state': ../../../../src/gallium/auxiliary/util/u_debug.h:273:40: error: expected expression before ')' token fmt, __VA_ARGS__); \ ^ svga_pipe_rasterizer.c:357:7: note: in expansion of macro 'pipe_debug_message' pipe_debug_message(>debug.callback, CONFORMANCE, ^ H... so that'd be a "yes" to me messing it up :) I'll have a look to see if there's something simple I can do here. Off-hand it feels like it should be #__VA_ARGS__? Been a while since I looked at the specifics of all this. BTW, right now this callback is only set for debug contexts. Perhaps that was not an ideal decision... just an FYI though. apitrace creates a debug context so that's the main thing for me right now. But it might be nice if setting MESA_DEBUG would cause the debug/info messages to be printed to stderr for those apps that don't use GL_ARB_debug_output. Yeah, we should probably also throw in Enable/Disable hooks into st/mesa that flip it on/off when flipping GL_DEBUG_OUTPUT (or whatever the enum is). BTW, note that apitrace cuts messages off after 100 of them which is pretty annoying -- I opened an issue at https://github.com/apitrace/apitrace/issues/395 a while back. Not just "100", but 100 *per ID*. Which is very different. Why is it so annoying? A patch for an option is welcome anyway BTW. Right, but but an ID is a message. So let's say the message is "falling back to XYZ", why would you only want to know the first 100 times? Or let's say the message is "shader compiles to XYZ instructions" -- again, why the first 100 times? In fact, *given* that you enable these messages, I don't see why it would ever be desirable to limit them. Isn't it obvious? OpenGL applications are repetitive by nature. You're bound to have the same error over and over again. Futhermore, some IHVs have messages so frequent that completely overwel everything else. So, when you take a trace of an arbitrary app, and replay e.g., on NVIDIA, you'll get lots and performance hints. Nevertheless, you'll want to be able to spot rare and more important issues among all that. limiting to a fix number, e.g., 100, means that after a while, they all silence, so that any new issue will stand out. Soemthing that happens all the time is invariably interesting. BTW, this is nothing novel -- most logging systems have mechanism in place to rate limit messages. Something bad/unexpected happening 100 times or 10 times is all the same -- it happened too many times, and needs to be fixed, which will bring the count to zero, or doesn't need to be fixed and it can be ignored. Eitherway, limiting message prevents it from flooding and hiding other potentially important issues. Jose ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] svga: use the debug callback to report issues to the state tracker
On 04/12/15 19:42, Brian Paul wrote: Use the new debug callback hook to report conformance, performance and fallbacks to the state tracker. The state tracker, in turn can report this issues to the user via the GL_ARB_debug_output extension. More issues can be reported in the future; this is just a start. --- src/gallium/drivers/svga/svga_context.h | 3 +++ src/gallium/drivers/svga/svga_draw_arrays.c | 7 +++ src/gallium/drivers/svga/svga_pipe_blend.c | 11 +++ src/gallium/drivers/svga/svga_pipe_misc.c| 17 + src/gallium/drivers/svga/svga_pipe_rasterizer.c | 6 ++ src/gallium/drivers/svga/svga_state_need_swtnl.c | 23 +++ 6 files changed, 67 insertions(+) diff --git a/src/gallium/drivers/svga/svga_context.h b/src/gallium/drivers/svga/svga_context.h index 6a4f9d8..c4284cc 100644 --- a/src/gallium/drivers/svga/svga_context.h +++ b/src/gallium/drivers/svga/svga_context.h @@ -392,6 +392,9 @@ struct svga_context boolean no_line_width; boolean force_hw_line_stipple; + + /** To report perf/conformance/etc issues to the state tracker */ + struct pipe_debug_callback callback; } debug; struct { diff --git a/src/gallium/drivers/svga/svga_draw_arrays.c b/src/gallium/drivers/svga/svga_draw_arrays.c index acb2e95..f6956b5 100644 --- a/src/gallium/drivers/svga/svga_draw_arrays.c +++ b/src/gallium/drivers/svga/svga_draw_arrays.c @@ -26,6 +26,7 @@ #include "svga_cmd.h" #include "util/u_inlines.h" +#include "util/u_prim.h" #include "indices/u_indices.h" #include "svga_hw_reg.h" @@ -277,6 +278,12 @@ svga_hwtnl_draw_arrays(struct svga_hwtnl *hwtnl, if (ret != PIPE_OK) goto done; + if (svga->debug.callback.debug_message) { + pipe_debug_message(>debug.callback, PERF_INFO, +"generating temporary index buffer for drawing %s", +u_prim_name(prim)); + } + ret = svga_hwtnl_simple_draw_range_elements(hwtnl, gen_buf, gen_size, diff --git a/src/gallium/drivers/svga/svga_pipe_blend.c b/src/gallium/drivers/svga/svga_pipe_blend.c index 0c9d612..5538193 100644 --- a/src/gallium/drivers/svga/svga_pipe_blend.c +++ b/src/gallium/drivers/svga/svga_pipe_blend.c @@ -243,6 +243,17 @@ svga_create_blend_state(struct pipe_context *pipe, blend->rt[i].srcblend_alpha = blend->rt[i].srcblend; blend->rt[i].dstblend_alpha = blend->rt[i].dstblend; blend->rt[i].blendeq_alpha = blend->rt[i].blendeq; + + if (svga->debug.callback.debug_message) { +if (templ->logicop_func == PIPE_LOGICOP_XOR) { + pipe_debug_message(>debug.callback, CONFORMANCE, + "XOR logicop mode has limited support%s", ""); +} +else if (templ->logicop_func != PIPE_LOGICOP_COPY) { + pipe_debug_message(>debug.callback, CONFORMANCE, + "general logicops are not supported%s", ""); +} + } } else { /* Note: the vgpu10 device does not yet support independent diff --git a/src/gallium/drivers/svga/svga_pipe_misc.c b/src/gallium/drivers/svga/svga_pipe_misc.c index c8020da..af9356d 100644 --- a/src/gallium/drivers/svga/svga_pipe_misc.c +++ b/src/gallium/drivers/svga/svga_pipe_misc.c @@ -244,6 +244,22 @@ static void svga_set_viewport_states( struct pipe_context *pipe, } +/** + * Called by state tracker to specify a callback function the driver + * can use to report info back to the state tracker. + */ +static void +svga_set_debug_callback(struct pipe_context *pipe, +const struct pipe_debug_callback *cb) +{ + struct svga_context *svga = svga_context(pipe); + + if (cb) + svga->debug.callback = *cb; + else + memset(>debug.callback, 0, sizeof(svga->debug.callback)); +} + void svga_init_misc_functions( struct svga_context *svga ) { @@ -252,6 +268,7 @@ void svga_init_misc_functions( struct svga_context *svga ) svga->pipe.set_framebuffer_state = svga_set_framebuffer_state; svga->pipe.set_clip_state = svga_set_clip_state; svga->pipe.set_viewport_states = svga_set_viewport_states; + svga->pipe.set_debug_callback = svga_set_debug_callback; } diff --git a/src/gallium/drivers/svga/svga_pipe_rasterizer.c b/src/gallium/drivers/svga/svga_pipe_rasterizer.c index 6310b7a..c6215e6 100644 --- a/src/gallium/drivers/svga/svga_pipe_rasterizer.c +++ b/src/gallium/drivers/svga/svga_pipe_rasterizer.c @@ -352,6 +352,12 @@ svga_create_rasterizer_state(struct pipe_context *pipe, define_rasterizer_object(svga, rast); } + if (templ->poly_smooth && svga->debug.callback.debug_message) { + /* note: we always need a % something in the message string */ +
Re: [Mesa-dev] [PATCH] svga: use the debug callback to report issues to the state tracker
On Fri, Dec 4, 2015 at 6:25 PM, Jose Fonsecawrote: > On 04/12/15 19:42, Brian Paul wrote: >> >> Use the new debug callback hook to report conformance, performance >> and fallbacks to the state tracker. The state tracker, in turn can >> report this issues to the user via the GL_ARB_debug_output extension. >> >> More issues can be reported in the future; this is just a start. >> --- >> src/gallium/drivers/svga/svga_context.h | 3 +++ >> src/gallium/drivers/svga/svga_draw_arrays.c | 7 +++ >> src/gallium/drivers/svga/svga_pipe_blend.c | 11 +++ >> src/gallium/drivers/svga/svga_pipe_misc.c| 17 + >> src/gallium/drivers/svga/svga_pipe_rasterizer.c | 6 ++ >> src/gallium/drivers/svga/svga_state_need_swtnl.c | 23 >> +++ >> 6 files changed, 67 insertions(+) >> >> diff --git a/src/gallium/drivers/svga/svga_context.h >> b/src/gallium/drivers/svga/svga_context.h >> index 6a4f9d8..c4284cc 100644 >> --- a/src/gallium/drivers/svga/svga_context.h >> +++ b/src/gallium/drivers/svga/svga_context.h >> @@ -392,6 +392,9 @@ struct svga_context >> >> boolean no_line_width; >> boolean force_hw_line_stipple; >> + >> + /** To report perf/conformance/etc issues to the state tracker */ >> + struct pipe_debug_callback callback; >> } debug; >> >> struct { >> diff --git a/src/gallium/drivers/svga/svga_draw_arrays.c >> b/src/gallium/drivers/svga/svga_draw_arrays.c >> index acb2e95..f6956b5 100644 >> --- a/src/gallium/drivers/svga/svga_draw_arrays.c >> +++ b/src/gallium/drivers/svga/svga_draw_arrays.c >> @@ -26,6 +26,7 @@ >> #include "svga_cmd.h" >> >> #include "util/u_inlines.h" >> +#include "util/u_prim.h" >> #include "indices/u_indices.h" >> >> #include "svga_hw_reg.h" >> @@ -277,6 +278,12 @@ svga_hwtnl_draw_arrays(struct svga_hwtnl *hwtnl, >> if (ret != PIPE_OK) >>goto done; >> >> + if (svga->debug.callback.debug_message) { >> + pipe_debug_message(>debug.callback, PERF_INFO, >> +"generating temporary index buffer for >> drawing %s", >> +u_prim_name(prim)); >> + } >> + >> ret = svga_hwtnl_simple_draw_range_elements(hwtnl, >> gen_buf, >> gen_size, >> diff --git a/src/gallium/drivers/svga/svga_pipe_blend.c >> b/src/gallium/drivers/svga/svga_pipe_blend.c >> index 0c9d612..5538193 100644 >> --- a/src/gallium/drivers/svga/svga_pipe_blend.c >> +++ b/src/gallium/drivers/svga/svga_pipe_blend.c >> @@ -243,6 +243,17 @@ svga_create_blend_state(struct pipe_context *pipe, >>blend->rt[i].srcblend_alpha = blend->rt[i].srcblend; >>blend->rt[i].dstblend_alpha = blend->rt[i].dstblend; >>blend->rt[i].blendeq_alpha = blend->rt[i].blendeq; >> + >> + if (svga->debug.callback.debug_message) { >> +if (templ->logicop_func == PIPE_LOGICOP_XOR) { >> + pipe_debug_message(>debug.callback, CONFORMANCE, >> + "XOR logicop mode has limited >> support%s", ""); >> +} >> +else if (templ->logicop_func != PIPE_LOGICOP_COPY) { >> + pipe_debug_message(>debug.callback, CONFORMANCE, >> + "general logicops are not supported%s", >> ""); >> +} >> + } >> } >> else { >>/* Note: the vgpu10 device does not yet support independent >> diff --git a/src/gallium/drivers/svga/svga_pipe_misc.c >> b/src/gallium/drivers/svga/svga_pipe_misc.c >> index c8020da..af9356d 100644 >> --- a/src/gallium/drivers/svga/svga_pipe_misc.c >> +++ b/src/gallium/drivers/svga/svga_pipe_misc.c >> @@ -244,6 +244,22 @@ static void svga_set_viewport_states( struct >> pipe_context *pipe, >> } >> >> >> +/** >> + * Called by state tracker to specify a callback function the driver >> + * can use to report info back to the state tracker. >> + */ >> +static void >> +svga_set_debug_callback(struct pipe_context *pipe, >> +const struct pipe_debug_callback *cb) >> +{ >> + struct svga_context *svga = svga_context(pipe); >> + >> + if (cb) >> + svga->debug.callback = *cb; >> + else >> + memset(>debug.callback, 0, sizeof(svga->debug.callback)); >> +} >> + >> >> void svga_init_misc_functions( struct svga_context *svga ) >> { >> @@ -252,6 +268,7 @@ void svga_init_misc_functions( struct svga_context >> *svga ) >> svga->pipe.set_framebuffer_state = svga_set_framebuffer_state; >> svga->pipe.set_clip_state = svga_set_clip_state; >> svga->pipe.set_viewport_states = svga_set_viewport_states; >> + svga->pipe.set_debug_callback = svga_set_debug_callback; >> } >> >> >> diff --git a/src/gallium/drivers/svga/svga_pipe_rasterizer.c >> b/src/gallium/drivers/svga/svga_pipe_rasterizer.c >> index
Re: [Mesa-dev] [PATCH] radeonsi: disable DCC on Stoney
On Fri, Dec 4, 2015 at 2:35 PM, Marek Olšákwrote: > From: Marek Olšák > > Cc: 11.0 11.1 Reviewed-by: Alex Deucher > --- > src/gallium/drivers/radeon/r600_texture.c | 4 > 1 file changed, 4 insertions(+) > > diff --git a/src/gallium/drivers/radeon/r600_texture.c > b/src/gallium/drivers/radeon/r600_texture.c > index e2947d9..e7b2aea 100644 > --- a/src/gallium/drivers/radeon/r600_texture.c > +++ b/src/gallium/drivers/radeon/r600_texture.c > @@ -497,6 +497,10 @@ static void vi_texture_alloc_dcc_separate(struct > r600_common_screen *rscreen, > if (rscreen->debug_flags & DBG_NO_DCC) > return; > > + /* TODO: DCC is broken on Stoney */ > + if (rscreen->family == CHIP_STONEY) > + return; > + > rtex->dcc_buffer = (struct r600_resource *) > r600_aligned_buffer_create(>b, PIPE_BIND_CUSTOM, >PIPE_USAGE_DEFAULT, > rtex->surface.dcc_size, rtex->surface.dcc_alignment); > -- > 2.1.4 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2 1/9] vc4: Do all uniform loads with byte offsets
Previously, we lowered direct uniform loads to dword offsets and indirect loads to byte offsets in vc4_nir_lower_io. However, it simplifies things a bit if we just use byte offsets for everything and then divide by 4 when we handle the direct uniform load. Cc: Eric Anholt--- src/gallium/drivers/vc4/vc4_nir_lower_io.c | 17 + src/gallium/drivers/vc4/vc4_program.c | 11 +++ 2 files changed, 12 insertions(+), 16 deletions(-) diff --git a/src/gallium/drivers/vc4/vc4_nir_lower_io.c b/src/gallium/drivers/vc4/vc4_nir_lower_io.c index 1afe52a..e2581c8 100644 --- a/src/gallium/drivers/vc4/vc4_nir_lower_io.c +++ b/src/gallium/drivers/vc4/vc4_nir_lower_io.c @@ -338,8 +338,8 @@ static void vc4_nir_lower_uniform(struct vc4_compile *c, nir_builder *b, nir_intrinsic_instr *intr) { -/* All TGSI-to-NIR uniform loads are vec4, but we may create dword - * loads in our lowering passes. +/* All TGSI-to-NIR uniform loads are vec4, but we need byte offsets + * byte offsets in the backend. */ if (intr->num_components == 1) return; @@ -355,6 +355,9 @@ vc4_nir_lower_uniform(struct vc4_compile *c, nir_builder *b, intr_comp->num_components = 1; nir_ssa_dest_init(_comp->instr, _comp->dest, 1, NULL); +/* Convert the base offset to bytes and add the component */ +intr_comp->const_index[0] = (intr->const_index[0] * 16 + i * 4); + if (intr->intrinsic == nir_intrinsic_load_uniform_indirect) { /* Convert the variable TGSI register index to a byte * offset. @@ -363,16 +366,6 @@ vc4_nir_lower_uniform(struct vc4_compile *c, nir_builder *b, nir_src_for_ssa(nir_ishl(b, intr->src[0].ssa, nir_imm_int(b, 4))); - -/* Convert the offset to be a byte index, too. */ -intr_comp->const_index[0] = (intr->const_index[0] * 16 + - i * 4); -} else { -/* We want a dword index for non-indirect uniform - * loads. - */ -intr_comp->const_index[0] = (intr->const_index[0] * 4 + - i); } dests[i] = _comp->dest.ssa; diff --git a/src/gallium/drivers/vc4/vc4_program.c b/src/gallium/drivers/vc4/vc4_program.c index 197577b..1a45bac 100644 --- a/src/gallium/drivers/vc4/vc4_program.c +++ b/src/gallium/drivers/vc4/vc4_program.c @@ -1433,6 +1433,7 @@ static void ntq_emit_intrinsic(struct vc4_compile *c, nir_intrinsic_instr *instr) { const nir_intrinsic_info *info = _intrinsic_infos[instr->intrinsic]; +int dword_offset; struct qreg *dest = NULL; if (info->has_dest) { @@ -1442,11 +1443,13 @@ ntq_emit_intrinsic(struct vc4_compile *c, nir_intrinsic_instr *instr) switch (instr->intrinsic) { case nir_intrinsic_load_uniform: assert(instr->num_components == 1); -if (instr->const_index[0] < VC4_NIR_STATE_UNIFORM_OFFSET) { -*dest = qir_uniform(c, QUNIFORM_UNIFORM, -instr->const_index[0]); +/* The offset is in bytes, but we need dwords */ +assert(instr->const_index[0] % 4 == 0); +dword_offset = instr->const_index[0] / 4; +if (dword_offset < VC4_NIR_STATE_UNIFORM_OFFSET) { +*dest = qir_uniform(c, QUNIFORM_UNIFORM, dword_offset); } else { -*dest = qir_uniform(c, instr->const_index[0] - +*dest = qir_uniform(c, dword_offset - VC4_NIR_STATE_UNIFORM_OFFSET, 0); } -- 2.5.0.400.gff86faf ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2 3/9] nir/glsl: Stop handling UBO/SSBO load/stores differently depending on indirect
--- src/glsl/nir/glsl_to_nir.cpp | 47 +++- 1 file changed, 7 insertions(+), 40 deletions(-) diff --git a/src/glsl/nir/glsl_to_nir.cpp b/src/glsl/nir/glsl_to_nir.cpp index 45d045c..ba23b91 100644 --- a/src/glsl/nir/glsl_to_nir.cpp +++ b/src/glsl/nir/glsl_to_nir.cpp @@ -853,24 +853,12 @@ nir_visitor::visit(ir_call *ir) ir_constant *write_mask = ((ir_instruction *)param)->as_constant(); assert(write_mask); - /* Check if we need the indirect version */ - ir_constant *const_offset = offset->as_constant(); - if (!const_offset) { -op = nir_intrinsic_store_ssbo_indirect; -ralloc_free(instr); -instr = nir_intrinsic_instr_create(shader, op); -instr->src[2] = nir_src_for_ssa(evaluate_rvalue(offset)); -instr->const_index[0] = 0; - } else { -instr->const_index[0] = const_offset->value.u[0]; - } - - instr->const_index[1] = write_mask->value.u[0]; - instr->src[0] = nir_src_for_ssa(evaluate_rvalue(val)); + instr->src[1] = nir_src_for_ssa(evaluate_rvalue(block)); + instr->src[2] = nir_src_for_ssa(evaluate_rvalue(offset)); + instr->const_index[0] = write_mask->value.u[0]; instr->num_components = val->type->vector_elements; - instr->src[1] = nir_src_for_ssa(evaluate_rvalue(block)); nir_builder_instr_insert(, >instr); break; } @@ -881,20 +869,8 @@ nir_visitor::visit(ir_call *ir) param = param->get_next(); ir_rvalue *offset = ((ir_instruction *)param)->as_rvalue(); - /* Check if we need the indirect version */ - ir_constant *const_offset = offset->as_constant(); - if (!const_offset) { -op = nir_intrinsic_load_ssbo_indirect; -ralloc_free(instr); -instr = nir_intrinsic_instr_create(shader, op); -instr->src[1] = nir_src_for_ssa(evaluate_rvalue(offset)); -instr->const_index[0] = 0; -dest = >dest; - } else { -instr->const_index[0] = const_offset->value.u[0]; - } - instr->src[0] = nir_src_for_ssa(evaluate_rvalue(block)); + instr->src[1] = nir_src_for_ssa(evaluate_rvalue(offset)); const glsl_type *type = ir->return_deref->var->type; instr->num_components = type->vector_elements; @@ -1174,20 +1150,11 @@ nir_visitor::visit(ir_expression *ir) /* Some special cases */ switch (ir->operation) { case ir_binop_ubo_load: { - ir_constant *const_index = ir->operands[1]->as_constant(); - - nir_intrinsic_op op; - if (const_index) { - op = nir_intrinsic_load_ubo; - } else { - op = nir_intrinsic_load_ubo_indirect; - } - nir_intrinsic_instr *load = nir_intrinsic_instr_create(this->shader, op); + nir_intrinsic_instr *load = + nir_intrinsic_instr_create(this->shader, nir_intrinsic_load_ubo); load->num_components = ir->type->vector_elements; - load->const_index[0] = const_index ? const_index->value.u[0] : 0; /* base offset */ load->src[0] = nir_src_for_ssa(evaluate_rvalue(ir->operands[0])); - if (!const_index) - load->src[1] = nir_src_for_ssa(evaluate_rvalue(ir->operands[1])); + load->src[1] = nir_src_for_ssa(evaluate_rvalue(ir->operands[1])); add_instr(>instr, ir->type->vector_elements); /* -- 2.5.0.400.gff86faf ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2 6/9] i965/vec4: Get rid of load/store_foo_indirect
--- src/mesa/drivers/dri/i965/brw_vec4_gs_nir.cpp | 7 +-- src/mesa/drivers/dri/i965/brw_vec4_nir.cpp| 91 +++ 2 files changed, 40 insertions(+), 58 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_vec4_gs_nir.cpp b/src/mesa/drivers/dri/i965/brw_vec4_gs_nir.cpp index e51ef4b..6f66978 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4_gs_nir.cpp +++ b/src/mesa/drivers/dri/i965/brw_vec4_gs_nir.cpp @@ -60,19 +60,19 @@ vec4_gs_visitor::nir_emit_intrinsic(nir_intrinsic_instr *instr) src_reg src; switch (instr->intrinsic) { - case nir_intrinsic_load_per_vertex_input_indirect: - assert(!"EmitNoIndirectInput should prevent this."); case nir_intrinsic_load_per_vertex_input: { /* The EmitNoIndirectInput flag guarantees our vertex index will * be constant. We should handle indirects someday. */ nir_const_value *vertex = nir_src_as_const_value(instr->src[0]); + nir_const_value *offset = nir_src_as_const_value(instr->src[1]); /* Make up a type...we have no way of knowing... */ const glsl_type *const type = glsl_type::ivec(instr->num_components); src = src_reg(ATTR, BRW_VARYING_SLOT_COUNT * vertex->u[0] + - instr->const_index[0], type); + instr->const_index[0] + offset->u[0], +type); dest = get_nir_dest(instr->dest, src.type); dest.writemask = brw_writemask_for_size(instr->num_components); emit(MOV(dest, src)); @@ -80,7 +80,6 @@ vec4_gs_visitor::nir_emit_intrinsic(nir_intrinsic_instr *instr) } case nir_intrinsic_load_input: - case nir_intrinsic_load_input_indirect: unreachable("nir_lower_io should have produced per_vertex intrinsics"); case nir_intrinsic_emit_vertex_with_counter: { diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp index 1e03022..dd2d7e7 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp +++ b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp @@ -369,22 +369,17 @@ vec4_visitor::nir_emit_intrinsic(nir_intrinsic_instr *instr) dst_reg dest; src_reg src; - bool has_indirect = false; - switch (instr->intrinsic) { - case nir_intrinsic_load_input_indirect: - has_indirect = true; - /* fallthrough */ case nir_intrinsic_load_input: { - int offset = instr->const_index[0]; - src = src_reg(ATTR, offset, glsl_type::uvec4_type); + nir_const_value *const_offset = nir_src_as_const_value(instr->src[0]); + + /* We set EmitNoIndirectInput for VS */ + assert(const_offset); + + src = src_reg(ATTR, instr->const_index[0] + const_offset->u[0], +glsl_type::uvec4_type); - if (has_indirect) { - dest.reladdr = new(mem_ctx) src_reg(get_nir_src(instr->src[0], - BRW_REGISTER_TYPE_D, - 1)); - } dest = get_nir_dest(instr->dest, src.type); dest.writemask = brw_writemask_for_size(instr->num_components); @@ -392,11 +387,11 @@ vec4_visitor::nir_emit_intrinsic(nir_intrinsic_instr *instr) break; } - case nir_intrinsic_store_output_indirect: - unreachable("nir_lower_outputs_to_temporaries should prevent this"); - case nir_intrinsic_store_output: { - int varying = instr->const_index[0]; + nir_const_value *const_offset = nir_src_as_const_value(instr->src[1]); + assert(const_offset); + + int varying = instr->const_index[0] + const_offset->u[0]; src = get_nir_src(instr->src[0], BRW_REGISTER_TYPE_F, instr->num_components); @@ -431,9 +426,6 @@ vec4_visitor::nir_emit_intrinsic(nir_intrinsic_instr *instr) break; } - case nir_intrinsic_store_ssbo_indirect: - has_indirect = true; - /* fallthrough */ case nir_intrinsic_store_ssbo: { assert(devinfo->gen >= 7); @@ -458,20 +450,19 @@ vec4_visitor::nir_emit_intrinsic(nir_intrinsic_instr *instr) } /* Offset */ - src_reg offset_reg = src_reg(this, glsl_type::uint_type); - unsigned const_offset_bytes = 0; - if (has_indirect) { - emit(MOV(dst_reg(offset_reg), get_nir_src(instr->src[2], 1))); + src_reg offset_reg; + nir_const_value *const_offset = nir_src_as_const_value(instr->src[2]); + if (const_offset) { + offset_reg = brw_imm_ud(const_offset->u[0]); } else { - const_offset_bytes = instr->const_index[0]; - emit(MOV(dst_reg(offset_reg), brw_imm_ud(const_offset_bytes))); + offset_reg = get_nir_src(instr->src[2], 1); } /* Value */ src_reg val_reg = get_nir_src(instr->src[0], 4); /* Writemask */ - unsigned write_mask = instr->const_index[1]; + unsigned write_mask = instr->const_index[0]; /* IvyBridge does not have a native SIMD4x2 untyped write
[Mesa-dev] [PATCH v2 9/9] vc4/nir: Use the new unified io intrinsics
Cc: Eric Anholt--- src/gallium/drivers/vc4/vc4_nir_lower_io.c | 16 ++- src/gallium/drivers/vc4/vc4_program.c | 31 -- 2 files changed, 23 insertions(+), 24 deletions(-) diff --git a/src/gallium/drivers/vc4/vc4_nir_lower_io.c b/src/gallium/drivers/vc4/vc4_nir_lower_io.c index e2581c8..ba05797 100644 --- a/src/gallium/drivers/vc4/vc4_nir_lower_io.c +++ b/src/gallium/drivers/vc4/vc4_nir_lower_io.c @@ -358,15 +358,12 @@ vc4_nir_lower_uniform(struct vc4_compile *c, nir_builder *b, /* Convert the base offset to bytes and add the component */ intr_comp->const_index[0] = (intr->const_index[0] * 16 + i * 4); -if (intr->intrinsic == nir_intrinsic_load_uniform_indirect) { -/* Convert the variable TGSI register index to a byte - * offset. - */ -intr_comp->src[0] = -nir_src_for_ssa(nir_ishl(b, - intr->src[0].ssa, - nir_imm_int(b, 4))); -} +/* Convert the offset to bytes. If it happens to be a + * constant, constant-folding will clean up the shift for us. + */ +intr_comp->src[0] = +nir_src_for_ssa(nir_ishl(b, intr->src[0].ssa, + nir_imm_int(b, 4))); dests[i] = _comp->dest.ssa; @@ -397,7 +394,6 @@ vc4_nir_lower_io_instr(struct vc4_compile *c, nir_builder *b, break; case nir_intrinsic_load_uniform: -case nir_intrinsic_load_uniform_indirect: case nir_intrinsic_load_user_clip_plane: vc4_nir_lower_uniform(c, b, intr); break; diff --git a/src/gallium/drivers/vc4/vc4_program.c b/src/gallium/drivers/vc4/vc4_program.c index 1a45bac..1423d6e 100644 --- a/src/gallium/drivers/vc4/vc4_program.c +++ b/src/gallium/drivers/vc4/vc4_program.c @@ -1433,7 +1433,7 @@ static void ntq_emit_intrinsic(struct vc4_compile *c, nir_intrinsic_instr *instr) { const nir_intrinsic_info *info = _intrinsic_infos[instr->intrinsic]; -int dword_offset; +nir_const_value *const_offset; struct qreg *dest = NULL; if (info->has_dest) { @@ -1443,23 +1443,26 @@ ntq_emit_intrinsic(struct vc4_compile *c, nir_intrinsic_instr *instr) switch (instr->intrinsic) { case nir_intrinsic_load_uniform: assert(instr->num_components == 1); -/* The offset is in bytes, but we need dwords */ -assert(instr->const_index[0] % 4 == 0); -dword_offset = instr->const_index[0] / 4; -if (dword_offset < VC4_NIR_STATE_UNIFORM_OFFSET) { -*dest = qir_uniform(c, QUNIFORM_UNIFORM, dword_offset); +const_offset = nir_src_as_const_value(instr->src[0]); +if (const_offset) { +unsigned offset = const_offset->u[0] + + instr->const_index[0]; +assert(offset % 4 == 0); +/* We need dwords */ +offset = offset / 4; +if (offset < VC4_NIR_STATE_UNIFORM_OFFSET) { +*dest = qir_uniform(c, QUNIFORM_UNIFORM, +offset); +} else { +*dest = qir_uniform(c, offset - + VC4_NIR_STATE_UNIFORM_OFFSET, +0); +} } else { -*dest = qir_uniform(c, dword_offset - -VC4_NIR_STATE_UNIFORM_OFFSET, -0); +*dest = indirect_uniform_load(c, instr); } break; -case nir_intrinsic_load_uniform_indirect: -*dest = indirect_uniform_load(c, instr); - -break; - case nir_intrinsic_load_user_clip_plane: *dest = qir_uniform(c, QUNIFORM_USER_CLIP_PLANE, instr->const_index[0]); -- 2.5.0.400.gff86faf ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2 4/9] nir/lower_io: Get rid of load/store_foo_indirect
--- src/glsl/nir/nir.h | 2 +- src/glsl/nir/nir_lower_io.c | 111 +--- 2 files changed, 44 insertions(+), 69 deletions(-) diff --git a/src/glsl/nir/nir.h b/src/glsl/nir/nir.h index e161b70..2e72e66 100644 --- a/src/glsl/nir/nir.h +++ b/src/glsl/nir/nir.h @@ -1969,7 +1969,7 @@ void nir_assign_var_locations(struct exec_list *var_list, void nir_lower_io(nir_shader *shader, nir_variable_mode mode, int (*type_size)(const struct glsl_type *)); -nir_src *nir_get_io_indirect_src(nir_intrinsic_instr *instr); +nir_src *nir_get_io_offset_src(nir_intrinsic_instr *instr); nir_src *nir_get_io_vertex_index_src(nir_intrinsic_instr *instr); void nir_lower_vars_to_ssa(nir_shader *shader); diff --git a/src/glsl/nir/nir_lower_io.c b/src/glsl/nir/nir_lower_io.c index f64ac69..a2723d5 100644 --- a/src/glsl/nir/nir_lower_io.c +++ b/src/glsl/nir/nir_lower_io.c @@ -86,10 +86,9 @@ is_per_vertex_output(struct lower_io_state *state, nir_variable *var) stage == MESA_SHADER_TESS_CTRL; } -static unsigned +static nir_ssa_def * get_io_offset(nir_builder *b, nir_deref_var *deref, nir_ssa_def **vertex_index, - nir_ssa_def **out_indirect, int (*type_size)(const struct glsl_type *)) { nir_deref *tail = >deref; @@ -109,8 +108,8 @@ get_io_offset(nir_builder *b, nir_deref_var *deref, *vertex_index = vtx; } - nir_ssa_def *indirect = NULL; - unsigned base_offset = 0; + /* Just emit code and let constant-folding go to town */ + nir_ssa_def *offset = nir_imm_int(b, 0); while (tail->child != NULL) { const struct glsl_type *parent_type = tail->type; @@ -120,55 +119,46 @@ get_io_offset(nir_builder *b, nir_deref_var *deref, nir_deref_array *deref_array = nir_deref_as_array(tail); unsigned size = type_size(tail->type); - base_offset += size * deref_array->base_offset; + offset = nir_iadd(b, offset, + nir_imm_int(b, size * deref_array->base_offset)); if (deref_array->deref_array_type == nir_deref_array_type_indirect) { nir_ssa_def *mul = nir_imul(b, nir_imm_int(b, size), nir_ssa_for_src(b, deref_array->indirect, 1)); -indirect = indirect ? nir_iadd(b, indirect, mul) : mul; +offset = nir_iadd(b, offset, mul); } } else if (tail->deref_type == nir_deref_type_struct) { nir_deref_struct *deref_struct = nir_deref_as_struct(tail); + unsigned field_offset = 0; for (unsigned i = 0; i < deref_struct->index; i++) { -base_offset += type_size(glsl_get_struct_field(parent_type, i)); +field_offset += type_size(glsl_get_struct_field(parent_type, i)); } + offset = nir_iadd(b, offset, nir_imm_int(b, field_offset)); } } - *out_indirect = indirect; - return base_offset; + return offset; } static nir_intrinsic_op load_op(struct lower_io_state *state, -nir_variable_mode mode, bool per_vertex, bool has_indirect) +nir_variable_mode mode, bool per_vertex) { nir_intrinsic_op op; switch (mode) { case nir_var_shader_in: - if (per_vertex) { - op = has_indirect ? nir_intrinsic_load_per_vertex_input_indirect : - nir_intrinsic_load_per_vertex_input; - } else { - op = has_indirect ? nir_intrinsic_load_input_indirect : - nir_intrinsic_load_input; - } + op = per_vertex ? nir_intrinsic_load_per_vertex_input : +nir_intrinsic_load_input; break; case nir_var_shader_out: - if (per_vertex) { - op = has_indirect ? nir_intrinsic_load_per_vertex_output_indirect : - nir_intrinsic_load_per_vertex_output; - } else { - op = has_indirect ? nir_intrinsic_load_output_indirect : - nir_intrinsic_load_output; - } + op = per_vertex ? nir_intrinsic_load_per_vertex_output : +nir_intrinsic_load_output; break; case nir_var_uniform: - op = has_indirect ? nir_intrinsic_load_uniform_indirect : - nir_intrinsic_load_uniform; + op = nir_intrinsic_load_uniform; break; default: unreachable("Unknown variable mode"); @@ -211,32 +201,25 @@ nir_lower_io_block(nir_block *block, void *void_state) is_per_vertex_input(state, intrin->variables[0]->var) || is_per_vertex_output(state, intrin->variables[0]->var); - nir_ssa_def *indirect; + nir_ssa_def *offset; nir_ssa_def *vertex_index; - unsigned offset = get_io_offset(b, intrin->variables[0], - per_vertex ? _index : NULL, - , state->type_size); +
[Mesa-dev] [PATCH v2 8/9] ir3/nir: Use the new unified io intrinsics
Cc: Rob Clark--- .../drivers/freedreno/ir3/ir3_compiler_nir.c | 79 +- 1 file changed, 47 insertions(+), 32 deletions(-) diff --git a/src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c b/src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c index 8617704..99b4264 100644 --- a/src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c +++ b/src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c @@ -1215,6 +1215,7 @@ emit_intrinsic_load_ubo(struct ir3_compile *ctx, nir_intrinsic_instr *intr, { struct ir3_block *b = ctx->block; struct ir3_instruction *addr, *src0, *src1; + nir_const_value *const_offset; /* UBO addresses are the first driver params: */ unsigned ubo = regid(ctx->so->first_driver_param + IR3_UBOS_OFF, 0); unsigned off = intr->const_index[0]; @@ -1228,7 +1229,10 @@ emit_intrinsic_load_ubo(struct ir3_compile *ctx, nir_intrinsic_instr *intr, addr = create_uniform_indirect(ctx, ubo, get_addr(ctx, src0)); } - if (intr->intrinsic == nir_intrinsic_load_ubo_indirect) { + const_offset = nir_src_as_const_value(intr->src[1]); + if (const_offset) { + off += const_offset->u[0]; + } else { /* For load_ubo_indirect, second src is indirect offset: */ src1 = get_src(ctx, >src[1])[0]; @@ -1391,6 +1395,7 @@ emit_intrinisic(struct ir3_compile *ctx, nir_intrinsic_instr *intr) struct ir3_instruction **dst, **src; struct ir3_block *b = ctx->block; unsigned idx = intr->const_index[0]; + nir_const_value *const_offset; if (info->has_dest) { dst = get_dst(ctx, >dest, intr->num_components); @@ -1400,43 +1405,49 @@ emit_intrinisic(struct ir3_compile *ctx, nir_intrinsic_instr *intr) switch (intr->intrinsic) { case nir_intrinsic_load_uniform: - for (int i = 0; i < intr->num_components; i++) { - unsigned n = idx * 4 + i; - dst[i] = create_uniform(ctx, n); - } - break; - case nir_intrinsic_load_uniform_indirect: - src = get_src(ctx, >src[0]); - for (int i = 0; i < intr->num_components; i++) { - unsigned n = idx * 4 + i; - dst[i] = create_uniform_indirect(ctx, n, - get_addr(ctx, src[0])); + const_offset = nir_src_as_const_value(intr->src[0]); + if (const_offset) { + idx += const_offset->u[0]; + for (int i = 0; i < intr->num_components; i++) { + unsigned n = idx * 4 + i; + dst[i] = create_uniform(ctx, n); + } + } else { + src = get_src(ctx, >src[0]); + for (int i = 0; i < intr->num_components; i++) { + unsigned n = idx * 4 + i; + dst[i] = create_uniform_indirect(ctx, n, + get_addr(ctx, src[0])); + } + /* NOTE: if relative addressing is used, we set +* constlen in the compiler (to worst-case value) +* since we don't know in the assembler what the max +* addr reg value can be: +*/ + ctx->so->constlen = ctx->s->num_uniforms; } - /* NOTE: if relative addressing is used, we set constlen in -* the compiler (to worst-case value) since we don't know in -* the assembler what the max addr reg value can be: -*/ - ctx->so->constlen = ctx->s->num_uniforms; break; case nir_intrinsic_load_ubo: - case nir_intrinsic_load_ubo_indirect: emit_intrinsic_load_ubo(ctx, intr, dst); break; case nir_intrinsic_load_input: - for (int i = 0; i < intr->num_components; i++) { - unsigned n = idx * 4 + i; - dst[i] = ctx->ir->inputs[n]; - } - break; - case nir_intrinsic_load_input_indirect: - src = get_src(ctx, >src[0]); - struct ir3_instruction *collect = - create_collect(b, ctx->ir->inputs, ctx->ir->ninputs); - struct ir3_instruction *addr = get_addr(ctx, src[0]); - for (int i = 0; i < intr->num_components; i++) { - unsigned n = idx * 4 + i; - dst[i] = create_indirect_load(ctx, ctx->ir->ninputs, - n, addr, collect); + const_offset = nir_src_as_const_value(intr->src[0]); + if
[Mesa-dev] [PATCH v2 5/9] i965/fs: Get rid of load/store_foo_indirect
--- src/mesa/drivers/dri/i965/brw_fs.h | 2 +- src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 153 ++- src/mesa/drivers/dri/i965/brw_nir.c | 45 +++-- 3 files changed, 105 insertions(+), 95 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs.h b/src/mesa/drivers/dri/i965/brw_fs.h index bca4589..b55589f 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.h +++ b/src/mesa/drivers/dri/i965/brw_fs.h @@ -278,7 +278,7 @@ public: unsigned stream_id); void emit_gs_thread_end(); void emit_gs_input_load(const fs_reg , const nir_src _src, - const fs_reg _offset, unsigned imm_offset, + unsigned base_offset, const nir_src _src, unsigned num_components); void emit_cs_terminate(); fs_reg *emit_cs_local_invocation_id_setup(); diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp index 9b50e4e..a7a565b 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp @@ -1621,29 +1621,31 @@ fs_visitor::emit_gs_vertex(const nir_src _count_nir_src, void fs_visitor::emit_gs_input_load(const fs_reg , const nir_src _src, - const fs_reg _offset, - unsigned imm_offset, + unsigned base_offset, + const nir_src _src, unsigned num_components) { struct brw_gs_prog_data *gs_prog_data = (struct brw_gs_prog_data *) prog_data; + nir_const_value *vertex_const = nir_src_as_const_value(vertex_src); + nir_const_value *offset_const = nir_src_as_const_value(offset_src); + const unsigned push_reg_count = gs_prog_data->base.urb_read_length * 8; + /* Offset 0 is the VUE header, which contains VARYING_SLOT_LAYER [.y], * VARYING_SLOT_VIEWPORT [.z], and VARYING_SLOT_PSIZ [.w]. Only * gl_PointSize is available as a GS input, however, so it must be that. */ - const bool is_point_size = - indirect_offset.file == BAD_FILE && imm_offset == 0; - - nir_const_value *vertex_const = nir_src_as_const_value(vertex_src); - const unsigned push_reg_count = gs_prog_data->base.urb_read_length * 8; + const bool is_point_size = (base_offset == 0); - if (indirect_offset.file == BAD_FILE && vertex_const != NULL && - 4 * imm_offset < push_reg_count) { - imm_offset = 4 * imm_offset + vertex_const->u[0] * push_reg_count; + if (offset_const != NULL && vertex_const != NULL && + 4 * offset_const->u[0] < push_reg_count) { + int imm_offset = (base_offset + offset_const->u[0]) * 4 + + vertex_const->u[0] * push_reg_count; /* This input was pushed into registers. */ if (is_point_size) { /* gl_PointSize comes in .w */ - bld.MOV(dst, fs_reg(ATTR, imm_offset + 3, dst.type)); + assert(imm_offset == 0); + bld.MOV(dst, fs_reg(ATTR, 3, dst.type)); } else { for (unsigned i = 0; i < num_components; i++) { bld.MOV(offset(dst, bld, i), @@ -1701,21 +1703,21 @@ fs_visitor::emit_gs_input_load(const fs_reg , } fs_inst *inst; - if (indirect_offset.file == BAD_FILE) { + if (offset_const) { /* Constant indexing - use global offset. */ inst = bld.emit(SHADER_OPCODE_URB_READ_SIMD8, dst, icp_handle); - inst->offset = imm_offset; + inst->offset = base_offset + offset_const->u[0]; inst->base_mrf = -1; inst->mlen = 1; inst->regs_written = num_components; } else { /* Indirect indexing - use per-slot offsets as well. */ - const fs_reg srcs[] = { icp_handle, indirect_offset }; + const fs_reg srcs[] = { icp_handle, get_nir_src(offset_src) }; fs_reg payload = bld.vgrf(BRW_REGISTER_TYPE_UD, 2); bld.LOAD_PAYLOAD(payload, srcs, ARRAY_SIZE(srcs), 0); inst = bld.emit(SHADER_OPCODE_URB_READ_SIMD8_PER_SLOT, dst, payload); - inst->offset = imm_offset; + inst->offset = base_offset; inst->base_mrf = -1; inst->mlen = 2; inst->regs_written = num_components; @@ -1781,17 +1783,12 @@ fs_visitor::nir_emit_gs_intrinsic(const fs_builder , retype(fs_reg(brw_vec8_grf(2, 0)), BRW_REGISTER_TYPE_UD)); break; - case nir_intrinsic_load_input_indirect: case nir_intrinsic_load_input: unreachable("load_input intrinsics are invalid for the GS stage"); - case nir_intrinsic_load_per_vertex_input_indirect: - indirect_offset = retype(get_nir_src(instr->src[1]), BRW_REGISTER_TYPE_D); - /* fallthrough */ case nir_intrinsic_load_per_vertex_input: - emit_gs_input_load(dest, instr->src[0], - indirect_offset, instr->const_index[0], -
[Mesa-dev] [PATCH v2 0/9] nir: Remove the *_indirect form of input/output
This is a second crack at my attempt to get rid of the *_indirect for of the input/output intrinsics. The primary change is that we decided to keep the const_index[0] base offset for all of the load/store intrinsics that operate on opaque memory. Along with this, the base offset is now much better defined: it is always the driver_location of the actual variable. If, for instance, you access the second column of a mat4, const_index[0] is the location of the whole mat4 and the offset source is load_const(2). Thanks to keeping the base offset arround, this series is a lot more straightforward than the first version. I think the chances are substantially higher that I didn't break vc4 or ir3. Cc: Eric AnholtCc: Rob Clark Jason Ekstrand (9): vc4: Do all uniform loads with byte offsets nir: Get rid of *_indirect variants of input/output load/store intrinsics nir/glsl: Stop handling UBO/SSBO load/stores differently depending on indirect nir/lower_io: Get rid of load/store_foo_indirect i965/fs: Get rid of load/store_foo_indirect i965/vec4: Get rid of load/store_foo_indirect tgsi_to_nir: Get rid of load/store_foo_indirect ir3/nir: Use the new unified io intrinsics vc4/nir: Use the new unified io intrinsics src/gallium/auxiliary/nir/tgsi_to_nir.c| 51 +++ .../drivers/freedreno/ir3/ir3_compiler_nir.c | 79 ++- src/gallium/drivers/vc4/vc4_nir_lower_io.c | 33 ++--- src/gallium/drivers/vc4/vc4_program.c | 28 ++-- src/glsl/nir/glsl_to_nir.cpp | 47 +-- src/glsl/nir/nir.h | 2 +- src/glsl/nir/nir_intrinsics.h | 81 +-- src/glsl/nir/nir_lower_io.c| 111 ++- src/glsl/nir/nir_lower_phis_to_scalar.c| 4 - src/glsl/nir/nir_print.c | 9 +- src/mesa/drivers/dri/i965/brw_fs.h | 2 +- src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 153 ++--- src/mesa/drivers/dri/i965/brw_nir.c| 45 -- src/mesa/drivers/dri/i965/brw_vec4_gs_nir.cpp | 7 +- src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 91 +--- 15 files changed, 340 insertions(+), 403 deletions(-) -- 2.5.0.400.gff86faf ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2 2/9] nir: Get rid of *_indirect variants of input/output load/store intrinsics
There is some special-casing needed in a competent back-end. However, they can do their special-casing easily enough based on whether or not the offset is a constant. In the mean time, having the *_indirect variants adds special cases a number of places where they don't need to be and, in general, only complicates things. To complicate matters, NIR had no way to convdert an indirect load/store to a direct one in the case that the indirect was a constant so we would still not really get what the back-ends wanted. The best solution seems to be to get rid of the *_indirect variants entirely. --- src/glsl/nir/nir_intrinsics.h | 81 + src/glsl/nir/nir_lower_phis_to_scalar.c | 4 -- src/glsl/nir/nir_print.c| 9 +--- 3 files changed, 42 insertions(+), 52 deletions(-) diff --git a/src/glsl/nir/nir_intrinsics.h b/src/glsl/nir/nir_intrinsics.h index b2565c5..63df21e 100644 --- a/src/glsl/nir/nir_intrinsics.h +++ b/src/glsl/nir/nir_intrinsics.h @@ -228,54 +228,55 @@ SYSTEM_VALUE(num_work_groups, 3, 0) SYSTEM_VALUE(helper_invocation, 1, 0) /* - * The format of the indices depends on the type of the load. For uniforms, - * the first index is the base address and the second index is an offset that - * should be added to the base address. (This way you can determine in the - * back-end which variable is being accessed even in an array.) For inputs, - * the one and only index corresponds to the attribute slot. UBO loads also - * have a single index which is the base address to load from. + * Load operations pull data from some piece of GPU memory. All load + * operations operate in terms of offsets into some piece of theoretical + * memory. Loads from externally visible memory (UBO and SSBO) simply take a + * byte offset as a source. Loads from opaque memory (uniforms, inputs, etc.) + * take a base+offset pair where the base (const_index[0]) gives the location + * of the start of the variable being loaded and and the offset source is a + * offset into that variable. * - * UBO loads have a (possibly constant) source which is the UBO buffer index. - * For each type of load, the _indirect variant has one additional source - * (the second in the case of UBO's) that is the is an indirect to be added to - * the constant address or base offset to compute the final offset. + * Some load operations such as UBO/SSBO load and per_vertex loads take an + * additional source to specify which UBO/SSBO/vertex to load from. * - * For vector backends, the address is in terms of one vec4, and so each array - * element is +4 scalar components from the previous array element. For scalar - * backends, the address is in terms of a single 4-byte float/int and arrays - * elements begin immediately after the previous array element. + * The exact address type depends on the lowering pass that generates the + * load/store intrinsics. Typically, this is vec4 units for things such as + * varying slots and float units for fragment shader inputs. UBO and SSBO + * offsets are always in bytes. */ -#define LOAD(name, extra_srcs, indices, flags) \ - INTRINSIC(load_##name, extra_srcs, ARR(1), true, 0, 0, indices, flags) \ - INTRINSIC(load_##name##_indirect, extra_srcs + 1, ARR(1, 1), \ - true, 0, 0, indices, flags) +#define LOAD(name, srcs, indices, flags) \ + INTRINSIC(load_##name, srcs, ARR(1, 1, 1, 1), true, 0, 0, indices, flags) -LOAD(uniform, 0, 2, NIR_INTRINSIC_CAN_ELIMINATE | NIR_INTRINSIC_CAN_REORDER) -LOAD(ubo, 1, 1, NIR_INTRINSIC_CAN_ELIMINATE | NIR_INTRINSIC_CAN_REORDER) -LOAD(input, 0, 1, NIR_INTRINSIC_CAN_ELIMINATE | NIR_INTRINSIC_CAN_REORDER) -LOAD(per_vertex_input, 1, 1, NIR_INTRINSIC_CAN_ELIMINATE | NIR_INTRINSIC_CAN_REORDER) -LOAD(ssbo, 1, 1, NIR_INTRINSIC_CAN_ELIMINATE) -LOAD(output, 0, 1, NIR_INTRINSIC_CAN_ELIMINATE) -LOAD(per_vertex_output, 1, 1, NIR_INTRINSIC_CAN_ELIMINATE) +/* src[] = { offset }. const_index[] = { base } */ +LOAD(uniform, 1, 1, NIR_INTRINSIC_CAN_ELIMINATE | NIR_INTRINSIC_CAN_REORDER) +/* src[] = { buffer_index, offset }. No const_index */ +LOAD(ubo, 2, 0, NIR_INTRINSIC_CAN_ELIMINATE | NIR_INTRINSIC_CAN_REORDER) +/* src[] = { offset }. const_index[0] = { base } */ +LOAD(input, 1, 1, NIR_INTRINSIC_CAN_ELIMINATE | NIR_INTRINSIC_CAN_REORDER) +/* src[] = { vertex, offset }. const_index[] = { base } */ +LOAD(per_vertex_input, 2, 1, NIR_INTRINSIC_CAN_ELIMINATE | NIR_INTRINSIC_CAN_REORDER) +/* src[] = { buffer_index, offset }. No const_index */ +LOAD(ssbo, 2, 0, NIR_INTRINSIC_CAN_ELIMINATE) +/* src[] = { offset }. const_index[] = { base } */ +LOAD(output, 1, 1, NIR_INTRINSIC_CAN_ELIMINATE) +/* src[] = { vertex, offset }. const_index[] = { base } */ +LOAD(per_vertex_output, 2, 1, NIR_INTRINSIC_CAN_ELIMINATE) /* - * Stores work the same way as loads, except now the first register input is - * the value or array to store and the optional second input is the indirect - * offset. SSBO stores are similar, but they accept
[Mesa-dev] [PATCH v2 7/9] tgsi_to_nir: Get rid of load/store_foo_indirect
Cc: Eric AnholtCc: Rob Clark --- src/gallium/auxiliary/nir/tgsi_to_nir.c | 51 + 1 file changed, 27 insertions(+), 24 deletions(-) diff --git a/src/gallium/auxiliary/nir/tgsi_to_nir.c b/src/gallium/auxiliary/nir/tgsi_to_nir.c index 5fef542..480c0b9 100644 --- a/src/gallium/auxiliary/nir/tgsi_to_nir.c +++ b/src/gallium/auxiliary/nir/tgsi_to_nir.c @@ -468,7 +468,7 @@ ttn_emit_immediate(struct ttn_compile *c) nir_builder_instr_insert(b, _const->instr); } -static nir_src +static nir_ssa_def * ttn_src_for_indirect(struct ttn_compile *c, struct tgsi_ind_register *indirect); /* generate either a constant or indirect deref chain for accessing an @@ -487,7 +487,7 @@ ttn_array_deref(struct ttn_compile *c, nir_intrinsic_instr *instr, if (indirect) { arr->deref_array_type = nir_deref_array_type_indirect; - arr->indirect = ttn_src_for_indirect(c, indirect); + arr->indirect = nir_src_for_ssa(ttn_src_for_indirect(c, indirect)); } else { arr->deref_array_type = nir_deref_array_type_direct; } @@ -586,19 +586,14 @@ ttn_src_for_file_and_index(struct ttn_compile *c, unsigned file, unsigned index, switch (file) { case TGSI_FILE_INPUT: - op = indirect ? nir_intrinsic_load_input_indirect : - nir_intrinsic_load_input; + op = nir_intrinsic_load_input; assert(!dim); break; case TGSI_FILE_CONSTANT: if (dim) { -op = indirect ? nir_intrinsic_load_ubo_indirect : -nir_intrinsic_load_ubo; -/* convert index from vec4 to byte: */ -index *= 16; +op = nir_intrinsic_load_ubo; } else { -op = indirect ? nir_intrinsic_load_uniform_indirect : -nir_intrinsic_load_uniform; +op = nir_intrinsic_load_uniform; } break; default: @@ -609,7 +604,6 @@ ttn_src_for_file_and_index(struct ttn_compile *c, unsigned file, unsigned index, load = nir_intrinsic_instr_create(b->shader, op); load->num_components = 4; - load->const_index[0] = index; if (dim) { if (dimind) { load->src[srcn] = @@ -622,17 +616,25 @@ ttn_src_for_file_and_index(struct ttn_compile *c, unsigned file, unsigned index, } srcn++; } - if (indirect) { - load->src[srcn] = ttn_src_for_indirect(c, indirect); - if (dim) { -assert(load->src[srcn].is_ssa); -/* we also need to covert vec4 to byte here too: */ -load->src[srcn] = - nir_src_for_ssa(nir_ishl(b, load->src[srcn].ssa, -nir_imm_int(b, 4))); + + nir_ssa_def *offset; + if (dim) { + /* UBO loads don't have a const_index[0] base offset. */ + offset = nir_imm_int(b, index); + if (indirect) { +offset = nir_iadd(b, offset, ttn_src_for_indirect(c, indirect)); +offset = nir_ishl(b, offset, nir_imm_int(b, 4)); + } + } else { + load->const_index[0] = index; + if (indirect) { +offset = ttn_src_for_indirect(c, indirect); + } else { +offset = nir_imm_int(b, 0); } - srcn++; } + load->src[srcn++] = nir_src_for_ssa(offset); + nir_ssa_dest_init(>instr, >dest, 4, NULL); nir_builder_instr_insert(b, >instr); @@ -648,7 +650,7 @@ ttn_src_for_file_and_index(struct ttn_compile *c, unsigned file, unsigned index, return src; } -static nir_src +static nir_ssa_def * ttn_src_for_indirect(struct ttn_compile *c, struct tgsi_ind_register *indirect) { nir_builder *b = >build; @@ -660,7 +662,7 @@ ttn_src_for_indirect(struct ttn_compile *c, struct tgsi_ind_register *indirect) indirect->File, indirect->Index, NULL, NULL, NULL); - return nir_src_for_ssa(nir_imov_alu(b, src, 1)); + return nir_imov_alu(b, src, 1); } static nir_alu_dest @@ -729,7 +731,7 @@ ttn_get_dest(struct ttn_compile *c, struct tgsi_full_dst_register *tgsi_fdst) if (tgsi_dst->Indirect && (tgsi_dst->File != TGSI_FILE_TEMPORARY)) { nir_src *indirect = ralloc(c->build.shader, nir_src); - *indirect = ttn_src_for_indirect(c, _fdst->Indirect); + *indirect = nir_src_for_ssa(ttn_src_for_indirect(c, _fdst->Indirect)); dest.dest.reg.indirect = indirect; } @@ -1927,9 +1929,10 @@ ttn_add_output_stores(struct ttn_compile *c) nir_intrinsic_instr_create(b->shader, nir_intrinsic_store_output); unsigned loc = var->data.driver_location + i; store->num_components = 4; - store->const_index[0] = loc; store->src[0].reg.reg = c->output_regs[loc].reg;
Re: [Mesa-dev] [PATCH] svga: use the debug callback to report issues to the state tracker
On 04/12/15 23:08, Ilia Mirkin wrote: On Fri, Dec 4, 2015 at 6:00 PM, Jose Fonsecawrote: On 04/12/15 22:45, Ilia Mirkin wrote: On Fri, Dec 4, 2015 at 5:40 PM, Jose Fonseca wrote: On 04/12/15 20:04, Ilia Mirkin wrote: On Fri, Dec 4, 2015 at 3:01 PM, Brian Paul wrote: On 12/04/2015 12:46 PM, Ilia Mirkin wrote: On Fri, Dec 4, 2015 at 2:42 PM, Brian Paul wrote: + if (templ->poly_smooth && svga->debug.callback.debug_message) { + /* note: we always need a % something in the message string */ Why? Did I mess something up? If I write it without the dummy %s I get: In file included from ../../../../src/gallium/auxiliary/util/u_inlines.h:36:0, from svga_pipe_rasterizer.c:29: svga_pipe_rasterizer.c: In function 'svga_create_rasterizer_state': ../../../../src/gallium/auxiliary/util/u_debug.h:273:40: error: expected expression before ')' token fmt, __VA_ARGS__); \ ^ svga_pipe_rasterizer.c:357:7: note: in expansion of macro 'pipe_debug_message' pipe_debug_message(>debug.callback, CONFORMANCE, ^ H... so that'd be a "yes" to me messing it up :) I'll have a look to see if there's something simple I can do here. Off-hand it feels like it should be #__VA_ARGS__? Been a while since I looked at the specifics of all this. BTW, right now this callback is only set for debug contexts. Perhaps that was not an ideal decision... just an FYI though. apitrace creates a debug context so that's the main thing for me right now. But it might be nice if setting MESA_DEBUG would cause the debug/info messages to be printed to stderr for those apps that don't use GL_ARB_debug_output. Yeah, we should probably also throw in Enable/Disable hooks into st/mesa that flip it on/off when flipping GL_DEBUG_OUTPUT (or whatever the enum is). BTW, note that apitrace cuts messages off after 100 of them which is pretty annoying -- I opened an issue at https://github.com/apitrace/apitrace/issues/395 a while back. Not just "100", but 100 *per ID*. Which is very different. Why is it so annoying? A patch for an option is welcome anyway BTW. Right, but but an ID is a message. So let's say the message is "falling back to XYZ", why would you only want to know the first 100 times? Or let's say the message is "shader compiles to XYZ instructions" -- again, why the first 100 times? In fact, *given* that you enable these messages, I don't see why it would ever be desirable to limit them. Isn't it obvious? OpenGL applications are repetitive by nature. You're bound to have the same error over and over again. Futhermore, some IHVs have messages so frequent that completely overwel everything else. So, when you take a trace of an arbitrary app, and replay e.g., on NVIDIA, you'll get lots and performance hints. Nevertheless, you'll want to be able to spot rare and more important issues among all that. limiting to a fix number, e.g., 100, means that after a while, they all silence, so that any new issue will stand out. Soemthing that happens all the time is invariably interesting. I doubt we're going to change each others' mind on this. I'm not trying to change your mind. You said you couldn't see any reason to ever limit, and I gave you an example otherwise. > I've contemplated throwing prints into mesa, but haven't been sufficiently motivated to worry about it. Like you say, GL applications tend to be repetitive, but they tend to have a "startup" phase, and a "do repetitive thing" phase. If the message gets triggered > 100x during the startup phase, I don't get to see it when I'm actually interested in the results. Yes, I agree there might be situations where you might want to see all. That's why I already agreed to have the apitrace option to cofigure an arbitraryly/infinite limit. But it's OK, with Matt's run from shader-db, it all works, which was the primary motivation for piping this thing through to gallium. Super. Jose ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] svga: use the debug callback to report issues to the state tracker
On 04/12/15 23:20, Ilia Mirkin wrote: On Fri, Dec 4, 2015 at 6:14 PM, Jose Fonsecawrote: On 04/12/15 22:40, Jose Fonseca wrote: On 04/12/15 20:04, Ilia Mirkin wrote: On Fri, Dec 4, 2015 at 3:01 PM, Brian Paul wrote: On 12/04/2015 12:46 PM, Ilia Mirkin wrote: On Fri, Dec 4, 2015 at 2:42 PM, Brian Paul wrote: + if (templ->poly_smooth && svga->debug.callback.debug_message) { + /* note: we always need a % something in the message string */ Why? Did I mess something up? If I write it without the dummy %s I get: In file included from ../../../../src/gallium/auxiliary/util/u_inlines.h:36:0, from svga_pipe_rasterizer.c:29: svga_pipe_rasterizer.c: In function 'svga_create_rasterizer_state': ../../../../src/gallium/auxiliary/util/u_debug.h:273:40: error: expected expression before ')' token fmt, __VA_ARGS__); \ ^ svga_pipe_rasterizer.c:357:7: note: in expansion of macro 'pipe_debug_message' pipe_debug_message(>debug.callback, CONFORMANCE, ^ H... so that'd be a "yes" to me messing it up :) I'll have a look to see if there's something simple I can do here. Off-hand it feels like it should be #__VA_ARGS__? Been a while since I looked at the specifics of all this. BTW, right now this callback is only set for debug contexts. Perhaps that was not an ideal decision... just an FYI though. apitrace creates a debug context so that's the main thing for me right now. But it might be nice if setting MESA_DEBUG would cause the debug/info messages to be printed to stderr for those apps that don't use GL_ARB_debug_output. Yeah, we should probably also throw in Enable/Disable hooks into st/mesa that flip it on/off when flipping GL_DEBUG_OUTPUT (or whatever the enum is). BTW, note that apitrace cuts messages off after 100 of them which is pretty annoying -- I opened an issue at https://github.com/apitrace/apitrace/issues/395 a while back. Not just "100", but 100 *per ID*. Which is very different. Why is it so annoying? A patch for an option is welcome anyway BTW. I noticed that pipe_debug_message() doesn't take a message ID. It's a little clever... this is why pipe_debug_message is a macro. Every separate callsite gets its own id. It's initialized to 0, and passed in as a pointer... the mesa function actually assigns it a new id if the passed in id is 0. It's a bit racey and definitely a weird API, but... it was easy :) Ah, I see it now. It's not reproducible, but at least different message sites have different id. Jose ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] svga: use the debug callback to report issues to the state tracker
On 04/12/15 22:40, Jose Fonseca wrote: On 04/12/15 20:04, Ilia Mirkin wrote: On Fri, Dec 4, 2015 at 3:01 PM, Brian Paulwrote: On 12/04/2015 12:46 PM, Ilia Mirkin wrote: On Fri, Dec 4, 2015 at 2:42 PM, Brian Paul wrote: + if (templ->poly_smooth && svga->debug.callback.debug_message) { + /* note: we always need a % something in the message string */ Why? Did I mess something up? If I write it without the dummy %s I get: In file included from ../../../../src/gallium/auxiliary/util/u_inlines.h:36:0, from svga_pipe_rasterizer.c:29: svga_pipe_rasterizer.c: In function 'svga_create_rasterizer_state': ../../../../src/gallium/auxiliary/util/u_debug.h:273:40: error: expected expression before ')' token fmt, __VA_ARGS__); \ ^ svga_pipe_rasterizer.c:357:7: note: in expansion of macro 'pipe_debug_message' pipe_debug_message(>debug.callback, CONFORMANCE, ^ H... so that'd be a "yes" to me messing it up :) I'll have a look to see if there's something simple I can do here. Off-hand it feels like it should be #__VA_ARGS__? Been a while since I looked at the specifics of all this. BTW, right now this callback is only set for debug contexts. Perhaps that was not an ideal decision... just an FYI though. apitrace creates a debug context so that's the main thing for me right now. But it might be nice if setting MESA_DEBUG would cause the debug/info messages to be printed to stderr for those apps that don't use GL_ARB_debug_output. Yeah, we should probably also throw in Enable/Disable hooks into st/mesa that flip it on/off when flipping GL_DEBUG_OUTPUT (or whatever the enum is). BTW, note that apitrace cuts messages off after 100 of them which is pretty annoying -- I opened an issue at https://github.com/apitrace/apitrace/issues/395 a while back. Not just "100", but 100 *per ID*. Which is very different. Why is it so annoying? A patch for an option is welcome anyway BTW. I noticed that pipe_debug_message() doesn't take a message ID. It looks like the same id is passed for every context: /** * Handles generating a GL_ARB_debug_output message ID generated by the GL or * GLSL compiler. * * The GL API has this "ID" mechanism, where the intention is to allow a * client to filter in/out messages based on source, type, and ID. Of course, * building a giant enum list of all debug output messages that Mesa might * generate is ridiculous, so instead we have our caller pass us a pointer to * static storage where the ID should get stored. This ID will be shared * across all contexts for that message (which seems like a desirable * property, even if it's not expected by the spec), but note that it won't be * the same between executions if messages aren't generated in the same order. */ This is far from ideal. I think at east Mesa should use ID namespaces: each Mesa driver / pipe driver should be assigned an unique driver ID (high word), then use low word internally. For the low word, it doesn't need to be an enum per message, but it should be more fined grained than one ID per context. If nothing else, you could use the address or hash of the printf-like format string to make an id. Jose ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] svga: use the debug callback to report issues to the state tracker
On Fri, Dec 4, 2015 at 6:14 PM, Jose Fonsecawrote: > On 04/12/15 22:40, Jose Fonseca wrote: >> >> On 04/12/15 20:04, Ilia Mirkin wrote: >>> >>> On Fri, Dec 4, 2015 at 3:01 PM, Brian Paul wrote: On 12/04/2015 12:46 PM, Ilia Mirkin wrote: > > > On Fri, Dec 4, 2015 at 2:42 PM, Brian Paul wrote: >> >> >> + if (templ->poly_smooth && svga->debug.callback.debug_message) { >> + /* note: we always need a % something in the message string */ > > > > Why? Did I mess something up? If I write it without the dummy %s I get: In file included from ../../../../src/gallium/auxiliary/util/u_inlines.h:36:0, from svga_pipe_rasterizer.c:29: svga_pipe_rasterizer.c: In function 'svga_create_rasterizer_state': ../../../../src/gallium/auxiliary/util/u_debug.h:273:40: error: expected expression before ')' token fmt, __VA_ARGS__); \ ^ svga_pipe_rasterizer.c:357:7: note: in expansion of macro 'pipe_debug_message' pipe_debug_message(>debug.callback, CONFORMANCE, ^ >>> >>> H... so that'd be a "yes" to me messing it up :) I'll have a look >>> to see if there's something simple I can do here. Off-hand it feels >>> like it should be #__VA_ARGS__? Been a while since I looked at the >>> specifics of all this. >>> > > BTW, right now this callback is only set for debug contexts. Perhaps > that was not an ideal decision... just an FYI though. apitrace creates a debug context so that's the main thing for me right now. But it might be nice if setting MESA_DEBUG would cause the debug/info messages to be printed to stderr for those apps that don't use GL_ARB_debug_output. >>> >>> >>> Yeah, we should probably also throw in Enable/Disable hooks into >>> st/mesa that flip it on/off when flipping GL_DEBUG_OUTPUT (or whatever >>> the enum is). >>> >>> BTW, note that apitrace cuts messages off after 100 of them which is >>> pretty annoying -- I opened an issue at >>> https://github.com/apitrace/apitrace/issues/395 a while back. >> >> >> Not just "100", but 100 *per ID*. >> >> Which is very different. >> >> Why is it so annoying? A patch for an option is welcome anyway BTW. >> > > I noticed that pipe_debug_message() doesn't take a message ID. It's a little clever... this is why pipe_debug_message is a macro. Every separate callsite gets its own id. It's initialized to 0, and passed in as a pointer... the mesa function actually assigns it a new id if the passed in id is 0. It's a bit racey and definitely a weird API, but... it was easy :) -ilia ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] svga: use the debug callback to report issues to the state tracker
On 04/12/15 23:27, Ilia Mirkin wrote: On Fri, Dec 4, 2015 at 6:25 PM, Jose Fonsecawrote: On 04/12/15 19:42, Brian Paul wrote: Use the new debug callback hook to report conformance, performance and fallbacks to the state tracker. The state tracker, in turn can report this issues to the user via the GL_ARB_debug_output extension. More issues can be reported in the future; this is just a start. --- src/gallium/drivers/svga/svga_context.h | 3 +++ src/gallium/drivers/svga/svga_draw_arrays.c | 7 +++ src/gallium/drivers/svga/svga_pipe_blend.c | 11 +++ src/gallium/drivers/svga/svga_pipe_misc.c| 17 + src/gallium/drivers/svga/svga_pipe_rasterizer.c | 6 ++ src/gallium/drivers/svga/svga_state_need_swtnl.c | 23 +++ 6 files changed, 67 insertions(+) diff --git a/src/gallium/drivers/svga/svga_context.h b/src/gallium/drivers/svga/svga_context.h index 6a4f9d8..c4284cc 100644 --- a/src/gallium/drivers/svga/svga_context.h +++ b/src/gallium/drivers/svga/svga_context.h @@ -392,6 +392,9 @@ struct svga_context boolean no_line_width; boolean force_hw_line_stipple; + + /** To report perf/conformance/etc issues to the state tracker */ + struct pipe_debug_callback callback; } debug; struct { diff --git a/src/gallium/drivers/svga/svga_draw_arrays.c b/src/gallium/drivers/svga/svga_draw_arrays.c index acb2e95..f6956b5 100644 --- a/src/gallium/drivers/svga/svga_draw_arrays.c +++ b/src/gallium/drivers/svga/svga_draw_arrays.c @@ -26,6 +26,7 @@ #include "svga_cmd.h" #include "util/u_inlines.h" +#include "util/u_prim.h" #include "indices/u_indices.h" #include "svga_hw_reg.h" @@ -277,6 +278,12 @@ svga_hwtnl_draw_arrays(struct svga_hwtnl *hwtnl, if (ret != PIPE_OK) goto done; + if (svga->debug.callback.debug_message) { + pipe_debug_message(>debug.callback, PERF_INFO, +"generating temporary index buffer for drawing %s", +u_prim_name(prim)); + } + ret = svga_hwtnl_simple_draw_range_elements(hwtnl, gen_buf, gen_size, diff --git a/src/gallium/drivers/svga/svga_pipe_blend.c b/src/gallium/drivers/svga/svga_pipe_blend.c index 0c9d612..5538193 100644 --- a/src/gallium/drivers/svga/svga_pipe_blend.c +++ b/src/gallium/drivers/svga/svga_pipe_blend.c @@ -243,6 +243,17 @@ svga_create_blend_state(struct pipe_context *pipe, blend->rt[i].srcblend_alpha = blend->rt[i].srcblend; blend->rt[i].dstblend_alpha = blend->rt[i].dstblend; blend->rt[i].blendeq_alpha = blend->rt[i].blendeq; + + if (svga->debug.callback.debug_message) { +if (templ->logicop_func == PIPE_LOGICOP_XOR) { + pipe_debug_message(>debug.callback, CONFORMANCE, + "XOR logicop mode has limited support%s", ""); +} +else if (templ->logicop_func != PIPE_LOGICOP_COPY) { + pipe_debug_message(>debug.callback, CONFORMANCE, + "general logicops are not supported%s", ""); +} + } } else { /* Note: the vgpu10 device does not yet support independent diff --git a/src/gallium/drivers/svga/svga_pipe_misc.c b/src/gallium/drivers/svga/svga_pipe_misc.c index c8020da..af9356d 100644 --- a/src/gallium/drivers/svga/svga_pipe_misc.c +++ b/src/gallium/drivers/svga/svga_pipe_misc.c @@ -244,6 +244,22 @@ static void svga_set_viewport_states( struct pipe_context *pipe, } +/** + * Called by state tracker to specify a callback function the driver + * can use to report info back to the state tracker. + */ +static void +svga_set_debug_callback(struct pipe_context *pipe, +const struct pipe_debug_callback *cb) +{ + struct svga_context *svga = svga_context(pipe); + + if (cb) + svga->debug.callback = *cb; + else + memset(>debug.callback, 0, sizeof(svga->debug.callback)); +} + void svga_init_misc_functions( struct svga_context *svga ) { @@ -252,6 +268,7 @@ void svga_init_misc_functions( struct svga_context *svga ) svga->pipe.set_framebuffer_state = svga_set_framebuffer_state; svga->pipe.set_clip_state = svga_set_clip_state; svga->pipe.set_viewport_states = svga_set_viewport_states; + svga->pipe.set_debug_callback = svga_set_debug_callback; } diff --git a/src/gallium/drivers/svga/svga_pipe_rasterizer.c b/src/gallium/drivers/svga/svga_pipe_rasterizer.c index 6310b7a..c6215e6 100644 --- a/src/gallium/drivers/svga/svga_pipe_rasterizer.c +++ b/src/gallium/drivers/svga/svga_pipe_rasterizer.c @@ -352,6 +352,12 @@ svga_create_rasterizer_state(struct pipe_context *pipe, define_rasterizer_object(svga, rast); } + if (templ->poly_smooth
[Mesa-dev] [Bug 79688] [dri3] Latest git breaks PRIME Offloading to Nouveau GPU
https://bugs.freedesktop.org/show_bug.cgi?id=79688 --- Comment #24 from Tobias Klausmann--- (In reply to poma from comment #22) > (In reply to Tobias Klausmann from comment #21) > > (In reply to Axel Davy from comment #20) > > > Should this bug report be closed ? > > > DRI3 has complete DRI_PRIME support now. > > > > > > > As far as it concerns me, DRI3 Support is fine and this bug can be marked as > > "resolved fixed". > > > > Do you refer to DRI3 in general, or is it DRI_PRIME? I was referring to RI_PRIME, as this bug was about DRI_PRIME > > Tobias, while DRI3 enabled, can you run: > $ ssh -Y [user@]hostname firefox > > What is the result? If you mean a running firefox process without a window showing up, i would guess this is not DRI3's fault, as other applications are showing up just fine, others don't. As far as i could elaborate in under 5 minutes, newer frameworks (gtk3, kde5 (qt5)) do not work, older ones do work just fine...if you have more information about this beeing DRI3, please open a bugreport about this specific problem! -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] gallium/util: fix pipe_debug_message macro to allow 0 args
Signed-off-by: Ilia Mirkin--- src/gallium/auxiliary/util/u_debug.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/gallium/auxiliary/util/u_debug.h b/src/gallium/auxiliary/util/u_debug.h index aaf223c..9900703 100644 --- a/src/gallium/auxiliary/util/u_debug.h +++ b/src/gallium/auxiliary/util/u_debug.h @@ -270,7 +270,7 @@ void _debug_assert_fail(const char *expr, static unsigned id = 0; \ _pipe_debug_message(cb, , \ PIPE_DEBUG_TYPE_ ## type, \ - fmt, __VA_ARGS__); \ + fmt, ##__VA_ARGS__); \ } while (0) struct pipe_debug_callback; -- 2.4.10 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [RFC] r600g: evergreen/cayman tessellation support
Am 04.12.2015 08:48, schrieb eocallag...@alterapraxis.com: On 2015-12-04 14:03, Dieter Nützel wrote: Am 03.12.2015 19:57, schrieb Dave Airlie: On 4 Dec 2015 03:01, "Aaron Watry"wrote: Hi Dave (and others), I cloned your fdo r600g-tess-submit branch and gave it a spin on CEDAR (Radeon 5400, kernel 4.3.0) with Heaven, and ran into a few issues. Just grab r600g-tess-staging -submit only worked on cayman, but i d posted it here so didn't want to change it during review. Dave. Hello to all, r600g-tess-staging WORKS on Turks XT (6670) with poor 2 GB, RAM width 128bits DDR, but slow. Time for R9 290 or Tonga (R9 380X). Tonga already works, Tonga is radeonsi not r600g, this series only applies to r600g. The radeonsi gallium mesa driver has supported tess for a while now. You didn't got it: It is time for ME for R9 xxx (including Tonga)... I'm with Radeon since the beginning on Linux ;-) Only issue so far: Switching tessellation (with mouse) and/or wireframe (F2) in Heaven-4.0 let system hang. But NOT hard. SysRq Key + S | U | B works But haven't had time to hunt for a useful log. GREAT WORK! Dieter ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 93188] "nir/nir.h", line 552: Error: Unexpected type name "nir_src" encountered.
https://bugs.freedesktop.org/show_bug.cgi?id=93188 --- Comment #6 from Jason Ekstrand--- I had a patch at one point to help C++ out here. I ended up not sending it out because it only fixed warnings that only came up if you had extra warnings turned on. basically it boiled down to #ifdef __cplusplus # define NIR_SRC_INIT nir_src() #else # define NIR_SRC_INIT (nir_src) { { NULL } } #endif And the same thing for NIR_DEST_INIT. Can you try that? -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] svga: use the debug callback to report issues to the state tracker
On 04/12/15 22:45, Ilia Mirkin wrote: On Fri, Dec 4, 2015 at 5:40 PM, Jose Fonsecawrote: On 04/12/15 20:04, Ilia Mirkin wrote: On Fri, Dec 4, 2015 at 3:01 PM, Brian Paul wrote: On 12/04/2015 12:46 PM, Ilia Mirkin wrote: On Fri, Dec 4, 2015 at 2:42 PM, Brian Paul wrote: + if (templ->poly_smooth && svga->debug.callback.debug_message) { + /* note: we always need a % something in the message string */ Why? Did I mess something up? If I write it without the dummy %s I get: In file included from ../../../../src/gallium/auxiliary/util/u_inlines.h:36:0, from svga_pipe_rasterizer.c:29: svga_pipe_rasterizer.c: In function 'svga_create_rasterizer_state': ../../../../src/gallium/auxiliary/util/u_debug.h:273:40: error: expected expression before ')' token fmt, __VA_ARGS__); \ ^ svga_pipe_rasterizer.c:357:7: note: in expansion of macro 'pipe_debug_message' pipe_debug_message(>debug.callback, CONFORMANCE, ^ H... so that'd be a "yes" to me messing it up :) I'll have a look to see if there's something simple I can do here. Off-hand it feels like it should be #__VA_ARGS__? Been a while since I looked at the specifics of all this. BTW, right now this callback is only set for debug contexts. Perhaps that was not an ideal decision... just an FYI though. apitrace creates a debug context so that's the main thing for me right now. But it might be nice if setting MESA_DEBUG would cause the debug/info messages to be printed to stderr for those apps that don't use GL_ARB_debug_output. Yeah, we should probably also throw in Enable/Disable hooks into st/mesa that flip it on/off when flipping GL_DEBUG_OUTPUT (or whatever the enum is). BTW, note that apitrace cuts messages off after 100 of them which is pretty annoying -- I opened an issue at https://github.com/apitrace/apitrace/issues/395 a while back. Not just "100", but 100 *per ID*. Which is very different. Why is it so annoying? A patch for an option is welcome anyway BTW. Right, but but an ID is a message. So let's say the message is "falling back to XYZ", why would you only want to know the first 100 times? Or let's say the message is "shader compiles to XYZ instructions" -- again, why the first 100 times? In fact, *given* that you enable these messages, I don't see why it would ever be desirable to limit them. Isn't it obvious? OpenGL applications are repetitive by nature. You're bound to have the same error over and over again. Futhermore, some IHVs have messages so frequent that completely overwel everything else. So, when you take a trace of an arbitrary app, and replay e.g., on NVIDIA, you'll get lots and performance hints. Nevertheless, you'll want to be able to spot rare and more important issues among all that. limiting to a fix number, e.g., 100, means that after a while, they all silence, so that any new issue will stand out. Soemthing that happens all the time is invariably interesting. Jose ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] svga: use the debug callback to report issues to the state tracker
On Fri, Dec 4, 2015 at 6:00 PM, Jose Fonsecawrote: > On 04/12/15 22:45, Ilia Mirkin wrote: >> >> On Fri, Dec 4, 2015 at 5:40 PM, Jose Fonseca wrote: >>> >>> On 04/12/15 20:04, Ilia Mirkin wrote: On Fri, Dec 4, 2015 at 3:01 PM, Brian Paul wrote: > > > On 12/04/2015 12:46 PM, Ilia Mirkin wrote: >> >> >> >> On Fri, Dec 4, 2015 at 2:42 PM, Brian Paul wrote: >>> >>> >>> >>> + if (templ->poly_smooth && svga->debug.callback.debug_message) { >>> + /* note: we always need a % something in the message string */ >> >> >> >> >> Why? Did I mess something up? > > > > > If I write it without the dummy %s I get: > > In file included from > ../../../../src/gallium/auxiliary/util/u_inlines.h:36:0, >from svga_pipe_rasterizer.c:29: > svga_pipe_rasterizer.c: In function 'svga_create_rasterizer_state': > ../../../../src/gallium/auxiliary/util/u_debug.h:273:40: error: > expected > expression before ')' token > fmt, __VA_ARGS__); \ > ^ > svga_pipe_rasterizer.c:357:7: note: in expansion of macro > 'pipe_debug_message' > pipe_debug_message(>debug.callback, CONFORMANCE, > ^ > H... so that'd be a "yes" to me messing it up :) I'll have a look to see if there's something simple I can do here. Off-hand it feels like it should be #__VA_ARGS__? Been a while since I looked at the specifics of all this. > >> >> BTW, right now this callback is only set for debug contexts. Perhaps >> that was not an ideal decision... just an FYI though. > > > > > apitrace creates a debug context so that's the main thing for me right > now. > But it might be nice if setting MESA_DEBUG would cause the debug/info > messages to be printed to stderr for those apps that don't use > GL_ARB_debug_output. Yeah, we should probably also throw in Enable/Disable hooks into st/mesa that flip it on/off when flipping GL_DEBUG_OUTPUT (or whatever the enum is). BTW, note that apitrace cuts messages off after 100 of them which is pretty annoying -- I opened an issue at https://github.com/apitrace/apitrace/issues/395 a while back. >>> >>> >>> >>> Not just "100", but 100 *per ID*. >>> >>> Which is very different. >>> >>> Why is it so annoying? A patch for an option is welcome anyway BTW. >> >> >> Right, but but an ID is a message. So let's say the message is >> "falling back to XYZ", why would you only want to know the first 100 >> times? Or let's say the message is "shader compiles to XYZ >> instructions" -- again, why the first 100 times? In fact, *given* that >> you enable these messages, I don't see why it would ever be desirable >> to limit them. > > > Isn't it obvious? > > > OpenGL applications are repetitive by nature. You're bound to have the same > error over and over again. > > Futhermore, some IHVs have messages so frequent that completely overwel > everything else. > > So, when you take a trace of an arbitrary app, and replay e.g., on NVIDIA, > you'll get lots and performance hints. Nevertheless, you'll want to be able > to spot rare and more important issues among all that. limiting to a fix > number, e.g., 100, means that after a while, they all silence, so that any > new issue will stand out. > > Soemthing that happens all the time is invariably interesting. I doubt we're going to change each others' mind on this. I've contemplated throwing prints into mesa, but haven't been sufficiently motivated to worry about it. Like you say, GL applications tend to be repetitive, but they tend to have a "startup" phase, and a "do repetitive thing" phase. If the message gets triggered > 100x during the startup phase, I don't get to see it when I'm actually interested in the results. But it's OK, with Matt's run from shader-db, it all works, which was the primary motivation for piping this thing through to gallium. -ilia ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] gallium/util: fix pipe_debug_message macro to allow 0 args
On 12/04/2015 01:09 PM, Ilia Mirkin wrote: Signed-off-by: Ilia Mirkin--- src/gallium/auxiliary/util/u_debug.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/gallium/auxiliary/util/u_debug.h b/src/gallium/auxiliary/util/u_debug.h index aaf223c..9900703 100644 --- a/src/gallium/auxiliary/util/u_debug.h +++ b/src/gallium/auxiliary/util/u_debug.h @@ -270,7 +270,7 @@ void _debug_assert_fail(const char *expr, static unsigned id = 0; \ _pipe_debug_message(cb, , \ PIPE_DEBUG_TYPE_ ## type, \ - fmt, __VA_ARGS__); \ + fmt, ##__VA_ARGS__); \ } while (0) struct pipe_debug_callback; Reviewed-by: Brian Paul Tested-by: Brian Paul Thanks. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/2] svga: use the debug callback to report issues to the state tracker
Use the new debug callback hook to report conformance, performance and fallbacks to the state tracker. The state tracker, in turn can report this issues to the user via the GL_ARB_debug_output extension. More issues can be reported in the future; this is just a start. v2: remove conditionals around pipe_debug_message() calls since the check is now done in the macro itself. --- src/gallium/drivers/svga/svga_context.h | 3 +++ src/gallium/drivers/svga/svga_draw_arrays.c | 5 + src/gallium/drivers/svga/svga_pipe_blend.c | 9 + src/gallium/drivers/svga/svga_pipe_misc.c| 17 + src/gallium/drivers/svga/svga_pipe_rasterizer.c | 5 + src/gallium/drivers/svga/svga_state_need_swtnl.c | 23 +++ 6 files changed, 62 insertions(+) diff --git a/src/gallium/drivers/svga/svga_context.h b/src/gallium/drivers/svga/svga_context.h index 6a4f9d8..c4284cc 100644 --- a/src/gallium/drivers/svga/svga_context.h +++ b/src/gallium/drivers/svga/svga_context.h @@ -392,6 +392,9 @@ struct svga_context boolean no_line_width; boolean force_hw_line_stipple; + + /** To report perf/conformance/etc issues to the state tracker */ + struct pipe_debug_callback callback; } debug; struct { diff --git a/src/gallium/drivers/svga/svga_draw_arrays.c b/src/gallium/drivers/svga/svga_draw_arrays.c index acb2e95..b02adeb 100644 --- a/src/gallium/drivers/svga/svga_draw_arrays.c +++ b/src/gallium/drivers/svga/svga_draw_arrays.c @@ -26,6 +26,7 @@ #include "svga_cmd.h" #include "util/u_inlines.h" +#include "util/u_prim.h" #include "indices/u_indices.h" #include "svga_hw_reg.h" @@ -277,6 +278,10 @@ svga_hwtnl_draw_arrays(struct svga_hwtnl *hwtnl, if (ret != PIPE_OK) goto done; + pipe_debug_message(>debug.callback, PERF_INFO, + "generating temporary index buffer for drawing %s", + u_prim_name(prim)); + ret = svga_hwtnl_simple_draw_range_elements(hwtnl, gen_buf, gen_size, diff --git a/src/gallium/drivers/svga/svga_pipe_blend.c b/src/gallium/drivers/svga/svga_pipe_blend.c index 0c9d612..b485703 100644 --- a/src/gallium/drivers/svga/svga_pipe_blend.c +++ b/src/gallium/drivers/svga/svga_pipe_blend.c @@ -243,6 +243,15 @@ svga_create_blend_state(struct pipe_context *pipe, blend->rt[i].srcblend_alpha = blend->rt[i].srcblend; blend->rt[i].dstblend_alpha = blend->rt[i].dstblend; blend->rt[i].blendeq_alpha = blend->rt[i].blendeq; + + if (templ->logicop_func == PIPE_LOGICOP_XOR) { +pipe_debug_message(>debug.callback, CONFORMANCE, + "XOR logicop mode has limited support%s", ""); + } + else if (templ->logicop_func != PIPE_LOGICOP_COPY) { +pipe_debug_message(>debug.callback, CONFORMANCE, + "general logicops are not supported%s", ""); + } } else { /* Note: the vgpu10 device does not yet support independent diff --git a/src/gallium/drivers/svga/svga_pipe_misc.c b/src/gallium/drivers/svga/svga_pipe_misc.c index c8020da..af9356d 100644 --- a/src/gallium/drivers/svga/svga_pipe_misc.c +++ b/src/gallium/drivers/svga/svga_pipe_misc.c @@ -244,6 +244,22 @@ static void svga_set_viewport_states( struct pipe_context *pipe, } +/** + * Called by state tracker to specify a callback function the driver + * can use to report info back to the state tracker. + */ +static void +svga_set_debug_callback(struct pipe_context *pipe, +const struct pipe_debug_callback *cb) +{ + struct svga_context *svga = svga_context(pipe); + + if (cb) + svga->debug.callback = *cb; + else + memset(>debug.callback, 0, sizeof(svga->debug.callback)); +} + void svga_init_misc_functions( struct svga_context *svga ) { @@ -252,6 +268,7 @@ void svga_init_misc_functions( struct svga_context *svga ) svga->pipe.set_framebuffer_state = svga_set_framebuffer_state; svga->pipe.set_clip_state = svga_set_clip_state; svga->pipe.set_viewport_states = svga_set_viewport_states; + svga->pipe.set_debug_callback = svga_set_debug_callback; } diff --git a/src/gallium/drivers/svga/svga_pipe_rasterizer.c b/src/gallium/drivers/svga/svga_pipe_rasterizer.c index 6310b7a..fa1744f 100644 --- a/src/gallium/drivers/svga/svga_pipe_rasterizer.c +++ b/src/gallium/drivers/svga/svga_pipe_rasterizer.c @@ -352,6 +352,11 @@ svga_create_rasterizer_state(struct pipe_context *pipe, define_rasterizer_object(svga, rast); } + if (templ->poly_smooth) { + pipe_debug_message(>debug.callback, CONFORMANCE, + "GL_POLYGON_SMOOTH not supported"); + } + svga->hud.num_state_objects++; return rast; diff --git a/src/gallium/drivers/svga/svga_state_need_swtnl.c
Re: [Mesa-dev] [PATCH] svga: use the debug callback to report issues to the state tracker
On Fri, Dec 4, 2015 at 6:32 PM, Jose Fonsecawrote: > On 04/12/15 23:27, Ilia Mirkin wrote: >> >> On Fri, Dec 4, 2015 at 6:25 PM, Jose Fonseca wrote: >>> >>> On 04/12/15 19:42, Brian Paul wrote: Use the new debug callback hook to report conformance, performance and fallbacks to the state tracker. The state tracker, in turn can report this issues to the user via the GL_ARB_debug_output extension. More issues can be reported in the future; this is just a start. --- src/gallium/drivers/svga/svga_context.h | 3 +++ src/gallium/drivers/svga/svga_draw_arrays.c | 7 +++ src/gallium/drivers/svga/svga_pipe_blend.c | 11 +++ src/gallium/drivers/svga/svga_pipe_misc.c| 17 + src/gallium/drivers/svga/svga_pipe_rasterizer.c | 6 ++ src/gallium/drivers/svga/svga_state_need_swtnl.c | 23 +++ 6 files changed, 67 insertions(+) diff --git a/src/gallium/drivers/svga/svga_context.h b/src/gallium/drivers/svga/svga_context.h index 6a4f9d8..c4284cc 100644 --- a/src/gallium/drivers/svga/svga_context.h +++ b/src/gallium/drivers/svga/svga_context.h @@ -392,6 +392,9 @@ struct svga_context boolean no_line_width; boolean force_hw_line_stipple; + + /** To report perf/conformance/etc issues to the state tracker */ + struct pipe_debug_callback callback; } debug; struct { diff --git a/src/gallium/drivers/svga/svga_draw_arrays.c b/src/gallium/drivers/svga/svga_draw_arrays.c index acb2e95..f6956b5 100644 --- a/src/gallium/drivers/svga/svga_draw_arrays.c +++ b/src/gallium/drivers/svga/svga_draw_arrays.c @@ -26,6 +26,7 @@ #include "svga_cmd.h" #include "util/u_inlines.h" +#include "util/u_prim.h" #include "indices/u_indices.h" #include "svga_hw_reg.h" @@ -277,6 +278,12 @@ svga_hwtnl_draw_arrays(struct svga_hwtnl *hwtnl, if (ret != PIPE_OK) goto done; + if (svga->debug.callback.debug_message) { + pipe_debug_message(>debug.callback, PERF_INFO, +"generating temporary index buffer for drawing %s", +u_prim_name(prim)); + } + ret = svga_hwtnl_simple_draw_range_elements(hwtnl, gen_buf, gen_size, diff --git a/src/gallium/drivers/svga/svga_pipe_blend.c b/src/gallium/drivers/svga/svga_pipe_blend.c index 0c9d612..5538193 100644 --- a/src/gallium/drivers/svga/svga_pipe_blend.c +++ b/src/gallium/drivers/svga/svga_pipe_blend.c @@ -243,6 +243,17 @@ svga_create_blend_state(struct pipe_context *pipe, blend->rt[i].srcblend_alpha = blend->rt[i].srcblend; blend->rt[i].dstblend_alpha = blend->rt[i].dstblend; blend->rt[i].blendeq_alpha = blend->rt[i].blendeq; + + if (svga->debug.callback.debug_message) { +if (templ->logicop_func == PIPE_LOGICOP_XOR) { + pipe_debug_message(>debug.callback, CONFORMANCE, + "XOR logicop mode has limited support%s", ""); +} +else if (templ->logicop_func != PIPE_LOGICOP_COPY) { + pipe_debug_message(>debug.callback, CONFORMANCE, + "general logicops are not supported%s", ""); +} + } } else { /* Note: the vgpu10 device does not yet support independent diff --git a/src/gallium/drivers/svga/svga_pipe_misc.c b/src/gallium/drivers/svga/svga_pipe_misc.c index c8020da..af9356d 100644 --- a/src/gallium/drivers/svga/svga_pipe_misc.c +++ b/src/gallium/drivers/svga/svga_pipe_misc.c @@ -244,6 +244,22 @@ static void svga_set_viewport_states( struct pipe_context *pipe, } +/** + * Called by state tracker to specify a callback function the driver + * can use to report info back to the state tracker. + */ +static void +svga_set_debug_callback(struct pipe_context *pipe, +const struct pipe_debug_callback *cb) +{ + struct svga_context *svga = svga_context(pipe); + + if (cb) + svga->debug.callback = *cb; + else + memset(>debug.callback, 0, sizeof(svga->debug.callback)); +} + void svga_init_misc_functions( struct svga_context *svga ) { @@ -252,6 +268,7 @@ void svga_init_misc_functions( struct svga_context *svga )
[Mesa-dev] [Bug 93257] [SKL, bisected] ASTC dEQP tests segfault
https://bugs.freedesktop.org/show_bug.cgi?id=93257 Nanley Cherychanged: What|Removed |Added Component|Mesa core |Drivers/DRI/i965 QA Contact|mesa-dev@lists.freedesktop. |intel-3d-bugs@lists.freedes |org |ktop.org -- You are receiving this mail because: You are the QA Contact for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 01/26] nir: Add a writemask to store intrinsics.
On Wed, Dec 2, 2015 at 4:15 PM, Kenneth Graunkewrote: > Tessellation control shaders need to be careful when writing outputs. > Because multiple threads can concurrently write the same output > variables, we need to only write the exact components we were told. > > Traditionally, for sub-vector writes, we've read the whole vector, > updated the temporary, and written the whole vector back. This breaks > down with concurrent access. > > This patch prepares the way for a solution by adding a writemask field > to store_var intrinsics, as well as the other store intrinsics. It then > updates all produces to emit a writemask of "all channels enabled". It > updates nir_lower_io to copy the writemask to output store intrinsics. > > Finally, it updates nir_lower_vars_to_ssa to handle partial writemasks > by doing a read-modify-write cycle (which is safe, because local > variables are specific to a single thread). > > This should have no functional change, since no one actually emits > partial writemasks yet. > > Signed-off-by: Kenneth Graunke > --- > src/gallium/auxiliary/nir/tgsi_to_nir.c| 1 + > .../drivers/freedreno/ir3/ir3_compiler_nir.c | 6 +++ > src/glsl/nir/glsl_to_nir.cpp | 2 + > src/glsl/nir/nir_builder.h | 4 +- > src/glsl/nir/nir_intrinsics.h | 14 +++ > src/glsl/nir/nir_lower_gs_intrinsics.c | 3 +- > src/glsl/nir/nir_lower_io.c| 2 + > src/glsl/nir/nir_lower_locals_to_regs.c| 2 +- > src/glsl/nir/nir_lower_var_copies.c| 1 + > src/glsl/nir/nir_lower_vars_to_ssa.c | 46 > -- > src/glsl/nir/nir_validate.c| 1 + > src/mesa/program/prog_to_nir.c | 2 + > 12 files changed, 63 insertions(+), 21 deletions(-) > > Technically, I don't think I need the ir3 changes here - it should work > without any changes. I think I was just overzealously preparing things > to handle writemasks before I realized there shouldn't be any. > > But, it might be worth doing anyway. Not sure. > > Jason and I have already talked about the fact that we're conflicting. > I like what he's doing, and he suggested this; we just need to figure out > what lands first. I decided to send these out anyway for people to look > at and start reviewing, since there's a lot to do there. We'll sort it > out in v2. > > diff --git a/src/gallium/auxiliary/nir/tgsi_to_nir.c > b/src/gallium/auxiliary/nir/tgsi_to_nir.c > index 86c2ffa..f5547c8 100644 > --- a/src/gallium/auxiliary/nir/tgsi_to_nir.c > +++ b/src/gallium/auxiliary/nir/tgsi_to_nir.c > @@ -1894,6 +1894,7 @@ ttn_emit_instruction(struct ttn_compile *c) > _dst->Indirect : NULL; > >store->num_components = 4; > + store->const_index[0] = 0xf; >store->variables[0] = ttn_array_deref(c, store, var, offset, indirect); >store->src[0] = nir_src_for_reg(dest.dest.reg.reg); > > diff --git a/src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c > b/src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c > index 8617704..3beed0e 100644 > --- a/src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c > +++ b/src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c > @@ -1314,6 +1314,9 @@ emit_intrinisic_store_var(struct ir3_compile *ctx, > nir_intrinsic_instr *intr) > case nir_deref_array_type_direct: > /* direct access does not require anything special: */ > for (int i = 0; i < intr->num_components; i++) { > + if (!(intr->const_index[0] & (1 << i))) > + continue; > + > unsigned n = darr->base_offset * 4 + i; > compile_assert(ctx, n < arr->length); > arr->arr[n] = src[i]; > @@ -1326,6 +1329,9 @@ emit_intrinisic_store_var(struct ir3_compile *ctx, > nir_intrinsic_instr *intr) > struct ir3_instruction *addr = > get_addr(ctx, get_src(ctx, > >indirect)[0]); > for (int i = 0; i < intr->num_components; i++) { > + if (!(intr->const_index[0] & (1 << i))) > + continue; > + > struct ir3_instruction *store; > unsigned n = darr->base_offset * 4 + i; > compile_assert(ctx, n < arr->length); > diff --git a/src/glsl/nir/glsl_to_nir.cpp b/src/glsl/nir/glsl_to_nir.cpp > index 45d045c..5c84b0d 100644 > --- a/src/glsl/nir/glsl_to_nir.cpp > +++ b/src/glsl/nir/glsl_to_nir.cpp > @@ -982,6 +982,7 @@ nir_visitor::visit(ir_call *ir) > nir_intrinsic_instr *store_instr = > nir_intrinsic_instr_create(shader, nir_intrinsic_store_var); > store_instr->num_components = >
[Mesa-dev] [PATCH 1/2] gallium/util: move conditional from _pipe_debug_message() to macro
To avoid function calls when the pointer is null. Also, assert that the pipe_debug_callback object is non-null. It looks like all the drivers which use this feature never pass a non-null pointer. --- src/gallium/auxiliary/util/u_debug.c | 3 +-- src/gallium/auxiliary/util/u_debug.h | 9 ++--- 2 files changed, 7 insertions(+), 5 deletions(-) diff --git a/src/gallium/auxiliary/util/u_debug.c b/src/gallium/auxiliary/util/u_debug.c index 7029536..c42505d 100644 --- a/src/gallium/auxiliary/util/u_debug.c +++ b/src/gallium/auxiliary/util/u_debug.c @@ -79,8 +79,7 @@ _pipe_debug_message( { va_list args; va_start(args, fmt); - if (cb && cb->debug_message) - cb->debug_message(cb->data, id, type, fmt, args); + cb->debug_message(cb->data, id, type, fmt, args); va_end(args); } diff --git a/src/gallium/auxiliary/util/u_debug.h b/src/gallium/auxiliary/util/u_debug.h index 9900703..4cc3d63 100644 --- a/src/gallium/auxiliary/util/u_debug.h +++ b/src/gallium/auxiliary/util/u_debug.h @@ -268,9 +268,12 @@ void _debug_assert_fail(const char *expr, */ #define pipe_debug_message(cb, type, fmt, ...) do { \ static unsigned id = 0; \ - _pipe_debug_message(cb, , \ - PIPE_DEBUG_TYPE_ ## type, \ - fmt, ##__VA_ARGS__); \ + assert(cb); \ + if ((cb)->debug_message) { \ + _pipe_debug_message(cb, , \ + PIPE_DEBUG_TYPE_ ## type, \ + fmt, ##__VA_ARGS__); \ + } \ } while (0) struct pipe_debug_callback; -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [Mesa-stable] [PATCH] glsl: always re-validate program pipeline
On Wednesday, December 02, 2015 10:45:02 AM Timothy Arceri wrote: > On Tue, 2015-12-01 at 13:46 +0200, Tapani Pälli wrote: > > On 12/01/2015 02:13 AM, Timothy Arceri wrote: > > > Just because the validation passed the last time is was called > > > doesn't > > > automatically mean it will pass again the next time its called. > > > > This is a rather large hammer though :/ Maybe we should look at > > invalidating the pipeline when samplers are changed instead. Was > > there > > other issues this resolves/fixes than sampler validation? > > There are no other tests that this fixes however I believe that's just > because the tests use ValidateProgramPipeline to check for problems > rather than issuing a draw call. For example this patch is the > equivalent of doing this in ValidateProgramPipeline > > if (!ctx->_Shader->Validated) >_mesa_validate_program_pipeline(ctx, pipe, false); > > And this causes the CTS pipeline test to fail. The interesting thing is > that the piglit test this fixes also fails on Nvidia. > > The spec is somewhat confusing in its wording: > > "Therefore validation is done when the first rendering command which > triggers shader invocations is issued, to determine if the set of > active program objects can be executed." > > This seem like its saying we only need to check it on the first call > however even then ctx->_Shader->Validated may have been set by > ValidateProgramPipeline rather than validation triggered a rendering > command. > > Also the spec goes on to say: > > "An INVALID_OPERATION error is generated by any command that trans-fers > vertices to the GL or launches compute work" > > Which seems to conflict with the first statement. > > It seems to me the intention is to call validation each time as > relinking or changing samplers could cause validation to fail. > > Maybe we should have a test which does a draw call, then relinks with > SEPARABLE to false, then calls draw again and checks for a failure. > > A less hammer type approach may be to reset ctx->_Shader->Validated > after linking and when samplers are changed but we really need more > tests to make sure we get this right. I agree with Tapani, performing shader validation on every draw call is a pretty big hammer, and will almost certainly impact performance - we really want less draw-time validation, where possible. Setting Validated = false when re-linking, changing program stages, or otherwise altering the pipeline sounds like a good idea. That way we validate it on the first draw following the change, but can avoid it after that. --Ken signature.asc Description: This is a digitally signed message part. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 93257] [SKL, bisected] ASTC dEQP tests segfault
https://bugs.freedesktop.org/show_bug.cgi?id=93257 Bug ID: 93257 Summary: [SKL, bisected] ASTC dEQP tests segfault Product: Mesa Version: git Hardware: Other OS: All Status: ASSIGNED Severity: normal Priority: medium Component: Mesa core Assignee: b...@bwidawsk.net Reporter: nanleych...@gmail.com QA Contact: mesa-dev@lists.freedesktop.org CC: chad.vers...@intel.com, n...@linux.intel.com Test: dEQP-GLES3.functional.texture.compressed.astc.void_extent_ldr.5x4 Expected result: Test fails due to comparison errors. Actual result: Test segfaults. The segfaults seem to occur for all ASTC tests with block size dimensions >= 5x4. More tests may be affected. This issue does not occur on Jenkins, but it does occur locally on the given configuration: HW: SKL dEQP: https://github.com/chadversary/deqp chadv CMAKE_BUILD_TYPE Debug DEQP_TARGET drm Mesa: commit 0288f92e7b0ce5f0d821f2d0ddef522a23776ecb Author: Ben WidawskyDate: Tue Oct 13 20:50:27 2015 -0700 i965/gen9: Support fast clears for 32b float SKL supports the ability to do fast clears and resolves of 32b RGBA as both integer and floats. This patch only enables float color clears because we haven't yet enabled integer color clears, (HW support for that was added in BDW). v2: Remove LUMINANCE16F and INTENSITY16F special cases since they are now handled by Neil's patch to disable MSAA fast clears. Signed-off-by: Ben Widawsky Reviewed-by: Neil Roberts Reviewed-by: Chad Versace Core dump info: Signal: 11 (SEGV) Command Line: ./modules/gles3/deqp-gles3 --deqp-case=dEQP-GLES3.functional.texture.compressed.astc.void_extent_ldr.5x4 --deqp-surface-type=fbo --deqp-log-images=disable --deqp-surface-width=64 --deqp-surface-height=64 Stack trace: #0 0x7fa3556059b9 dri2InvalidateDrawable (i965_dri.so) #1 0x7fa3556222dc intel_viewport (i965_dri.so) #2 0x7fa355372ace _mesa_set_viewport (i965_dri.so) #3 0x7fa35547e7a8 _mesa_meta_end (i965_dri.so) #4 0x7fa35562fa40 brw_meta_resolve_color (i965_dri.so) #5 0x7fa355680bdd intel_miptree_resolve_color (i965_dri.so) #6 0x7fa355685ee2 intel_readpixels_tiled_memcpy (i965_dri.so) #7 0x7fa35568637b intelReadPixels (i965_dri.so) #8 0x7fa355317716 _mesa_ReadnPixelsARB (i965_dri.so) #9 0x7fa355317767 _mesa_ReadPixels (i965_dri.so) #10 0x7fa358127640 shared_dispatch_stub_256 (libglapi.so.0) #11 0x010343eb _ZN3glu10readPixelsERKNS_13RenderContextEiiRKN3tcu17PixelBufferAccessE (deqp-gles3) #12 0x00c6772f _ZN4deqp5gles310Functional29ASTCDecompressionCaseInternal14ASTCRenderer2D6renderERN3tcu7SurfaceES6_RKN3glu9Texture2DERKNS4_13TextureFormatE (deqp-gles3) #13 0x00c6839f _ZN4deqp5gles310Functional15ASTCBlockCase2D7iterateEv (deqp-gles3) #14 0x00feab6b _ZN3tcu15TestCaseWrapper15iterateTestCaseEPNS_8TestCaseE (deqp-gles3) #15 0x00aacb9f _ZN4deqp5gles315TestCaseWrapper15iterateTestCaseEPN3tcu8TestCaseE (deqp-gles3) #16 0x00fec6cd _ZN3tcu12TestExecutor7iterateEv (deqp-gles3) #17 0x00fc31e7 _ZN3tcu3App7iterateEv (deqp-gles3) #18 0x00aa92d9 main (deqp-gles3) #19 0x7fa3583a0610 __libc_start_main (libc.so.6) #20 0x00aa90c9 _start (deqp-gles3) -- You are receiving this mail because: You are the QA Contact for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 02/26] nir: Use writemasked store_vars in glsl_to_nir.
Yeah, I like this. On Wed, Dec 2, 2015 at 4:15 PM, Kenneth Graunkewrote: > Instead of performing the read-modify-write cycle in glsl->nir, we can > simply emit a partial writemask. For locals, nir_lower_vars_to_ssa will > do the equivalent read-modify-write cycle for us, so we continue to get > the same SSA values we had before. > > Because glsl_to_nir calls nir_lower_outputs_to_temporaries, all outputs > are shadowed with temporary values, and written out as whole vectors at > the end of the shader. So, most consumers will still not see partial > writemasks. > > However, nir_lower_outputs_to_temporaries bails for tessellation control > shader outputs. So those remain actual variables, and stores to those > variables now get a writemask. nir_lower_io passes that through. This > means that TCS outputs should actually work now. > > This is a functional change for tessellation control shaders. > > Signed-off-by: Kenneth Graunke > --- > src/glsl/nir/glsl_to_nir.cpp | 39 +-- > 1 file changed, 9 insertions(+), 30 deletions(-) > > diff --git a/src/glsl/nir/glsl_to_nir.cpp b/src/glsl/nir/glsl_to_nir.cpp > index 5c84b0d..0d33eaa 100644 > --- a/src/glsl/nir/glsl_to_nir.cpp > +++ b/src/glsl/nir/glsl_to_nir.cpp > @@ -1044,44 +1044,23 @@ nir_visitor::visit(ir_assignment *ir) > nir_ssa_def *src = evaluate_rvalue(ir->rhs); > > if (ir->write_mask != (1 << num_components) - 1 && ir->write_mask != 0) { > - /* > - * We have no good way to update only part of a variable, so just load > - * the LHS and do a vec operation to combine the old with the new, and > - * then store it > - * back into the LHS. Copy propagation should get rid of the mess. > + /* GLSL IR will give us the input to the write-masked assignment in a > + * single packed vector. So, for example, if the writemask is xzw, > then > + * we have to swizzle x -> x, y -> z, and z -> w and get the y > component > + * from the load. > */ > - > - nir_intrinsic_instr *load = > - nir_intrinsic_instr_create(this->shader, nir_intrinsic_load_var); > - load->num_components = ir->lhs->type->vector_elements; > - nir_ssa_dest_init(>instr, >dest, num_components, NULL); > - load->variables[0] = lhs_deref; > - ralloc_steal(load, load->variables[0]); > - nir_builder_instr_insert(, >instr); > - > - nir_ssa_def *srcs[4]; > - > + unsigned swiz[4]; >unsigned component = 0; > - for (unsigned i = 0; i < ir->lhs->type->vector_elements; i++) { > - if (ir->write_mask & (1 << i)) { > -/* GLSL IR will give us the input to the write-masked assignment > - * in a single packed vector. So, for example, if the > - * writemask is xzw, then we have to swizzle x -> x, y -> z, > - * and z -> w and get the y component from the load. > - */ > -srcs[i] = nir_channel(, src, component++); > - } else { > -srcs[i] = nir_channel(, >dest.ssa, i); > - } > + for (unsigned i = 0; i < 4; i++) { > + swiz[i] = ir->write_mask & (1 << i) ? component++ : 0; >} > - > - src = nir_vec(, srcs, ir->lhs->type->vector_elements); > + src = nir_swizzle(, src, swiz, num_components, !supports_ints); > } > > nir_intrinsic_instr *store = >nir_intrinsic_instr_create(this->shader, nir_intrinsic_store_var); > store->num_components = ir->lhs->type->vector_elements; > - store->const_index[0] = (1 << store->num_components) - 1; > + store->const_index[0] = ir->write_mask; > nir_deref *store_deref = nir_copy_deref(store, _deref->deref); > store->variables[0] = nir_deref_as_var(store_deref); > store->src[0] = nir_src_for_ssa(src); > -- > 2.6.2 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 01/26] nir: Add a writemask to store intrinsics.
On Friday, December 04, 2015 04:45:03 PM Jason Ekstrand wrote: > On Wed, Dec 2, 2015 at 4:15 PM, Kenneth Graunkewrote: > > Tessellation control shaders need to be careful when writing outputs. > > Because multiple threads can concurrently write the same output > > variables, we need to only write the exact components we were told. > > > > Traditionally, for sub-vector writes, we've read the whole vector, > > updated the temporary, and written the whole vector back. This breaks > > down with concurrent access. > > > > This patch prepares the way for a solution by adding a writemask field > > to store_var intrinsics, as well as the other store intrinsics. It then > > updates all produces to emit a writemask of "all channels enabled". It > > updates nir_lower_io to copy the writemask to output store intrinsics. > > > > Finally, it updates nir_lower_vars_to_ssa to handle partial writemasks > > by doing a read-modify-write cycle (which is safe, because local > > variables are specific to a single thread). > > > > This should have no functional change, since no one actually emits > > partial writemasks yet. > > > > Signed-off-by: Kenneth Graunke > > --- > > src/gallium/auxiliary/nir/tgsi_to_nir.c| 1 + > > .../drivers/freedreno/ir3/ir3_compiler_nir.c | 6 +++ > > src/glsl/nir/glsl_to_nir.cpp | 2 + > > src/glsl/nir/nir_builder.h | 4 +- > > src/glsl/nir/nir_intrinsics.h | 14 +++ > > src/glsl/nir/nir_lower_gs_intrinsics.c | 3 +- > > src/glsl/nir/nir_lower_io.c| 2 + > > src/glsl/nir/nir_lower_locals_to_regs.c| 2 +- > > src/glsl/nir/nir_lower_var_copies.c| 1 + > > src/glsl/nir/nir_lower_vars_to_ssa.c | 46 > > -- > > src/glsl/nir/nir_validate.c| 1 + > > src/mesa/program/prog_to_nir.c | 2 + > > 12 files changed, 63 insertions(+), 21 deletions(-) > > > > Technically, I don't think I need the ir3 changes here - it should work > > without any changes. I think I was just overzealously preparing things > > to handle writemasks before I realized there shouldn't be any. > > > > But, it might be worth doing anyway. Not sure. > > > > Jason and I have already talked about the fact that we're conflicting. > > I like what he's doing, and he suggested this; we just need to figure out > > what lands first. I decided to send these out anyway for people to look > > at and start reviewing, since there's a lot to do there. We'll sort it > > out in v2. > > > > diff --git a/src/gallium/auxiliary/nir/tgsi_to_nir.c > > b/src/gallium/auxiliary/nir/tgsi_to_nir.c > > index 86c2ffa..f5547c8 100644 > > --- a/src/gallium/auxiliary/nir/tgsi_to_nir.c > > +++ b/src/gallium/auxiliary/nir/tgsi_to_nir.c > > @@ -1894,6 +1894,7 @@ ttn_emit_instruction(struct ttn_compile *c) > > _dst->Indirect : NULL; > > > >store->num_components = 4; > > + store->const_index[0] = 0xf; > >store->variables[0] = ttn_array_deref(c, store, var, offset, > > indirect); > >store->src[0] = nir_src_for_reg(dest.dest.reg.reg); > > > > diff --git a/src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c > > b/src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c > > index 8617704..3beed0e 100644 > > --- a/src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c > > +++ b/src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c > > @@ -1314,6 +1314,9 @@ emit_intrinisic_store_var(struct ir3_compile *ctx, > > nir_intrinsic_instr *intr) > > case nir_deref_array_type_direct: > > /* direct access does not require anything special: */ > > for (int i = 0; i < intr->num_components; i++) { > > + if (!(intr->const_index[0] & (1 << i))) > > + continue; > > + > > unsigned n = darr->base_offset * 4 + i; > > compile_assert(ctx, n < arr->length); > > arr->arr[n] = src[i]; > > @@ -1326,6 +1329,9 @@ emit_intrinisic_store_var(struct ir3_compile *ctx, > > nir_intrinsic_instr *intr) > > struct ir3_instruction *addr = > > get_addr(ctx, get_src(ctx, > > >indirect)[0]); > > for (int i = 0; i < intr->num_components; i++) { > > + if (!(intr->const_index[0] & (1 << i))) > > + continue; > > + > > struct ir3_instruction *store; > > unsigned n = darr->base_offset * 4 + i; > > compile_assert(ctx, n < arr->length); > > diff --git a/src/glsl/nir/glsl_to_nir.cpp b/src/glsl/nir/glsl_to_nir.cpp > > index 45d045c..5c84b0d 100644 > > --- a/src/glsl/nir/glsl_to_nir.cpp > > +++
Re: [Mesa-dev] [PATCH 2/2] svga: use the debug callback to report issues to the state tracker
On Fri, Dec 4, 2015 at 7:52 PM, Brian Paulwrote: > Use the new debug callback hook to report conformance, performance > and fallbacks to the state tracker. The state tracker, in turn can > report this issues to the user via the GL_ARB_debug_output extension. > > More issues can be reported in the future; this is just a start. > > v2: remove conditionals around pipe_debug_message() calls since the > check is now done in the macro itself. > --- > src/gallium/drivers/svga/svga_context.h | 3 +++ > src/gallium/drivers/svga/svga_draw_arrays.c | 5 + > src/gallium/drivers/svga/svga_pipe_blend.c | 9 + > src/gallium/drivers/svga/svga_pipe_misc.c| 17 + > src/gallium/drivers/svga/svga_pipe_rasterizer.c | 5 + > src/gallium/drivers/svga/svga_state_need_swtnl.c | 23 +++ > 6 files changed, 62 insertions(+) > > diff --git a/src/gallium/drivers/svga/svga_context.h > b/src/gallium/drivers/svga/svga_context.h > index 6a4f9d8..c4284cc 100644 > --- a/src/gallium/drivers/svga/svga_context.h > +++ b/src/gallium/drivers/svga/svga_context.h > @@ -392,6 +392,9 @@ struct svga_context > >boolean no_line_width; >boolean force_hw_line_stipple; > + > + /** To report perf/conformance/etc issues to the state tracker */ > + struct pipe_debug_callback callback; > } debug; > > struct { > diff --git a/src/gallium/drivers/svga/svga_draw_arrays.c > b/src/gallium/drivers/svga/svga_draw_arrays.c > index acb2e95..b02adeb 100644 > --- a/src/gallium/drivers/svga/svga_draw_arrays.c > +++ b/src/gallium/drivers/svga/svga_draw_arrays.c > @@ -26,6 +26,7 @@ > #include "svga_cmd.h" > > #include "util/u_inlines.h" > +#include "util/u_prim.h" > #include "indices/u_indices.h" > > #include "svga_hw_reg.h" > @@ -277,6 +278,10 @@ svga_hwtnl_draw_arrays(struct svga_hwtnl *hwtnl, >if (ret != PIPE_OK) > goto done; > > + pipe_debug_message(>debug.callback, PERF_INFO, > + "generating temporary index buffer for drawing %s", > + u_prim_name(prim)); > + >ret = svga_hwtnl_simple_draw_range_elements(hwtnl, >gen_buf, >gen_size, > diff --git a/src/gallium/drivers/svga/svga_pipe_blend.c > b/src/gallium/drivers/svga/svga_pipe_blend.c > index 0c9d612..b485703 100644 > --- a/src/gallium/drivers/svga/svga_pipe_blend.c > +++ b/src/gallium/drivers/svga/svga_pipe_blend.c > @@ -243,6 +243,15 @@ svga_create_blend_state(struct pipe_context *pipe, > blend->rt[i].srcblend_alpha = blend->rt[i].srcblend; > blend->rt[i].dstblend_alpha = blend->rt[i].dstblend; > blend->rt[i].blendeq_alpha = blend->rt[i].blendeq; > + > + if (templ->logicop_func == PIPE_LOGICOP_XOR) { > +pipe_debug_message(>debug.callback, CONFORMANCE, > + "XOR logicop mode has limited support%s", ""); > + } > + else if (templ->logicop_func != PIPE_LOGICOP_COPY) { > +pipe_debug_message(>debug.callback, CONFORMANCE, > + "general logicops are not supported%s", ""); These %s's should no longer be necessary after my ##__VA_ARGS__ fix right? > + } >} >else { > /* Note: the vgpu10 device does not yet support independent > diff --git a/src/gallium/drivers/svga/svga_pipe_misc.c > b/src/gallium/drivers/svga/svga_pipe_misc.c > index c8020da..af9356d 100644 > --- a/src/gallium/drivers/svga/svga_pipe_misc.c > +++ b/src/gallium/drivers/svga/svga_pipe_misc.c > @@ -244,6 +244,22 @@ static void svga_set_viewport_states( struct > pipe_context *pipe, > } > > > +/** > + * Called by state tracker to specify a callback function the driver > + * can use to report info back to the state tracker. > + */ > +static void > +svga_set_debug_callback(struct pipe_context *pipe, > +const struct pipe_debug_callback *cb) > +{ > + struct svga_context *svga = svga_context(pipe); > + > + if (cb) > + svga->debug.callback = *cb; > + else > + memset(>debug.callback, 0, sizeof(svga->debug.callback)); > +} > + > > void svga_init_misc_functions( struct svga_context *svga ) > { > @@ -252,6 +268,7 @@ void svga_init_misc_functions( struct svga_context *svga ) > svga->pipe.set_framebuffer_state = svga_set_framebuffer_state; > svga->pipe.set_clip_state = svga_set_clip_state; > svga->pipe.set_viewport_states = svga_set_viewport_states; > + svga->pipe.set_debug_callback = svga_set_debug_callback; > } > > > diff --git a/src/gallium/drivers/svga/svga_pipe_rasterizer.c > b/src/gallium/drivers/svga/svga_pipe_rasterizer.c > index 6310b7a..fa1744f 100644 > --- a/src/gallium/drivers/svga/svga_pipe_rasterizer.c > +++ b/src/gallium/drivers/svga/svga_pipe_rasterizer.c > @@ -352,6 +352,11 @@
Re: [Mesa-dev] [PATCH v3 10/44] i965: Define and use REG_MASK macro to make masked MMIO writes slightly more readable.
On Tue, Dec 01, 2015 at 12:19:28AM -0800, Jordan Justen wrote: > From: Francisco Jerez> > Reviewed-by: Samuel Iglesias Gonsálvez > --- > src/mesa/drivers/dri/i965/brw_defines.h | 6 ++ > src/mesa/drivers/dri/i965/brw_state_upload.c | 2 +- > src/mesa/drivers/dri/i965/gen7_l3_state.c| 2 +- > src/mesa/drivers/dri/i965/intel_reg.h| 2 +- > 4 files changed, 9 insertions(+), 3 deletions(-) > > diff --git a/src/mesa/drivers/dri/i965/brw_defines.h > b/src/mesa/drivers/dri/i965/brw_defines.h > index a511d5c..94f878b 100644 > --- a/src/mesa/drivers/dri/i965/brw_defines.h > +++ b/src/mesa/drivers/dri/i965/brw_defines.h > @@ -41,6 +41,12 @@ > #define GET_BITS(data, high, low) ((data & INTEL_MASK((high), (low))) >> > (low)) > #define GET_FIELD(word, field) (((word) & field ## _MASK) >> field ## > _SHIFT) > > +/** > + * For use with masked MMIO registers where the upper 16 bits control which > + * of the lower bits are committed to the register. > + */ > +#define REG_MASK(value) ((value) << 16) Might nice to also take the value of the bit and do #define REG_MASK(bit, enable) ( ((bit) << 16) | (enable ? (bit) : 0) ) > + > #ifndef BRW_DEFINES_H > #define BRW_DEFINES_H > > diff --git a/src/mesa/drivers/dri/i965/brw_state_upload.c > b/src/mesa/drivers/dri/i965/brw_state_upload.c > index aab5c91..3d14d65 100644 > --- a/src/mesa/drivers/dri/i965/brw_state_upload.c > +++ b/src/mesa/drivers/dri/i965/brw_state_upload.c > @@ -382,7 +382,7 @@ brw_upload_initial_gpu_state(struct brw_context *brw) >BEGIN_BATCH(3); >OUT_BATCH(MI_LOAD_REGISTER_IMM | (3 - 2)); >OUT_BATCH(GEN7_CACHE_MODE_1); > - OUT_BATCH((GEN9_PARTIAL_RESOLVE_DISABLE_IN_VC << 16) | > + OUT_BATCH(REG_MASK(GEN9_PARTIAL_RESOLVE_DISABLE_IN_VC) | > GEN9_PARTIAL_RESOLVE_DISABLE_IN_VC); and then this becomes OUT_BATCH(REG_MASK(GEN9_PARTIAL_RESOLVE_DISABLE_IN_VC, true)); which avoids having to type GEN9_PARTIAL_RESOLVE_DISABLE_IN_VC twice. > } > diff --git a/src/mesa/drivers/dri/i965/gen7_l3_state.c > b/src/mesa/drivers/dri/i965/gen7_l3_state.c > index 9aad563..2c692be 100644 > --- a/src/mesa/drivers/dri/i965/gen7_l3_state.c > +++ b/src/mesa/drivers/dri/i965/gen7_l3_state.c > @@ -264,7 +264,7 @@ setup_l3_config(struct brw_context *brw, const struct > brw_l3_config *cfg) > OUT_BATCH(HSW_SCRATCH1); > OUT_BATCH(has_dc ? 0 : HSW_SCRATCH1_L3_ATOMIC_DISABLE); > OUT_BATCH(HSW_ROW_CHICKEN3); > - OUT_BATCH(HSW_ROW_CHICKEN3_L3_ATOMIC_DISABLE << 16 | > + OUT_BATCH(REG_MASK(HSW_ROW_CHICKEN3_L3_ATOMIC_DISABLE) | > (has_dc ? 0 : HSW_ROW_CHICKEN3_L3_ATOMIC_DISABLE)); and this is even nicer: OUT_BATCH(REG_MASK(HSW_ROW_CHICKEN3_L3_ATOMIC_DISABLE, has_dc)); Also, multipe REG_MASK()s can be bitwise or-ed together to set or clear multiple bits. > ADVANCE_BATCH(); >} > diff --git a/src/mesa/drivers/dri/i965/intel_reg.h > b/src/mesa/drivers/dri/i965/intel_reg.h > index 0b167d5..d6f 100644 > --- a/src/mesa/drivers/dri/i965/intel_reg.h > +++ b/src/mesa/drivers/dri/i965/intel_reg.h > @@ -183,7 +183,7 @@ > # define GEN8_HIZ_NP_EARLY_Z_FAILS_DISABLE (1 << 13) > # define GEN9_PARTIAL_RESOLVE_DISABLE_IN_VC (1 << 1) > # define GEN8_HIZ_PMA_MASK_BITS \ > - ((GEN8_HIZ_NP_PMA_FIX_ENABLE | GEN8_HIZ_NP_EARLY_Z_FAILS_DISABLE) << 16) > + REG_MASK(GEN8_HIZ_NP_PMA_FIX_ENABLE | GEN8_HIZ_NP_EARLY_Z_FAILS_DISABLE) > > /* Predicate registers */ > #define MI_PREDICATE_SRC0 0x2400 > -- > 2.6.2 > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] gallium/util: move conditional from _pipe_debug_message() to macro
Actually nouveau can, in some situations, pass in a null debug object. I have a thing in the fence wait to say how long it waited, which can be invoked without a pipe_context, and hence no debug object. Other (most) times it will be invoked with a pipe_context in hand and be able to log the messages. Oh, but perhaps for other reasons nouveau actually checks if it's null before calling pipe_debug_message. Either way, I think it's a nice property that the object is allowed to be null... I'd rather keep that option open. -ilia On Fri, Dec 4, 2015 at 7:52 PM, Brian Paulwrote: > To avoid function calls when the pointer is null. > > Also, assert that the pipe_debug_callback object is non-null. It looks > like all the drivers which use this feature never pass a non-null pointer. > --- > src/gallium/auxiliary/util/u_debug.c | 3 +-- > src/gallium/auxiliary/util/u_debug.h | 9 ++--- > 2 files changed, 7 insertions(+), 5 deletions(-) > > diff --git a/src/gallium/auxiliary/util/u_debug.c > b/src/gallium/auxiliary/util/u_debug.c > index 7029536..c42505d 100644 > --- a/src/gallium/auxiliary/util/u_debug.c > +++ b/src/gallium/auxiliary/util/u_debug.c > @@ -79,8 +79,7 @@ _pipe_debug_message( > { > va_list args; > va_start(args, fmt); > - if (cb && cb->debug_message) > - cb->debug_message(cb->data, id, type, fmt, args); > + cb->debug_message(cb->data, id, type, fmt, args); > va_end(args); > } > > diff --git a/src/gallium/auxiliary/util/u_debug.h > b/src/gallium/auxiliary/util/u_debug.h > index 9900703..4cc3d63 100644 > --- a/src/gallium/auxiliary/util/u_debug.h > +++ b/src/gallium/auxiliary/util/u_debug.h > @@ -268,9 +268,12 @@ void _debug_assert_fail(const char *expr, > */ > #define pipe_debug_message(cb, type, fmt, ...) do { \ > static unsigned id = 0; \ > - _pipe_debug_message(cb, , \ > - PIPE_DEBUG_TYPE_ ## type, \ > - fmt, ##__VA_ARGS__); \ > + assert(cb); \ > + if ((cb)->debug_message) { \ > + _pipe_debug_message(cb, , \ > + PIPE_DEBUG_TYPE_ ## type, \ > + fmt, ##__VA_ARGS__); \ > + } \ > } while (0) > > struct pipe_debug_callback; > -- > 1.9.1 > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] gallium/util: move conditional from _pipe_debug_message() to macro
I suspect that adding the check in the macro amight cause compilers to warn that "taking the address of cb can never be false". Either way, series is Reviewed-by: Jose FonsecaOn 05/12/15 02:17, Ilia Mirkin wrote: Actually nouveau can, in some situations, pass in a null debug object. I have a thing in the fence wait to say how long it waited, which can be invoked without a pipe_context, and hence no debug object. Other (most) times it will be invoked with a pipe_context in hand and be able to log the messages. Oh, but perhaps for other reasons nouveau actually checks if it's null before calling pipe_debug_message. Either way, I think it's a nice property that the object is allowed to be null... I'd rather keep that option open. -ilia On Fri, Dec 4, 2015 at 7:52 PM, Brian Paul wrote: To avoid function calls when the pointer is null. Also, assert that the pipe_debug_callback object is non-null. It looks like all the drivers which use this feature never pass a non-null pointer. --- src/gallium/auxiliary/util/u_debug.c | 3 +-- src/gallium/auxiliary/util/u_debug.h | 9 ++--- 2 files changed, 7 insertions(+), 5 deletions(-) diff --git a/src/gallium/auxiliary/util/u_debug.c b/src/gallium/auxiliary/util/u_debug.c index 7029536..c42505d 100644 --- a/src/gallium/auxiliary/util/u_debug.c +++ b/src/gallium/auxiliary/util/u_debug.c @@ -79,8 +79,7 @@ _pipe_debug_message( { va_list args; va_start(args, fmt); - if (cb && cb->debug_message) - cb->debug_message(cb->data, id, type, fmt, args); + cb->debug_message(cb->data, id, type, fmt, args); va_end(args); } diff --git a/src/gallium/auxiliary/util/u_debug.h b/src/gallium/auxiliary/util/u_debug.h index 9900703..4cc3d63 100644 --- a/src/gallium/auxiliary/util/u_debug.h +++ b/src/gallium/auxiliary/util/u_debug.h @@ -268,9 +268,12 @@ void _debug_assert_fail(const char *expr, */ #define pipe_debug_message(cb, type, fmt, ...) do { \ static unsigned id = 0; \ - _pipe_debug_message(cb, , \ - PIPE_DEBUG_TYPE_ ## type, \ - fmt, ##__VA_ARGS__); \ + assert(cb); \ + if ((cb)->debug_message) { \ + _pipe_debug_message(cb, , \ + PIPE_DEBUG_TYPE_ ## type, \ + fmt, ##__VA_ARGS__); \ + } \ } while (0) struct pipe_debug_callback; -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 0/3] misc janitorial
> Hmm I thought I pushed those in a couple of days ago. Doing it now, thanks ! No, thank you 8-) >> This is one of those >> contexts in which things could be done marginally more elegantly with >> C++. I am proficient enough with C macros and xincludes that I could >> probably hack together something that avoids the symlinks and the code >> duplication, but whether or not this would be considered an >> improvement is very much up to personal taste. >> > Before you start moving things around please fish around for > RADEON_R(12)00. Once things are free of those (either the macros are > just removed or the functions are split into r100 and r200 variant, > all the movement comes into play. The last one can be tedious if > you're unfamiliar with autotools, so just ping me on IRC (not there > today) and I can guide you around. I've started looking into the defines and it's not just a matter of tediousness. The problem with splitting out the conditional functions is with the callers. Consider for example radeon_map_renderbuffer_s8z24, which uses get_depth_z32, which has different forms for R100 and R200. The caller could go in the common library, since the code is exactly the same, but it would need to know which get_depth_z32 incarnation to invoke. What would be a nice way to approach this? Function pointers? Runtime switch? Code duplication of the entire call chain? Other? -- Giuseppe "Oblomov" Bilotta ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] nir: Optimize useless comparisons against true/false.
Matt Turnerwrites: > # Written in the form (, ) where is an expression > # and is either an expression or a value. An expression is > # defined as a tuple of the form (, , , , ) > @@ -94,6 +97,8 @@ optimizations = [ > (('inot', ('ige', a, b)), ('ilt', a, b)), > (('inot', ('ieq', a, b)), ('ine', a, b)), > (('inot', ('ine', a, b)), ('ieq', a, b)), > + (('ieq', 'a@bool', true), a), > + (('ine', 'a@bool', false), a), I think this second one is already in the file on line 187 here: # Boolean simplifications (('ine', 'a@bool', 0), 'a'), (('ieq', 'a@bool', 0), ('inot', 'a')), Maybe you could add the first one near these two? It could be good to add a line like this as well to complete the set: (('ine', 'a@bool', true), ('inot', a)), Regards, - Neil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 06/10] i965/fs: Get rid of load/store_foo_indirect
On Wed, Dec 2, 2015 at 12:33 AM, Kenneth Graunkewrote: > On Wednesday, November 25, 2015 09:00:09 PM Jason Ekstrand wrote: >> --- >> src/mesa/drivers/dri/i965/brw_fs.h | 3 +- >> src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 145 >> ++- >> src/mesa/drivers/dri/i965/brw_nir.c | 36 ++-- >> 3 files changed, 92 insertions(+), 92 deletions(-) >> >> diff --git a/src/mesa/drivers/dri/i965/brw_fs.h >> b/src/mesa/drivers/dri/i965/brw_fs.h >> index 2d408b2..b8891dd 100644 >> --- a/src/mesa/drivers/dri/i965/brw_fs.h >> +++ b/src/mesa/drivers/dri/i965/brw_fs.h >> @@ -296,8 +296,7 @@ public: >> unsigned stream_id); >> void emit_gs_thread_end(); >> void emit_gs_input_load(const fs_reg , const nir_src _src, >> - const fs_reg _offset, unsigned >> imm_offset, >> - unsigned num_components); >> + const nir_src _src, unsigned >> num_components); >> void emit_cs_terminate(); >> fs_reg *emit_cs_local_invocation_id_setup(); >> fs_reg *emit_cs_work_group_id_setup(); >> diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp >> b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp >> index c439da2..b2e6ee8 100644 >> --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp >> +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp >> @@ -1601,25 +1601,26 @@ fs_visitor::emit_gs_vertex(const nir_src >> _count_nir_src, >> void >> fs_visitor::emit_gs_input_load(const fs_reg , >> const nir_src _src, >> - const fs_reg _offset, >> - unsigned imm_offset, >> + const nir_src _src, >> unsigned num_components) >> { >> struct brw_gs_prog_data *gs_prog_data = (struct brw_gs_prog_data *) >> prog_data; >> >> + nir_const_value *vertex_const = nir_src_as_const_value(vertex_src); >> + nir_const_value *offset_const = nir_src_as_const_value(offset_src); >> + const unsigned push_reg_count = gs_prog_data->base.urb_read_length * 8; >> + >> /* Offset 0 is the VUE header, which contains VARYING_SLOT_LAYER [.y], >> * VARYING_SLOT_VIEWPORT [.z], and VARYING_SLOT_PSIZ [.w]. Only >> * gl_PointSize is available as a GS input, however, so it must be that. >> */ >> const bool is_point_size = >> - indirect_offset.file == BAD_FILE && imm_offset == 0; >> - >> - nir_const_value *vertex_const = nir_src_as_const_value(vertex_src); >> - const unsigned push_reg_count = gs_prog_data->base.urb_read_length * 8; >> + offset_const != NULL && offset_const->u[0] == 0; >> >> - if (indirect_offset.file == BAD_FILE && vertex_const != NULL && >> - 4 * imm_offset < push_reg_count) { >> - imm_offset = 4 * imm_offset + vertex_const->u[0] * push_reg_count; >> + if (offset_const != NULL && vertex_const != NULL && >> + 4 * offset_const->u[0] < push_reg_count) { > > Hmm, I thought you were planning to write a NIR helper that looked if an > expression was ( + ) and return both pieces? As is, > this will generate different code...I used to use the global offset for > the constant part, and the per-slot offsets for only the non-constant > part. I think that can save some ADD instructions. I'll be sending out a V2 that puts the base_offset back in but in a well-defined and sane manner. Eric and rob both want to keep the base+offset pair for opaque storage. >> + int imm_offset = offset_const->u[0] * 4 + >> + vertex_const->u[0] * push_reg_count; >>/* This input was pushed into registers. */ >>if (is_point_size) { >> /* gl_PointSize comes in .w */ >> @@ -1681,21 +1682,21 @@ fs_visitor::emit_gs_input_load(const fs_reg , >>} >> >>fs_inst *inst; >> - if (indirect_offset.file == BAD_FILE) { >> + if (offset_const) { >> /* Constant indexing - use global offset. */ >> inst = bld.emit(SHADER_OPCODE_URB_READ_SIMD8, dst, icp_handle); >> - inst->offset = imm_offset; >> + inst->offset = offset_const->u[0]; >> inst->base_mrf = -1; >> inst->mlen = 1; >> inst->regs_written = num_components; >>} else { >> /* Indirect indexing - use per-slot offsets as well. */ >> - const fs_reg srcs[] = { icp_handle, indirect_offset }; >> + const fs_reg srcs[] = { icp_handle, get_nir_src(offset_src) }; >> fs_reg payload = bld.vgrf(BRW_REGISTER_TYPE_UD, 2); >> bld.LOAD_PAYLOAD(payload, srcs, ARRAY_SIZE(srcs), 0); >> >> inst = bld.emit(SHADER_OPCODE_URB_READ_SIMD8_PER_SLOT, dst, >> payload); >> - inst->offset = imm_offset; >> + inst->offset = 0; >> inst->base_mrf = -1; >> inst->mlen = 2; >> inst->regs_written = num_components; >> @@ -1761,16 +1762,11 @@
[Mesa-dev] [PATCH 2/3] winsys/amdgpu: addrlib - port Checks mip 0 for czDispCompatible
From: Sonny JiangChange-Id: I3a6ae6b76b13aa95103f68ca3a2a8d03d5d0aeb1 Signed-off-by: Sonny Jiang Reviewed-by: Alex Deucher --- src/gallium/winsys/amdgpu/drm/addrlib/r800/egbaddrlib.cpp | 4 +++- src/gallium/winsys/amdgpu/drm/addrlib/r800/egbaddrlib.h | 3 ++- 2 files changed, 5 insertions(+), 2 deletions(-) diff --git a/src/gallium/winsys/amdgpu/drm/addrlib/r800/egbaddrlib.cpp b/src/gallium/winsys/amdgpu/drm/addrlib/r800/egbaddrlib.cpp index 110e3d0..088b645 100644 --- a/src/gallium/winsys/amdgpu/drm/addrlib/r800/egbaddrlib.cpp +++ b/src/gallium/winsys/amdgpu/drm/addrlib/r800/egbaddrlib.cpp @@ -352,6 +352,7 @@ BOOL_32 EgBasedAddrLib::ComputeSurfaceInfoMicroTiled( ComputeSurfaceAlignmentsMicroTiled(expTileMode, pIn->bpp, pIn->flags, + pIn->mipLevel, numSamples, >baseAlign, >pitchAlign, @@ -647,6 +648,7 @@ BOOL_32 EgBasedAddrLib::ComputeSurfaceAlignmentsMicroTiled( AddrTileModetileMode, ///< [in] tile mode UINT_32 bpp, ///< [in] bits per pixel ADDR_SURFACE_FLAGS flags, ///< [in] surface flags +UINT_32 mipLevel, ///< [in] mip level UINT_32 numSamples,///< [in] number of samples UINT_32*pBaseAlign,///< [out] base address alignment in bytes UINT_32*pPitchAlign, ///< [out] pitch alignment in pixels @@ -669,7 +671,7 @@ BOOL_32 EgBasedAddrLib::ComputeSurfaceAlignmentsMicroTiled( // ECR#393489 // Workaround 2 for 1D tiling - There is HW bug for Carrizo // where it requires the following alignments for 1D tiling. -if (flags.czDispCompatible) +if (flags.czDispCompatible && (mipLevel == 0)) { *pBaseAlign = PowTwoAlign(*pBaseAlign, 4096); //Base address MOD 4096 = 0 *pPitchAlign = PowTwoAlign(*pPitchAlign, 512 / (BITS_TO_BYTES(bpp))); //(8 lines * pitch * bytes per pixel) MOD 4096 = 0 diff --git a/src/gallium/winsys/amdgpu/drm/addrlib/r800/egbaddrlib.h b/src/gallium/winsys/amdgpu/drm/addrlib/r800/egbaddrlib.h index 84adb66..25e3896 100644 --- a/src/gallium/winsys/amdgpu/drm/addrlib/r800/egbaddrlib.h +++ b/src/gallium/winsys/amdgpu/drm/addrlib/r800/egbaddrlib.h @@ -315,7 +315,8 @@ private: UINT_32* pBaseAlign, UINT_32* pPitchAlign, UINT_32* pHeightAlign) const; BOOL_32 ComputeSurfaceAlignmentsMicroTiled( -AddrTileMode tileMode, UINT_32 bpp, ADDR_SURFACE_FLAGS flags, UINT_32 numSamples, +AddrTileMode tileMode, UINT_32 bpp, ADDR_SURFACE_FLAGS flags, +UINT_32 mipLevel, UINT_32 numSamples, UINT_32* pBaseAlign, UINT_32* pPitchAlign, UINT_32* pHeightAlign) const; BOOL_32 ComputeSurfaceAlignmentsMacroTiled( -- 2.1.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/3] winsys/amdgpu: addrlib - port fix error for workaround for 1D tiling
From: Sonny JiangChange-Id: I1383d2b2670ed02bc6a6761531ee20f27fd6492f Signed-off-by: Sonny Jiang Reviewed-by: Alex Deucher --- src/gallium/winsys/amdgpu/drm/addrlib/r800/egbaddrlib.cpp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/gallium/winsys/amdgpu/drm/addrlib/r800/egbaddrlib.cpp b/src/gallium/winsys/amdgpu/drm/addrlib/r800/egbaddrlib.cpp index b1e008b..110e3d0 100644 --- a/src/gallium/winsys/amdgpu/drm/addrlib/r800/egbaddrlib.cpp +++ b/src/gallium/winsys/amdgpu/drm/addrlib/r800/egbaddrlib.cpp @@ -672,7 +672,7 @@ BOOL_32 EgBasedAddrLib::ComputeSurfaceAlignmentsMicroTiled( if (flags.czDispCompatible) { *pBaseAlign = PowTwoAlign(*pBaseAlign, 4096); //Base address MOD 4096 = 0 -*pPitchAlign = PowTwoAlign(*pPitchAlign, 512 >> (BITS_TO_BYTES(bpp))); //(8 lines * pitch * bytes per pixel) MOD 4096 = 0 +*pPitchAlign = PowTwoAlign(*pPitchAlign, 512 / (BITS_TO_BYTES(bpp))); //(8 lines * pitch * bytes per pixel) MOD 4096 = 0 } // end Carrizo workaround for 1D tilling -- 2.1.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 3/3] winsys/amdgpu: addrlib - port a Fiji bug fix
From: Sonny JiangFiji: Fixed tiled resource failures Change-Id: Idbad4c09ded6dd5cb28d6342f9b04a3345377c82 Signed-off-by: Sonny Jiang Reviewed-by: Alex Deucher --- .../winsys/amdgpu/drm/addrlib/r800/ciaddrlib.cpp | 45 +- .../winsys/amdgpu/drm/addrlib/r800/ciaddrlib.h | 2 + 2 files changed, 46 insertions(+), 1 deletion(-) diff --git a/src/gallium/winsys/amdgpu/drm/addrlib/r800/ciaddrlib.cpp b/src/gallium/winsys/amdgpu/drm/addrlib/r800/ciaddrlib.cpp index 7393953..995f276 100644 --- a/src/gallium/winsys/amdgpu/drm/addrlib/r800/ciaddrlib.cpp +++ b/src/gallium/winsys/amdgpu/drm/addrlib/r800/ciaddrlib.cpp @@ -896,6 +896,49 @@ BOOL_32 CIAddrLib::HwlOverrideTileMode( /** *** +* CiAddrLib::GetPrtSwitchP4Threshold +* +* @brief +* Return the threshold of switching to P4_* instead of P16_* for PRT resources +*** +*/ +UINT_32 CiAddrLib::GetPrtSwitchP4Threshold() const +{ +UINT_32 threshold; + +switch (m_pipes) +{ +case 8: +threshold = 32; +break; +case 16: +if (m_settings.isFiji) +{ +threshold = 16; +} +else if (m_settings.isHawaii) +{ +threshold = 8; +} +else +{ +///@todo add for possible new ASICs. +ADDR_ASSERT_ALWAYS(); +threshold = 16; +} +break; +default: +///@todo add for possible new ASICs. +ADDR_ASSERT_ALWAYS(); +threshold = 32; +break; +} + +return threshold; +} + +/** +*** * CIAddrLib::HwlSetupTileInfo * * @brief @@ -1123,7 +1166,7 @@ VOID CIAddrLib::HwlSetupTileInfo( { UINT_32 bytesXSamples = bpp * numSamples / 8; UINT_32 bytesXThickness = bpp * thickness / 8; -UINT_32 switchP4Threshold = (m_pipes == 16) ? 8 : 32; +UINT_32 switchP4Threshold = GetPrtSwitchP4Threshold(); if ((bytesXSamples > switchP4Threshold) || (bytesXThickness > switchP4Threshold)) { diff --git a/src/gallium/winsys/amdgpu/drm/addrlib/r800/ciaddrlib.h b/src/gallium/winsys/amdgpu/drm/addrlib/r800/ciaddrlib.h index 4515086..4cbe970 100644 --- a/src/gallium/winsys/amdgpu/drm/addrlib/r800/ciaddrlib.h +++ b/src/gallium/winsys/amdgpu/drm/addrlib/r800/ciaddrlib.h @@ -167,6 +167,8 @@ private: VOID ReadGbMacroTileCfg( UINT_32 regValue, ADDR_TILEINFO* pCfg) const; +UINT_32 GetPrtSwitchP4Threshold() const; + BOOL_32 InitTileSettingTable( const UINT_32 *pSetting, UINT_32 noOfEntries); -- 2.1.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 79688] [dri3] Latest git breaks PRIME Offloading to Nouveau GPU
https://bugs.freedesktop.org/show_bug.cgi?id=79688 --- Comment #22 from poma--- (In reply to Tobias Klausmann from comment #21) > (In reply to Axel Davy from comment #20) > > Should this bug report be closed ? > > DRI3 has complete DRI_PRIME support now. > > > > As far as it concerns me, DRI3 Support is fine and this bug can be marked as > "resolved fixed". > Do you refer to DRI3 in general, or is it DRI_PRIME? Tobias, while DRI3 enabled, can you run: $ ssh -Y [user@]hostname firefox What is the result? -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 79688] [dri3] Latest git breaks PRIME Offloading to Nouveau GPU
https://bugs.freedesktop.org/show_bug.cgi?id=79688 --- Comment #23 from poma--- (In reply to Axel Davy from comment #20) > Should this bug report be closed ? > DRI3 has complete DRI_PRIME support now. > Actually, I have some questions, but if you closing, bye bye. > Also I'm not sure why you do post these benchmarks here. > Is there some hidden message about something not working right ? For me these are ordinary tests, no benchmarks, to see is it working, working properly, or not. phoronix.com has some benchmarks. -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] radeonsi: disable DCC on Stoney
From: Marek OlšákCc: 11.0 11.1 --- src/gallium/drivers/radeon/r600_texture.c | 4 1 file changed, 4 insertions(+) diff --git a/src/gallium/drivers/radeon/r600_texture.c b/src/gallium/drivers/radeon/r600_texture.c index e2947d9..e7b2aea 100644 --- a/src/gallium/drivers/radeon/r600_texture.c +++ b/src/gallium/drivers/radeon/r600_texture.c @@ -497,6 +497,10 @@ static void vi_texture_alloc_dcc_separate(struct r600_common_screen *rscreen, if (rscreen->debug_flags & DBG_NO_DCC) return; + /* TODO: DCC is broken on Stoney */ + if (rscreen->family == CHIP_STONEY) + return; + rtex->dcc_buffer = (struct r600_resource *) r600_aligned_buffer_create(>b, PIPE_BIND_CUSTOM, PIPE_USAGE_DEFAULT, rtex->surface.dcc_size, rtex->surface.dcc_alignment); -- 2.1.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev