[Mesa-dev] [PATCH] r600g: Don't leak bytecode on shader compile failure
From: Michel Dänzer michel.daen...@amd.com Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=74868 Cc: mesa-sta...@lists.freedesktop.org Signed-off-by: Michel Dänzer michel.daen...@amd.com --- src/gallium/drivers/r600/r600_shader.c | 18 +++--- 1 file changed, 11 insertions(+), 7 deletions(-) diff --git a/src/gallium/drivers/r600/r600_shader.c b/src/gallium/drivers/r600/r600_shader.c index ddf79ee..b198359 100644 --- a/src/gallium/drivers/r600/r600_shader.c +++ b/src/gallium/drivers/r600/r600_shader.c @@ -155,7 +155,7 @@ int r600_pipe_shader_create(struct pipe_context *ctx, r = r600_shader_from_tgsi(rctx, shader, key); if (r) { R600_ERR(translation from TGSI failed !\n); - return r; + goto error; } /* disable SB for geom shaders - it can't handle the CF_EMIT instructions */ @@ -169,7 +169,7 @@ int r600_pipe_shader_create(struct pipe_context *ctx, r = r600_bytecode_build(shader-shader.bc); if (r) { R600_ERR(building bytecode failed !\n); - return r; + goto error; } } @@ -182,7 +182,7 @@ int r600_pipe_shader_create(struct pipe_context *ctx, dump, use_sb); if (r) { R600_ERR(r600_sb_bytecode_process failed !\n); - return r; + goto error; } } @@ -192,16 +192,16 @@ int r600_pipe_shader_create(struct pipe_context *ctx, r = r600_sb_bytecode_process(rctx, shader-gs_copy_shader-shader.bc, shader-gs_copy_shader-shader, dump, 0); if (r) - return r; + goto error; } if ((r = store_shader(ctx, shader-gs_copy_shader))) - return r; + goto error; } /* Store the shader in a buffer. */ if ((r = store_shader(ctx, shader))) - return r; + goto error; /* Build state. */ switch (shader-shader.processor_type) { @@ -235,9 +235,13 @@ int r600_pipe_shader_create(struct pipe_context *ctx, } break; default: - return -EINVAL; + goto error; } return 0; + +error: + r600_pipe_shader_destroy(ctx, shader); + return r; } void r600_pipe_shader_destroy(struct pipe_context *ctx, struct r600_pipe_shader *shader) -- 1.9.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 0/5] util: Rework endian handling in python code
José, any chance you could work Richard to get this in? On Mon, 2014-04-07 at 11:35 +0100, Richard Sandiford wrote: Ping. Richard Sandiford rsand...@linux.vnet.ibm.com writes: Ping (with fixed subject) Richard Sandiford rsand...@linux.vnet.ibm.com writes: This is a refresh of: http://lists.freedesktop.org/archives/mesa-dev/2013-June/040594.html At the moment the python code uses sys.byteorder to decide whether u_format_table.c should be for big or little endian. With this series it instead generates both forms, using blocks like: #ifdef PIPE_ARCH_BIG_ENDIAN ... #else ... #endif in cases where endianness matters. Doing it this way is more cross-compiler-friendly. It also means people working on LE systems can see what the differences would be for BE. Tested on x86_64 and z. I don't have commit access so please apply if OK. Thanks, Richard ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev -- Earthling Michel Dänzer| http://www.amd.com Libre software enthusiast |Mesa and X developer ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 77208] VdpPresentationQueueGetTime does not return a monotonic time
https://bugs.freedesktop.org/show_bug.cgi?id=77208 Christian König deathsim...@vodafone.de changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |DUPLICATE --- Comment #2 from Christian König deathsim...@vodafone.de --- That's a known issue, it's a design problem of DRI2. Essentially there is no global time you could return from VdpPresentationQueueGetTime. Instead what you return is always the an estimation based on the vsync counter and that's unfortunately per output instead of global. Additional to that I never bothered adding the difference between the last flip and the current time to the result of VdpPresentationQueueGetTime. So when you don't have a page flip the result of VdpPresentationQueueGetTime will probably stand still. *** This bug has been marked as a duplicate of bug 66384 *** -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] docs: Expand ARB_gpu_shader5 to describe status of individual features
On Wed, Apr 9, 2014 at 7:23 AM, Ian Romanick i...@freedesktop.org wrote: I believe UBO array indices are also dynamically uniform. I was surprised to find this when building the list too, but I believe it's unrestricted. The GLSL 4.0 spec, 4.3.7 p43 (bottom of) says: Any integral expression can be used to index a uniform block array. The corresponding language for arrays of samplers is: ...a dynamically uniform integral expression, otherwise results are undefined. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 7/7] winsys/radeon: fix a race condition in initialization of radeon_winsys::screen
Am 09.04.2014 05:44, schrieb Michel Dänzer: On Mit, 2014-04-09 at 02:15 +0200, Marek Olšák wrote: From: Marek Olšák marek.ol...@amd.com Create the screen in the winsys while the mutex is locked. This also results in a nice code cleanup! [...] diff --git a/src/gallium/targets/egl-static/egl_pipe.c b/src/gallium/targets/egl-static/egl_pipe.c index eb1cff9..ce734fb 100644 --- a/src/gallium/targets/egl-static/egl_pipe.c +++ b/src/gallium/targets/egl-static/egl_pipe.c @@ -119,19 +119,9 @@ pipe_r300_create_screen(int fd) { #if _EGL_PIPE_R300 struct radeon_winsys *sws; - struct pipe_screen *screen; - - sws = radeon_drm_winsys_create(fd); - if (!sws) - return NULL; - - screen = r300_screen_create(sws); - if (!screen) - return NULL; - screen = debug_screen_wrap(screen); - - return screen; + sws = radeon_drm_winsys_create(fd, r300_screen_create); + return sws ? debug_screen_wrap(sws-screen) : NULL; I think it would be clearer to keep this as: sws = radeon_drm_winsys_create(fd, r300_screen_create); if (!sws) return NULL; return debug_screen_wrap(sws-screen); Either way though, the series is Reviewed-by: Michel Dänzer michel.daen...@amd.com I actually like the shorter form, but anyway thanks allot for this it removes quite a bunch of todos from my list. Didn't though that creating the screen from the winsys was ok with the design, but when you are fine with it I'm not about to complain. For the series: Reviewed-by: Christian König christian.koe...@amd.com ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 4/7] radeon/vce: implement B-frame support
From: Christian König christian.koe...@amd.com Signed-off-by: Slava Grigorev slava.grigo...@amd.com Signed-off-by: Christian König christian.koe...@amd.com --- src/gallium/drivers/radeon/radeon_vce.h| 2 +- src/gallium/drivers/radeon/radeon_vce_40_2_2.c | 73 ++ 2 files changed, 53 insertions(+), 22 deletions(-) diff --git a/src/gallium/drivers/radeon/radeon_vce.h b/src/gallium/drivers/radeon/radeon_vce.h index f815cad..7cc87be 100644 --- a/src/gallium/drivers/radeon/radeon_vce.h +++ b/src/gallium/drivers/radeon/radeon_vce.h @@ -45,7 +45,7 @@ #define RVCE_READWRITE(buf, domain) RVCE_CS(RVCE_RELOC(buf, RADEON_USAGE_READWRITE, domain) * 4) #define RVCE_END() *begin = (enc-cs-buf[enc-cs-cdw] - begin) * 4; } -#define RVCE_NUM_CPB_FRAMES 2 +#define RVCE_NUM_CPB_FRAMES 3 struct r600_common_screen; diff --git a/src/gallium/drivers/radeon/radeon_vce_40_2_2.c b/src/gallium/drivers/radeon/radeon_vce_40_2_2.c index 1327d64..3b67b31 100644 --- a/src/gallium/drivers/radeon/radeon_vce_40_2_2.c +++ b/src/gallium/drivers/radeon/radeon_vce_40_2_2.c @@ -54,6 +54,11 @@ static struct rvce_cpb_slot *l0_slot(struct rvce_encoder *enc) return LIST_ENTRY(struct rvce_cpb_slot, enc-cpb_slots.next, list); } +static struct rvce_cpb_slot *l1_slot(struct rvce_encoder *enc) +{ + return LIST_ENTRY(struct rvce_cpb_slot, enc-cpb_slots.next-next, list); +} + static void frame_offset(struct rvce_encoder *enc, struct rvce_cpb_slot *slot, unsigned *luma_offset, unsigned *chroma_offset) { @@ -99,8 +104,8 @@ static void create(struct rvce_encoder *enc) RVCE_BEGIN(0x0101); // create cmd RVCE_CS(0x); // encUseCircularBuffer - RVCE_CS(0x0041); // encProfile - RVCE_CS(0x000a); // encLevel + RVCE_CS(0x004d); // encProfile: Main + RVCE_CS(0x002a); // encLevel: 4.2 RVCE_CS(0x); // encPicStructRestriction RVCE_CS(enc-base.width); // encImageWidth RVCE_CS(enc-base.height); // encImageHeight @@ -175,12 +180,12 @@ static void pic_control(struct rvce_encoder *enc) RVCE_CS(0x); // encSPSID RVCE_CS(0x); // encPPSID RVCE_CS(0x0040); // encConstraintSetFlags - RVCE_CS(0x); // encBPicPattern + RVCE_CS(MAX2(enc-base.max_references, 1) - 1); // encBPicPattern RVCE_CS(0x); // weightPredModeBPicture RVCE_CS(MIN2(enc-base.max_references, 2)); // encNumberOfReferenceFrames RVCE_CS(enc-base.max_references + 1); // encMaxNumRefFrames - RVCE_CS(0x); // encNumDefaultActiveRefL0 - RVCE_CS(0x); // encNumDefaultActiveRefL1 + RVCE_CS(0x0001); // encNumDefaultActiveRefL0 + RVCE_CS(0x0001); // encNumDefaultActiveRefL1 RVCE_CS(0x); // encSliceMode RVCE_CS(0x); // encMaxSliceSize RVCE_END(); @@ -275,7 +280,7 @@ static void encode(struct rvce_encoder *enc) RVCE_CS(0x); // encInputPic(Addr|Array)Mode RVCE_CS(0x); // encInputPicTileConfig RVCE_CS(enc-pic.picture_type); // encPicType - RVCE_CS(enc-pic.picture_type == 3); // encIdrFlag + RVCE_CS(enc-pic.picture_type == PIPE_H264_ENC_PICTURE_TYPE_IDR); // encIdrFlag RVCE_CS(0x); // encIdrPicId RVCE_CS(0x); // encMGSKeyPic RVCE_CS(0x0001); // encReferenceFlag @@ -283,7 +288,17 @@ static void encode(struct rvce_encoder *enc) RVCE_CS(0x); // num_ref_idx_active_override_flag RVCE_CS(0x); // num_ref_idx_l0_active_minus1 RVCE_CS(0x); // num_ref_idx_l1_active_minus1 - for (i = 0; i 4; ++i) { + + i = enc-pic.frame_num - enc-pic.ref_idx_l0; + if (i 1 enc-pic.picture_type == PIPE_H264_ENC_PICTURE_TYPE_P) { + RVCE_CS(0x0001); // encRefListModificationOp + RVCE_CS(i - 1); // encRefListModificationNum + } else { + RVCE_CS(0x); // encRefListModificationOp + RVCE_CS(0x); // encRefListModificationNum + } + + for (i = 0; i 3; ++i) { RVCE_CS(0x); // encRefListModificationOp RVCE_CS(0x); // encRefListModificationNum } @@ -291,22 +306,14 @@ static void encode(struct rvce_encoder *enc) RVCE_CS(0x); // encDecodedPictureMarkingOp RVCE_CS(0x); // encDecodedPictureMarkingNum RVCE_CS(0x); // encDecodedPictureMarkingIdx - } - for (i = 0; i 4; ++i) { RVCE_CS(0x); // encDecodedRefBasePictureMarkingOp RVCE_CS(0x); // encDecodedRefBasePictureMarkingNum } + // encReferencePictureL0[0] RVCE_CS(0x); // pictureStructure - - if (enc-pic.picture_type == PIPE_H264_ENC_PICTURE_TYPE_IDR) { -
[Mesa-dev] [PATCH 5/7] st/omx/enc: separate input buffer private and task structure
From: Christian König christian.koe...@amd.com Keep tasks as linked list, this way we can associate more than one encoding task with each buffer. Signed-off-by: Christian König christian.koe...@amd.com --- src/gallium/state_trackers/omx/vid_enc.c | 182 +-- src/gallium/state_trackers/omx/vid_enc.h | 4 + 2 files changed, 127 insertions(+), 59 deletions(-) diff --git a/src/gallium/state_trackers/omx/vid_enc.c b/src/gallium/state_trackers/omx/vid_enc.c index 080730b..88d15a9 100644 --- a/src/gallium/state_trackers/omx/vid_enc.c +++ b/src/gallium/state_trackers/omx/vid_enc.c @@ -54,12 +54,18 @@ #include entrypoint.h #include vid_enc.h -struct input_buf_private { +struct encode_task { + struct list_head list; + struct pipe_video_buffer *buf; struct pipe_resource *bitstream; void *feedback; }; +struct input_buf_private { + struct list_head tasks; +}; + struct output_buf_private { struct pipe_resource *bitstream; struct pipe_transfer *transfer; @@ -79,6 +85,8 @@ static OMX_ERRORTYPE vid_enc_AllocateOutBuffer(omx_base_PortType *comp, OMX_INOU static OMX_ERRORTYPE vid_enc_FreeOutBuffer(omx_base_PortType *port, OMX_U32 idx, OMX_BUFFERHEADERTYPE *buf); static void vid_enc_BufferEncoded(OMX_COMPONENTTYPE *comp, OMX_BUFFERHEADERTYPE* input, OMX_BUFFERHEADERTYPE* output); +static void enc_ReleaseTasks(struct list_head *head); + static void vid_enc_name(char str[OMX_MAX_STRINGNAME_SIZE]) { snprintf(str, OMX_MAX_STRINGNAME_SIZE, OMX_VID_ENC_BASE_NAME, driver_descriptor.name); @@ -243,6 +251,9 @@ static OMX_ERRORTYPE vid_enc_Constructor(OMX_COMPONENTTYPE *comp, OMX_STRING nam priv-scale.xWidth = OMX_VID_ENC_SCALING_WIDTH_DEFAULT; priv-scale.xHeight = OMX_VID_ENC_SCALING_WIDTH_DEFAULT; + LIST_INITHEAD(priv-free_tasks); + LIST_INITHEAD(priv-used_tasks); + return OMX_ErrorNone; } @@ -251,6 +262,9 @@ static OMX_ERRORTYPE vid_enc_Destructor(OMX_COMPONENTTYPE *comp) vid_enc_PrivateType* priv = comp-pComponentPrivate; int i; + enc_ReleaseTasks(priv-free_tasks); + enc_ReleaseTasks(priv-used_tasks); + if (priv-ports) { for (i = 0; i priv-sPortTypesParam[OMX_PortDomainVideo].nPorts; ++i) { if(priv-ports[i]) @@ -563,9 +577,10 @@ static OMX_ERRORTYPE vid_enc_MessageHandler(OMX_COMPONENTTYPE* comp, internalReq static OMX_ERRORTYPE vid_enc_FreeInBuffer(omx_base_PortType *port, OMX_U32 idx, OMX_BUFFERHEADERTYPE *buf) { struct input_buf_private *inp = buf-pInputPortPrivate; - pipe_resource_reference(inp-bitstream, NULL); - inp-buf-destroy(inp-buf); - FREE(inp); + if (inp) { + enc_ReleaseTasks(inp-tasks); + FREE(inp); + } return base_port_FreeBuffer(port, idx, buf); } @@ -607,22 +622,25 @@ static OMX_ERRORTYPE vid_enc_FreeOutBuffer(omx_base_PortType *port, OMX_U32 idx, return base_port_FreeBuffer(port, idx, buf); } -static OMX_ERRORTYPE enc_NeedInputPortPrivate(omx_base_PortType *port, OMX_BUFFERHEADERTYPE *buf) +static struct encode_task *enc_NeedTask(omx_base_PortType *port) { + OMX_VIDEO_PORTDEFINITIONTYPE *def = port-sPortParam.format.video; OMX_COMPONENTTYPE* comp = port-standCompContainer; vid_enc_PrivateType *priv = comp-pComponentPrivate; - OMX_VIDEO_PORTDEFINITIONTYPE *def = port-sPortParam.format.video; - struct input_buf_private **inp = (struct input_buf_private **)buf-pInputPortPrivate; + struct pipe_video_buffer templat = {}; + struct encode_task *task; - if (*inp) { - pipe_resource_reference((*inp)-bitstream, NULL); - return OMX_ErrorNone; + if (!LIST_IS_EMPTY(priv-free_tasks)) { + task = LIST_ENTRY(struct encode_task, priv-free_tasks.next, list); + LIST_DEL(task-list); + return task; } - if (!(*inp = CALLOC(1, sizeof(struct input_buf_private { - return OMX_ErrorInsufficientResources; - } + /* allocate a new one */ + task = CALLOC_STRUCT(encode_task); + if (!task) + return NULL; templat.buffer_format = PIPE_FORMAT_NV12; templat.chroma_format = PIPE_VIDEO_CHROMA_FORMAT_420; @@ -630,25 +648,46 @@ static OMX_ERRORTYPE enc_NeedInputPortPrivate(omx_base_PortType *port, OMX_BUFFE templat.height = def-nFrameHeight; templat.interlaced = false; - if (!((*inp)-buf = priv-s_pipe-create_video_buffer(priv-s_pipe, templat))) { - FREE(*inp); - return OMX_ErrorInsufficientResources; + task-buf = priv-s_pipe-create_video_buffer(priv-s_pipe, templat); + if (!task-buf) { + FREE(task); + return NULL; } - return OMX_ErrorNone; + return task; +} + +static void enc_MoveTasks(struct list_head *from, struct list_head *to) +{ + to-prev-next = from-next; + from-next-prev = to-prev; + from-prev-next = to; + to-prev = from-prev; + LIST_INITHEAD(from); } -static OMX_ERRORTYPE enc_LoadImage(omx_base_PortType *port, OMX_BUFFERHEADERTYPE *buf) +static void enc_ReleaseTasks(struct list_head *head) +{ + struct encode_task
[Mesa-dev] [PATCH 1/7] radeon/vce: remove RVCE_NUM_CPB_EXTRA_FRAMES
From: Christian König christian.koe...@amd.com Doesn't seems to be needed any more. Signed-off-by: Christian König christian.koe...@amd.com --- src/gallium/drivers/radeon/radeon_vce.c| 2 +- src/gallium/drivers/radeon/radeon_vce.h| 1 - src/gallium/drivers/radeon/radeon_vce_40_2_2.c | 3 +-- 3 files changed, 2 insertions(+), 4 deletions(-) diff --git a/src/gallium/drivers/radeon/radeon_vce.c b/src/gallium/drivers/radeon/radeon_vce.c index 4b824f9..012b4f8 100644 --- a/src/gallium/drivers/radeon/radeon_vce.c +++ b/src/gallium/drivers/radeon/radeon_vce.c @@ -262,7 +262,7 @@ struct pipe_video_codec *rvce_create_encoder(struct pipe_context *context, vpitch = align(tmp_surf-npix_y, 16); tmp_buf-destroy(tmp_buf); if (!rvid_create_buffer(enc-ws, enc-cpb, - pitch * vpitch * 1.5 * (RVCE_NUM_CPB_FRAMES + RVCE_NUM_CPB_EXTRA_FRAMES), + pitch * vpitch * 1.5 * RVCE_NUM_CPB_FRAMES, RADEON_DOMAIN_VRAM)) { RVID_ERR(Can't create CPB buffer.\n); goto error; diff --git a/src/gallium/drivers/radeon/radeon_vce.h b/src/gallium/drivers/radeon/radeon_vce.h index 9dc0c68..3ea738b 100644 --- a/src/gallium/drivers/radeon/radeon_vce.h +++ b/src/gallium/drivers/radeon/radeon_vce.h @@ -44,7 +44,6 @@ #define RVCE_END() *begin = (enc-cs-buf[enc-cs-cdw] - begin) * 4; } #define RVCE_NUM_CPB_FRAMES 2 -#define RVCE_NUM_CPB_EXTRA_FRAMES 2 struct r600_common_screen; diff --git a/src/gallium/drivers/radeon/radeon_vce_40_2_2.c b/src/gallium/drivers/radeon/radeon_vce_40_2_2.c index 26c3629..c41b2d0 100644 --- a/src/gallium/drivers/radeon/radeon_vce_40_2_2.c +++ b/src/gallium/drivers/radeon/radeon_vce_40_2_2.c @@ -224,9 +224,8 @@ static void frame_offset(struct rvce_encoder *enc, unsigned frame_num, unsigned pitch = align(enc-luma-level[0].pitch_bytes, 128); unsigned vpitch = align(enc-luma-npix_y, 16); unsigned fsize = pitch * (vpitch + vpitch / 2); - unsigned base_offset = RVCE_NUM_CPB_EXTRA_FRAMES * fsize; - *luma_offset = base_offset + (frame_num % RVCE_NUM_CPB_FRAMES) * fsize; + *luma_offset = (frame_num % RVCE_NUM_CPB_FRAMES) * fsize; *chroma_offset = *luma_offset + pitch * vpitch; } -- 1.8.3.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/7] vl: add interface for H264 B-frame encoding
From: Christian König christian.koe...@amd.com Signed-off-by: Christian König christian.koe...@amd.com --- src/gallium/drivers/radeon/radeon_vce_40_2_2.c | 11 ++- src/gallium/include/pipe/p_video_state.h | 3 +++ src/gallium/state_trackers/omx/vid_enc.c | 8 +++- 3 files changed, 16 insertions(+), 6 deletions(-) diff --git a/src/gallium/drivers/radeon/radeon_vce_40_2_2.c b/src/gallium/drivers/radeon/radeon_vce_40_2_2.c index c41b2d0..33a58f3 100644 --- a/src/gallium/drivers/radeon/radeon_vce_40_2_2.c +++ b/src/gallium/drivers/radeon/radeon_vce_40_2_2.c @@ -156,8 +156,8 @@ static void pic_control(struct rvce_encoder *enc) RVCE_CS(0x0040); // encConstraintSetFlags RVCE_CS(0x); // encBPicPattern RVCE_CS(0x); // weightPredModeBPicture - RVCE_CS(0x0001); // encNumberOfReferenceFrames - RVCE_CS(0x0001); // encMaxNumRefFrames + RVCE_CS(MIN2(enc-base.max_references, 2)); // encNumberOfReferenceFrames + RVCE_CS(enc-base.max_references + 1); // encMaxNumRefFrames RVCE_CS(0x); // encNumDefaultActiveRefL0 RVCE_CS(0x); // encNumDefaultActiveRefL1 RVCE_CS(0x); // encSliceMode @@ -297,8 +297,9 @@ static void encode(struct rvce_encoder *enc) RVCE_CS(0x); // chromaOffset } else if(enc-pic.picture_type == PIPE_H264_ENC_PICTURE_TYPE_P) { - frame_offset(enc, enc-pic.frame_num - 1, luma_offset, chroma_offset); + frame_offset(enc, enc-pic.ref_idx_l0, luma_offset, chroma_offset); RVCE_CS(0x); // encPicType + // TODO: Stores these in the CPB backtrack RVCE_CS(enc-pic.frame_num - 1); // frameNumber RVCE_CS(enc-pic.frame_num - 1); // pictureOrderCount RVCE_CS(luma_offset); // lumaOffset @@ -322,8 +323,8 @@ static void encode(struct rvce_encoder *enc) RVCE_CS(0x); // encReferenceRefBasePictureLumaOffset RVCE_CS(0x); // encReferenceRefBasePictureChromaOffset RVCE_CS(0x); // pictureCount - RVCE_CS(0x); // frameNumber - RVCE_CS(0x); // pictureOrderCount + RVCE_CS(enc-pic.frame_num); // frameNumber + RVCE_CS(enc-pic.pic_order_cnt); // pictureOrderCount RVCE_CS(0x); // numIPicRemainInRCGOP RVCE_CS(0x); // numPPicRemainInRCGOP RVCE_CS(0x); // numBPicRemainInRCGOP diff --git a/src/gallium/include/pipe/p_video_state.h b/src/gallium/include/pipe/p_video_state.h index f9721dc..0256a8f 100644 --- a/src/gallium/include/pipe/p_video_state.h +++ b/src/gallium/include/pipe/p_video_state.h @@ -368,6 +368,9 @@ struct pipe_h264_enc_picture_desc enum pipe_h264_enc_picture_type picture_type; unsigned frame_num; + unsigned pic_order_cnt; + unsigned ref_idx_l0; + unsigned ref_idx_l1; }; #ifdef __cplusplus diff --git a/src/gallium/state_trackers/omx/vid_enc.c b/src/gallium/state_trackers/omx/vid_enc.c index 8ec0439..080730b 100644 --- a/src/gallium/state_trackers/omx/vid_enc.c +++ b/src/gallium/state_trackers/omx/vid_enc.c @@ -769,11 +769,17 @@ static void enc_ControlPicture(omx_base_PortType *port, if (!(priv-frame_num % OMX_VID_ENC_IDR_PERIOD_DEFAULT) || priv-force_pic_type.IntraRefreshVOP) { picture-picture_type = PIPE_H264_ENC_PICTURE_TYPE_IDR; + picture-ref_idx_l0 = 0; + picture-ref_idx_l1 = 0; priv-frame_num = 0; - } else + } else { picture-picture_type = PIPE_H264_ENC_PICTURE_TYPE_P; + picture-ref_idx_l0 = priv-frame_num - 1; + picture-ref_idx_l1 = 0; + } picture-frame_num = priv-frame_num++; + picture-pic_order_cnt = picture-frame_num; priv-force_pic_type.IntraRefreshVOP = OMX_FALSE; } -- 1.8.3.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 3/7] radeon/vce: add proper CPB backtrack
From: Christian König christian.koe...@amd.com Remember what frames we encoded at which position. Signed-off-by: Christian König christian.koe...@amd.com --- src/gallium/drivers/radeon/radeon_vce.c| 87 -- src/gallium/drivers/radeon/radeon_vce.h| 15 + src/gallium/drivers/radeon/radeon_vce_40_2_2.c | 44 - 3 files changed, 123 insertions(+), 23 deletions(-) diff --git a/src/gallium/drivers/radeon/radeon_vce.c b/src/gallium/drivers/radeon/radeon_vce.c index 012b4f8..a7dfcda 100644 --- a/src/gallium/drivers/radeon/radeon_vce.c +++ b/src/gallium/drivers/radeon/radeon_vce.c @@ -80,6 +80,57 @@ static void dump_feedback(struct rvce_encoder *enc, struct rvid_buffer *fb) #endif /** + * reset the CPB handling + */ +static void reset_cpb(struct rvce_encoder *enc) +{ + unsigned i; + + LIST_INITHEAD(enc-cpb_slots); + for (i = 0; i RVCE_NUM_CPB_FRAMES; ++i) { + struct rvce_cpb_slot *slot = enc-cpb_array[i]; + slot-index = i; + slot-picture_type = PIPE_H264_ENC_PICTURE_TYPE_SKIP; + slot-frame_num = 0; + slot-pic_order_cnt = 0; + LIST_ADDTAIL(slot-list, enc-cpb_slots); + } +} + +/** + * sort l0 and l1 to the top of the list + */ +static void sort_cpb(struct rvce_encoder *enc) +{ + struct rvce_cpb_slot *i, *l0 = NULL, *l1 = NULL; + + LIST_FOR_EACH_ENTRY(i, enc-cpb_slots, list) { + if (i-frame_num == enc-pic.ref_idx_l0) + l0 = i; + + if (i-frame_num == enc-pic.ref_idx_l1) + l1 = i; + + if (enc-pic.picture_type == PIPE_H264_ENC_PICTURE_TYPE_P l0) + break; + + if (enc-pic.picture_type == PIPE_H264_ENC_PICTURE_TYPE_B + l0 l1) + break; + } + + if (l1) { + LIST_DEL(l1-list); + LIST_ADD(l1-list, enc-cpb_slots); + } + + if (l0) { + LIST_DEL(l0-list); + LIST_ADD(l0-list, enc-cpb_slots); + } +} + +/** * destroy this video encoder */ static void rvce_destroy(struct pipe_video_codec *encoder) @@ -97,6 +148,7 @@ static void rvce_destroy(struct pipe_video_codec *encoder) } rvid_destroy_buffer(enc-cpb); enc-ws-cs_destroy(enc-cs); + FREE(enc-cpb_array); FREE(enc); } @@ -118,6 +170,12 @@ static void rvce_begin_frame(struct pipe_video_codec *encoder, enc-get_buffer(vid_buf-resources[0], enc-handle, enc-luma); enc-get_buffer(vid_buf-resources[1], NULL, enc-chroma); + + if (pic-picture_type == PIPE_H264_ENC_PICTURE_TYPE_IDR) + reset_cpb(enc); + else if (pic-picture_type == PIPE_H264_ENC_PICTURE_TYPE_P || +pic-picture_type == PIPE_H264_ENC_PICTURE_TYPE_B) + sort_cpb(enc); if (!enc-stream_handle) { struct rvid_buffer fb; @@ -167,7 +225,17 @@ static void rvce_end_frame(struct pipe_video_codec *encoder, struct pipe_picture_desc *picture) { struct rvce_encoder *enc = (struct rvce_encoder*)encoder; + struct rvce_cpb_slot *slot = LIST_ENTRY( + struct rvce_cpb_slot, enc-cpb_slots.prev, list); + flush(enc); + + /* update the CPB backtrack with the just encoded frame */ + LIST_DEL(slot-list); + slot-picture_type = enc-pic.picture_type; + slot-frame_num = enc-pic.frame_num; + slot-pic_order_cnt = enc-pic.pic_order_cnt; + LIST_ADD(slot-list, enc-cpb_slots); } static void rvce_get_feedback(struct pipe_video_codec *encoder, @@ -213,7 +281,7 @@ struct pipe_video_codec *rvce_create_encoder(struct pipe_context *context, struct rvce_encoder *enc; struct pipe_video_buffer *tmp_buf, templat = {}; struct radeon_surface *tmp_surf; - unsigned pitch, vpitch; + unsigned cpb_size; if (!rscreen-info.vce_fw_version) { RVID_ERR(Kernel doesn't supports VCE!\n); @@ -258,16 +326,22 @@ struct pipe_video_codec *rvce_create_encoder(struct pipe_context *context, } get_buffer(((struct vl_video_buffer *)tmp_buf)-resources[0], NULL, tmp_surf); - pitch = align(tmp_surf-level[0].pitch_bytes, 128); - vpitch = align(tmp_surf-npix_y, 16); + cpb_size = align(tmp_surf-level[0].pitch_bytes, 128); + cpb_size = cpb_size * align(tmp_surf-npix_y, 16); + cpb_size = cpb_size * 3 / 2; + cpb_size = cpb_size * RVCE_NUM_CPB_FRAMES; tmp_buf-destroy(tmp_buf); - if (!rvid_create_buffer(enc-ws, enc-cpb, - pitch * vpitch * 1.5 * RVCE_NUM_CPB_FRAMES, - RADEON_DOMAIN_VRAM)) { + if (!rvid_create_buffer(enc-ws, enc-cpb, cpb_size, RADEON_DOMAIN_VRAM)) { RVID_ERR(Can't create CPB buffer.\n); goto error;
[Mesa-dev] [PATCH 7/7] st/omx/enc: enable B-frames
From: Christian König christian.koe...@amd.com Signed-off-by: Christian König christian.koe...@amd.com --- src/gallium/state_trackers/omx/vid_enc.c | 10 +++--- src/gallium/state_trackers/omx/vid_enc.h | 1 + 2 files changed, 8 insertions(+), 3 deletions(-) diff --git a/src/gallium/state_trackers/omx/vid_enc.c b/src/gallium/state_trackers/omx/vid_enc.c index 7633cd6..1e6a189 100644 --- a/src/gallium/state_trackers/omx/vid_enc.c +++ b/src/gallium/state_trackers/omx/vid_enc.c @@ -563,7 +563,7 @@ static OMX_ERRORTYPE vid_enc_MessageHandler(OMX_COMPONENTTYPE* comp, internalReq priv-scale.xWidth : port-sPortParam.format.video.nFrameWidth; templat.height = priv-scale_buffer[priv-current_scale_buffer] ? priv-scale.xHeight : port-sPortParam.format.video.nFrameHeight; - templat.max_references = 1; + templat.max_references = OMX_VID_ENC_P_PERIOD_DEFAULT; priv-codec = priv-s_pipe-create_video_codec(priv-s_pipe, templat); @@ -907,13 +907,17 @@ static OMX_ERRORTYPE vid_enc_EncodeFrame(omx_base_PortType *port, OMX_BUFFERHEAD } /* -- determine picture type - */ - if (!(priv-pic_order_cnt % OMX_VID_ENC_IDR_PERIOD_DEFAULT) || priv-force_pic_type.IntraRefreshVOP) { + if (!(priv-pic_order_cnt % OMX_VID_ENC_IDR_PERIOD_DEFAULT) || + priv-force_pic_type.IntraRefreshVOP) { enc_ClearBframes(port, inp); picture_type = PIPE_H264_ENC_PICTURE_TYPE_IDR; priv-force_pic_type.IntraRefreshVOP = OMX_FALSE; priv-frame_num = 0; + } else if (!(priv-pic_order_cnt % OMX_VID_ENC_P_PERIOD_DEFAULT) || + (buf-nFlags OMX_BUFFERFLAG_EOS)) { + picture_type = PIPE_H264_ENC_PICTURE_TYPE_P; } else { - picture_type = PIPE_H264_ENC_PICTURE_TYPE_P; + picture_type = PIPE_H264_ENC_PICTURE_TYPE_B; } task-pic_order_cnt = priv-pic_order_cnt++; diff --git a/src/gallium/state_trackers/omx/vid_enc.h b/src/gallium/state_trackers/omx/vid_enc.h index 6f6226a..c01c959 100644 --- a/src/gallium/state_trackers/omx/vid_enc.h +++ b/src/gallium/state_trackers/omx/vid_enc.h @@ -60,6 +60,7 @@ #define OMX_VID_ENC_SCALING_WIDTH_DEFAULT 0x #define OMX_VID_ENC_SCALING_HEIGHT_DEFAULT 0x #define OMX_VID_ENC_IDR_PERIOD_DEFAULT 1000 +#define OMX_VID_ENC_P_PERIOD_DEFAULT 4 #define OMX_VID_ENC_NUM_SCALING_BUFFERS 4 -- 1.8.3.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 6/7] st/omx/enc: implement frame reordering and B-frames
From: Christian König christian.koe...@amd.com Signed-off-by: Christian König christian.koe...@amd.com --- src/gallium/state_trackers/omx/vid_enc.c | 91 +--- src/gallium/state_trackers/omx/vid_enc.h | 5 +- 2 files changed, 76 insertions(+), 20 deletions(-) diff --git a/src/gallium/state_trackers/omx/vid_enc.c b/src/gallium/state_trackers/omx/vid_enc.c index 88d15a9..7633cd6 100644 --- a/src/gallium/state_trackers/omx/vid_enc.c +++ b/src/gallium/state_trackers/omx/vid_enc.c @@ -58,6 +58,7 @@ struct encode_task { struct list_head list; struct pipe_video_buffer *buf; + unsigned pic_order_cnt; struct pipe_resource *bitstream; void *feedback; }; @@ -247,12 +248,14 @@ static OMX_ERRORTYPE vid_enc_Constructor(OMX_COMPONENTTYPE *comp, OMX_STRING nam priv-force_pic_type.IntraRefreshVOP = OMX_FALSE; priv-frame_num = 0; + priv-pic_order_cnt = 0; priv-scale.xWidth = OMX_VID_ENC_SCALING_WIDTH_DEFAULT; priv-scale.xHeight = OMX_VID_ENC_SCALING_WIDTH_DEFAULT; LIST_INITHEAD(priv-free_tasks); LIST_INITHEAD(priv-used_tasks); + LIST_INITHEAD(priv-b_frames); return OMX_ErrorNone; } @@ -264,6 +267,7 @@ static OMX_ERRORTYPE vid_enc_Destructor(OMX_COMPONENTTYPE *comp) enc_ReleaseTasks(priv-free_tasks); enc_ReleaseTasks(priv-used_tasks); + enc_ReleaseTasks(priv-b_frames); if (priv-ports) { for (i = 0; i priv-sPortTypesParam[OMX_PortDomainVideo].nPorts; ++i) { @@ -803,23 +807,13 @@ static void enc_ControlPicture(omx_base_PortType *port, struct pipe_h264_enc_pic picture-quant_p_frames = priv-quant.nQpP; picture-quant_b_frames = priv-quant.nQpB; - if (!(priv-frame_num % OMX_VID_ENC_IDR_PERIOD_DEFAULT) || priv-force_pic_type.IntraRefreshVOP) { - picture-picture_type = PIPE_H264_ENC_PICTURE_TYPE_IDR; - picture-ref_idx_l0 = 0; - picture-ref_idx_l1 = 0; - priv-frame_num = 0; - } else { - picture-picture_type = PIPE_H264_ENC_PICTURE_TYPE_P; - picture-ref_idx_l0 = priv-frame_num - 1; - picture-ref_idx_l1 = 0; - } - - picture-frame_num = priv-frame_num++; - picture-pic_order_cnt = picture-frame_num; - priv-force_pic_type.IntraRefreshVOP = OMX_FALSE; + picture-frame_num = priv-frame_num; + picture-ref_idx_l0 = priv-ref_idx_l0; + picture-ref_idx_l1 = priv-ref_idx_l1; } -static void enc_HandleTask(omx_base_PortType *port, struct encode_task *task) +static void enc_HandleTask(omx_base_PortType *port, struct encode_task *task, + enum pipe_h264_enc_picture_type picture_type) { OMX_COMPONENTTYPE* comp = port-standCompContainer; vid_enc_PrivateType *priv = comp-pComponentPrivate; @@ -834,6 +828,9 @@ static void enc_HandleTask(omx_base_PortType *port, struct encode_task *task) /* -- allocate output buffer - */ task-bitstream = pipe_buffer_create(priv-s_pipe-screen, PIPE_BIND_VERTEX_BUFFER, PIPE_USAGE_STREAM, size); + + picture.picture_type = picture_type; + picture.pic_order_cnt = task-pic_order_cnt; enc_ControlPicture(port, picture); /* -- encode frame - */ @@ -842,11 +839,39 @@ static void enc_HandleTask(omx_base_PortType *port, struct encode_task *task) priv-codec-end_frame(priv-codec, vbuf, picture.base); } +static void enc_ClearBframes(omx_base_PortType *port, struct input_buf_private *inp) +{ + OMX_COMPONENTTYPE* comp = port-standCompContainer; + vid_enc_PrivateType *priv = comp-pComponentPrivate; + struct encode_task *task; + + if (LIST_IS_EMPTY(priv-b_frames)) + return; + + task = LIST_ENTRY(struct encode_task, priv-b_frames.prev, list); + LIST_DEL(task-list); + + /* promote last from to P frame */ + priv-ref_idx_l0 = priv-ref_idx_l1; + enc_HandleTask(port, task, PIPE_H264_ENC_PICTURE_TYPE_P); + LIST_ADDTAIL(task-list, inp-tasks); + priv-ref_idx_l1 = priv-frame_num++; + + /* handle B frames */ + LIST_FOR_EACH_ENTRY(task, priv-b_frames, list) { + enc_HandleTask(port, task, PIPE_H264_ENC_PICTURE_TYPE_B); + priv-ref_idx_l0 = priv-frame_num++; + } + + enc_MoveTasks(priv-b_frames, inp-tasks); +} + static OMX_ERRORTYPE vid_enc_EncodeFrame(omx_base_PortType *port, OMX_BUFFERHEADERTYPE *buf) { OMX_COMPONENTTYPE* comp = port-standCompContainer; vid_enc_PrivateType *priv = comp-pComponentPrivate; struct input_buf_private *inp = buf-pInputPortPrivate; + enum pipe_h264_enc_picture_type picture_type; struct encode_task *task; OMX_ERRORTYPE err; @@ -863,8 +888,10 @@ static OMX_ERRORTYPE vid_enc_EncodeFrame(omx_base_PortType *port, OMX_BUFFERHEAD return OMX_ErrorInsufficientResources; if (buf-nFilledLen == 0) { - if (buf-nFlags OMX_BUFFERFLAG_EOS) + if (buf-nFlags OMX_BUFFERFLAG_EOS) { buf-nFilledLen = buf-nAllocLen; + enc_ClearBframes(port, inp); + } return
[Mesa-dev] [PATCH 04/10] glsl/linker: initialize explicit uniform locations
Patch initializes the UniformRemapTable for explicit locations. This needs to happen before optimizations to make sure all inactive uniforms get their explicit locations correctly. Signed-off-by: Tapani Pälli tapani.pa...@intel.com --- src/glsl/linker.cpp | 99 + 1 file changed, 99 insertions(+) diff --git a/src/glsl/linker.cpp b/src/glsl/linker.cpp index 7c194a2..1b4cb63 100644 --- a/src/glsl/linker.cpp +++ b/src/glsl/linker.cpp @@ -74,6 +74,7 @@ #include link_varyings.h #include ir_optimization.h #include ir_rvalue_visitor.h +#include ir_uniform.h extern C { #include main/shaderobj.h @@ -2089,6 +2090,100 @@ check_image_resources(struct gl_context *ctx, struct gl_shader_program *prog) linker_error(prog, Too many combined image uniforms and fragment outputs); } + +/** + * Initializes explicit location slots point to -1 for a variable, + * checks for overlaps between other uniforms using explicit locations. + */ +static bool +reserve_explicit_locations(struct gl_shader_program *prog, + string_to_uint_map *map, ir_variable *var) +{ + unsigned max_loc = var-data.location + var-type-component_slots() - 1; + + /* Resize remap table if locations do not fit in the current one. */ + if (max_loc + 1 prog-NumUniformRemapTable) { + prog-UniformRemapTable = + reralloc(prog, prog-UniformRemapTable, + gl_uniform_storage *, + max_loc + 1); + prog-NumUniformRemapTable = max_loc + 1; + } + + for (unsigned i = 0; i var-type-component_slots(); i++) { + unsigned loc = var-data.location + i; + + /* Check if location is already used. */ + if (prog-UniformRemapTable[loc] == (gl_uniform_storage *) -1) { + + /* Possibly same uniform from a different stage, this is ok. */ + unsigned hash_loc; + if (map-get(hash_loc, var-name) hash_loc == loc - i) + continue; + + /* ARB_explicit_uniform_location specification states: + * + * No two default-block uniform variables in the program can have + * the same location, even if they are unused, otherwise a compiler + * or linker error will be generated. + */ + linker_error(prog, location qualifier + for uniform %s + overlaps previously used location, + var-name); + return false; + } + + prog-UniformRemapTable[loc] = (gl_uniform_storage *) -1; + } + + /* Note, base location used for arrays. */ + map-put(var-data.location, var-name); + + return true; +} + +/** + * Check and reserve all explicit uniform locations, called before + * any optimizations happen to handle also inactive uniforms and + * inactive array elements that may get trimmed away. + */ +static void +check_explicit_uniform_locations(struct gl_context *ctx, + struct gl_shader_program *prog) +{ + if (!ctx-Extensions.ARB_explicit_uniform_location) + return; + + /* This map is used to detect if overlapping explicit locations +* occur with the same uniform (from different stage) or a different one. +*/ + string_to_uint_map *uniform_map = new string_to_uint_map; + + for (unsigned i = 0; i MESA_SHADER_STAGES; i++) { + struct gl_shader *sh = prog-_LinkedShaders[i]; + + if (!sh) + continue; + + foreach_list(node, sh-ir) { + ir_variable *var = ((ir_instruction *)node)-as_variable(); + if ((var var-data.mode == ir_var_uniform) + var-data.explicit_location) { +if (!reserve_explicit_locations(prog, uniform_map, var)) + return; + +/* Initialize locations that were allocated but left unused. */ +for (unsigned i = 0; i prog-NumUniformRemapTable; i++) + if (prog-UniformRemapTable[i] != (gl_uniform_storage *) -1) + prog-UniformRemapTable[i] = NULL; + } + } + } + + delete uniform_map; +} + void link_shaders(struct gl_context *ctx, struct gl_shader_program *prog) { @@ -2232,6 +2327,10 @@ link_shaders(struct gl_context *ctx, struct gl_shader_program *prog) break; } + check_explicit_uniform_locations(ctx, prog); + if (!prog-LinkStatus) + goto done; + /* Validate the inputs of each stage with the output of the preceding * stage. */ -- 1.9.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 00/10] GL_ARB_explicit_uniform_location v2
Hi; Patches implement the extension, no Piglit regressions and all the tests for the extension pass. Location initialization and assignment is done like Ian suggested, this removed quite a bit of code since now there is no need to store inactive uniforms temporarily. Here's a branch with the patches: http://cgit.freedesktop.org/~tpalli/mesa/log/?h=exp_uniform_loc_v2 // Tapani Tapani Pälli (10): glapi: add GL_ARB_explicit_uniform_location mesa: add enable bit for ARB_explicit_uniform_location mesa: add new enum MAX_UNIFORM_LOCATIONS glsl/linker: initialize explicit uniform locations glsl/linker: assign explicit uniform locations mesa: support inactive uniforms in glUniform* functions glsl: add enable bit for ARB_explicit_uniform_location glsl: parser changes for GL_ARB_explicit_uniform_location Enable GL_ARB_explicit_uniform_location in the drivers. docs: update ARB_explicit_uniform_location status docs/GL3.txt | 2 +- src/glsl/ast_to_hir.cpp | 37 +++ src/glsl/glcpp/glcpp-parse.y | 3 + src/glsl/glsl_lexer.ll | 1 + src/glsl/glsl_parser_extras.cpp | 1 + src/glsl/glsl_parser_extras.h| 16 + src/glsl/ir_uniform.h| 5 +- src/glsl/link_uniforms.cpp | 56 ++-- src/glsl/linker.cpp | 99 src/mapi/glapi/gen/gl_API.xml| 6 ++ src/mesa/drivers/dri/i965/intel_extensions.c | 1 + src/mesa/main/context.c | 10 ++- src/mesa/main/extensions.c | 1 + src/mesa/main/get.c | 1 + src/mesa/main/get_hash_params.py | 1 + src/mesa/main/mtypes.h | 6 ++ src/mesa/main/tests/enum_strings.cpp | 1 + src/mesa/main/uniform_query.cpp | 15 + src/mesa/state_tracker/st_extensions.c | 1 + 19 files changed, 254 insertions(+), 9 deletions(-) -- 1.9.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 03/10] mesa: add new enum MAX_UNIFORM_LOCATIONS
Patch adds new implementation dependent value required by the GL_ARB_explicit_uniform_location extension. Default value for user assignable locations is calculated as sum of MaxUniformComponents for each stage. Signed-off-by: Tapani Pälli tapani.pa...@intel.com --- src/mesa/main/context.c | 10 +- src/mesa/main/get.c | 1 + src/mesa/main/get_hash_params.py | 1 + src/mesa/main/mtypes.h | 5 + src/mesa/main/tests/enum_strings.cpp | 1 + 5 files changed, 17 insertions(+), 1 deletion(-) diff --git a/src/mesa/main/context.c b/src/mesa/main/context.c index 860ae86..8b77df1 100644 --- a/src/mesa/main/context.c +++ b/src/mesa/main/context.c @@ -610,8 +610,16 @@ _mesa_init_constants(struct gl_context *ctx) ctx-Const.MaxUniformBlockSize = 16384; ctx-Const.UniformBufferOffsetAlignment = 1; - for (i = 0; i MESA_SHADER_STAGES; i++) + /* GL_ARB_explicit_uniform_location, initial value calculated +* as sum of MaxUniformComponents for each stage. +*/ + ctx-Const.MaxUserAssignableUniformLocations = 0; + + for (i = 0; i MESA_SHADER_STAGES; i++) { init_program_limits(ctx, i, ctx-Const.Program[i]); + ctx-Const.MaxUserAssignableUniformLocations += + ctx-Const.Program[i].MaxUniformComponents; + } ctx-Const.MaxProgramMatrices = MAX_PROGRAM_MATRICES; ctx-Const.MaxProgramMatrixStackDepth = MAX_PROGRAM_MATRIX_STACK_DEPTH; diff --git a/src/mesa/main/get.c b/src/mesa/main/get.c index 6d95790..8b50441 100644 --- a/src/mesa/main/get.c +++ b/src/mesa/main/get.c @@ -395,6 +395,7 @@ EXTRA_EXT(ARB_viewport_array); EXTRA_EXT(ARB_compute_shader); EXTRA_EXT(ARB_gpu_shader5); EXTRA_EXT2(ARB_transform_feedback3, ARB_gpu_shader5); +EXTRA_EXT(ARB_explicit_uniform_location); static const int extra_ARB_color_buffer_float_or_glcore[] = { diff --git a/src/mesa/main/get_hash_params.py b/src/mesa/main/get_hash_params.py index 06d0bba..5709d42 100644 --- a/src/mesa/main/get_hash_params.py +++ b/src/mesa/main/get_hash_params.py @@ -474,6 +474,7 @@ descriptor=[ [ MAX_LIST_NESTING, CONST(MAX_LIST_NESTING), NO_EXTRA ], [ MAX_NAME_STACK_DEPTH, CONST(MAX_NAME_STACK_DEPTH), NO_EXTRA ], [ MAX_PIXEL_MAP_TABLE, CONST(MAX_PIXEL_MAP_TABLE), NO_EXTRA ], + [ MAX_UNIFORM_LOCATIONS, CONTEXT_INT(Const.MaxUserAssignableUniformLocations), NO_EXTRA ], [ NAME_STACK_DEPTH, CONTEXT_INT(Select.NameStackDepth), NO_EXTRA ], [ PACK_LSB_FIRST, CONTEXT_BOOL(Pack.LsbFirst), NO_EXTRA ], [ PACK_SWAP_BYTES, CONTEXT_BOOL(Pack.SwapBytes), NO_EXTRA ], diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h index 7ac6bbe..fefbe06 100644 --- a/src/mesa/main/mtypes.h +++ b/src/mesa/main/mtypes.h @@ -3311,6 +3311,11 @@ struct gl_constants GLuint UniformBufferOffsetAlignment; /** @} */ + /** +* GL_ARB_explicit_uniform_location +*/ + GLuint MaxUserAssignableUniformLocations; + /** GL_ARB_geometry_shader4 */ GLuint MaxGeometryOutputVertices; GLuint MaxGeometryTotalOutputComponents; diff --git a/src/mesa/main/tests/enum_strings.cpp b/src/mesa/main/tests/enum_strings.cpp index 3795700..298ff6a 100644 --- a/src/mesa/main/tests/enum_strings.cpp +++ b/src/mesa/main/tests/enum_strings.cpp @@ -787,6 +787,7 @@ const struct enum_info everything[] = { { 0x8256, GL_RESET_NOTIFICATION_STRATEGY_ARB }, { 0x8257, GL_PROGRAM_BINARY_RETRIEVABLE_HINT }, { 0x8261, GL_NO_RESET_NOTIFICATION_ARB }, + { 0x826E, GL_MAX_UNIFORM_LOCATIONS }, { 0x82DF, GL_TEXTURE_IMMUTABLE_LEVELS }, { 0x8362, GL_UNSIGNED_BYTE_2_3_3_REV }, { 0x8363, GL_UNSIGNED_SHORT_5_6_5 }, -- 1.9.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 02/10] mesa: add enable bit for ARB_explicit_uniform_location
Signed-off-by: Tapani Pälli tapani.pa...@intel.com --- src/mesa/main/extensions.c | 1 + src/mesa/main/mtypes.h | 1 + 2 files changed, 2 insertions(+) diff --git a/src/mesa/main/extensions.c b/src/mesa/main/extensions.c index a72284c..8605189 100644 --- a/src/mesa/main/extensions.c +++ b/src/mesa/main/extensions.c @@ -99,6 +99,7 @@ static const struct extension extension_table[] = { { GL_ARB_draw_indirect, o(ARB_draw_indirect), GLC,2010 }, { GL_ARB_draw_instanced, o(ARB_draw_instanced), GL, 2008 }, { GL_ARB_explicit_attrib_location, o(ARB_explicit_attrib_location),GL, 2009 }, + { GL_ARB_explicit_uniform_location, o(ARB_explicit_uniform_location), GL, 2012 }, { GL_ARB_fragment_coord_conventions, o(ARB_fragment_coord_conventions), GL, 2009 }, { GL_ARB_fragment_program,o(ARB_fragment_program), GLL,2002 }, { GL_ARB_fragment_program_shadow, o(ARB_fragment_program_shadow), GLL,2003 }, diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h index 4d014d1..7ac6bbe 100644 --- a/src/mesa/main/mtypes.h +++ b/src/mesa/main/mtypes.h @@ -3508,6 +3508,7 @@ struct gl_extensions GLboolean ARB_fragment_shader; GLboolean ARB_framebuffer_object; GLboolean ARB_explicit_attrib_location; + GLboolean ARB_explicit_uniform_location; GLboolean ARB_geometry_shader4; GLboolean ARB_gpu_shader5; GLboolean ARB_half_float_vertex; -- 1.9.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 01/10] glapi: add GL_ARB_explicit_uniform_location
Signed-off-by: Tapani Pälli tapani.pa...@intel.com --- src/mapi/glapi/gen/gl_API.xml | 6 ++ 1 file changed, 6 insertions(+) diff --git a/src/mapi/glapi/gen/gl_API.xml b/src/mapi/glapi/gen/gl_API.xml index 9200cd6..d269d7d 100644 --- a/src/mapi/glapi/gen/gl_API.xml +++ b/src/mapi/glapi/gen/gl_API.xml @@ -8312,6 +8312,12 @@ !-- ARB extensions #128...#131 -- +category name=GL_ARB_explicit_uniform_location number=128 +enum name=MAX_UNIFORM_LOCATIONS count=1 value=0x826E +size name=Get mode=get/ +/enum +/category + xi:include href=ARB_invalidate_subdata.xml xmlns:xi=http://www.w3.org/2001/XInclude/ !-- ARB extensions #134...#138 -- -- 1.9.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 06/10] mesa: support inactive uniforms in glUniform* functions
Support inactive uniforms that have explicit location set in glUniform* functions. Signed-off-by: Tapani Pälli tapani.pa...@intel.com --- src/mesa/main/uniform_query.cpp | 15 +++ 1 file changed, 15 insertions(+) diff --git a/src/mesa/main/uniform_query.cpp b/src/mesa/main/uniform_query.cpp index 5f1af08..e33800a 100644 --- a/src/mesa/main/uniform_query.cpp +++ b/src/mesa/main/uniform_query.cpp @@ -253,6 +253,21 @@ validate_uniform_parameters(struct gl_context *ctx, return false; } + /* If the driver storage pointer in remap table is -1, we ignore silently. +* +* GL_ARB_explicit_uniform_location spec says: +* What happens if Uniform* is called with an explicitly defined +* uniform location, but that uniform is deemed inactive by the +* linker? +* +* RESOLVED: The call is ignored for inactive uniform variables and +* no error is generated. +* +*/ + if (ctx-Extensions.ARB_explicit_uniform_location + shProg-UniformRemapTable[location] == (gl_uniform_storage *) -1) + return false; + _mesa_uniform_split_location_offset(shProg, location, loc, array_index); if (shProg-UniformStorage[*loc].array_elements == 0 count 1) { -- 1.9.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 07/10] glsl: add enable bit for ARB_explicit_uniform_location
Signed-off-by: Tapani Pälli tapani.pa...@intel.com --- src/glsl/glsl_parser_extras.cpp | 1 + src/glsl/glsl_parser_extras.h | 2 ++ 2 files changed, 3 insertions(+) diff --git a/src/glsl/glsl_parser_extras.cpp b/src/glsl/glsl_parser_extras.cpp index a42f3d2..d6415ab 100644 --- a/src/glsl/glsl_parser_extras.cpp +++ b/src/glsl/glsl_parser_extras.cpp @@ -505,6 +505,7 @@ static const _mesa_glsl_extension _mesa_glsl_supported_extensions[] = { EXT(ARB_draw_buffers, true, false, dummy_true), EXT(ARB_draw_instanced, true, false, ARB_draw_instanced), EXT(ARB_explicit_attrib_location, true, false, ARB_explicit_attrib_location), + EXT(ARB_explicit_uniform_location, true, false, ARB_explicit_uniform_location), EXT(ARB_fragment_coord_conventions, true, false, ARB_fragment_coord_conventions), EXT(ARB_texture_rectangle, true, false, dummy_true), EXT(EXT_texture_array, true, false, EXT_texture_array), diff --git a/src/glsl/glsl_parser_extras.h b/src/glsl/glsl_parser_extras.h index 3ad205c..c53c583 100644 --- a/src/glsl/glsl_parser_extras.h +++ b/src/glsl/glsl_parser_extras.h @@ -345,6 +345,8 @@ struct _mesa_glsl_parse_state { bool ARB_draw_instanced_warn; bool ARB_explicit_attrib_location_enable; bool ARB_explicit_attrib_location_warn; + bool ARB_explicit_uniform_location_enable; + bool ARB_explicit_uniform_location_warn; bool ARB_fragment_coord_conventions_enable; bool ARB_fragment_coord_conventions_warn; bool ARB_texture_rectangle_enable; -- 1.9.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 09/10] Enable GL_ARB_explicit_uniform_location in the drivers.
Signed-off-by: Tapani Pälli tapani.pa...@intel.com --- src/mesa/drivers/dri/i965/intel_extensions.c | 1 + src/mesa/state_tracker/st_extensions.c | 1 + 2 files changed, 2 insertions(+) diff --git a/src/mesa/drivers/dri/i965/intel_extensions.c b/src/mesa/drivers/dri/i965/intel_extensions.c index 15fcd30..f8abf98 100644 --- a/src/mesa/drivers/dri/i965/intel_extensions.c +++ b/src/mesa/drivers/dri/i965/intel_extensions.c @@ -170,6 +170,7 @@ intelInitExtensions(struct gl_context *ctx) ctx-Extensions.ARB_draw_instanced = true; ctx-Extensions.ARB_ES2_compatibility = true; ctx-Extensions.ARB_explicit_attrib_location = true; + ctx-Extensions.ARB_explicit_uniform_location = true; ctx-Extensions.ARB_fragment_coord_conventions = true; ctx-Extensions.ARB_fragment_program = true; ctx-Extensions.ARB_fragment_program_shadow = true; diff --git a/src/mesa/state_tracker/st_extensions.c b/src/mesa/state_tracker/st_extensions.c index 3e1e45d..5b11e7b 100644 --- a/src/mesa/state_tracker/st_extensions.c +++ b/src/mesa/state_tracker/st_extensions.c @@ -534,6 +534,7 @@ void st_init_extensions(struct st_context *st) ctx-Extensions.ARB_ES2_compatibility = GL_TRUE; ctx-Extensions.ARB_draw_elements_base_vertex = GL_TRUE; ctx-Extensions.ARB_explicit_attrib_location = GL_TRUE; + ctx-Extensions.ARB_explicit_uniform_location = GL_TRUE; ctx-Extensions.ARB_fragment_coord_conventions = GL_TRUE; ctx-Extensions.ARB_fragment_program = GL_TRUE; ctx-Extensions.ARB_fragment_shader = GL_TRUE; -- 1.9.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 08/10] glsl: parser changes for GL_ARB_explicit_uniform_location
Patch adds a preprocessor define for the extension and stores explicit location data for uniforms during AST-HIR conversion. It also sets layout token to be available when having the extension in place. Signed-off-by: Tapani Pälli tapani.pa...@intel.com --- src/glsl/ast_to_hir.cpp | 37 + src/glsl/glcpp/glcpp-parse.y | 3 +++ src/glsl/glsl_lexer.ll| 1 + src/glsl/glsl_parser_extras.h | 14 ++ 4 files changed, 55 insertions(+) diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp index 8d55ee3..7431ad7 100644 --- a/src/glsl/ast_to_hir.cpp +++ b/src/glsl/ast_to_hir.cpp @@ -2170,6 +2170,43 @@ validate_explicit_location(const struct ast_type_qualifier *qual, { bool fail = false; + /* Checks for GL_ARB_explicit_uniform_location. */ + if (qual-flags.q.uniform) { + + if (!state-check_explicit_uniform_location_allowed(loc, var)) + return; + + const struct gl_context *const ctx = state-ctx; + unsigned max_loc = qual-location + var-type-component_slots() - 1; + + /* ARB_explicit_uniform_location specification states: + * + * The explicitly defined locations and the generated locations + * must be in the range of 0 to MAX_UNIFORM_LOCATIONS minus one. + * + * Valid locations for default-block uniform variable locations + * are in the range of 0 to the implementation-defined maximum + * number of uniform locations. + */ + if (qual-location 0) { + _mesa_glsl_error(loc, state, + explicit location 0 for uniform %s, var-name); + return; + } + + if (max_loc = ctx-Const.MaxUserAssignableUniformLocations) { + _mesa_glsl_error(loc, state, location qualifier for uniform %s + = MAX_UNIFORM_LOCATIONS (%u), + var-name, + ctx-Const.MaxUserAssignableUniformLocations); + return; + } + + var-data.explicit_location = true; + var-data.location = qual-location; + return; + } + /* Between GL_ARB_explicit_attrib_location an * GL_ARB_separate_shader_objects, the inputs and outputs of any shader * stage can be assigned explicit locations. The checking here associates diff --git a/src/glsl/glcpp/glcpp-parse.y b/src/glsl/glcpp/glcpp-parse.y index f28d853..6d42138 100644 --- a/src/glsl/glcpp/glcpp-parse.y +++ b/src/glsl/glcpp/glcpp-parse.y @@ -2087,6 +2087,9 @@ _glcpp_parser_handle_version_declaration(glcpp_parser_t *parser, intmax_t versio if (extensions-ARB_explicit_attrib_location) add_builtin_define(parser, GL_ARB_explicit_attrib_location, 1); + if (extensions-ARB_explicit_uniform_location) +add_builtin_define(parser, GL_ARB_explicit_uniform_location, 1); + if (extensions-ARB_shader_texture_lod) add_builtin_define(parser, GL_ARB_shader_texture_lod, 1); diff --git a/src/glsl/glsl_lexer.ll b/src/glsl/glsl_lexer.ll index 7602351..83f0b6d 100644 --- a/src/glsl/glsl_lexer.ll +++ b/src/glsl/glsl_lexer.ll @@ -393,6 +393,7 @@ layout { || yyextra-AMD_conservative_depth_enable || yyextra-ARB_conservative_depth_enable || yyextra-ARB_explicit_attrib_location_enable + || yyextra-ARB_explicit_uniform_location_enable || yyextra-has_separate_shader_objects() || yyextra-ARB_uniform_buffer_object_enable || yyextra-ARB_fragment_coord_conventions_enable diff --git a/src/glsl/glsl_parser_extras.h b/src/glsl/glsl_parser_extras.h index c53c583..20879a0 100644 --- a/src/glsl/glsl_parser_extras.h +++ b/src/glsl/glsl_parser_extras.h @@ -152,6 +152,20 @@ struct _mesa_glsl_parse_state { return true; } + bool check_explicit_uniform_location_allowed(YYLTYPE *locp, +const ir_variable *var) + { + /* Requires OpenGL 3.3 or ARB_explicit_attrib_location. */ + if (ctx-Version 33 !ctx-Extensions.ARB_explicit_attrib_location) { + _mesa_glsl_error(locp, this, %s explicit location requires + GL_ARB_explicit_attrib_location extension + or OpenGL 3.3, mode_string(var)); + return false; + } + + return true; + } + bool has_explicit_attrib_location() const { return ARB_explicit_attrib_location_enable || is_version(330, 300); -- 1.9.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 05/10] glsl/linker: assign explicit uniform locations
Patch refactors the existing uniform processing so explicit locations are taken in to account during variable processing. These locations are temporarily stored in gl_uniform_storage before actual locations are set. The 'remap_location' variable in gl_uniform_storage is changed to be signed so that we can use 0 as a valid explicit location and '-1' as identifier that no explicit location has been defined. When locations are set, UniformRemapTable is first populated with uniforms that have explicit location set (inactive and actives ones), rest are put after explicit location slots. Signed-off-by: Tapani Pälli tapani.pa...@intel.com --- src/glsl/ir_uniform.h | 5 +++-- src/glsl/link_uniforms.cpp | 56 +- 2 files changed, 54 insertions(+), 7 deletions(-) diff --git a/src/glsl/ir_uniform.h b/src/glsl/ir_uniform.h index 3508509..9dc4a8e 100644 --- a/src/glsl/ir_uniform.h +++ b/src/glsl/ir_uniform.h @@ -181,9 +181,10 @@ struct gl_uniform_storage { /** * The 'base location' for this uniform in the uniform remap table. For -* arrays this is the first element in the array. +* arrays this is the first element in the array. It needs to be signed +* so that we can use 0 as valid location and -1 as initial value */ - unsigned remap_location; + int remap_location; }; #ifdef __cplusplus diff --git a/src/glsl/link_uniforms.cpp b/src/glsl/link_uniforms.cpp index 29dc0b1..0f99082 100644 --- a/src/glsl/link_uniforms.cpp +++ b/src/glsl/link_uniforms.cpp @@ -387,6 +387,9 @@ public: void set_and_process(struct gl_shader_program *prog, ir_variable *var) { + current_var = var; + field_counter = 0; + ubo_block_index = -1; if (var-is_in_uniform_block()) { if (var-is_interface_instance() var-type-is_array()) { @@ -543,6 +546,22 @@ private: return; } + /* Assign explicit locations. */ + if (current_var-data.explicit_location) { + /* Set sequential locations for struct fields. */ + if (current_var-type-is_record()) { +const unsigned entries = MAX2(1, this-uniforms[id].array_elements); +this-uniforms[id].remap_location = + current_var-data.location + field_counter; + field_counter += entries; + } else { +this-uniforms[id].remap_location = current_var-data.location; + } + } else { + /* Initialize to -1 to indicate that no explicit location is set */ + this-uniforms[id].remap_location = -1; + } + this-uniforms[id].name = ralloc_strdup(this-uniforms, name); this-uniforms[id].type = base_type; this-uniforms[id].initialized = 0; @@ -598,6 +617,17 @@ public: gl_texture_index targets[MAX_SAMPLERS]; /** +* Current variable being processed. +*/ + ir_variable *current_var; + + /** +* Field counter is used to take care that uniform structures +* with explicit locations get sequential locations. +*/ + unsigned field_counter; + + /** * Mask of samplers used by the current shader stage. */ unsigned shader_samplers_used; @@ -799,10 +829,6 @@ link_assign_uniform_locations(struct gl_shader_program *prog) prog-UniformStorage = NULL; prog-NumUserUniformStorage = 0; - ralloc_free(prog-UniformRemapTable); - prog-UniformRemapTable = NULL; - prog-NumUniformRemapTable = 0; - if (prog-UniformHash != NULL) { prog-UniformHash-clear(); } else { @@ -915,9 +941,29 @@ link_assign_uniform_locations(struct gl_shader_program *prog) sizeof(prog-_LinkedShaders[i]-SamplerTargets)); } - /* Build the uniform remap table that is used to set/get uniform locations */ + /* Reserve all the explicit locations of the active uniforms. */ + for (unsigned i = 0; i num_user_uniforms; i++) { + if (uniforms[i].remap_location != -1) { + /* How many new entries for this uniform? */ + const unsigned entries = MAX2(1, uniforms[i].array_elements); + + /* Set remap table entries point to correct gl_uniform_storage. */ + for (unsigned j = 0; j entries; j++) { +unsigned element_loc = uniforms[i].remap_location + j; +assert(prog-UniformRemapTable[element_loc] == + (gl_uniform_storage *) -1); +prog-UniformRemapTable[element_loc] = uniforms[i]; + } + } + } + + /* Reserve locations for rest of the uniforms. */ for (unsigned i = 0; i num_user_uniforms; i++) { + /* Explicit ones have been set already. */ + if (uniforms[i].remap_location != -1) + continue; + /* how many new entries for this uniform? */ const unsigned entries = MAX2(1, uniforms[i].array_elements); -- 1.9.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org
[Mesa-dev] [PATCH 10/10] docs: update ARB_explicit_uniform_location status
Signed-off-by: Tapani Pälli tapani.pa...@intel.com --- docs/GL3.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/GL3.txt b/docs/GL3.txt index bf51e3a..245a045 100644 --- a/docs/GL3.txt +++ b/docs/GL3.txt @@ -148,7 +148,7 @@ GL 4.3: GL_ARB_compute_shaderstarted (Paul Berry) GL_ARB_copy_imagenot started GL_KHR_debug DONE (all drivers) - GL_ARB_explicit_uniform_location not started + GL_ARB_explicit_uniform_location DONE (all drivers that support GLSL) GL_ARB_fragment_layer_viewport not started GL_ARB_framebuffer_no_attachmentsnot started GL_ARB_internalformat_query2 not started -- 1.9.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 03/10] mesa: add new enum MAX_UNIFORM_LOCATIONS
On 04/09/2014 12:56 PM, Tapani Pälli wrote: diff --git a/src/mesa/main/get_hash_params.py b/src/mesa/main/get_hash_params.py index 06d0bba..5709d42 100644 --- a/src/mesa/main/get_hash_params.py +++ b/src/mesa/main/get_hash_params.py @@ -474,6 +474,7 @@ descriptor=[ [ MAX_LIST_NESTING, CONST(MAX_LIST_NESTING), NO_EXTRA ], [ MAX_NAME_STACK_DEPTH, CONST(MAX_NAME_STACK_DEPTH), NO_EXTRA ], [ MAX_PIXEL_MAP_TABLE, CONST(MAX_PIXEL_MAP_TABLE), NO_EXTRA ], + [ MAX_UNIFORM_LOCATIONS, CONTEXT_INT(Const.MaxUserAssignableUniformLocations), NO_EXTRA ], [ NAME_STACK_DEPTH, CONTEXT_INT(Select.NameStackDepth), NO_EXTRA ], [ PACK_LSB_FIRST, CONTEXT_BOOL(Pack.LsbFirst), NO_EXTRA ], [ PACK_SWAP_BYTES, CONTEXT_BOOL(Pack.SwapBytes), NO_EXTRA ], Should that NO_EXTRA be extra_ARB_explicit_uniform_location? -- Petri Latvala ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] r600g: Don't leak bytecode on shader compile failure
Reviewed-by: Marek Olšák marek.ol...@amd.com Marek On Wed, Apr 9, 2014 at 8:39 AM, Michel Dänzer mic...@daenzer.net wrote: From: Michel Dänzer michel.daen...@amd.com Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=74868 Cc: mesa-sta...@lists.freedesktop.org Signed-off-by: Michel Dänzer michel.daen...@amd.com --- src/gallium/drivers/r600/r600_shader.c | 18 +++--- 1 file changed, 11 insertions(+), 7 deletions(-) diff --git a/src/gallium/drivers/r600/r600_shader.c b/src/gallium/drivers/r600/r600_shader.c index ddf79ee..b198359 100644 --- a/src/gallium/drivers/r600/r600_shader.c +++ b/src/gallium/drivers/r600/r600_shader.c @@ -155,7 +155,7 @@ int r600_pipe_shader_create(struct pipe_context *ctx, r = r600_shader_from_tgsi(rctx, shader, key); if (r) { R600_ERR(translation from TGSI failed !\n); - return r; + goto error; } /* disable SB for geom shaders - it can't handle the CF_EMIT instructions */ @@ -169,7 +169,7 @@ int r600_pipe_shader_create(struct pipe_context *ctx, r = r600_bytecode_build(shader-shader.bc); if (r) { R600_ERR(building bytecode failed !\n); - return r; + goto error; } } @@ -182,7 +182,7 @@ int r600_pipe_shader_create(struct pipe_context *ctx, dump, use_sb); if (r) { R600_ERR(r600_sb_bytecode_process failed !\n); - return r; + goto error; } } @@ -192,16 +192,16 @@ int r600_pipe_shader_create(struct pipe_context *ctx, r = r600_sb_bytecode_process(rctx, shader-gs_copy_shader-shader.bc, shader-gs_copy_shader-shader, dump, 0); if (r) - return r; + goto error; } if ((r = store_shader(ctx, shader-gs_copy_shader))) - return r; + goto error; } /* Store the shader in a buffer. */ if ((r = store_shader(ctx, shader))) - return r; + goto error; /* Build state. */ switch (shader-shader.processor_type) { @@ -235,9 +235,13 @@ int r600_pipe_shader_create(struct pipe_context *ctx, } break; default: - return -EINVAL; + goto error; } return 0; + +error: + r600_pipe_shader_destroy(ctx, shader); + return r; } void r600_pipe_shader_destroy(struct pipe_context *ctx, struct r600_pipe_shader *shader) -- 1.9.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 77240] New: khrplatform.h not installed if EGL is disabled
https://bugs.freedesktop.org/show_bug.cgi?id=77240 Priority: medium Bug ID: 77240 Assignee: mesa-dev@lists.freedesktop.org Summary: khrplatform.h not installed if EGL is disabled Severity: normal Classification: Unclassified OS: Linux (All) Reporter: eric.le.bihan@free.fr Hardware: All Status: NEW Version: unspecified Component: Other Product: Mesa Created attachment 97136 -- https://bugs.freedesktop.org/attachment.cgi?id=97136action=edit Patch to fix missing khrplatform.h KHR/khrplatform.h is required by the EGL, GLES and VG headers, but is only installed if Mesa3d is compiled with EGL support. Configuring with $ ./configure --disable-egl --enable-gles1 --enable-gles2 ... will result with an incomplete header set. When compiling Cairo with OpenGLESv2 support, the build will fail because of the missing header: /usr/include/GLES2/gl2platform.h:20:29: fatal error: KHR/khrplatform.h: No such file or directory The attached patch fixes the issue for me. -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 77208] VdpPresentationQueueGetTime does not return a monotonic time
https://bugs.freedesktop.org/show_bug.cgi?id=77208 --- Comment #3 from Andy Furniss adf.li...@gmail.com --- (In reply to comment #1) Oh, it seems the pausing issue could be caused by interaction with power management. This is what a user posted: When polling '/sys/kernel/debug/dri/0/radeon_pm_info' you can see that this only happens when the power level switches from an UVD power level to a non-UVD power level. Pause - 1-2 seconds - non-UVD power level - Play - Stutter Not for me though, my HD4890 doesn't have uvd, and I just tried forcing dpm to low and high rather than auto and the pausing issue is still there with both. -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 76856] -Wl, --no-undefined gives undefined references to libc symbols on OpenBSD
https://bugs.freedesktop.org/show_bug.cgi?id=76856 Emil Velikov emil.l.veli...@gmail.com changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED --- Comment #8 from Emil Velikov emil.l.veli...@gmail.com --- Pushed to master commit 11623be934f8573910484de2a5fb50c95f0a1d44 Author: Jonathan Gray j...@jsg.id.au Date: Thu Apr 3 15:46:01 2014 +1100 automake: don't enable -Wl,--no-undefined on OpenBSD OpenBSD does not have DT_NEEDED entries for libc by design, over concerns how the symbols would be referenced after changing the major version of the library. So avoid -no-undefined checks on OpenBSD as they will fail. v2: don't include the -no-undefined libtool option in the variable and change -Wl,--no-undefined references in Automake.inc as well. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=76856 Signed-off-by: Jonathan Gray j...@jsg.id.au Reviewed-by: Emil Velikov emil.l.veli...@gmail.com Reviewed-by: Matt Turner matts...@gmail.com -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 76377] DRI3 should only be enabled on Linux due to a udev dependency
https://bugs.freedesktop.org/show_bug.cgi?id=76377 Emil Velikov emil.l.veli...@gmail.com changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED --- Comment #7 from Emil Velikov emil.l.veli...@gmail.com --- Both patches are in master now, and a tagged for the 10.1 stable branch. -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] glx: drop obsolete _XUnlock_Mutex in __glXInitialize error path
On 16/03/14 14:10, Emil Velikov wrote: With commit 1f1928db001(glx: Drop _Xglobal_lock while we create and initialize glx display) we've split the big _Xglobal_lock handling in a more fine grained manner. Unfortunatelly we forgot to drop the unlock_mutex on the error paths, leading to undefined behaviour as the mutex is already unlocked. Gents, Kristian, Can someone spare a few minutes to review this patch ? It addresses 3+ yo bug. Cheers, -Emil Cc: Kristian Høgsberg k...@bitplanet.net Cc: 9.2 10.0 10.1 mesa-sta...@lists.freedesktop.org Signed-off-by: Emil Velikov emil.l.veli...@gmail.com --- src/glx/glxext.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/src/glx/glxext.c b/src/glx/glxext.c index 4a195bd..de73036 100644 --- a/src/glx/glxext.c +++ b/src/glx/glxext.c @@ -826,7 +826,6 @@ __glXInitialize(Display * dpy) dpyPriv-codes = XInitExtension(dpy, __glXExtensionName); if (!dpyPriv-codes) { free(dpyPriv); - _XUnlockMutex(_Xglobal_lock); return NULL; } @@ -842,7 +841,6 @@ __glXInitialize(Display * dpy) dpyPriv-majorVersion, dpyPriv-minorVersion) || (dpyPriv-majorVersion == 1 dpyPriv-minorVersion 1)) { free(dpyPriv); - _XUnlockMutex(_Xglobal_lock); return NULL; } @@ -907,7 +905,7 @@ __glXInitialize(Display * dpy) dpyPriv-next = glx_displays; glx_displays = dpyPriv; -_XUnlockMutex(_Xglobal_lock); + _XUnlockMutex(_Xglobal_lock); return dpyPriv; } ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 3/7] linker: Fold set_uniform_binding into call site
On 04/04/2014 02:01 PM, Ian Romanick wrote: From: Ian Romanick ian.d.roman...@intel.com In the next patch, we'll see that using gl_shader_program::UniformStorage is not correct for uniform blocks. That means we can't use ::UniformStorage to select between the sampler path and the block path. Instead we want to just use the type of the variable. That's never passed to set_uniform_binding, and it's easier Ehhhmm.then to just remove the function (especially for later patches in the series) than to add another parameter. Signed-off-by: Ian Romanick ian.d.roman...@intel.com Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=76323 Cc: 10.1 mesa-sta...@lists.freedesktop.org Cc: git...@socker.lepus.uberspace.de --- src/glsl/link_uniform_initializers.cpp | 33 - 1 file changed, 12 insertions(+), 21 deletions(-) diff --git a/src/glsl/link_uniform_initializers.cpp b/src/glsl/link_uniform_initializers.cpp index 6f15e69..bbdeec9 100644 --- a/src/glsl/link_uniform_initializers.cpp +++ b/src/glsl/link_uniform_initializers.cpp @@ -151,25 +151,6 @@ set_block_binding(void *mem_ctx, gl_shader_program *prog, } void -set_uniform_binding(void *mem_ctx, gl_shader_program *prog, -const char *name, const glsl_type *type, int binding) ...what exactly is this? ^ -{ - struct gl_uniform_storage *const storage = - get_storage(prog-UniformStorage, prog-NumUserUniformStorage, name); - - if (storage == NULL) { - assert(storage != NULL); - return; - } - - if (storage-type-is_sampler()) { - set_sampler_binding(mem_ctx, prog, name, type, binding); - } else if (storage-block_index != -1) { - set_block_binding(mem_ctx, prog, name, type, binding); - } -} - -void set_uniform_initializer(void *mem_ctx, gl_shader_program *prog, const char *name, const glsl_type *type, ir_constant *val) @@ -268,8 +249,18 @@ link_set_uniform_initializers(struct gl_shader_program *prog) mem_ctx = ralloc_context(NULL); if (var-data.explicit_binding) { -linker::set_uniform_binding(mem_ctx, prog, var-name, -var-type, var-data.binding); +const glsl_type *const type = var-type; Here you're using type, which is var-type, which is exactly what we were already passing. AFAICT all you needed to do was change: if (storage-type-is_sampler()) to if (type-is_sampler() || (type-is_array() type-fields.array-is_sampler())) in set_uniform_binding. + +if (type-is_sampler() +|| (type-is_array() type-fields.array-is_sampler())) { + linker::set_sampler_binding(mem_ctx, prog, var-name, + type, var-data.binding); +} else if (var-is_in_uniform_block()) { + linker::set_block_binding(mem_ctx, prog, var-name, + type, var-data.binding); +} else { + assert(!Explicit binding not on a sampler or UBO.); +} } else if (var-constant_value) { linker::set_uniform_initializer(mem_ctx, prog, var-name, var-type, var-constant_value); signature.asc Description: OpenPGP digital signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 5/7] linker: Set block bindings based on UniformBlocks rather than UniformStorage
On 04/04/2014 02:01 PM, Ian Romanick wrote: From: Ian Romanick ian.d.roman...@intel.com For blocks, gl_shader_program::UniformStorage isn't very useful. The names stored there are the names of the elements of the block, so finding blocks with an instance name is hard. There is also only one entry in ::UniformStorage for each element of a block array, and that is a deal breaker. Using ::UniformBlocks is what _mesa_GetUniformBlockIndex does. I contemplated sharing code between set_block_binding and _mesa_GetUniformBlockIndex, but building the stand-alone compiler and the unit tests make this hard. I plan to return to this effort shortly. Signed-off-by: Ian Romanick ian.d.roman...@intel.com Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=76323 Cc: 10.1 mesa-sta...@lists.freedesktop.org Cc: git...@socker.lepus.uberspace.de --- src/glsl/link_uniform_initializers.cpp | 32 +--- 1 file changed, 21 insertions(+), 11 deletions(-) diff --git a/src/glsl/link_uniform_initializers.cpp b/src/glsl/link_uniform_initializers.cpp index c633850..491eb69 100644 --- a/src/glsl/link_uniform_initializers.cpp +++ b/src/glsl/link_uniform_initializers.cpp @@ -46,6 +46,18 @@ get_storage(gl_uniform_storage *storage, unsigned num_storage, return NULL; } +static unsigned +get_uniform_block_index(const gl_shader_program *shProg, +const char *uniformBlockName) +{ + for (unsigned i = 0; i shProg-NumUniformBlocks; i++) { + if (!strcmp(shProg-UniformBlocks[i].Name, uniformBlockName)) + return i; + } + + return GL_INVALID_INDEX; +} + void copy_constant_to_storage(union gl_constant_value *storage, const ir_constant *val, @@ -123,29 +135,24 @@ set_sampler_binding(gl_shader_program *prog, const char *name, int binding) } void -set_block_binding(gl_shader_program *prog, const char *name, int binding) +set_block_binding(gl_shader_program *prog, const char *block_name, int binding) { - struct gl_uniform_storage *const storage = - get_storage(prog-UniformStorage, prog-NumUserUniformStorage, name); + const unsigned block_index = get_uniform_block_index(prog, block_name); - if (storage == NULL) { - assert(storage != NULL); + if (block_index == GL_INVALID_INDEX) { + assert(block_index != GL_INVALID_INDEX); return; } - if (storage-block_index != -1) { /* This is a field of a UBO. val is the binding index. */ for (int i = 0; i MESA_SHADER_STAGES; i++) { - int stage_index = prog-UniformBlockStageIndex[i][storage-block_index]; + int stage_index = prog-UniformBlockStageIndex[i][block_index]; if (stage_index != -1) { struct gl_shader *sh = prog-_LinkedShaders[i]; sh-UniformBlocks[stage_index].Binding = binding; } } - } - - storage-initialized = true; Why is it not necessary to set storage-initialized = true? It goes away here and never seems to come back. } void @@ -253,7 +260,10 @@ link_set_uniform_initializers(struct gl_shader_program *prog) || (type-is_array() type-fields.array-is_sampler())) { linker::set_sampler_binding(prog, var-name, var-data.binding); } else if (var-is_in_uniform_block()) { - linker::set_block_binding(prog, var-name, var-data.binding); + const glsl_type *const iface_type = var-get_interface_type(); + + linker::set_block_binding(prog, iface_type-name, + var-data.binding); } else { assert(!Explicit binding not on a sampler or UBO.); } signature.asc Description: OpenPGP digital signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 6/6] glsl: Ignore loop-too-large heuristic if there's bad variable indexing.
On 04/08/2014 09:20 PM, Kenneth Graunke wrote: Many shaders use a pattern such as: for (int i = 0; i NUM_LIGHTS; i++) { ...access a uniform array, or shader input/output array... } where NUM_LIGHTS is a small constant (such as 2, 4, or 8). The expectation is that the compiler will unroll those loops, turning the array access into constant indexing, which is more efficient, and which may enable array splitting and other optimizations. In many cases, our heuristic fails - either there's another tiny nested loop inside, or the estimated number of instructions is just barely beyond the threshold. So, we fail to unroll the loop, leaving the variable indexing in place. Drivers which don't support the particular flavor of variable indexing will call lower_variable_index_to_cond_assign(), which generates piles and piles of immensely inefficient code. We'd like to avoid generating that. This patch detects unsupported forms of variable-indexing in loops, where the array index is a loop induction variable. In that case, it bypasses the loop-too-large heuristic and forces unrolling. Improves performance in a PCF soft-shadow microbenchmark by 2x. Sorry...this number is incorrect. It improves performance by 21%. The 2x figure was due to a bug in the older version where I literally unrolled everything, which also got rid of variable-indexing of uniform arrays. (The program still worked even with the buggy patch...we just made different unrolling decisions. So, we should still be able to attain 2x...just not with this patch alone.) No changes in shader-db. v2: Check ir-array for being an array or matrix, rather than the ir_dereference_array itself. Signed-off-by: Kenneth Graunke kenn...@whitecape.org --- src/glsl/loop_unroll.cpp | 61 +--- 1 file changed, 58 insertions(+), 3 deletions(-) v1 of 6/6 had several bugs, which ended up cancelling out in many cases and making things look like they were working. I think this one is actually good. Sorry for the noise... diff --git a/src/glsl/loop_unroll.cpp b/src/glsl/loop_unroll.cpp index 1ce4d58..da53280 100644 --- a/src/glsl/loop_unroll.cpp +++ b/src/glsl/loop_unroll.cpp @@ -63,13 +63,17 @@ is_break(ir_instruction *ir) class loop_unroll_count : public ir_hierarchical_visitor { public: int nodes; + bool unsupported_variable_indexing; /* If there are nested loops, the node count will be inaccurate. */ bool nested_loop; - loop_unroll_count(exec_list *list) + loop_unroll_count(exec_list *list, loop_variable_state *ls, + const struct gl_shader_compiler_options *options) + : ls(ls), options(options) { nodes = 0; nested_loop = false; + unsupported_variable_indexing = false; run(list); } @@ -91,6 +95,54 @@ public: nested_loop = true; return visit_continue; } + + virtual ir_visitor_status visit_enter(ir_dereference_array *ir) + { + /* Check for arrays variably-indexed by a loop induction variable. + * Unrolling the loop may convert that access into constant-indexing. + * + * Many drivers don't support particular kinds of variable indexing, + * and have to resort to using lower_variable_index_to_cond_assign to + * handle it. This results in huge amounts of horrible code, so we'd + * like to avoid that if possible. Here, we just note that it will + * happen. + */ + if ((ir-array-type-is_array() || ir-array-type-is_matrix()) + !ir-array_index-as_constant()) { + ir_variable *array = ir-array-variable_referenced(); + loop_variable *lv = ls-get(ir-array_index-variable_referenced()); + if (array lv lv-is_induction_var()) { +switch (array-data.mode) { +case ir_var_auto: +case ir_var_temporary: +case ir_var_const_in: +case ir_var_function_in: +case ir_var_function_out: +case ir_var_function_inout: + if (options-EmitNoIndirectTemp) + unsupported_variable_indexing = true; + break; +case ir_var_uniform: + if (options-EmitNoIndirectUniform) + unsupported_variable_indexing = true; + break; +case ir_var_shader_in: + if (options-EmitNoIndirectInput) + unsupported_variable_indexing = true; + break; +case ir_var_shader_out: + if (options-EmitNoIndirectOutput) + unsupported_variable_indexing = true; + break; +} + } + } + return visit_continue; + } + +private: + loop_variable_state *ls; + const struct gl_shader_compiler_options *options; }; @@ -257,9 +309,12 @@
Re: [Mesa-dev] [PATCH 1/7] linker: Split set_uniform_binding into separate functions for blocks and samplers
On 04/04/2014 02:01 PM, Ian Romanick wrote: From: Ian Romanick ian.d.roman...@intel.com The two code paths are quite different, and there are some problems in the handling of uniform blocks. Future changes will cause these paths to diverge further. Ultimately, selecting between the two functions will happen at the set_uniform_binding call site, and set_uniform_binding will be deleted. NOTE: This patch just moves code around. Signed-off-by: Ian Romanick ian.d.roman...@intel.com Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=76323 Cc: 10.1 mesa-sta...@lists.freedesktop.org Cc: git...@socker.lepus.uberspace.de --- src/glsl/link_uniform_initializers.cpp | 42 +++--- 1 file changed, 39 insertions(+), 3 deletions(-) Assuming you have a reasonable response to my comment on patch 5, this series is: Reviewed-by: Kenneth Graunke kenn...@whitecape.org though, I'm not sure how much that's worth - I had to re-read the GLSL rules and re-discover how our compiler IR for this stuff works. The code seems right, but I could be totally missing something obvious. On that note...is it just me, or is the compiler IR for uniform blocks rather ugly and messy? Anyway, thanks a ton for doing this, Ian. Sorry for dropping the ball when we first implemented 420pack. diff --git a/src/glsl/link_uniform_initializers.cpp b/src/glsl/link_uniform_initializers.cpp index 9d6977d..9a10350 100644 --- a/src/glsl/link_uniform_initializers.cpp +++ b/src/glsl/link_uniform_initializers.cpp @@ -84,7 +84,7 @@ copy_constant_to_storage(union gl_constant_value *storage, } void -set_uniform_binding(void *mem_ctx, gl_shader_program *prog, +set_sampler_binding(void *mem_ctx, gl_shader_program *prog, const char *name, const glsl_type *type, int binding) { struct gl_uniform_storage *const storage = @@ -95,7 +95,7 @@ set_uniform_binding(void *mem_ctx, gl_shader_program *prog, return; } - if (storage-type-is_sampler()) { + { unsigned elements = MAX2(storage-array_elements, 1); /* From section 4.4.4 of the GLSL 4.20 specification: @@ -118,7 +118,24 @@ set_uniform_binding(void *mem_ctx, gl_shader_program *prog, } } } - } else if (storage-block_index != -1) { + } + + storage-initialized = true; +} + +void +set_block_binding(void *mem_ctx, gl_shader_program *prog, + const char *name, const glsl_type *type, int binding) +{ + struct gl_uniform_storage *const storage = + get_storage(prog-UniformStorage, prog-NumUserUniformStorage, name); + + if (storage == NULL) { + assert(storage != NULL); + return; + } + + if (storage-block_index != -1) { /* This is a field of a UBO. val is the binding index. */ for (int i = 0; i MESA_SHADER_STAGES; i++) { int stage_index = prog-UniformBlockStageIndex[i][storage-block_index]; @@ -134,6 +151,25 @@ set_uniform_binding(void *mem_ctx, gl_shader_program *prog, } void +set_uniform_binding(void *mem_ctx, gl_shader_program *prog, +const char *name, const glsl_type *type, int binding) +{ + struct gl_uniform_storage *const storage = + get_storage(prog-UniformStorage, prog-NumUserUniformStorage, name); + + if (storage == NULL) { + assert(storage != NULL); + return; + } + + if (storage-type-is_sampler()) { + set_sampler_binding(mem_ctx, prog, name, type, binding); + } else if (storage-block_index != -1) { + set_block_binding(mem_ctx, prog, name, type, binding); + } +} + +void set_uniform_initializer(void *mem_ctx, gl_shader_program *prog, const char *name, const glsl_type *type, ir_constant *val) signature.asc Description: OpenPGP digital signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] svga: move LIST_INITHEAD(dirty_buffers) earlier in svga_context_create()
Fixes a crash in svga_context_flush_buffers() if we use the 'draw' module for AA lines (when the device doesn't support that feature). We need to initialize this list before we setup the swtnl pieces. Found/fixed by Charmaine Lee. Cc: 10.0 mesa-sta...@lists.freedesktop.org --- src/gallium/drivers/svga/svga_context.c |4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/gallium/drivers/svga/svga_context.c b/src/gallium/drivers/svga/svga_context.c index 0ba09ce..8389384 100644 --- a/src/gallium/drivers/svga/svga_context.c +++ b/src/gallium/drivers/svga/svga_context.c @@ -123,6 +123,8 @@ struct pipe_context *svga_context_create( struct pipe_screen *screen, if (svga == NULL) goto no_svga; + LIST_INITHEAD(svga-dirty_buffers); + svga-pipe.screen = screen; svga-pipe.priv = priv; svga-pipe.destroy = svga_destroy; @@ -185,8 +187,6 @@ struct pipe_context *svga_context_create( struct pipe_screen *screen, svga-dirty = ~0; - LIST_INITHEAD(svga-dirty_buffers); - check_for_workarounds(svga); return svga-pipe; -- 1.7.10.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/4] xa: handle solid-fill src/mask
On 04/03/2014 04:07 PM, Brian Paul wrote: On 04/02/2014 11:17 PM, Thomas Hellstrom wrote: On 04/01/2014 05:04 PM, Rob Clark wrote: From: Rob Clark robcl...@freedesktop.org Add support to property handle solid-fill src and/or mask. Without this we fallback to sw a lot for common things like text rendering. Signed-off-by: Rob Clark robcl...@freedesktop.org --- src/gallium/state_trackers/xa/xa_composite.c | 88 src/gallium/state_trackers/xa/xa_priv.h | 7 +- src/gallium/state_trackers/xa/xa_renderer.c | 289 --- src/gallium/state_trackers/xa/xa_tgsi.c | 31 ++- 4 files changed, 242 insertions(+), 173 deletions(-) Rob, While testing this patch it looks like we sometimes set two samplers, and the first one is NULL. The SVGA driver asserts on that condition. We might need to move the active sampler to the first entry in that case, and adjust tex coords and shader accordingly. I'll discuss with BrianP. I think the root problem is a disagreement between texture samplers and sampler views. If a texture sampler is non-null, the corresponding sampler view be should be non-null too, and vice versa. We're tripping over an assertion when a a sampler view is non-null but the corresponding sampler is NULL. I'm going to write a patch for the driver to be more resilient in that situation. -Brian Brian, This is a different problem. Here, the state tracker sets up sampler[0] and sampler_view[0] to NULL, but sampler[1] and sampler_view[1] to NON-NULL, but samplers and sampler views are consistent. The question is whether that's OK, or whether that's not allowed. /Thomas ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://urldefense.proofpoint.com/v1/url?u=http://lists.freedesktop.org/mailman/listinfo/mesa-devk=oIvRg1%2BdGAgOoM1BIlLLqw%3D%3D%0Ar=l5Ago9ekmVFZ3c4M6eauqrJWGwjf6fTb%2BP3CxbBFkVM%3D%0Am=VGqKhmPosyC9QpnoVvQFOszMOE3eB%2B956me7wcCEnEs%3D%0As=5dd06c74671afd42afbd55b02f0c81895463044783e8d6f7a692963750b02123 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] svga: move LIST_INITHEAD(dirty_buffers) earlier in svga_context_create()
Reviewed-by: Thomas Hellstrom thellst...@vmware.com On 04/09/2014 07:40 PM, Brian Paul wrote: Fixes a crash in svga_context_flush_buffers() if we use the 'draw' module for AA lines (when the device doesn't support that feature). We need to initialize this list before we setup the swtnl pieces. Found/fixed by Charmaine Lee. Cc: 10.0 mesa-sta...@lists.freedesktop.org --- src/gallium/drivers/svga/svga_context.c |4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/gallium/drivers/svga/svga_context.c b/src/gallium/drivers/svga/svga_context.c index 0ba09ce..8389384 100644 --- a/src/gallium/drivers/svga/svga_context.c +++ b/src/gallium/drivers/svga/svga_context.c @@ -123,6 +123,8 @@ struct pipe_context *svga_context_create( struct pipe_screen *screen, if (svga == NULL) goto no_svga; + LIST_INITHEAD(svga-dirty_buffers); + svga-pipe.screen = screen; svga-pipe.priv = priv; svga-pipe.destroy = svga_destroy; @@ -185,8 +187,6 @@ struct pipe_context *svga_context_create( struct pipe_screen *screen, svga-dirty = ~0; - LIST_INITHEAD(svga-dirty_buffers); - check_for_workarounds(svga); return svga-pipe; ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] svga: move LIST_INITHEAD(dirty_buffers) earlier in svga_context_create()
On Wed, Apr 9, 2014 at 7:40 PM, Brian Paul bri...@vmware.com wrote: Fixes a crash in svga_context_flush_buffers() if we use the 'draw' module for AA lines (when the device doesn't support that feature). We need to initialize this list before we setup the swtnl pieces. Found/fixed by Charmaine Lee. Cc: 10.0 mesa-sta...@lists.freedesktop.org --- src/gallium/drivers/svga/svga_context.c |4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) Reviewed-by: Jakob Bornecrantz ja...@vmware.com Cheers, Jakob ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 4/5] st/xa: handle solid-fill src/mask
Hi, Rob! On 04/08/2014 10:48 PM, Rob Clark wrote: From: Rob Clark robcl...@freedesktop.org Add support to property handle solid-fill src and/or mask. Without this we fallback to sw a lot for common things like text rendering. Signed-off-by: Rob Clark robcl...@freedesktop.org --- src/gallium/state_trackers/xa/xa_composite.c | 115 +-- src/gallium/state_trackers/xa/xa_priv.h | 13 +- src/gallium/state_trackers/xa/xa_renderer.c | 298 --- src/gallium/state_trackers/xa/xa_tgsi.c | 36 +++- 4 files changed, 263 insertions(+), 199 deletions(-) diff --git a/src/gallium/state_trackers/xa/xa_composite.c b/src/gallium/state_trackers/xa/xa_composite.c index 7ae35a1..b70fd47 100644 --- a/src/gallium/state_trackers/xa/xa_composite.c +++ b/src/gallium/state_trackers/xa/xa_composite.c @@ -111,12 +111,6 @@ blend_for_op(struct xa_composite_blend *blend, boolean supported = FALSE; /* - * Temporarily disable component alpha since it appears buggy. - */ -if (mask_pic mask_pic-component_alpha) - return FALSE; - -/* I'll attach the rendercheck logs of two early regression. The first one (log1.txt) happens because we enable component_alpha here. The second one is with component alpha disabled again. /Thomas rendercheck 1.4 Render extension version 0.11 Window format: r8g8b8 Found server-supported format: a8 Found server-supported format: a8r8g8b8 Found server-supported format: x8r8g8b8 Found server-supported format: b8g8r8a8 Found server-supported format: b8g8r8x8 Found server-supported format: r8g8b8 Found server-supported format: b8g8r8 Found server-supported format: r5g5b5 Found server-supported format: b5g5r5 Found server-supported format: x1r5g5b5 Found server-supported format: x1b5g5r5 Found server-supported format: r5g6b5 Found server-supported format: b5g6r5 Found server-supported format: x8b8g8r8 Found server-supported format: x2r10g10b10 Found server-supported format: x2b10g10r10 Beginning testing of filling of 1x1R pictures Beginning testing of filling of 10x10 pictures Beginning dest coords test Beginning src coords test Beginning mask coords test mask coords test error of 32. at (1, 0) -- RGBA got: 0.00 0.00 0.00 1.00 expected: 1.00 0.00 0.00 1.00 mask coords test error of 32. at (2, 0) -- RGBA got: 0.00 0.00 0.00 1.00 expected: 1.00 0.00 0.00 1.00 mask coords test error of 32. at (3, 0) -- RGBA got: 0.00 0.00 0.00 1.00 expected: 1.00 0.00 0.00 1.00 mask coords test error of 32. at (4, 0) -- RGBA got: 0.00 0.00 0.00 1.00 expected: 1.00 0.00 0.00 1.00 mask coords test error of 32. at (0, 1) -- RGBA got: 0.00 0.00 0.00 1.00 expected: 1.00 0.00 0.00 1.00 mask coords test error of 64. at (1, 1) -- RGBA got: 0.00 0.00 0.00 1.00 expected: 1.00 1.00 1.00 1.00 mask coords test error of 32. at (2, 1) -- RGBA got: 0.00 0.00 0.00 1.00 expected: 1.00 0.00 0.00 1.00 mask coords test error of 64. at (3, 1) -- RGBA got: 0.00 0.00 0.00 1.00 expected: 1.00 1.00 1.00 1.00 mask coords test error of 32. at (4, 1) -- RGBA got: 0.00 0.00 0.00 1.00 expected: 1.00 0.00 0.00 1.00 mask coords test error of 32. at (0, 2) -- RGBA got: 0.00 0.00 0.00 1.00 expected: 1.00 0.00 0.00 1.00 mask coords test error of 64. at (1, 2) -- RGBA got: 0.00 0.00 0.00 1.00 expected: 1.00 1.00 1.00 1.00 mask coords test error of 32. at (2, 2) -- RGBA got: 0.00 0.00 0.00 1.00 expected: 1.00 0.00 0.00 1.00 mask coords test error of 64. at (3, 2) -- RGBA got: 0.00 0.00 0.00 1.00 expected: 1.00 1.00 1.00 1.00 mask coords test error of 32. at (4, 2) -- RGBA got: 0.00 0.00 0.00 1.00 expected: 1.00 0.00 0.00 1.00 mask coords test error of 32. at (0, 3) -- RGBA got: 0.00 0.00 0.00 1.00 expected: 1.00 0.00 0.00 1.00 mask coords test error of 64. at (1, 3) -- RGBA got: 0.00 0.00 0.00 1.00 expected: 1.00 1.00 1.00 1.00 mask coords test error of 64. at (2, 3) -- RGBA got: 0.00 0.00 0.00 1.00 expected: 1.00 1.00 1.00 1.00 mask coords test error of 32. at (3, 3) -- RGBA got: 0.00 0.00 0.00 1.00 expected: 1.00 0.00 0.00 1.00 mask coords test error of 32. at (4, 3) -- RGBA got: 0.00 0.00 0.00 1.00 expected: 1.00 0.00 0.00 1.00 mask coords test error of 32. at (0, 4) -- RGBA got: 0.00 0.00 0.00 1.00 expected: 1.00 0.00 0.00 1.00 mask coords test error of 32. at (1, 4) -- RG
Re: [Mesa-dev] [Mesa-stable] [PATCH 1/2] glx: drop obsolete _XUnlock_Mutex in __glXInitialize error path
On 03/16/2014 07:10 AM, Emil Velikov wrote: With commit 1f1928db001(glx: Drop _Xglobal_lock while we create and initialize glx display) we've split the big _Xglobal_lock handling in a more fine grained manner. Unfortunatelly we forgot to drop the unlock_mutex on the error paths, leading to undefined behaviour as the mutex is already unlocked. Cc: Kristian Høgsberg k...@bitplanet.net Cc: 9.2 10.0 10.1 mesa-sta...@lists.freedesktop.org Signed-off-by: Emil Velikov emil.l.veli...@gmail.com Sorry for not looking at this sooner... I checked the code, and this patch is obviously correct. The lock was released just a few lines earlier (outside the patch). Reviewed-by: Ian Romanick ian.d.roman...@intel.com --- src/glx/glxext.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/src/glx/glxext.c b/src/glx/glxext.c index 4a195bd..de73036 100644 --- a/src/glx/glxext.c +++ b/src/glx/glxext.c @@ -826,7 +826,6 @@ __glXInitialize(Display * dpy) dpyPriv-codes = XInitExtension(dpy, __glXExtensionName); if (!dpyPriv-codes) { free(dpyPriv); - _XUnlockMutex(_Xglobal_lock); return NULL; } @@ -842,7 +841,6 @@ __glXInitialize(Display * dpy) dpyPriv-majorVersion, dpyPriv-minorVersion) || (dpyPriv-majorVersion == 1 dpyPriv-minorVersion 1)) { free(dpyPriv); - _XUnlockMutex(_Xglobal_lock); return NULL; } @@ -907,7 +905,7 @@ __glXInitialize(Display * dpy) dpyPriv-next = glx_displays; glx_displays = dpyPriv; -_XUnlockMutex(_Xglobal_lock); + _XUnlockMutex(_Xglobal_lock); return dpyPriv; } ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/4] xa: handle solid-fill src/mask
On 04/09/2014 11:50 AM, Thomas Hellstrom wrote: On 04/03/2014 04:07 PM, Brian Paul wrote: On 04/02/2014 11:17 PM, Thomas Hellstrom wrote: On 04/01/2014 05:04 PM, Rob Clark wrote: From: Rob Clark robcl...@freedesktop.org Add support to property handle solid-fill src and/or mask. Without this we fallback to sw a lot for common things like text rendering. Signed-off-by: Rob Clark robcl...@freedesktop.org --- src/gallium/state_trackers/xa/xa_composite.c | 88 src/gallium/state_trackers/xa/xa_priv.h | 7 +- src/gallium/state_trackers/xa/xa_renderer.c | 289 --- src/gallium/state_trackers/xa/xa_tgsi.c | 31 ++- 4 files changed, 242 insertions(+), 173 deletions(-) Rob, While testing this patch it looks like we sometimes set two samplers, and the first one is NULL. The SVGA driver asserts on that condition. We might need to move the active sampler to the first entry in that case, and adjust tex coords and shader accordingly. I'll discuss with BrianP. I think the root problem is a disagreement between texture samplers and sampler views. If a texture sampler is non-null, the corresponding sampler view be should be non-null too, and vice versa. We're tripping over an assertion when a a sampler view is non-null but the corresponding sampler is NULL. I'm going to write a patch for the driver to be more resilient in that situation. -Brian Brian, This is a different problem. Here, the state tracker sets up sampler[0] and sampler_view[0] to NULL, but sampler[1] and sampler_view[1] to NON-NULL, but samplers and sampler views are consistent. The question is whether that's OK, or whether that's not allowed. I think that's OK. -Brian ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 4/5] st/xa: handle solid-fill src/mask
On Wed, Apr 9, 2014 at 1:59 PM, Thomas Hellstrom thellst...@vmware.com wrote: Hi, Rob! On 04/08/2014 10:48 PM, Rob Clark wrote: From: Rob Clark robcl...@freedesktop.org Add support to property handle solid-fill src and/or mask. Without this we fallback to sw a lot for common things like text rendering. Signed-off-by: Rob Clark robcl...@freedesktop.org --- src/gallium/state_trackers/xa/xa_composite.c | 115 +-- src/gallium/state_trackers/xa/xa_priv.h | 13 +- src/gallium/state_trackers/xa/xa_renderer.c | 298 --- src/gallium/state_trackers/xa/xa_tgsi.c | 36 +++- 4 files changed, 263 insertions(+), 199 deletions(-) diff --git a/src/gallium/state_trackers/xa/xa_composite.c b/src/gallium/state_trackers/xa/xa_composite.c index 7ae35a1..b70fd47 100644 --- a/src/gallium/state_trackers/xa/xa_composite.c +++ b/src/gallium/state_trackers/xa/xa_composite.c @@ -111,12 +111,6 @@ blend_for_op(struct xa_composite_blend *blend, boolean supported = FALSE; /* - * Temporarily disable component alpha since it appears buggy. - */ -if (mask_pic mask_pic-component_alpha) - return FALSE; - -/* oh, I guess that hunk should have been a different patch anyways (even if it worked) I'll attach the rendercheck logs of two early regression. The first one (log1.txt) happens because we enable component_alpha here. The second one is with component alpha disabled again. hmm.. that almost looks like a vertex shader issue (if I'm understanding what rendercheck is saying properly). Like the mask coords are wrong? BR, -R /Thomas ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/3] i965: Add reads_accumulator_implicitly() function.
--- src/mesa/drivers/dri/i965/brw_shader.cpp | 16 src/mesa/drivers/dri/i965/brw_shader.h | 1 + 2 files changed, 17 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_shader.cpp b/src/mesa/drivers/dri/i965/brw_shader.cpp index f194437..c8796b3 100644 --- a/src/mesa/drivers/dri/i965/brw_shader.cpp +++ b/src/mesa/drivers/dri/i965/brw_shader.cpp @@ -664,6 +664,22 @@ backend_instruction::can_do_saturate() const } bool +backend_instruction::reads_accumulator_implicitly() const +{ + switch (opcode) { + case BRW_OPCODE_MAC: + case BRW_OPCODE_MACH: + /* FINISHME: Enable these if we ever start emitting them. +* case BRW_OPCODE_SADA: +* case BRW_OPCODE_SADA2: +*/ + return true; + default: + return false; + } +} + +bool backend_instruction::has_side_effects() const { switch (opcode) { diff --git a/src/mesa/drivers/dri/i965/brw_shader.h b/src/mesa/drivers/dri/i965/brw_shader.h index 6bd7dc8..9ef08e5 100644 --- a/src/mesa/drivers/dri/i965/brw_shader.h +++ b/src/mesa/drivers/dri/i965/brw_shader.h @@ -47,6 +47,7 @@ public: bool is_control_flow() const; bool can_do_source_mods() const; bool can_do_saturate() const; + bool reads_accumulator_implicitly() const; /** * True if the instruction has side effects other than writing to -- 1.8.3.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/3] i965: Add is_accumulator() function.
From: Juha-Pekka Heikkila juhapekka.heikk...@gmail.com Reviewed-by: Matt Turner matts...@gmail.com Signed-off-by: Juha-Pekka Heikkila juhapekka.heikk...@gmail.com --- As a follow-on patch series, we should move common fields from fs_reg and vec4's reg into a backend_reg and consolidate these functions. src/mesa/drivers/dri/i965/brw_fs.cpp | 8 src/mesa/drivers/dri/i965/brw_fs.h | 1 + src/mesa/drivers/dri/i965/brw_vec4.cpp | 17 + src/mesa/drivers/dri/i965/brw_vec4.h | 2 ++ 4 files changed, 28 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp index 85a5463..e576545 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp @@ -501,6 +501,14 @@ fs_reg::is_valid_3src() const return file == GRF || file == UNIFORM; } +bool +fs_reg::is_accumulator() const +{ + return file == HW_REG + fixed_hw_reg.file == BRW_ARCHITECTURE_REGISTER_FILE + fixed_hw_reg.nr == BRW_ARF_ACCUMULATOR; +} + int fs_visitor::type_size(const struct glsl_type *type) { diff --git a/src/mesa/drivers/dri/i965/brw_fs.h b/src/mesa/drivers/dri/i965/brw_fs.h index 3d21ee5..1dadccd 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.h +++ b/src/mesa/drivers/dri/i965/brw_fs.h @@ -83,6 +83,7 @@ public: bool is_null() const; bool is_valid_3src() const; bool is_contiguous() const; + bool is_accumulator() const; fs_reg apply_stride(unsigned stride); /** Smear a channel of the reg to all channels. */ diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp b/src/mesa/drivers/dri/i965/brw_vec4.cpp index 740d9ff..38d2b93 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp +++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp @@ -151,6 +151,15 @@ src_reg::src_reg(dst_reg reg) swizzles[2], swizzles[3]); } +bool +src_reg::is_accumulator() const +{ + return file == HW_REG + fixed_hw_reg.file == BRW_ARCHITECTURE_REGISTER_FILE + fixed_hw_reg.nr == BRW_ARF_ACCUMULATOR; +} + + void dst_reg::init() { @@ -221,6 +230,14 @@ dst_reg::is_null() const } bool +dst_reg::is_accumulator() const +{ + return file == HW_REG + fixed_hw_reg.file == BRW_ARCHITECTURE_REGISTER_FILE + fixed_hw_reg.nr == BRW_ARF_ACCUMULATOR; +} + +bool vec4_instruction::is_send_from_grf() { switch (opcode) { diff --git a/src/mesa/drivers/dri/i965/brw_vec4.h b/src/mesa/drivers/dri/i965/brw_vec4.h index 159a5bd..b3549a5 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4.h +++ b/src/mesa/drivers/dri/i965/brw_vec4.h @@ -128,6 +128,7 @@ public: bool equals(src_reg *r); bool is_zero() const; bool is_one() const; + bool is_accumulator() const; src_reg(class vec4_visitor *v, const struct glsl_type *type); @@ -195,6 +196,7 @@ public: explicit dst_reg(src_reg reg); bool is_null() const; + bool is_accumulator() const; int writemask; /** Bitfield of WRITEMASK_[XYZW] */ -- 1.8.3.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 3/3] i965: Add writes_accumulator flag
From: Juha-Pekka Heikkila juhapekka.heikk...@gmail.com Our hardware has an accumulator register, which can be used to store intermediate results across multiple instructions. Many instructions can implicitly write a value to the accumulator in addition to their normal destination register. This is enabled by the AccWrEn flag. This patch introduces a new flag, inst-writes_accumulator, which allows us to express the AccWrEn notion in the IR. It also creates a n ALU2_ACC macro to easily define emitters for instructions that implicitly write the accumulator. Previously, we only supported implicit accumulator writes from the ADDC, SUBB, and MACH instructions. We always enabled them on those instructions, and left them disabled for other instructions. To take advantage of the MAC (multiply-accumulate) instruction, we need to be able to set AccWrEn on other types of instructions. Reviewed-by: Matt Turner matts...@gmail.com Signed-off-by: Juha-Pekka Heikkila juhapekka.heikk...@gmail.com --- I split out is_accumulator() into a separate patch, and made some fixes to the scheduling code. Let me know if these changes look good to you, JP. (Patch formatted with -U15 as to see other sections of the scheduling code during review) src/mesa/drivers/dri/i965/brw_fs.cpp | 26 ++ src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 7 +-- .../drivers/dri/i965/brw_schedule_instructions.cpp | 58 ++ src/mesa/drivers/dri/i965/brw_shader.h | 1 + src/mesa/drivers/dri/i965/brw_vec4.cpp | 15 ++ src/mesa/drivers/dri/i965/brw_vec4_generator.cpp | 7 +-- src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 17 +-- 7 files changed, 95 insertions(+), 36 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp index e576545..0eece60 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp @@ -52,30 +52,32 @@ extern C { #include glsl/glsl_types.h void fs_inst::init() { memset(this, 0, sizeof(*this)); this-conditional_mod = BRW_CONDITIONAL_NONE; this-dst = reg_undef; this-src[0] = reg_undef; this-src[1] = reg_undef; this-src[2] = reg_undef; /* This will be the case for almost all instructions. */ this-regs_written = 1; + + this-writes_accumulator = false; } fs_inst::fs_inst() { init(); this-opcode = BRW_OPCODE_NOP; } fs_inst::fs_inst(enum opcode opcode) { init(); this-opcode = opcode; } fs_inst::fs_inst(enum opcode opcode, fs_reg dst) @@ -139,63 +141,72 @@ fs_inst::fs_inst(enum opcode opcode, fs_reg dst, #define ALU1(op)\ fs_inst *\ fs_visitor::op(fs_reg dst, fs_reg src0) \ {\ return new(mem_ctx) fs_inst(BRW_OPCODE_##op, dst, src0); \ } #define ALU2(op)\ fs_inst *\ fs_visitor::op(fs_reg dst, fs_reg src0, fs_reg src1) \ {\ return new(mem_ctx) fs_inst(BRW_OPCODE_##op, dst, src0, src1);\ } +#define ALU2_ACC(op)\ + fs_inst *\ + fs_visitor::op(fs_reg dst, fs_reg src0, fs_reg src1) \ + {\ + fs_inst *inst = new(mem_ctx) fs_inst(BRW_OPCODE_##op, dst, src0, src1);\ + inst-writes_accumulator = true; \ + return inst; \ + } + #define ALU3(op)\ fs_inst *\ fs_visitor::op(fs_reg dst, fs_reg src0, fs_reg src1, fs_reg src2)\ {\ return new(mem_ctx) fs_inst(BRW_OPCODE_##op, dst, src0, src1, src2);\ } ALU1(NOT) ALU1(MOV) ALU1(FRC) ALU1(RNDD) ALU1(RNDE) ALU1(RNDZ) ALU2(ADD) ALU2(MUL) -ALU2(MACH) +ALU2_ACC(MACH) ALU2(AND) ALU2(OR) ALU2(XOR) ALU2(SHL) ALU2(SHR) ALU2(ASR) ALU3(LRP) ALU1(BFREV) ALU3(BFE) ALU2(BFI1) ALU3(BFI2) ALU1(FBH) ALU1(FBL) ALU1(CBIT) ALU3(MAD) -ALU2(ADDC) -ALU2(SUBB) +ALU2_ACC(ADDC) +ALU2_ACC(SUBB) ALU2(SEL) /** Gen4 predicated IF. */ fs_inst * fs_visitor::IF(uint32_t predicate) { fs_inst *inst = new(mem_ctx) fs_inst(BRW_OPCODE_IF); inst-predicate = predicate; return inst; } /** Gen6 IF with embedded comparison. */ fs_inst * fs_visitor::IF(fs_reg src0, fs_reg src1,
Re: [Mesa-dev] [PATCH 1/5] i965: Add writes_accumulator flag
On Fri, Apr 4, 2014 at 6:51 AM, Juha-Pekka Heikkila juhapekka.heikk...@gmail.com wrote: diff --git a/src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp b/src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp index a951459..92f82fd 100644 --- a/src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp +++ b/src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp @@ -758,6 +758,7 @@ fs_instruction_scheduler::calculate_deps() schedule_node *last_fixed_grf_write = NULL; int reg_width = v-dispatch_width / 8; + schedule_node *last_accumulator_write = NULL; /* The last instruction always needs to still be the last * instruction. Either it's flow control (IF, ELSE, ENDIF, DO, * WHILE) and scheduling other things after it would disturb the @@ -822,6 +823,10 @@ fs_instruction_scheduler::calculate_deps() The line before this was if (inst-reads_flag()) { add_dep(last_conditional_mod[inst-flag_subreg], n); } + if (inst-writes_accumulator || inst-dst.is_accumulator()) { + add_dep(last_accumulator_write, n); + } But we're checking if we're writing the accumulator here, instead of reading it. We're also not giving the scheduler any benefits from it's new knowledge of accumulator dependencies, because we're still calling add_barrier_deps() above when we don't recognize the destination. I hope you don't mind, but I split the is_accumulator() additions into a separate patch, fixed up the scheduler hunks and sent the revised patch. Let me know if it looks right to you. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 4/5] st/xa: handle solid-fill src/mask
On Wed, Apr 9, 2014 at 1:59 PM, Thomas Hellstrom thellst...@vmware.com wrote: Hi, Rob! On 04/08/2014 10:48 PM, Rob Clark wrote: From: Rob Clark robcl...@freedesktop.org Add support to property handle solid-fill src and/or mask. Without this we fallback to sw a lot for common things like text rendering. Signed-off-by: Rob Clark robcl...@freedesktop.org --- src/gallium/state_trackers/xa/xa_composite.c | 115 +-- src/gallium/state_trackers/xa/xa_priv.h | 13 +- src/gallium/state_trackers/xa/xa_renderer.c | 298 --- src/gallium/state_trackers/xa/xa_tgsi.c | 36 +++- 4 files changed, 263 insertions(+), 199 deletions(-) diff --git a/src/gallium/state_trackers/xa/xa_composite.c b/src/gallium/state_trackers/xa/xa_composite.c index 7ae35a1..b70fd47 100644 --- a/src/gallium/state_trackers/xa/xa_composite.c +++ b/src/gallium/state_trackers/xa/xa_composite.c @@ -111,12 +111,6 @@ blend_for_op(struct xa_composite_blend *blend, boolean supported = FALSE; /* - * Temporarily disable component alpha since it appears buggy. - */ -if (mask_pic mask_pic-component_alpha) - return FALSE; - -/* I'll attach the rendercheck logs of two early regression. The first one (log1.txt) happens because we enable component_alpha here. The second one is with component alpha disabled again. hmm, so for the second test, it works for me with --sync: [robclark@reptile:~]$ rendercheck --sync -v -t mcoords rendercheck 1.4 Render extension version 0.11 Window format: r8g8b8 Found server-supported format: a8 Found server-supported format: a8r8g8b8 Found server-supported format: x8r8g8b8 Found server-supported format: b8g8r8a8 Found server-supported format: b8g8r8x8 Found server-supported format: r8g8b8 Found server-supported format: b8g8r8 Found server-supported format: r5g5b5 Found server-supported format: b5g5r5 Found server-supported format: x1r5g5b5 Found server-supported format: x1b5g5r5 Found server-supported format: r5g6b5 Found server-supported format: b5g6r5 Found server-supported format: x8b8g8r8 Found server-supported format: x2r10g10b10 Found server-supported format: x2b10g10r10 Beginning mask coords test 1 tests passed of 1 total Successful Groups: mcoords [robclark@reptile:~]$ but not without (although the error I get is a bit different.. although maybe different rendercheck args?) [robclark@reptile:~]$ rendercheck -v -t mcoords rendercheck 1.4 Render extension version 0.11 Window format: r8g8b8 Found server-supported format: a8 Found server-supported format: a8r8g8b8 Found server-supported format: x8r8g8b8 Found server-supported format: b8g8r8a8 Found server-supported format: b8g8r8x8 Found server-supported format: r8g8b8 Found server-supported format: b8g8r8 Found server-supported format: r5g5b5 Found server-supported format: b5g5r5 Found server-supported format: x1r5g5b5 Found server-supported format: x1b5g5r5 Found server-supported format: r5g6b5 Found server-supported format: b5g6r5 Found server-supported format: x8b8g8r8 Found server-supported format: x2r10g10b10 Found server-supported format: x2b10g10r10 Beginning mask coords test mask coords test error of 255. at (0, 0) -- R G B A got: 1.000 1.000 1.000 1.000 expected: 1.000 0.000 0.000 1.000 mask coords test error of 255. at (1, 1) -- R G B A got: 1.000 0.000 0.000 1.000 expected: 1.000 1.000 1.000 1.000 mask coords test error of 255. at (2, 1) -- R G B A got: 1.000 1.000 1.000 1.000 expected: 1.000 0.000 0.000 1.000 mask coords test error of 255. at (3, 1) -- R G B A got: 1.000 0.000 0.000 1.000 expected: 1.000 1.000 1.000 1.000 mask coords test error of 255. at (4, 1) -- R G B A got: 1.000 1.000 1.000 1.000 expected: 1.000 0.000 0.000 1.000 mask coords test error of 255. at (1, 2) -- R G B A got: 1.000 0.000 0.000 1.000 expected: 1.000 1.000 1.000 1.000 mask coords test error of 255. at (2, 2) -- R G B A got: 1.000 1.000 1.000 1.000 expected: 1.000 0.000 0.000 1.000 mask coords test error of 255. at (3, 2) -- R G B A got: 1.000 0.000 0.000 1.000 expected: 1.000 1.000 1.000 1.000 mask coords test error of 255. at (4, 2) -- R G B A got: 1.000 1.000 1.000 1.000 expected: 1.000 0.000 0.000 1.000 mask coords test error of 255. at (1, 3) -- R G B A got: 1.000 0.000 0.000 1.000 expected: 1.000 1.000 1.000 1.000 mask coords test error of 255. at (3, 3) -- R G B A got: 1.000 1.000 1.000 1.000 expected: 1.000 0.000 0.000 1.000 expected vs tested: 1 0 10101 11010 10101 11010 10011 11001 1 1 0 tests passed of 1 total Successful Groups:
Re: [Mesa-dev] [PATCH v2 4/5] st/xa: handle solid-fill src/mask
On Wed, Apr 9, 2014 at 5:12 PM, Rob Clark robdcl...@gmail.com wrote: On Wed, Apr 9, 2014 at 1:59 PM, Thomas Hellstrom thellst...@vmware.com wrote: Hi, Rob! On 04/08/2014 10:48 PM, Rob Clark wrote: From: Rob Clark robcl...@freedesktop.org Add support to property handle solid-fill src and/or mask. Without this we fallback to sw a lot for common things like text rendering. Signed-off-by: Rob Clark robcl...@freedesktop.org --- src/gallium/state_trackers/xa/xa_composite.c | 115 +-- src/gallium/state_trackers/xa/xa_priv.h | 13 +- src/gallium/state_trackers/xa/xa_renderer.c | 298 --- src/gallium/state_trackers/xa/xa_tgsi.c | 36 +++- 4 files changed, 263 insertions(+), 199 deletions(-) diff --git a/src/gallium/state_trackers/xa/xa_composite.c b/src/gallium/state_trackers/xa/xa_composite.c index 7ae35a1..b70fd47 100644 --- a/src/gallium/state_trackers/xa/xa_composite.c +++ b/src/gallium/state_trackers/xa/xa_composite.c @@ -111,12 +111,6 @@ blend_for_op(struct xa_composite_blend *blend, boolean supported = FALSE; /* - * Temporarily disable component alpha since it appears buggy. - */ -if (mask_pic mask_pic-component_alpha) - return FALSE; - -/* I'll attach the rendercheck logs of two early regression. The first one (log1.txt) happens because we enable component_alpha here. The second one is with component alpha disabled again. hmm, so for the second test, it works for me with --sync: oh, and if I bring back disabling of mask pic w/ component alpha, then it passes for me both with and without --sync.. BR, -R [robclark@reptile:~]$ rendercheck --sync -v -t mcoords rendercheck 1.4 Render extension version 0.11 Window format: r8g8b8 Found server-supported format: a8 Found server-supported format: a8r8g8b8 Found server-supported format: x8r8g8b8 Found server-supported format: b8g8r8a8 Found server-supported format: b8g8r8x8 Found server-supported format: r8g8b8 Found server-supported format: b8g8r8 Found server-supported format: r5g5b5 Found server-supported format: b5g5r5 Found server-supported format: x1r5g5b5 Found server-supported format: x1b5g5r5 Found server-supported format: r5g6b5 Found server-supported format: b5g6r5 Found server-supported format: x8b8g8r8 Found server-supported format: x2r10g10b10 Found server-supported format: x2b10g10r10 Beginning mask coords test 1 tests passed of 1 total Successful Groups: mcoords [robclark@reptile:~]$ but not without (although the error I get is a bit different.. although maybe different rendercheck args?) [robclark@reptile:~]$ rendercheck -v -t mcoords rendercheck 1.4 Render extension version 0.11 Window format: r8g8b8 Found server-supported format: a8 Found server-supported format: a8r8g8b8 Found server-supported format: x8r8g8b8 Found server-supported format: b8g8r8a8 Found server-supported format: b8g8r8x8 Found server-supported format: r8g8b8 Found server-supported format: b8g8r8 Found server-supported format: r5g5b5 Found server-supported format: b5g5r5 Found server-supported format: x1r5g5b5 Found server-supported format: x1b5g5r5 Found server-supported format: r5g6b5 Found server-supported format: b5g6r5 Found server-supported format: x8b8g8r8 Found server-supported format: x2r10g10b10 Found server-supported format: x2b10g10r10 Beginning mask coords test mask coords test error of 255. at (0, 0) -- R G B A got: 1.000 1.000 1.000 1.000 expected: 1.000 0.000 0.000 1.000 mask coords test error of 255. at (1, 1) -- R G B A got: 1.000 0.000 0.000 1.000 expected: 1.000 1.000 1.000 1.000 mask coords test error of 255. at (2, 1) -- R G B A got: 1.000 1.000 1.000 1.000 expected: 1.000 0.000 0.000 1.000 mask coords test error of 255. at (3, 1) -- R G B A got: 1.000 0.000 0.000 1.000 expected: 1.000 1.000 1.000 1.000 mask coords test error of 255. at (4, 1) -- R G B A got: 1.000 1.000 1.000 1.000 expected: 1.000 0.000 0.000 1.000 mask coords test error of 255. at (1, 2) -- R G B A got: 1.000 0.000 0.000 1.000 expected: 1.000 1.000 1.000 1.000 mask coords test error of 255. at (2, 2) -- R G B A got: 1.000 1.000 1.000 1.000 expected: 1.000 0.000 0.000 1.000 mask coords test error of 255. at (3, 2) -- R G B A got: 1.000 0.000 0.000 1.000 expected: 1.000 1.000 1.000 1.000 mask coords test error of 255. at (4, 2) -- R G B A got: 1.000 1.000 1.000 1.000 expected: 1.000 0.000 0.000 1.000 mask coords test error of 255. at (1, 3) -- R G B A got: 1.000 0.000 0.000 1.000 expected: 1.000 1.000
[Mesa-dev] [PATCH] i965: Don't make instructions with a null dest a barrier to scheduling.
Now that we properly track accumulator dependencies, the scheduler is able to schedule instructions between the mach and mov in the common the integer multiplication pattern: mul acc0, x, y mach null, x, y mov dest, acc0 Since a null destination implies no dependency on the destination, we can also safely schedule instructions (that don't write the accumulator) between the mul and mach. --- This depends on JP's accumulator series. src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp | 12 1 file changed, 8 insertions(+), 4 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp b/src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp index 3538da5..910b73a 100644 --- a/src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp +++ b/src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp @@ -864,7 +864,8 @@ fs_instruction_scheduler::calculate_deps() } else if (inst-dst.is_accumulator()) { add_dep(last_accumulator_write, n); last_accumulator_write = n; - } else if (inst-dst.file != BAD_FILE) { + } else if (inst-dst.file != BAD_FILE + !inst-dst.is_null()) { add_barrier_deps(n); } @@ -983,7 +984,8 @@ fs_instruction_scheduler::calculate_deps() } } else if (inst-dst.is_accumulator()) { last_accumulator_write = n; - } else if (inst-dst.file != BAD_FILE) { + } else if (inst-dst.file != BAD_FILE + !inst-dst.is_null()) { add_barrier_deps(n); } @@ -1089,7 +1091,8 @@ vec4_instruction_scheduler::calculate_deps() } else if (inst-dst.is_accumulator()) { add_dep(last_accumulator_write, n); last_accumulator_write = n; - } else if (inst-dst.file != BAD_FILE) { + } else if (inst-dst.file != BAD_FILE + !inst-dst.is_null()) { add_barrier_deps(n); } @@ -1173,7 +1176,8 @@ vec4_instruction_scheduler::calculate_deps() last_fixed_grf_write = n; } else if (inst-dst.is_accumulator()) { last_accumulator_write = n; - } else if (inst-dst.file != BAD_FILE) { + } else if (inst-dst.file != BAD_FILE + !inst-dst.is_null()) { add_barrier_deps(n); } -- 1.8.3.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/3] i965: Add reads_accumulator_implicitly() function.
Matt Turner matts...@gmail.com writes: --- src/mesa/drivers/dri/i965/brw_shader.cpp | 16 src/mesa/drivers/dri/i965/brw_shader.h | 1 + 2 files changed, 17 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_shader.cpp b/src/mesa/drivers/dri/i965/brw_shader.cpp index f194437..c8796b3 100644 --- a/src/mesa/drivers/dri/i965/brw_shader.cpp +++ b/src/mesa/drivers/dri/i965/brw_shader.cpp @@ -664,6 +664,22 @@ backend_instruction::can_do_saturate() const } bool +backend_instruction::reads_accumulator_implicitly() const +{ + switch (opcode) { + case BRW_OPCODE_MAC: + case BRW_OPCODE_MACH: + /* FINISHME: Enable these if we ever start emitting them. +* case BRW_OPCODE_SADA: +* case BRW_OPCODE_SADA2: +*/ Let's just uncomment SADA2 right away to prevent pain in the future. SAD2 doesn't read the acc, though. Other than that, the first 2 patches are: Reviewed-by: Eric Anholt e...@anholt.net I think scheduling is still broken in the last one, because you're removing the barrier deps on implicit-accumulator opcodes and replacing them with explicit dependencies, but you're not tracking the accumulator updates by almost-all-instructions pre-gen6. The scheduler would be free to slip in some unrelated instruction after the MUL in the following snippet from brw_vec4_visitor.cpp: emit(MUL(acc, op[0], op[1])); emit(MACH(dst_null_d(), op[0], op[1])); emit(MOV(result_dst, src_reg(acc))); (err, why are we doing MACH and MOV instead of just MACH into result_dst?) pgpumEA54Dvmj.pgp Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/3] i965: Add is_accumulator() function.
On 04/09/2014 01:47 PM, Matt Turner wrote: From: Juha-Pekka Heikkila juhapekka.heikk...@gmail.com Reviewed-by: Matt Turner matts...@gmail.com Signed-off-by: Juha-Pekka Heikkila juhapekka.heikk...@gmail.com --- As a follow-on patch series, we should move common fields from fs_reg and vec4's reg into a backend_reg and consolidate these functions. Yeah, there's been some talk about creating one of those. But even at present, src_reg and dst_reg both inherit from reg. is_accumulator() should be defined there, not in both subclasses. src/mesa/drivers/dri/i965/brw_fs.cpp | 8 src/mesa/drivers/dri/i965/brw_fs.h | 1 + src/mesa/drivers/dri/i965/brw_vec4.cpp | 17 + src/mesa/drivers/dri/i965/brw_vec4.h | 2 ++ 4 files changed, 28 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp index 85a5463..e576545 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp @@ -501,6 +501,14 @@ fs_reg::is_valid_3src() const return file == GRF || file == UNIFORM; } +bool +fs_reg::is_accumulator() const +{ + return file == HW_REG + fixed_hw_reg.file == BRW_ARCHITECTURE_REGISTER_FILE + fixed_hw_reg.nr == BRW_ARF_ACCUMULATOR; +} + int fs_visitor::type_size(const struct glsl_type *type) { diff --git a/src/mesa/drivers/dri/i965/brw_fs.h b/src/mesa/drivers/dri/i965/brw_fs.h index 3d21ee5..1dadccd 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.h +++ b/src/mesa/drivers/dri/i965/brw_fs.h @@ -83,6 +83,7 @@ public: bool is_null() const; bool is_valid_3src() const; bool is_contiguous() const; + bool is_accumulator() const; fs_reg apply_stride(unsigned stride); /** Smear a channel of the reg to all channels. */ diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp b/src/mesa/drivers/dri/i965/brw_vec4.cpp index 740d9ff..38d2b93 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp +++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp @@ -151,6 +151,15 @@ src_reg::src_reg(dst_reg reg) swizzles[2], swizzles[3]); } +bool +src_reg::is_accumulator() const +{ + return file == HW_REG + fixed_hw_reg.file == BRW_ARCHITECTURE_REGISTER_FILE + fixed_hw_reg.nr == BRW_ARF_ACCUMULATOR; +} + + void dst_reg::init() { @@ -221,6 +230,14 @@ dst_reg::is_null() const } bool +dst_reg::is_accumulator() const +{ + return file == HW_REG + fixed_hw_reg.file == BRW_ARCHITECTURE_REGISTER_FILE + fixed_hw_reg.nr == BRW_ARF_ACCUMULATOR; +} + +bool vec4_instruction::is_send_from_grf() { switch (opcode) { diff --git a/src/mesa/drivers/dri/i965/brw_vec4.h b/src/mesa/drivers/dri/i965/brw_vec4.h index 159a5bd..b3549a5 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4.h +++ b/src/mesa/drivers/dri/i965/brw_vec4.h @@ -128,6 +128,7 @@ public: bool equals(src_reg *r); bool is_zero() const; bool is_one() const; + bool is_accumulator() const; src_reg(class vec4_visitor *v, const struct glsl_type *type); @@ -195,6 +196,7 @@ public: explicit dst_reg(src_reg reg); bool is_null() const; + bool is_accumulator() const; int writemask; /** Bitfield of WRITEMASK_[XYZW] */ signature.asc Description: OpenPGP digital signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/3] i965: Add reads_accumulator_implicitly() function.
On Wed, Apr 9, 2014 at 3:06 PM, Eric Anholt e...@anholt.net wrote: Matt Turner matts...@gmail.com writes: --- src/mesa/drivers/dri/i965/brw_shader.cpp | 16 src/mesa/drivers/dri/i965/brw_shader.h | 1 + 2 files changed, 17 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_shader.cpp b/src/mesa/drivers/dri/i965/brw_shader.cpp index f194437..c8796b3 100644 --- a/src/mesa/drivers/dri/i965/brw_shader.cpp +++ b/src/mesa/drivers/dri/i965/brw_shader.cpp @@ -664,6 +664,22 @@ backend_instruction::can_do_saturate() const } bool +backend_instruction::reads_accumulator_implicitly() const +{ + switch (opcode) { + case BRW_OPCODE_MAC: + case BRW_OPCODE_MACH: + /* FINISHME: Enable these if we ever start emitting them. +* case BRW_OPCODE_SADA: +* case BRW_OPCODE_SADA2: +*/ Let's just uncomment SADA2 right away to prevent pain in the future. SAD2 doesn't read the acc, though. Other than that, the first 2 patches are: Reviewed-by: Eric Anholt e...@anholt.net I think scheduling is still broken in the last one, because you're removing the barrier deps on implicit-accumulator opcodes and replacing them with explicit dependencies, but you're not tracking the accumulator updates by almost-all-instructions pre-gen6. The scheduler would be free to slip in some unrelated instruction after the MUL in the following snippet from brw_vec4_visitor.cpp: Ah, that is true. I went looking for text about this, since I didn't know about it until you mentioned it recently. I see in the GM45 docs a 'Accumulator Disable' bit in cr0. I wonder whether all of the false write-after-write dependencies on the accumulator actually cause stalls, and if so whether we should attempt to disable accumulator writes. We don't seem to have any cases where we rely on implicit accumulator updates that we couldn't replace with explicit accumulator destinations. emit(MUL(acc, op[0], op[1])); emit(MACH(dst_null_d(), op[0], op[1])); emit(MOV(result_dst, src_reg(acc))); (err, why are we doing MACH and MOV instead of just MACH into result_dst?) mach writes the *high* 32-bits of the result into its destination (so useful for *mulExtended()). ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 3/3] i965: Add writes_accumulator flag
On 04/09/2014 01:47 PM, Matt Turner wrote: From: Juha-Pekka Heikkila juhapekka.heikk...@gmail.com Our hardware has an accumulator register, which can be used to store intermediate results across multiple instructions. Many instructions can implicitly write a value to the accumulator in addition to their normal destination register. This is enabled by the AccWrEn flag. This patch introduces a new flag, inst-writes_accumulator, which allows us to express the AccWrEn notion in the IR. It also creates a n ALU2_ACC macro to easily define emitters for instructions that implicitly write the accumulator. Previously, we only supported implicit accumulator writes from the ADDC, SUBB, and MACH instructions. We always enabled them on those instructions, and left them disabled for other instructions. To take advantage of the MAC (multiply-accumulate) instruction, we need to be able to set AccWrEn on other types of instructions. Reviewed-by: Matt Turner matts...@gmail.com Signed-off-by: Juha-Pekka Heikkila juhapekka.heikk...@gmail.com --- I split out is_accumulator() into a separate patch, and made some fixes to the scheduling code. Let me know if these changes look good to you, JP. (Patch formatted with -U15 as to see other sections of the scheduling code during review) src/mesa/drivers/dri/i965/brw_fs.cpp | 26 ++ src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 7 +-- .../drivers/dri/i965/brw_schedule_instructions.cpp | 58 ++ src/mesa/drivers/dri/i965/brw_shader.h | 1 + src/mesa/drivers/dri/i965/brw_vec4.cpp | 15 ++ src/mesa/drivers/dri/i965/brw_vec4_generator.cpp | 7 +-- src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 17 +-- 7 files changed, 95 insertions(+), 36 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp index e576545..0eece60 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp [snip] @@ -2113,40 +2124,35 @@ fs_visitor::dead_code_eliminate() for (int i = 0; i inst-regs_written; i++) { int var = live_intervals-var_from_vgrf[inst-dst.reg]; assert(live_intervals-end[var + inst-dst.reg_offset + i] = pc); if (live_intervals-end[var + inst-dst.reg_offset + i] != pc) { dead = false; break; } } if (dead) { /* Don't dead code eliminate instructions that write to the * accumulator as a side-effect. Instead just set the destination * to the null register to free it. */ -switch (inst-opcode) { -case BRW_OPCODE_ADDC: -case BRW_OPCODE_SUBB: -case BRW_OPCODE_MACH: +if (inst-writes_accumulator) { inst-dst = fs_reg(retype(brw_null_reg(), inst-dst.type)); - break; Pre-existing bug: we ought to set progress = true in this case. -default: +} else { inst-remove(); progress = true; - break; } } } pc++; } if (progress) invalidate_live_intervals(); return progress; } struct dead_code_hash_key { signature.asc Description: OpenPGP digital signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/3] i965: Add is_accumulator() function.
On Wed, Apr 9, 2014 at 3:13 PM, Kenneth Graunke kenn...@whitecape.org wrote: On 04/09/2014 01:47 PM, Matt Turner wrote: From: Juha-Pekka Heikkila juhapekka.heikk...@gmail.com Reviewed-by: Matt Turner matts...@gmail.com Signed-off-by: Juha-Pekka Heikkila juhapekka.heikk...@gmail.com --- As a follow-on patch series, we should move common fields from fs_reg and vec4's reg into a backend_reg and consolidate these functions. Yeah, there's been some talk about creating one of those. But even at present, src_reg and dst_reg both inherit from reg. is_accumulator() should be defined there, not in both subclasses. Yeah. That's what made me notice. I don't care to do a partial fix now. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] i965/fs: Reset reg_from when we can't coalesce.
Not setting this would prevented coalescing after a failed attempt if the sources for both MOVs were the same. total instructions in shared programs: 1654531 - 1650224 (-0.26%) instructions in affected programs: 423167 - 418860 (-1.02%) GAINED:2 LOST: 0 --- src/mesa/drivers/dri/i965/brw_fs_register_coalesce.cpp | 1 + 1 file changed, 1 insertion(+) diff --git a/src/mesa/drivers/dri/i965/brw_fs_register_coalesce.cpp b/src/mesa/drivers/dri/i965/brw_fs_register_coalesce.cpp index 6e30d16..4e3b611 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_register_coalesce.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_register_coalesce.cpp @@ -162,6 +162,7 @@ fs_visitor::register_coalesce() if (!can_coalesce_vars(live_intervals, instructions, inst, var_to[i], var_from[i])) { can_coalesce = false; +reg_from = -1; break; } } -- 1.8.3.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/2] i965/gs: Add dummy source to prepare_channel_masks instruction.
The generator uses its destination as a source implicitly, which breaks some assumptions in dead code elimination. Giving the instruction a source allows us to reason about it better. Reviewed-by: Eric Anholt e...@anholt.net --- I can't use the source in the generator because a shl(1) instruction is emitted from generate_gs_prepare_channel_masks(), so we rely on a bunch of bits in src being in dst even though we're not writing the whole register. src/mesa/drivers/dri/i965/brw_vec4_generator.cpp | 2 ++ src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp | 3 ++- src/mesa/drivers/dri/i965/gen8_vec4_generator.cpp | 2 ++ 3 files changed, 6 insertions(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp b/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp index a74514f..47aac75 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp +++ b/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp @@ -1221,6 +1221,8 @@ vec4_generator::generate_vec4_instruction(vec4_instruction *instruction, break; case GS_OPCODE_PREPARE_CHANNEL_MASKS: + assert(dst.file == src[0].file + dst.reg == src[0].reg); generate_gs_prepare_channel_masks(dst); break; diff --git a/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp b/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp index 13d6d38..1321a94 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp +++ b/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp @@ -408,7 +408,8 @@ vec4_gs_visitor::emit_control_data_bits() src_reg channel_mask(this, glsl_type::uint_type); inst = emit(SHL(dst_reg(channel_mask), one, channel)); inst-force_writemask_all = true; - emit(GS_OPCODE_PREPARE_CHANNEL_MASKS, dst_reg(channel_mask)); + emit(GS_OPCODE_PREPARE_CHANNEL_MASKS, dst_reg(channel_mask), + channel_mask); emit(GS_OPCODE_SET_CHANNEL_MASKS, mrf_reg, channel_mask); } diff --git a/src/mesa/drivers/dri/i965/gen8_vec4_generator.cpp b/src/mesa/drivers/dri/i965/gen8_vec4_generator.cpp index b854db5..49e1a97 100644 --- a/src/mesa/drivers/dri/i965/gen8_vec4_generator.cpp +++ b/src/mesa/drivers/dri/i965/gen8_vec4_generator.cpp @@ -758,6 +758,8 @@ gen8_vec4_generator::generate_vec4_instruction(vec4_instruction *instruction, break; case GS_OPCODE_PREPARE_CHANNEL_MASKS: + assert(dst.file == src[0].file + dst.reg == src[0].reg); generate_gs_prepare_channel_masks(dst); break; -- 1.8.3.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 00/18] Implement GL_ARB_multi_bind
On Tuesday 08 April 2014, Kenneth Graunke wrote: On 01/21/2014 03:35 PM, Fredrik Höglund wrote: So here is my take on GL_ARB_multi_bind. I tried to come up with names for the new hash table functions that don't suggest that they should be used to do unlocked insertions/lookups. I'm not entirely happy with the ones I came up with though, so I'm hoping someone will have better suggestions. When binding 32 textures glBindTextures() seems to be about three times faster than calling glActiveTexture() + glBindTexture() in a loop. When binding 4 textures it's about twice as fast. I hope to land this series this week if there are no major issues. Note that I haven't been able to test the glBindImageTextures() implementation. This series is also available at: git://people.freedesktop.org/~fredrik/mesa arb-multi-bind Hi Fredrik, Where are we at with this? It sounds like there were a few review comments and suggestions - were you planning to send out a v2? I plan on sending out a new version shortly. Fredrik ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] fixup! i965: Add writes_accumulator flag
--- Eric, how about this squashed in? On Gen 6 any accumulator use, with the exception of the implied update that nearly every instruction does causes a barrier dep. Implicit writes, noted by ::writes_accumulator, causes a barrier dep. On Gen = 6, we just track the accumulator dependencies with last_accumulator_write. .../drivers/dri/i965/brw_schedule_instructions.cpp | 74 -- 1 file changed, 55 insertions(+), 19 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp b/src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp index 910b73a..8cc6908 100644 --- a/src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp +++ b/src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp @@ -742,6 +742,8 @@ fs_instruction_scheduler::is_compressed(fs_inst *inst) void fs_instruction_scheduler::calculate_deps() { + const bool gen6plus = v-brw-gen = 6; + /* Pre-register-allocation, this tracks the last write per VGRF (so * different reg_offsets within it can interfere when they shouldn't). * After register allocation, reg_offsets are gone and we track individual @@ -801,7 +803,7 @@ fs_instruction_scheduler::calculate_deps() } else { add_dep(last_fixed_grf_write, n); } - } else if (inst-src[i].is_accumulator()) { + } else if (inst-src[i].is_accumulator() gen6plus) { add_dep(last_accumulator_write, n); } else if (inst-src[i].file != BAD_FILE inst-src[i].file != IMM @@ -826,7 +828,11 @@ fs_instruction_scheduler::calculate_deps() } if (inst-reads_accumulator_implicitly()) { - add_dep(last_accumulator_write, n); + if (gen6plus) { +add_dep(last_accumulator_write, n); + } else { +add_barrier_deps(n); + } } /* write-after-write deps. */ @@ -861,7 +867,7 @@ fs_instruction_scheduler::calculate_deps() } else { last_fixed_grf_write = n; } - } else if (inst-dst.is_accumulator()) { + } else if (inst-dst.is_accumulator() gen6plus) { add_dep(last_accumulator_write, n); last_accumulator_write = n; } else if (inst-dst.file != BAD_FILE @@ -882,8 +888,12 @@ fs_instruction_scheduler::calculate_deps() } if (inst-writes_accumulator) { - add_dep(last_accumulator_write, n); - last_accumulator_write = n; + if (gen6plus) { +add_dep(last_accumulator_write, n); +last_accumulator_write = n; + } else { +add_barrier_deps(n); + } } } @@ -923,7 +933,7 @@ fs_instruction_scheduler::calculate_deps() } else { add_dep(n, last_fixed_grf_write); } - } else if (inst-src[i].is_accumulator()) { + } else if (inst-src[i].is_accumulator() gen6plus) { add_dep(n, last_accumulator_write); } else if (inst-src[i].file != BAD_FILE inst-src[i].file != IMM @@ -948,7 +958,11 @@ fs_instruction_scheduler::calculate_deps() } if (inst-reads_accumulator_implicitly()) { - add_dep(n, last_accumulator_write); + if (gen6plus) { +add_dep(n, last_accumulator_write); + } else { +add_barrier_deps(n); + } } /* Update the things this instruction wrote, so earlier reads @@ -982,7 +996,7 @@ fs_instruction_scheduler::calculate_deps() } else { last_fixed_grf_write = n; } - } else if (inst-dst.is_accumulator()) { + } else if (inst-dst.is_accumulator() gen6plus) { last_accumulator_write = n; } else if (inst-dst.file != BAD_FILE !inst-dst.is_null()) { @@ -1000,7 +1014,11 @@ fs_instruction_scheduler::calculate_deps() } if (inst-writes_accumulator) { - last_accumulator_write = n; + if (gen6plus) { +last_accumulator_write = n; + } else { +add_barrier_deps(n); + } } } } @@ -1008,6 +1026,8 @@ fs_instruction_scheduler::calculate_deps() void vec4_instruction_scheduler::calculate_deps() { + const bool gen6plus = v-brw-gen = 6; + schedule_node *last_grf_write[grf_count]; schedule_node *last_mrf_write[BRW_MAX_MRF]; schedule_node *last_conditional_mod = NULL; @@ -1047,7 +1067,7 @@ vec4_instruction_scheduler::calculate_deps() (inst-src[i].fixed_hw_reg.file == BRW_GENERAL_REGISTER_FILE)) { add_dep(last_fixed_grf_write, n); - } else if (inst-src[i].is_accumulator()) { + } else if (inst-src[i].is_accumulator() gen6plus) { assert(last_accumulator_write); add_dep(last_accumulator_write, n); } else if (inst-src[i].file != BAD_FILE @@ -1074,8 +1094,12 @@ vec4_instruction_scheduler::calculate_deps() }
[Mesa-dev] state tracker texture sizing fun
So I was looking at adding ARB_texture_query_levels support to gallium, and hit a bit of a saga in the state tracker texture finalising code. commits involved in this are below, So to fix the query levels test I essentially wanted to revert 529b7b355d392b1534ccd8ff7b428dc21cbfdc21 so that the hw was programmed with the correct last levels and the query tests would pass, I then did an llvmpipe piglit run, and found two major regressions, texture arrays broke and getteximage broke, arrays are broken simply because st_texture_image_copy is broken for arrays, that seems like a not insane fix, however getteximage is broken because of Cooper's commit, as the test sets GL_NEAREST for everything, it however TexImage2D a number of layers into the textures, then GetTexImage them backout, however due to that sampler check we totally fail, I think we should be dropping Cooper's change it lacks justification and it dies texture backing store and samplers together a bit much for my liking, Dave. commit 529b7b355d392b1534ccd8ff7b428dc21cbfdc21 Author: Brian Paul bri...@vmware.com Date: Mon May 3 13:04:29 2010 -0600 st/mesa: restore original last_layer comparison Commit e648d4a1d1c0c5f70916e38366b863f0bec79a62 changed the original less-than test to a not-equal test. This was an effort to save some memory by switching the texture layout to a non-mipmapped layout when we mis-guessed about the original layout (thus saving some memory). However, this causes us to hit a new (apparently broken) code path when copying the old texture's data to the new texture. Simply undo this change for the time being until the other/new bug is fixed. Fixes fd.o bug 27933. commit e648d4a1d1c0c5f70916e38366b863f0bec79a62 Author: Brian Paul bri...@vmware.com Date: Thu Apr 29 15:32:36 2010 -0600 st/mesa: ignore gl_texture_object::BaseLevel when allocating gallium texture Previously, when we created a gallium texture for a corresponding Mesa texture we'd only allocate space for mipmap levels = BaseLevel. This patch undoes that mechanism. This fixes a render-to-texture bug when rendering to level 0 when BaseLevel=1. Also, it makes sense to allocate the whole texture object memory when BaseLevel 0 since a common use of GL_TEXTURE_BASE_LEVEL is to progressively load/render mipmaps. Eventually, the app almost always fills in the level=0 mipmap image. Finally, the texture image code is bit easier to understand now. ae2daacbac7242938cffe0e2409071e030e00863 Author: Cooper Yuan coopery...@gmail.com Date: Thu Oct 1 17:54:27 2009 +0800 st/mesa: fix non-mipmap lastLevel calculation. reviewed by Brian Paul. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] st/mesa: fix sampler_view REALLOC/FREE macro mix-up
We were using REALLOC() from u_memory.h but FREE() from imports.h. This mismatch caused us to trash the heap on Windows after we deleted a texture object. This fixes a regression from commit 6c59be7776e4d. --- src/mesa/state_tracker/st_cb_texture.c |2 +- src/mesa/state_tracker/st_texture.c| 12 src/mesa/state_tracker/st_texture.h|3 +++ 3 files changed, 16 insertions(+), 1 deletion(-) diff --git a/src/mesa/state_tracker/st_cb_texture.c b/src/mesa/state_tracker/st_cb_texture.c index 353415b..304dc91 100644 --- a/src/mesa/state_tracker/st_cb_texture.c +++ b/src/mesa/state_tracker/st_cb_texture.c @@ -155,7 +155,7 @@ st_DeleteTextureObject(struct gl_context *ctx, pipe_resource_reference(stObj-pt, NULL); st_texture_release_all_sampler_views(stObj); - FREE(stObj-sampler_views); + st_texture_free_sampler_views(stObj); _mesa_delete_texture_object(ctx, texObj); } diff --git a/src/mesa/state_tracker/st_texture.c b/src/mesa/state_tracker/st_texture.c index 8d559df..cfa0605 100644 --- a/src/mesa/state_tracker/st_texture.c +++ b/src/mesa/state_tracker/st_texture.c @@ -483,3 +483,15 @@ st_texture_release_all_sampler_views(struct st_texture_object *stObj) for (i = 0; i stObj-num_sampler_views; ++i) pipe_sampler_view_reference(stObj-sampler_views[i], NULL); } + + +void +st_texture_free_sampler_views(struct st_texture_object *stObj) +{ + /* NOTE: +* We use FREE() here to match REALLOC() above. Both come from +* u_memory.h, not imports.h. If we mis-match MALLOC/FREE from +* those two headers we can trash the heap. +*/ + FREE(stObj-sampler_views); +} diff --git a/src/mesa/state_tracker/st_texture.h b/src/mesa/state_tracker/st_texture.h index 87de9f9..f2afaf1 100644 --- a/src/mesa/state_tracker/st_texture.h +++ b/src/mesa/state_tracker/st_texture.h @@ -241,4 +241,7 @@ st_texture_release_sampler_view(struct st_context *st, extern void st_texture_release_all_sampler_views(struct st_texture_object *stObj); +void +st_texture_free_sampler_views(struct st_texture_object *stObj); + #endif -- 1.7.10.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 3/5] mesa: s/FREE/free/ in vdpau code
--- src/mesa/main/vdpau.c |8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/src/mesa/main/vdpau.c b/src/mesa/main/vdpau.c index c2cf206..d974593 100644 --- a/src/mesa/main/vdpau.c +++ b/src/mesa/main/vdpau.c @@ -88,7 +88,7 @@ unregister_surface(struct set_entry *entry) } _mesa_set_remove(ctx-vdpSurfaces, entry); - FREE(surf); + free(surf); } void GLAPIENTRY @@ -145,7 +145,7 @@ register_surface(struct gl_context *ctx, GLboolean isOutput, if (tex-Immutable) { _mesa_unlock_texture(ctx, tex); - FREE(surf); + free(surf); _mesa_error(ctx, GL_INVALID_OPERATION, VDPAURegisterSurfaceNV(texture is immutable)); return (GLintptr)NULL; @@ -155,7 +155,7 @@ register_surface(struct gl_context *ctx, GLboolean isOutput, tex-Target = target; else if (tex-Target != target) { _mesa_unlock_texture(ctx, tex); - FREE(surf); + free(surf); _mesa_error(ctx, GL_INVALID_OPERATION, VDPAURegisterSurfaceNV(target mismatch)); return (GLintptr)NULL; @@ -254,7 +254,7 @@ _mesa_VDPAUUnregisterSurfaceNV(GLintptr surface) } _mesa_set_remove(ctx-vdpSurfaces, entry); - FREE(surf); + free(surf); } void GLAPIENTRY -- 1.7.10.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 4/5] xlib: s/FREE/free/
--- src/mesa/drivers/x11/xm_api.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/mesa/drivers/x11/xm_api.c b/src/mesa/drivers/x11/xm_api.c index 4779595..d860569 100644 --- a/src/mesa/drivers/x11/xm_api.c +++ b/src/mesa/drivers/x11/xm_api.c @@ -855,7 +855,7 @@ XMesaVisual XMesaCreateVisual( XMesaDisplay *display, accum_red_size, accum_green_size, accum_blue_size, accum_alpha_size, 0)) { - FREE(v); + free(v); return NULL; } -- 1.7.10.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/5] mesa: s/FREE/free/ in _mesa_free_errors_data()
--- src/mesa/main/errors.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/mesa/main/errors.c b/src/mesa/main/errors.c index 9151718..d80fda0 100644 --- a/src/mesa/main/errors.c +++ b/src/mesa/main/errors.c @@ -980,7 +980,7 @@ _mesa_free_errors_data(struct gl_context *ctx) for (i = 0; i = ctx-Debug-GroupStackDepth; i++) { free_errors_data(ctx, i); } - FREE(ctx-Debug); + free(ctx-Debug); /* set to NULL just in case it is used before context is completely gone. */ ctx-Debug = NULL; } -- 1.7.10.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/5] mesa: use malloc/free instead of MALLOC/FREE in attrib stack code
We moved away from MALLOC/FREE in the rest of core Mesa a while ago. --- src/mesa/main/attrib.c | 20 ++-- 1 file changed, 10 insertions(+), 10 deletions(-) diff --git a/src/mesa/main/attrib.c b/src/mesa/main/attrib.c index 5a626f2..c656845 100644 --- a/src/mesa/main/attrib.c +++ b/src/mesa/main/attrib.c @@ -217,7 +217,7 @@ push_attrib(struct gl_context *ctx, struct gl_attrib_node **head, { void *attribute; - attribute = MALLOC(attr_size); + attribute = malloc(attr_size); if (attribute == NULL) { _mesa_error(ctx, GL_OUT_OF_MEMORY, glPushAttrib); return false; @@ -227,7 +227,7 @@ push_attrib(struct gl_context *ctx, struct gl_attrib_node **head, memcpy(attribute, attr_data, attr_size); } else { - FREE(attribute); + free(attribute); _mesa_error(ctx, GL_OUT_OF_MEMORY, glPushAttrib); return false; } @@ -277,7 +277,7 @@ _mesa_PushAttrib(GLbitfield mask) attr-DrawBuffer[i] = ctx-DrawBuffer-ColorDrawBuffer[i]; } else { - FREE(attr); + free(attr); _mesa_error(ctx, GL_OUT_OF_MEMORY, glPushAttrib); goto end; } @@ -374,7 +374,7 @@ _mesa_PushAttrib(GLbitfield mask) attr-FragmentProgram = ctx-FragmentProgram.Enabled; if (!save_attrib_data(head, GL_ENABLE_BIT, attr)) { - FREE(attr); + free(attr); _mesa_error(ctx, GL_OUT_OF_MEMORY, glPushAttrib); goto end; } @@ -440,7 +440,7 @@ _mesa_PushAttrib(GLbitfield mask) attr-ReadBuffer = ctx-ReadBuffer-ColorReadBuffer; } else { - FREE(attr); + free(attr); _mesa_error(ctx, GL_OUT_OF_MEMORY, glPushAttrib); goto end; } @@ -491,7 +491,7 @@ _mesa_PushAttrib(GLbitfield mask) } if (!save_attrib_data(head, GL_TEXTURE_BIT, texstate)) { - FREE(texstate); + free(texstate); _mesa_error(ctx, GL_OUT_OF_MEMORY, glPushAttrib(GL_TEXTURE_BIT)); goto end; } @@ -1626,7 +1626,7 @@ _mesa_PushClientAttrib(GLbitfield mask) } else { _mesa_error( ctx, GL_OUT_OF_MEMORY, glPushClientAttrib ); - FREE(attr); + free(attr); goto end; } @@ -1642,7 +1642,7 @@ _mesa_PushClientAttrib(GLbitfield mask) } else { _mesa_error( ctx, GL_OUT_OF_MEMORY, glPushClientAttrib ); - FREE(attr); + free(attr); goto end; } } @@ -1656,7 +1656,7 @@ _mesa_PushClientAttrib(GLbitfield mask) } if (!init_array_attrib_data(ctx, attr)) { - FREE(attr); + free(attr); goto end; } @@ -1666,7 +1666,7 @@ _mesa_PushClientAttrib(GLbitfield mask) else { free_array_attrib_data(ctx, attr); _mesa_error(ctx, GL_OUT_OF_MEMORY, glPushClientAttrib); - FREE(attr); + free(attr); /* goto to keep safe from possible later changes */ goto end; } -- 1.7.10.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 5/5] mesa: remove the MALLOC, CALLOC and FREE macros
No longer used anywhere. These also caused trouble in the Gallium state tracker code where we include both core Mesa and Gallium util headers (and the macros were defined differently in each world.) Removing these macros should help avoid macro mix-ups in the future. --- src/mesa/main/imports.h |6 -- 1 file changed, 6 deletions(-) diff --git a/src/mesa/main/imports.h b/src/mesa/main/imports.h index 9e221cc..17a9bd0 100644 --- a/src/mesa/main/imports.h +++ b/src/mesa/main/imports.h @@ -49,16 +49,10 @@ extern C { /** Memory macros */ /*@{*/ -/** Allocate \p BYTES bytes */ -#define MALLOC(BYTES) malloc(BYTES) -/** Allocate and zero \p BYTES bytes */ -#define CALLOC(BYTES) calloc(1, BYTES) /** Allocate a structure of type \p T */ #define MALLOC_STRUCT(T) (struct T *) malloc(sizeof(struct T)) /** Allocate and zero a structure of type \p T */ #define CALLOC_STRUCT(T) (struct T *) calloc(1, sizeof(struct T)) -/** Free memory */ -#define FREE(PTR) free(PTR) /*@}*/ -- 1.7.10.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/4] glxinfo: Print XFB, TBO, and UBO limits
--- src/xdemos/glxinfo.c | 22 ++ 1 file changed, 22 insertions(+) diff --git a/src/xdemos/glxinfo.c b/src/xdemos/glxinfo.c index a116e4a..a77e808 100644 --- a/src/xdemos/glxinfo.c +++ b/src/xdemos/glxinfo.c @@ -659,6 +659,28 @@ print_limits(const char *extensions, const char *oglstring) { 1, GL_MAX_COLOR_ATTACHMENTS, GL_MAX_COLOR_ATTACHMENTS, GL_ARB_framebuffer_object }, { 1, GL_MAX_SAMPLES, GL_MAX_SAMPLES, GL_ARB_framebuffer_object }, #endif +#if defined (GL_EXT_transform_feedback) + { 1, GL_MAX_TRANSFORM_FEEDBACK_BUFFERS, GL_MAX_TRANSFORM_FEEDBACK_BUFFERS, GL_EXT_transform_feedback }, + { 1, GL_MAX_TRANSFORM_FEEDBACK_INTERLEAVED_COMPONENTS_EXT, GL_MAX_TRANSFORM_FEEDBACK_INTERLEAVED_COMPONENTS, GL_EXT_transform_feedback }, + { 1, GL_MAX_TRANSFORM_FEEDBACK_SEPARATE_ATTRIBS_EXT, GL_MAX_TRANSFORM_FEEDBACK_SEPARATE_ATTRIBS, GL_EXT_transform_feedback, }, + { 1, GL_MAX_TRANSFORM_FEEDBACK_SEPARATE_COMPONENTS_EXT, GL_MAX_TRANSFORM_FEEDBACK_SEPARATE_COMPONENTS, GL_EXT_transform_feedback }, +#endif +#if defined (GL_ARB_texture_buffer_object) + { 1, GL_TEXTURE_BUFFER_OFFSET_ALIGNMENT, GL_TEXTURE_BUFFER_OFFSET_ALIGNMENT, GL_ARB_texture_buffer_object }, + { 1, GL_MAX_TEXTURE_BUFFER_SIZE, GL_MAX_TEXTURE_BUFFER_SIZE, GL_ARB_texture_buffer_object }, +#endif +#if defined (GL_ARB_uniform_buffer_object) + { 1, GL_MAX_VERTEX_UNIFORM_BLOCKS, GL_MAX_VERTEX_UNIFORM_BLOCKS, GL_ARB_uniform_buffer_object }, + { 1, GL_MAX_FRAGMENT_UNIFORM_BLOCKS, GL_MAX_FRAGMENT_UNIFORM_BLOCKS, GL_ARB_uniform_buffer_object }, + { 1, GL_MAX_GEOMETRY_UNIFORM_BLOCKS, GL_MAX_GEOMETRY_UNIFORM_BLOCKS , GL_ARB_uniform_buffer_object }, + { 1, GL_MAX_COMBINED_UNIFORM_BLOCKS, GL_MAX_COMBINED_UNIFORM_BLOCKS, GL_ARB_uniform_buffer_object }, + { 1, GL_MAX_UNIFORM_BUFFER_BINDINGS, GL_MAX_UNIFORM_BUFFER_BINDINGS, GL_ARB_uniform_buffer_object }, + { 1, GL_MAX_UNIFORM_BLOCK_SIZE, GL_MAX_UNIFORM_BLOCK_SIZE, GL_ARB_uniform_buffer_object }, + { 1, GL_MAX_COMBINED_VERTEX_UNIFORM_COMPONENTS, GL_MAX_COMBINED_VERTEX_UNIFORM_COMPONENTS, GL_ARB_uniform_buffer_object }, + { 1, GL_MAX_COMBINED_FRAGMENT_UNIFORM_COMPONENTS, GL_MAX_COMBINED_FRAGMENT_UNIFORM_COMPONENTS, GL_ARB_uniform_buffer_object }, + { 1, GL_MAX_COMBINED_GEOMETRY_UNIFORM_COMPONENTS, GL_MAX_COMBINED_GEOMETRY_UNIFORM_COMPONENTS, GL_ARB_uniform_buffer_object }, + { 1, GL_UNIFORM_BUFFER_OFFSET_ALIGNMENT, GL_UNIFORM_BUFFER_OFFSET_ALIGNMENT, GL_ARB_uniform_buffer_object }, +#endif { 0, (GLenum) 0, NULL, NULL } }; GLint i, max[2]; -- 1.8.5.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 4/4] glxinfo: Remove the ARB suffixes from core enums
The suffix is only removed from the printed names in case someone wants to build glxinfo against an old implementation. --- src/xdemos/glxinfo.c | 34 +- 1 file changed, 17 insertions(+), 17 deletions(-) diff --git a/src/xdemos/glxinfo.c b/src/xdemos/glxinfo.c index f8a4e51..6e00dd3 100644 --- a/src/xdemos/glxinfo.c +++ b/src/xdemos/glxinfo.c @@ -538,20 +538,20 @@ static void print_shader_limits(GLenum target) { static const struct token_name vertex_limits[] = { - { GL_MAX_VERTEX_UNIFORM_COMPONENTS_ARB, GL_MAX_VERTEX_UNIFORM_COMPONENTS_ARB }, - { GL_MAX_VARYING_FLOATS_ARB, GL_MAX_VARYING_FLOATS_ARB }, - { GL_MAX_VERTEX_ATTRIBS_ARB, GL_MAX_VERTEX_ATTRIBS_ARB }, - { GL_MAX_TEXTURE_IMAGE_UNITS_ARB, GL_MAX_TEXTURE_IMAGE_UNITS_ARB }, - { GL_MAX_VERTEX_TEXTURE_IMAGE_UNITS_ARB, GL_MAX_VERTEX_TEXTURE_IMAGE_UNITS_ARB }, - { GL_MAX_COMBINED_TEXTURE_IMAGE_UNITS_ARB, GL_MAX_COMBINED_TEXTURE_IMAGE_UNITS_ARB }, - { GL_MAX_TEXTURE_COORDS_ARB, GL_MAX_TEXTURE_COORDS_ARB }, + { GL_MAX_VERTEX_UNIFORM_COMPONENTS_ARB, GL_MAX_VERTEX_UNIFORM_COMPONENTS }, + { GL_MAX_VARYING_FLOATS_ARB, GL_MAX_VARYING_FLOATS }, + { GL_MAX_VERTEX_ATTRIBS_ARB, GL_MAX_VERTEX_ATTRIBS }, + { GL_MAX_TEXTURE_IMAGE_UNITS_ARB, GL_MAX_TEXTURE_IMAGE_UNITS }, + { GL_MAX_VERTEX_TEXTURE_IMAGE_UNITS_ARB, GL_MAX_VERTEX_TEXTURE_IMAGE_UNITS }, + { GL_MAX_COMBINED_TEXTURE_IMAGE_UNITS_ARB, GL_MAX_COMBINED_TEXTURE_IMAGE_UNITS }, + { GL_MAX_TEXTURE_COORDS_ARB, GL_MAX_TEXTURE_COORDS }, { GL_MAX_VERTEX_OUTPUT_COMPONENTS , GL_MAX_VERTEX_OUTPUT_COMPONENTS }, { (GLenum) 0, NULL } }; static const struct token_name fragment_limits[] = { - { GL_MAX_FRAGMENT_UNIFORM_COMPONENTS_ARB, GL_MAX_FRAGMENT_UNIFORM_COMPONENTS_ARB }, - { GL_MAX_TEXTURE_COORDS_ARB, GL_MAX_TEXTURE_COORDS_ARB }, - { GL_MAX_TEXTURE_IMAGE_UNITS_ARB, GL_MAX_TEXTURE_IMAGE_UNITS_ARB }, + { GL_MAX_FRAGMENT_UNIFORM_COMPONENTS_ARB, GL_MAX_FRAGMENT_UNIFORM_COMPONENTS }, + { GL_MAX_TEXTURE_COORDS_ARB, GL_MAX_TEXTURE_COORDS }, + { GL_MAX_TEXTURE_IMAGE_UNITS_ARB, GL_MAX_TEXTURE_IMAGE_UNITS }, { GL_MAX_FRAGMENT_INPUT_COMPONENTS , GL_MAX_FRAGMENT_INPUT_COMPONENTS }, { (GLenum) 0, NULL } }; @@ -567,12 +567,12 @@ print_shader_limits(GLenum target) switch (target) { case GL_VERTEX_SHADER: - printf(GL_VERTEX_SHADER_ARB:\n); + printf(GL_VERTEX_SHADER:\n); print_shader_limit_list(vertex_limits); break; case GL_FRAGMENT_SHADER: - printf(GL_FRAGMENT_SHADER_ARB:\n); + printf(GL_FRAGMENT_SHADER:\n); print_shader_limit_list(fragment_limits); break; @@ -637,22 +637,22 @@ print_limits(const char *extensions, const char *oglstring) { 2, GL_ALIASED_POINT_SIZE_RANGE, GL_ALIASED_POINT_SIZE_RANGE, NULL }, { 2, GL_SMOOTH_POINT_SIZE_RANGE, GL_SMOOTH_POINT_SIZE_RANGE, NULL }, #if defined(GL_ARB_texture_cube_map) - { 1, GL_MAX_CUBE_MAP_TEXTURE_SIZE_ARB, GL_MAX_CUBE_MAP_TEXTURE_SIZE_ARB, GL_ARB_texture_cube_map }, + { 1, GL_MAX_CUBE_MAP_TEXTURE_SIZE_ARB, GL_MAX_CUBE_MAP_TEXTURE_SIZE, GL_ARB_texture_cube_map }, #endif #if defined(GL_NV_texture_rectangle) - { 1, GL_MAX_RECTANGLE_TEXTURE_SIZE_NV, GL_MAX_RECTANGLE_TEXTURE_SIZE_NV, GL_NV_texture_rectangle }, + { 1, GL_MAX_RECTANGLE_TEXTURE_SIZE_NV, GL_MAX_RECTANGLE_TEXTURE_SIZE, GL_NV_texture_rectangle }, #endif #if defined(GL_ARB_multitexture) - { 1, GL_MAX_TEXTURE_UNITS_ARB, GL_MAX_TEXTURE_UNITS_ARB, GL_ARB_multitexture }, + { 1, GL_MAX_TEXTURE_UNITS_ARB, GL_MAX_TEXTURE_UNITS, GL_ARB_multitexture }, #endif #if defined(GL_EXT_texture_lod_bias) - { 1, GL_MAX_TEXTURE_LOD_BIAS_EXT, GL_MAX_TEXTURE_LOD_BIAS_EXT, GL_EXT_texture_lod_bias }, + { 1, GL_MAX_TEXTURE_LOD_BIAS_EXT, GL_MAX_TEXTURE_LOD_BIAS, GL_EXT_texture_lod_bias }, #endif #if defined(GL_EXT_texture_filter_anisotropic) { 1, GL_MAX_TEXTURE_MAX_ANISOTROPY_EXT, GL_MAX_TEXTURE_MAX_ANISOTROPY_EXT, GL_EXT_texture_filter_anisotropic }, #endif #if defined(GL_ARB_draw_buffers) - { 1, GL_MAX_DRAW_BUFFERS_ARB, GL_MAX_DRAW_BUFFERS_ARB, GL_ARB_draw_buffers }, + { 1, GL_MAX_DRAW_BUFFERS_ARB, GL_MAX_DRAW_BUFFERS, GL_ARB_draw_buffers }, #endif #if defined(GL_ARB_blend_func_extended) { 1, GL_MAX_DUAL_SOURCE_DRAW_BUFFERS, GL_MAX_DUAL_SOURCE_DRAW_BUFFERS, GL_ARB_blend_func_extended }, -- 1.8.5.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/4] glxinfo: Print GL_ARB_vertex_attrib_binding limits
--- src/xdemos/glxinfo.c | 5 + 1 file changed, 5 insertions(+) diff --git a/src/xdemos/glxinfo.c b/src/xdemos/glxinfo.c index a77e808..f97ba3e 100644 --- a/src/xdemos/glxinfo.c +++ b/src/xdemos/glxinfo.c @@ -681,6 +681,11 @@ print_limits(const char *extensions, const char *oglstring) { 1, GL_MAX_COMBINED_GEOMETRY_UNIFORM_COMPONENTS, GL_MAX_COMBINED_GEOMETRY_UNIFORM_COMPONENTS, GL_ARB_uniform_buffer_object }, { 1, GL_UNIFORM_BUFFER_OFFSET_ALIGNMENT, GL_UNIFORM_BUFFER_OFFSET_ALIGNMENT, GL_ARB_uniform_buffer_object }, #endif +#if defined (GL_ARB_vertex_attrib_binding) + { 1, GL_MAX_VERTEX_ATTRIB_RELATIVE_OFFSET, GL_MAX_VERTEX_ATTRIB_RELATIVE_OFFSET, GL_ARB_vertex_attrib_binding }, + { 1, GL_MAX_VERTEX_ATTRIB_STRIDE, GL_MAX_VERTEX_ATTRIB_STRIDE, GL_ARB_vertex_attrib_binding }, + { 1, GL_MAX_VERTEX_ATTRIB_BINDINGS, GL_MAX_VERTEX_ATTRIB_BINDINGS, GL_ARB_vertex_attrib_binding }, +#endif { 0, (GLenum) 0, NULL, NULL } }; GLint i, max[2]; -- 1.8.5.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 3/4] glxinfo: Print GL_EXT_texture_array limits
--- src/xdemos/glxinfo.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/src/xdemos/glxinfo.c b/src/xdemos/glxinfo.c index f97ba3e..f8a4e51 100644 --- a/src/xdemos/glxinfo.c +++ b/src/xdemos/glxinfo.c @@ -628,6 +628,9 @@ print_limits(const char *extensions, const char *oglstring) { 1, GL_MAX_TEXTURE_STACK_DEPTH, GL_MAX_TEXTURE_STACK_DEPTH, NULL }, { 1, GL_MAX_TEXTURE_SIZE, GL_MAX_TEXTURE_SIZE, NULL }, { 1, GL_MAX_3D_TEXTURE_SIZE, GL_MAX_3D_TEXTURE_SIZE, NULL }, +#if defined(GL_EXT_texture_array) + { 1, GL_MAX_ARRAY_TEXTURE_LAYERS_EXT, GL_MAX_ARRAY_TEXTURE_LAYERS, GL_EXT_texture_array }, +#endif { 2, GL_MAX_VIEWPORT_DIMS, GL_MAX_VIEWPORT_DIMS, NULL }, { 2, GL_ALIASED_LINE_WIDTH_RANGE, GL_ALIASED_LINE_WIDTH_RANGE, NULL }, { 2, GL_SMOOTH_LINE_WIDTH_RANGE, GL_SMOOTH_LINE_WIDTH_RANGE, NULL }, -- 1.8.5.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/5] mesa: use malloc/free instead of MALLOC/FREE in attrib stack code
On 04/09/2014 06:39 PM, Brian Paul wrote: We moved away from MALLOC/FREE in the rest of core Mesa a while ago. --- src/mesa/main/attrib.c | 20 ++-- 1 file changed, 10 insertions(+), 10 deletions(-) Series is: Reviewed-by: Kenneth Graunke kenn...@whitecape.org signature.asc Description: OpenPGP digital signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/2] mesa/st: set min/max texture gather offset to driver-reported value
It was always getting set to -8/7 unconditionally. Use the driver-reported value instead. Signed-off-by: Ilia Mirkin imir...@alum.mit.edu --- src/mesa/state_tracker/st_extensions.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/src/mesa/state_tracker/st_extensions.c b/src/mesa/state_tracker/st_extensions.c index 845d29c..673a855 100644 --- a/src/mesa/state_tracker/st_extensions.c +++ b/src/mesa/state_tracker/st_extensions.c @@ -275,6 +275,9 @@ void st_init_limits(struct st_context *st) c-MaxProgramTexelOffset = screen-get_param(screen, PIPE_CAP_MAX_TEXEL_OFFSET); c-MaxProgramTextureGatherComponents = screen-get_param(screen, PIPE_CAP_MAX_TEXTURE_GATHER_COMPONENTS); + c-MinProgramTextureGatherOffset = screen-get_param(screen, PIPE_CAP_MIN_TEXTURE_GATHER_OFFSET); + c-MaxProgramTextureGatherOffset = screen-get_param(screen, PIPE_CAP_MAX_TEXTURE_GATHER_OFFSET); + c-UniformBooleanTrue = ~0; c-MaxTransformFeedbackBuffers = -- 1.8.3.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/2] gallium: add a way to query min/max texture gather offsets
Defaults to providing the same offsets as MIN/MAX_TEXEL_OFFSET. For nvc0, the offset can be -32/31. Signed-off-by: Ilia Mirkin imir...@alum.mit.edu --- src/gallium/docs/source/screen.rst | 4 src/gallium/drivers/freedreno/freedreno_screen.c | 2 ++ src/gallium/drivers/i915/i915_screen.c | 2 ++ src/gallium/drivers/ilo/ilo_screen.c | 2 ++ src/gallium/drivers/llvmpipe/lp_screen.c | 2 ++ src/gallium/drivers/nouveau/nv30/nv30_screen.c | 2 ++ src/gallium/drivers/nouveau/nv50/nv50_screen.c | 2 ++ src/gallium/drivers/nouveau/nvc0/nvc0_screen.c | 4 src/gallium/drivers/r300/r300_screen.c | 2 ++ src/gallium/drivers/r600/r600_pipe.c | 2 ++ src/gallium/drivers/radeonsi/si_pipe.c | 2 ++ src/gallium/drivers/svga/svga_screen.c | 2 ++ src/gallium/include/pipe/p_defines.h | 2 ++ 13 files changed, 30 insertions(+) diff --git a/src/gallium/docs/source/screen.rst b/src/gallium/docs/source/screen.rst index 943d880..5c255d0 100644 --- a/src/gallium/docs/source/screen.rst +++ b/src/gallium/docs/source/screen.rst @@ -193,6 +193,10 @@ The integer capabilities: for buffers. * ``PIPE_CAP_TEXTURE_QUERY_LOD``: Whether the ``LODQ`` instruction is supported. +* ``PIPE_CAP_MIN_TEXTURE_GATHER_OFFSET``: The minimum offset that can be used + in conjunction with a texture gather opcode. +* ``PIPE_CAP_MAX_TEXTURE_GATHER_OFFSET``: The maximum offset that can be used + in conjunction with a texture gather opcode. .. _pipe_capf: diff --git a/src/gallium/drivers/freedreno/freedreno_screen.c b/src/gallium/drivers/freedreno/freedreno_screen.c index 96c769e..08556a4 100644 --- a/src/gallium/drivers/freedreno/freedreno_screen.c +++ b/src/gallium/drivers/freedreno/freedreno_screen.c @@ -240,9 +240,11 @@ fd_screen_get_param(struct pipe_screen *pscreen, enum pipe_cap param) case PIPE_CAP_QUERY_TIMESTAMP: return 0; + case PIPE_CAP_MIN_TEXTURE_GATHER_OFFSET: case PIPE_CAP_MIN_TEXEL_OFFSET: return -8; + case PIPE_CAP_MAX_TEXTURE_GATHER_OFFSET: case PIPE_CAP_MAX_TEXEL_OFFSET: return 7; diff --git a/src/gallium/drivers/i915/i915_screen.c b/src/gallium/drivers/i915/i915_screen.c index 892c3ea..b484d36 100644 --- a/src/gallium/drivers/i915/i915_screen.c +++ b/src/gallium/drivers/i915/i915_screen.c @@ -253,6 +253,8 @@ i915_get_param(struct pipe_screen *screen, enum pipe_cap cap) return I915_MAX_TEXTURE_2D_LEVELS; case PIPE_CAP_MIN_TEXEL_OFFSET: case PIPE_CAP_MAX_TEXEL_OFFSET: + case PIPE_CAP_MIN_TEXTURE_GATHER_OFFSET: + case PIPE_CAP_MAX_TEXTURE_GATHER_OFFSET: case PIPE_CAP_MAX_STREAM_OUTPUT_BUFFERS: case PIPE_CAP_MAX_STREAM_OUTPUT_SEPARATE_COMPONENTS: case PIPE_CAP_MAX_STREAM_OUTPUT_INTERLEAVED_COMPONENTS: diff --git a/src/gallium/drivers/ilo/ilo_screen.c b/src/gallium/drivers/ilo/ilo_screen.c index 7f2e01f..4bea564 100644 --- a/src/gallium/drivers/ilo/ilo_screen.c +++ b/src/gallium/drivers/ilo/ilo_screen.c @@ -361,8 +361,10 @@ ilo_get_param(struct pipe_screen *screen, enum pipe_cap param) case PIPE_CAP_SEAMLESS_CUBE_MAP: case PIPE_CAP_SEAMLESS_CUBE_MAP_PER_TEXTURE: return true; + case PIPE_CAP_MIN_TEXTURE_GATHER_OFFSET: case PIPE_CAP_MIN_TEXEL_OFFSET: return -8; + case PIPE_CAP_MAX_TEXTURE_GATHER_OFFSET: case PIPE_CAP_MAX_TEXEL_OFFSET: return 7; case PIPE_CAP_CONDITIONAL_RENDER: diff --git a/src/gallium/drivers/llvmpipe/lp_screen.c b/src/gallium/drivers/llvmpipe/lp_screen.c index 6eb7d64..8fbc58f 100644 --- a/src/gallium/drivers/llvmpipe/lp_screen.c +++ b/src/gallium/drivers/llvmpipe/lp_screen.c @@ -176,8 +176,10 @@ llvmpipe_get_param(struct pipe_screen *screen, enum pipe_cap param) case PIPE_CAP_SEAMLESS_CUBE_MAP_PER_TEXTURE: return 1; /* this is a lie could support arbitrary large offsets */ + case PIPE_CAP_MIN_TEXTURE_GATHER_OFFSET: case PIPE_CAP_MIN_TEXEL_OFFSET: return -8; + case PIPE_CAP_MAX_TEXTURE_GATHER_OFFSET: case PIPE_CAP_MAX_TEXEL_OFFSET: return 7; case PIPE_CAP_CONDITIONAL_RENDER: diff --git a/src/gallium/drivers/nouveau/nv30/nv30_screen.c b/src/gallium/drivers/nouveau/nv30/nv30_screen.c index c34b1da..57a2f7d 100644 --- a/src/gallium/drivers/nouveau/nv30/nv30_screen.c +++ b/src/gallium/drivers/nouveau/nv30/nv30_screen.c @@ -106,6 +106,8 @@ nv30_screen_get_param(struct pipe_screen *pscreen, enum pipe_cap param) case PIPE_CAP_STREAM_OUTPUT_PAUSE_RESUME: case PIPE_CAP_MIN_TEXEL_OFFSET: case PIPE_CAP_MAX_TEXEL_OFFSET: + case PIPE_CAP_MIN_TEXTURE_GATHER_OFFSET: + case PIPE_CAP_MAX_TEXTURE_GATHER_OFFSET: case PIPE_CAP_MAX_STREAM_OUTPUT_SEPARATE_COMPONENTS: case PIPE_CAP_MAX_STREAM_OUTPUT_INTERLEAVED_COMPONENTS: case PIPE_CAP_MAX_GEOMETRY_OUTPUT_VERTICES: diff --git a/src/gallium/drivers/nouveau/nv50/nv50_screen.c b/src/gallium/drivers/nouveau/nv50/nv50_screen.c