Re: [Mesa3d-dev] Gallium double opcodes
Hi, On Thu, Sep 16, 2010 at 2:37 AM, Jose Fonseca jfons...@vmware.com wrote: Hi Igor, The overall intent is good, but this creates 4*64bit = 256 bit registers which don't exist. LLVM can split into 128bit instructions, but I found that to be buggy in some cases, and it affects our ability to use sse intrinsics. It is also unnecessary for vertical operations such as multiplication. So I'd prefer that you create a 2*double vector, and issue two multiplications per channel, and do the same for other double opcodes. Is there any double opcode for which this would not work? No. as we are doing a operation by chan i believe we can do just one mutiplication and lp_build_vec_type is enough to create the vec type(patch bellow) A few minor details: instead of lp_types_to_double/lp_double_to_types, I'd prefer cast_to_double, cast_from_double; and the double type should be computed once and stored in the tgsi build context. Ok. about the double type stored to be computed in tgsi build context i do not think it is a good idea because tgsi just support float operations we are doing a hack using double operations, the chans when using double operations are float we are just doing something like: muld result.xy, a.yz, b,yz so the hsb and msb are moved to x and y respectively. But as i said before this is my first experience with llvm pipe :) diff --git a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c index ca8db9c..c9174ce 100644 --- a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c +++ b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c @@ -970,6 +970,56 @@ emit_kil( lp_build_mask_update(bld-mask, mask); } +static LLVMValueRef +lp_cast_to_double(struct lp_build_context *bld, + LLVMValueRef a, + LLVMValueRef b) +{ + struct lp_type type; + LLVMValueRef res; + LLVMTypeRef vec_type; + LLVMTypeRef vec_double_type; + + assert(lp_check_value(bld-type, a)); + assert(lp_check_value(bld-type, b)); + + type = lp_type_uint_vec(64); + vec_type = lp_build_vec_type(type); + + a = LLVMBuildBitCast(bld-builder, a, vec_type, ); + b = LLVMBuildBitCast(bld-builder, b, vec_type, ); + + res = LLVMBuildShl(bld-builder, a, lp_build_const_int_vec(type, 32),); + res = LLVMBuildOr(bld-builder, res, b, ); + + a = LLVMBuildBitCast(bld-builder, a, bld-vec_type, ); + b = LLVMBuildBitCast(bld-builder, b, bld-vec_type, ); + + type = lp_type_float_vec(64); + vec_double_type = lp_build_vec_type(type); + res = LLVMBuildBitCast(bld-builder, res, vec_double_type, ); + + return res; +} + +static void +lp_cast_from_double(struct lp_build_context *bld, +LLVMValueRef double_value, +LLVMValueRef a, +LLVMValueRef b) +{ + LLVMTypeRef double_type; + struct lp_type type = lp_type_uint_vec(64); + + double_type = lp_build_vec_type(type); + a = LLVMBuildBitCast(bld-builder, double_value, double_type, ); + + b = LLVMBuildAnd(bld-builder, a, lp_build_const_int_vec(type, 0x), ); + + a = LLVMBuildBitCast(bld-builder, a, bld-vec_type, ); + b = LLVMBuildBitCast(bld-builder, b, bld-vec_type, ); +} + /** * Predicated fragment kill. @@ -1988,6 +2038,34 @@ emit_instruction( case TGSI_OPCODE_NOP: break; + case TGSI_OPCODE_DMUL: + if (IS_DST0_CHANNEL_ENABLED(inst, CHAN_X) IS_DST0_CHANNEL_ENABLED(inst, CHAN_Y)) { + tmp0 = emit_fetch( bld, inst, 0, CHAN_X ); + tmp1 = emit_fetch( bld, inst, 0, CHAN_Y ); + + tmp2 = emit_fetch( bld, inst, 1, CHAN_X ); + tmp3 = emit_fetch( bld, inst, 1, CHAN_Y ); + + src0 = lp_cast_to_double(bld-base, tmp0, tmp1); + src1 = lp_cast_to_double(bld-base, tmp2, tmp3); + tmp4 = lp_build_mul(bld-base, src0, src1); + lp_cast_from_double(bld-base, tmp4, dst0[CHAN_X], dst0[CHAN_Y]); + } + + if (IS_DST0_CHANNEL_ENABLED(inst, CHAN_Z) IS_DST0_CHANNEL_ENABLED(inst, CHAN_W)) { + tmp0 = emit_fetch( bld, inst, 0, CHAN_Z ); + tmp1 = emit_fetch( bld, inst, 0, CHAN_W ); + + tmp2 = emit_fetch( bld, inst, 1, CHAN_Z ); + tmp3 = emit_fetch( bld, inst, 1, CHAN_W ); + + src0 = lp_cast_to_double(bld-base, tmp0, tmp1); + src1 = lp_cast_to_double(bld-base, tmp2, tmp3); + tmp4 = lp_build_mul(bld-base, src0, src1); + lp_cast_from_double(bld-base, tmp4, dst0[CHAN_Z], dst0[CHAN_W]); + } + break; + default: return FALSE; } -- Start uncovering the many advantages of virtual appliances and start using them to simplify application deployment and accelerate your shift to cloud computing. http://p.sf.net/sfu/novell-sfdev2dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
[Mesa3d-dev] Gallium double opcodes
Hi, I am reliving gallium double opcode branch and make some work in llvm driver. So before the tests was done using the python/st but look likes it is a little bit out dated. So i am just sending the code below for review and suggestions(i am new in llvm api), basically all the others opcodes would be done using the same logic. diff --git a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c index ca8db9c..29892d9 100644 --- a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c +++ b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c @@ -970,6 +970,60 @@ emit_kil( lp_build_mask_update(bld-mask, mask); } +static LLVMValueRef +lp_types_to_double(struct lp_build_context *bld, + LLVMValueRef a, + LLVMValueRef b) +{ + LLVMValueRef res; + struct lp_type type; + LLVMTypeRef vec_type; + LLVMTypeRef vec_double_type; + + assert(lp_check_value(bld-type, a)); + assert(lp_check_value(bld-type, b)); + + type = lp_type_uint(64); + type.length = bld-type.length; + + vec_type = lp_build_vec_type(type); + a = LLVMBuildBitCast(bld-builder, a, vec_type, ); + b = LLVMBuildBitCast(bld-builder, b, vec_type, ); + + res = LLVMBuildShl(bld-builder, a, lp_build_const_int_vec(type, 32),); + res = LLVMBuildOr(bld-builder, res, b, ); + + a = LLVMBuildBitCast(bld-builder, a, bld-vec_type, ); + b = LLVMBuildBitCast(bld-builder, b, bld-vec_type, ); + + type = lp_type_float(64); + type.length = bld-type.length; + vec_double_type = lp_build_vec_type(type); + res = LLVMBuildBitCast(bld-builder, res, vec_double_type, ); + + return res; +} + +static void +lp_double_to_types(struct lp_build_context *bld, + LLVMValueRef double_value, + LLVMValueRef a, + LLVMValueRef b) +{ + LLVMTypeRef double_type; + struct lp_type type = lp_type_uint(64); + type.length = bld-type.length; + + double_type = lp_build_vec_type(type); + + a = LLVMBuildBitCast(bld-builder, double_value, double_type, ); + + b = LLVMBuildAnd(bld-builder, a, lp_build_const_int_vec(type, 0x), ); + + a = LLVMBuildBitCast(bld-builder, a, bld-vec_type, ); + b = LLVMBuildBitCast(bld-builder, b, bld-vec_type, ); +} + /** * Predicated fragment kill. @@ -1988,6 +2042,34 @@ emit_instruction( case TGSI_OPCODE_NOP: break; + case TGSI_OPCODE_DMUL: + if (IS_DST0_CHANNEL_ENABLED(inst, CHAN_X) IS_DST0_CHANNEL_ENABLED(inst, CHAN_Y)) { + tmp0 = emit_fetch( bld, inst, 0, CHAN_X ); + tmp1 = emit_fetch( bld, inst, 0, CHAN_Y ); + + tmp2 = emit_fetch( bld, inst, 1, CHAN_X ); + tmp3 = emit_fetch( bld, inst, 1, CHAN_Y ); + + src0 = lp_types_to_double(bld-base, tmp0, tmp1); + src1 = lp_types_to_double(bld-base, tmp2, tmp3); + tmp4 = lp_build_mul(bld-base, src0, src1); + lp_double_to_types(bld-base, tmp4, dst0[CHAN_X], dst0[CHAN_Y]); + } + + if (IS_DST0_CHANNEL_ENABLED(inst, CHAN_Z) IS_DST0_CHANNEL_ENABLED(inst, CHAN_W)) { + tmp0 = emit_fetch( bld, inst, 0, CHAN_Z ); + tmp1 = emit_fetch( bld, inst, 0, CHAN_W ); + + tmp2 = emit_fetch( bld, inst, 1, CHAN_Z ); + tmp3 = emit_fetch( bld, inst, 1, CHAN_W ); + + src0 = lp_types_to_double(bld-base, tmp0, tmp1); + src1 = lp_types_to_double(bld-base, tmp2, tmp3); + tmp4 = lp_build_mul(bld-base, src0, src1); + lp_double_to_types(bld-base, tmp4, dst0[CHAN_Z], dst0[CHAN_W]); + } + break; + default: return FALSE; } -- Start uncovering the many advantages of virtual appliances and start using them to simplify application deployment and accelerate your shift to cloud computing. http://p.sf.net/sfu/novell-sfdev2dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] [RFC] Vega state tracker advanced blending
Hi, On Fri, Feb 12, 2010 at 1:38 PM, Keith Whitwell kei...@vmware.com wrote: On Fri, 2010-02-12 at 09:34 -0800, Igor Oliveira wrote: Hello, i am starting a new branch called Vega-advance-blending. The objective of this branch is implement advanced blending krhonos specification. If anyone has a different opinion let me know :). Below you can see what i already have done in my local branch(a big amount of code): This looks great -- only comment is that with larger shaders it becomes difficult to see what formulae you're implementing. A comment containing a textual expression of each advanced blend mode prior to all the ureg calls would help a lot. Doing that. Igor Keith diff --git a/src/gallium/state_trackers/vega/api_params.c b/src/gallium/state_trackers/vega/api_params.c index db77fd9..c5ac8f8 100644 --- a/src/gallium/state_trackers/vega/api_params.c +++ b/src/gallium/state_trackers/vega/api_params.c @@ -25,6 +25,7 @@ **/ #include VG/openvg.h +#include VG/vgext.h #include vg_context.h #include paint.h @@ -160,7 +161,7 @@ void vgSeti (VGParamType type, VGint value) break; case VG_BLEND_MODE: if (value VG_BLEND_SRC || - value VG_BLEND_ADDITIVE) + value VG_BLEND_XOR_KHR) error = VG_ILLEGAL_ARGUMENT_ERROR; else { ctx-state.dirty |= BLEND_DIRTY; diff --git a/src/gallium/state_trackers/vega/asm_fill.h b/src/gallium/state_trackers/vega/asm_fill.h index 9a06982..916af75 100644 --- a/src/gallium/state_trackers/vega/asm_fill.h +++ b/src/gallium/state_trackers/vega/asm_fill.h @@ -28,6 +28,7 @@ #define ASM_FILL_H #include tgsi/tgsi_ureg.h +#include stdio.h typedef void (* ureg_func)( struct ureg_program *ureg, struct ureg_dst *out, @@ -335,6 +336,647 @@ blend_lighten( struct ureg_program *ureg, } static INLINE void +blend_overlay_khr( struct ureg_program *ureg, + struct ureg_dst *out, + struct ureg_src *in, + struct ureg_src *sampler, + struct ureg_dst *temp, + struct ureg_src *constant) +{ + unsigned label; + ureg_TEX(ureg, temp[1], TGSI_TEXTURE_2D, in[0], sampler[2]); + ureg_ADD(ureg, + ureg_writemask(temp[2], TGSI_WRITEMASK_XYZ), + ureg_src(temp[1]), ureg_src(temp[1])); + + ureg_SLT(ureg, temp[2], ureg_src(temp[2]), + ureg_scalar(ureg_src(temp[1]), TGSI_SWIZZLE_W)); + + ureg_MOV(ureg, ureg_writemask(temp[2], TGSI_WRITEMASK_W), + ureg_scalar(constant[1], TGSI_SWIZZLE_Y)); + + + EXTENDED_BLENDER_OVER_FUNC + + label = ureg_get_instruction_number(ureg); + label += 2; + + ureg_IF(ureg, ureg_src(temp[2]), label); + ureg_MUL(ureg, + ureg_writemask(temp[2], TGSI_WRITEMASK_XYZ), + ureg_src(temp[0]), ureg_src(temp[1])); + ureg_ADD(ureg, temp[2], ureg_src(temp[2]), ureg_src(temp[2])); + ureg_ADD(ureg, temp[1], ureg_src(temp[2]), ureg_src(temp[3])); + + label = ureg_get_instruction_number(ureg); + label += 2; + + ureg_ELSE(ureg, label); + ureg_SUB(ureg, temp[2], + ureg_scalar(ureg_src(temp[1]), TGSI_SWIZZLE_W), + ureg_src(temp[1])); + ureg_SUB(ureg, temp[4], ureg_scalar(ureg_src(temp[0]), TGSI_SWIZZLE_W), + ureg_src(temp[0])); + ureg_MUL(ureg, temp[2], ureg_src(temp[2]), ureg_src(temp[4])); + ureg_ADD(ureg, temp[2], ureg_src(temp[2]), ureg_src(temp[2])); + ureg_MUL(ureg, temp[4], ureg_scalar(ureg_src(temp[0]), TGSI_SWIZZLE_W), + ureg_scalar(ureg_src(temp[1]), TGSI_SWIZZLE_W)); + ureg_SUB(ureg, ureg_writemask(temp[2], TGSI_WRITEMASK_XYZ), + ureg_src(temp[4]), ureg_src(temp[2])); + ureg_ADD(ureg, temp[1], ureg_src(temp[2]), ureg_src(temp[3])); + ureg_ENDIF(ureg); + + ureg_MUL(ureg, temp[2], ureg_scalar(ureg_src(temp[0]), TGSI_SWIZZLE_W), + ureg_scalar(ureg_src(temp[1]), TGSI_SWIZZLE_W)); + ureg_ADD(ureg, temp[3], ureg_scalar(ureg_src(temp[0]), TGSI_SWIZZLE_W), + ureg_scalar(ureg_src(temp[1]), TGSI_SWIZZLE_W)); + ureg_SUB(ureg, ureg_writemask(temp[1], TGSI_WRITEMASK_W), + ureg_src(temp[3]), ureg_src(temp[2])); + + ureg_MOV(ureg, *out, ureg_src(temp[1])); +} + +static INLINE void +blend_hardlight_khr( struct ureg_program *ureg, + struct ureg_dst *out, + struct ureg_src *in, + struct ureg_src *sampler, + struct ureg_dst *temp, + struct ureg_src *constant) +{ + unsigned label; + + ureg_TEX(ureg, temp[1], TGSI_TEXTURE_2D, in[0], sampler[2]); + + ureg_ADD(ureg, temp[2], ureg_src(temp[0]), ureg_src(temp[0])); + + ureg_SLT(ureg, temp[2], ureg_src(temp[2]), + ureg_scalar
[Mesa3d-dev] [PATCH] vega st: fix missing texture in mask when setup samplers
This patch fix segfaults in mask.cpp and mask4.cpp binding a missing texture in mask bind samplers. Igor From a8ceac97c316abc56b31b6cadd826eae8f3bcbe8 Mon Sep 17 00:00:00 2001 From: Igor Oliveira igor.olive...@openbossa.org Date: Tue, 2 Feb 2010 19:19:52 -0400 Subject: [PATCH] vega: fix missing texture in mask when setup samplers --- src/gallium/state_trackers/vega/mask.c |1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/src/gallium/state_trackers/vega/mask.c b/src/gallium/state_trackers/vega/mask.c index ba8ecef..0fb5639 100644 --- a/src/gallium/state_trackers/vega/mask.c +++ b/src/gallium/state_trackers/vega/mask.c @@ -681,6 +681,7 @@ VGint mask_bind_samplers(struct pipe_sampler_state **samplers, samplers[1] = ctx-mask.sampler; textures[1] = fb_buffers-alpha_mask; + textures[0] = fb_buffers-blend_texture; return 1; } else return 0; -- 1.6.3.3 -- The Planet: dedicated and managed hosting, cloud storage, colocation Stay online with enterprise data centers and the best network in the business Choose flexible plans and management services without long-term contracts Personal 24x7 support from experience hosting pros just a phone call away. http://p.sf.net/sfu/theplanet-com___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] [PATCH] vega st: fix missing texture in mask when setup samplers
A new version and a little improvement. Makes more sense adding the texture in paint bind samplers than mask bind samplers. Igor On Wed, Feb 3, 2010 at 6:21 AM, Igor Oliveira igor.olive...@openbossa.org wrote: This patch fix segfaults in mask.cpp and mask4.cpp binding a missing texture in mask bind samplers. Igor From 803e77b0b4d03cf8b17b0a3b2f6f080db7f519c1 Mon Sep 17 00:00:00 2001 From: Igor Oliveira igor.olive...@openbossa.org Date: Wed, 3 Feb 2010 10:14:48 -0400 Subject: [PATCH] vega: add dummy texture when setuping paint textures. Making it we do not have problems when using mask operations --- src/gallium/state_trackers/vega/paint.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/src/gallium/state_trackers/vega/paint.c b/src/gallium/state_trackers/vega/paint.c index d8f6299..46a6930 100644 --- a/src/gallium/state_trackers/vega/paint.c +++ b/src/gallium/state_trackers/vega/paint.c @@ -640,7 +640,7 @@ VGint paint_bind_samplers(struct vg_paint *paint, struct pipe_sampler_state **sa break; default: samplers[0] = paint-pattern.sampler; /* dummy */ - textures[0] = 0; + textures[0] = ctx-draw_buffer-alpha_mask; /* dummy */ return 0; break; } -- 1.6.3.3 -- The Planet: dedicated and managed hosting, cloud storage, colocation Stay online with enterprise data centers and the best network in the business Choose flexible plans and management services without long-term contracts Personal 24x7 support from experience hosting pros just a phone call away. http://p.sf.net/sfu/theplanet-com___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] [PATCH] vega st: fix missing texture in mask when setup samplers
This patch fixes problems in filter and lookup too. Igor On Wed, Feb 3, 2010 at 10:19 AM, Igor Oliveira igor.olive...@openbossa.org wrote: A new version and a little improvement. Makes more sense adding the texture in paint bind samplers than mask bind samplers. Igor On Wed, Feb 3, 2010 at 6:21 AM, Igor Oliveira igor.olive...@openbossa.org wrote: This patch fix segfaults in mask.cpp and mask4.cpp binding a missing texture in mask bind samplers. Igor -- The Planet: dedicated and managed hosting, cloud storage, colocation Stay online with enterprise data centers and the best network in the business Choose flexible plans and management services without long-term contracts Personal 24x7 support from experience hosting pros just a phone call away. http://p.sf.net/sfu/theplanet-com ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] [PATCH] vega st: fix missing texture in mask when setup samplers
Hi, this is the filter backtrace: Program received signal SIGSEGV, Segmentation fault. get_sampler_varient (unit=value optimized out, sampler=0x8085c80, texture=0x0, processor=0) at sp_state_sampler.c:207 207key.bits.target = sp_texture-base.target; (gdb) bt #0 get_sampler_varient (unit=value optimized out, sampler=0x8085c80, texture=0x0, processor=0) at sp_state_sampler.c:207 #1 0x00c12c7c in softpipe_reset_sampler_varients (softpipe=0x805d498) at sp_state_sampler.c:263 #2 0x00c12312 in update_tgsi_samplers (softpipe=0x805d498) at sp_state_derived.c:200 #3 softpipe_update_derived (softpipe=0x805d498) at sp_state_derived.c:249 #4 0x00c0978d in softpipe_draw_range_elements_instanced (pipe=0x805d498, indexBuffer=value optimized out, indexSize=0, minIndex=0, maxIndex=4294967295, mode=6, start=0, count=4, startInstance=0, instanceCount=1) at sp_draw_arrays.c:260 #5 0x00c0996c in softpipe_draw_arrays (pipe=0x805d498, mode=6, start=0, count=4) at sp_draw_arrays.c:144 #6 0x004cc40c in util_draw_vertex_buffer (pipe=0x805d498, vbuf=0x808d4c0, offset=0, prim_type=6, num_verts=4, num_attribs=2) at util/u_draw_quad.c:72 #7 0x004acb36 in renderer_texture_quad (r=0x80751b8, tex=0x8086eb8, x1offset=0, y1offset=0, x2offset=32, y2offset=32, x1=10, y1=10, x2=42, y2=10, x3=42, y3=42, x4=10, y4=42) at renderer.c:590 #8 0x004ab036 in image_draw (img=0x8086e60) at image.c:552 #9 0x0048c6e7 in vgDrawImage (image=134770272) at api_images.c:307 #10 0x08049301 in draw () at filter.c:97 #11 0x080498ba in event_loop (argc=1, argv=0xb3c4, init_f=0x8048fd0 init, resh_f=0x8048f50 reshape, draw_f=0x8049270 draw, key_f=0) at eglcommon.c:172 #12 run (argc=1, argv=0xb3c4, init_f=0x8048fd0 init, resh_f=0x8048f50 reshape, draw_f=0x8049270 draw, key_f=0) at eglcommon.c:276 #13 0x08048fc1 in main (argc=1, argv=0xb3c4) at filter.c:105 and this is the output of text_to_tgsi: VERT DCL IN[0] DCL OUT[0], POSITION DCL TEMP[0] DCL CONST[0..1] 0: MUL TEMP[0], IN[0], CONST[0] 1: ADD TEMP[0], TEMP[0], CONST[1] 2: MOV OUT[0], TEMP[0] 3: END FRAG DCL IN[0], GENERIC[0], PERSPECTIVE DCL OUT[0], COLOR, CONSTANT DCL TEMP[0..4], CONSTANT DCL ADDR[0], CONSTANT DCL CONST[0..20], CONSTANT DCL SAMP[0], CONSTANT 0: MOV TEMP[0], CONST[0]. 1: MOV TEMP[1], CONST[0]. 2: BGNLOOP :14 3: SGE TEMP[0].z, TEMP[0]., CONST[1]. 4: IF TEMP[0]. :7 5: BRK 6: ENDIF 7: ARL ADDR[0].x, TEMP[0]. 8: MOV TEMP[3], CONST[ADDR[0]+2] 9: ADD TEMP[4].xy, IN[0], TEMP[3] 10: TEX TEMP[2], TEMP[4], SAMP[0], 2D 11: MOV TEMP[3], CONST[ADDR[0]+11] 12: MAD TEMP[1], TEMP[2], TEMP[3], TEMP[1] 13: ADD TEMP[0].y, TEMP[0]., CONST[0]. 14: ENDLOOP :2 15: MAD OUT[0], TEMP[1], CONST[1]., CONST[1]. 16: END VERT DCL IN[0] DCL IN[1] DCL OUT[0], POSITION DCL OUT[1], GENERIC DCL TEMP[0] DCL CONST[0..1] 0: MUL TEMP[0], IN[0], CONST[0] 1: ADD TEMP[0], TEMP[0], CONST[1] 2: MOV OUT[0], TEMP[0] 3: MOV OUT[1], IN[1] 4: END Segmentation fault On Wed, Feb 3, 2010 at 12:03 PM, Zack Rusin za...@vmware.com wrote: On Wednesday 03 February 2010 09:19:43 Igor Oliveira wrote: A new version and a little improvement. Makes more sense adding the texture in paint bind samplers than mask bind samplers. Igor On Wed, Feb 3, 2010 at 6:21 AM, Igor Oliveira igor.olive...@openbossa.org wrote: This patch fix segfaults in mask.cpp and mask4.cpp binding a missing texture in mask bind samplers. What's the stack trace again? I don't have a working vg setup right now. I'm a bit confused why it's crashing when we never sample from those units. The issue is that while right now alpha_mask texture is unconditionally there that's actually a bug - there's no guarantee that the alpha mask will be always present (obviously for egl configs that didn't ask for it, it shouldn't be there). Meaning that for stuff like filter and lookup it's perfectly ok for it to be null. I think ideally what we'd do is fix all those static sampler texture assignment. E.g. right now 0 - paint sampler/texture for gradient/pattern 1 - mask sampler/texture 2 - blend sampler/texture 3 - image sampler/texture meaning that if there's no paint, mask and blend and we only draw image then we have 0 - dummy 1 - dummy 2 - dummy 3 - image We had to do it this way when we had the hand written text assembly to have some semblance of sanity but now that we use ureg we could fix it properly e.g. in asm_fill.c in mask(...) instead of doing sampler[1], we'd simply do *sampler and change the combine_shaders to pass the correct the sampler to the function. Yep. It is better do that. i was fixing it because i was porting filters to ureg and we do not have the khronos conformance tests so we need to know when we have regressions in ureg code. If you need to fix the crash just to move forward on something I can commit it but please just add a /* FIXME: the texture might not be there */ and debug_assert(ctx-draw_buffer-alpha_mask
[Mesa3d-dev] [PATCH] elg: fix wrong argument in egldriver function
Hi, the patch fix a typo in _eglPreloadForEach, the loader should have as third argument loader_data(the driver name) instead of loader(the loader function). Igor From 318a82ed3016ef788a41f15b461e88034b5fe64e Mon Sep 17 00:00:00 2001 From: Igor Oliveira igor.olive...@openbossa.org Date: Wed, 3 Feb 2010 17:51:30 -0400 Subject: [PATCH] egl: fix wrong argument. Use loader_data instead of loader --- src/egl/main/egldriver.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/src/egl/main/egldriver.c b/src/egl/main/egldriver.c index a8a8e30..a87c697 100644 --- a/src/egl/main/egldriver.c +++ b/src/egl/main/egldriver.c @@ -392,7 +392,7 @@ _eglPreloadForEach(const char *search_path, next = strchr(cur, ':'); len = (next) ? next - cur : strlen(cur); - if (!loader(cur, len, loader)) + if (!loader(cur, len, loader_data)) break; cur = (next) ? next + 1 : NULL; -- 1.6.3.3 -- The Planet: dedicated and managed hosting, cloud storage, colocation Stay online with enterprise data centers and the best network in the business Choose flexible plans and management services without long-term contracts Personal 24x7 support from experience hosting pros just a phone call away. http://p.sf.net/sfu/theplanet-com___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
[Mesa3d-dev] [PATCH] egl: fix implicit declaration of pipe_texture_reference adding u_inlines.h
Hi, This patch fix a missing include in egl state trackers. It is a leftover from gallium-embedded merge. Igor From dcc2584c162a0963fa2044a5440b9d5514ea4c97 Mon Sep 17 00:00:00 2001 From: Igor Oliveira igor.olive...@openbossa.org Date: Wed, 3 Feb 2010 18:37:36 -0400 Subject: [PATCH] egl: fix implicit declaration of pipe_texture_reference adding u_inlines.h --- src/gallium/state_trackers/egl/common/egl_g3d.c|1 + src/gallium/state_trackers/egl/x11/native_dri2.c |1 + src/gallium/state_trackers/egl/x11/native_ximage.c |1 + 3 files changed, 3 insertions(+), 0 deletions(-) diff --git a/src/gallium/state_trackers/egl/common/egl_g3d.c b/src/gallium/state_trackers/egl/common/egl_g3d.c index 30e2c34..2393199 100644 --- a/src/gallium/state_trackers/egl/common/egl_g3d.c +++ b/src/gallium/state_trackers/egl/common/egl_g3d.c @@ -27,6 +27,7 @@ #include pipe/p_screen.h #include util/u_memory.h #include util/u_rect.h +#include util/u_inlines.h #include egldriver.h #include eglcurrent.h #include eglconfigutil.h diff --git a/src/gallium/state_trackers/egl/x11/native_dri2.c b/src/gallium/state_trackers/egl/x11/native_dri2.c index 07f82d8..b2eba72 100644 --- a/src/gallium/state_trackers/egl/x11/native_dri2.c +++ b/src/gallium/state_trackers/egl/x11/native_dri2.c @@ -25,6 +25,7 @@ #include util/u_memory.h #include util/u_math.h #include util/u_format.h +#include util/u_inlines.h #include pipe/p_compiler.h #include pipe/p_screen.h #include pipe/p_context.h diff --git a/src/gallium/state_trackers/egl/x11/native_ximage.c b/src/gallium/state_trackers/egl/x11/native_ximage.c index 697fd7c..7946415 100644 --- a/src/gallium/state_trackers/egl/x11/native_ximage.c +++ b/src/gallium/state_trackers/egl/x11/native_ximage.c @@ -34,6 +34,7 @@ #include util/u_format.h #include pipe/p_compiler.h #include util/u_simple_screen.h +#include util/u_inlines.h #include softpipe/sp_winsys.h #include egllog.h -- 1.6.3.3 -- The Planet: dedicated and managed hosting, cloud storage, colocation Stay online with enterprise data centers and the best network in the business Choose flexible plans and management services without long-term contracts Personal 24x7 support from experience hosting pros just a phone call away. http://p.sf.net/sfu/theplanet-com___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
[Mesa3d-dev] [PATCH] switch shaders assembly TGSI code by tgsi_ureg
Hello, Theses patchs switch all shaders code implemeted using TGSI assembly by tgsi_ureg. Igor From 6ff667cff0b3d6460a2bb0d6845348b6a2c6e6e2 Mon Sep 17 00:00:00 2001 From: Igor Oliveira igor.olive...@openbossa.org Date: Mon, 1 Feb 2010 11:11:36 -0400 Subject: [PATCH 1/2] vega: change tgsi asm by tgsi_ureg functions --- src/gallium/state_trackers/vega/asm_fill.h | 549 +++- 1 files changed, 378 insertions(+), 171 deletions(-) diff --git a/src/gallium/state_trackers/vega/asm_fill.h b/src/gallium/state_trackers/vega/asm_fill.h index 2f394ad..549e54c 100644 --- a/src/gallium/state_trackers/vega/asm_fill.h +++ b/src/gallium/state_trackers/vega/asm_fill.h @@ -27,166 +27,373 @@ #ifndef ASM_FILL_H #define ASM_FILL_H -static const char solid_fill_asm[] = - MOV %s, CONST[0]\n; - - -static const char linear_grad_asm[] = - MOV TEMP[0].xy, IN[0]\n - MOV TEMP[0].z, CONST[1].\n - DP3 TEMP[1], CONST[2], TEMP[0]\n - DP3 TEMP[2], CONST[3], TEMP[0]\n - DP3 TEMP[3], CONST[4], TEMP[0]\n - RCP TEMP[3], TEMP[3]\n - MUL TEMP[1], TEMP[1], TEMP[3]\n - MUL TEMP[2], TEMP[2], TEMP[3]\n - MOV TEMP[4].x, TEMP[1]\n - MOV TEMP[4].y, TEMP[2]\n - MUL TEMP[0], CONST[0]., TEMP[4].\n - MAD TEMP[1], CONST[0]., TEMP[4]., TEMP[0]\n - MUL TEMP[2], TEMP[1], CONST[0].\n - TEX %s, TEMP[2], SAMP[0], 1D\n; - -static const char radial_grad_asm[] = - MOV TEMP[0].xy, IN[0]\n - MOV TEMP[0].z, CONST[1].\n - DP3 TEMP[1], CONST[2], TEMP[0]\n - DP3 TEMP[2], CONST[3], TEMP[0]\n - DP3 TEMP[3], CONST[4], TEMP[0]\n - RCP TEMP[3], TEMP[3]\n - MUL TEMP[1], TEMP[1], TEMP[3]\n - MUL TEMP[2], TEMP[2], TEMP[3]\n - MOV TEMP[5].x, TEMP[1]\n - MOV TEMP[5].y, TEMP[2]\n - MUL TEMP[0], CONST[0]., TEMP[5].\n - MAD TEMP[1], CONST[0]., TEMP[5]., TEMP[0]\n - ADD TEMP[1], TEMP[1], TEMP[1]\n - MUL TEMP[3], TEMP[5]., TEMP[5].\n - MAD TEMP[4], TEMP[5]., TEMP[5]., TEMP[3]\n - MOV TEMP[4], -TEMP[4]\n - MUL TEMP[2], CONST[0]., TEMP[4]\n - MUL TEMP[0], CONST[1]., TEMP[2]\n - MUL TEMP[3], TEMP[1], TEMP[1]\n - SUB TEMP[2], TEMP[3], TEMP[0]\n - RSQ TEMP[2], |TEMP[2]|\n - RCP TEMP[2], TEMP[2]\n - SUB TEMP[1], TEMP[2], TEMP[1]\n - ADD TEMP[0], CONST[0]., CONST[0].\n - RCP TEMP[0], TEMP[0]\n - MUL TEMP[2], TEMP[1], TEMP[0]\n - TEX %s, TEMP[2], SAMP[0], 1D\n; - -static const char pattern_asm[] = - MOV TEMP[0].xy, IN[0]\n - MOV TEMP[0].z, CONST[1].\n - DP3 TEMP[1], CONST[2], TEMP[0]\n - DP3 TEMP[2], CONST[3], TEMP[0]\n - DP3 TEMP[3], CONST[4], TEMP[0]\n - RCP TEMP[3], TEMP[3]\n - MUL TEMP[1], TEMP[1], TEMP[3]\n - MUL TEMP[2], TEMP[2], TEMP[3]\n - MOV TEMP[4].x, TEMP[1]\n - MOV TEMP[4].y, TEMP[2]\n - RCP TEMP[0], CONST[1].zwzw\n - MOV TEMP[1], TEMP[4]\n - MUL TEMP[1].x, TEMP[1], TEMP[0]\n - MUL TEMP[1].y, TEMP[1], TEMP[0]\n - TEX %s, TEMP[1], SAMP[0], 2D\n; - - -static const char mask_asm[] = - TEX TEMP[1], IN[0], SAMP[1], 2D\n - MUL TEMP[0].w, TEMP[0]., TEMP[1].\n - MOV %s, TEMP[0]\n; - - -static const char image_normal_asm[] = - TEX %s, IN[1], SAMP[3], 2D\n; - -static const char image_multiply_asm[] = - TEX TEMP[1], IN[1], SAMP[3], 2D\n - MUL %s, TEMP[0], TEMP[1]\n; - -static const char image_stencil_asm[] = - TEX TEMP[1], IN[1], SAMP[3], 2D\n - MUL %s, TEMP[0], TEMP[1]\n; - - -#define EXTENDED_BLEND_OVER \ - SUB TEMP[3], CONST[1]., TEMP[1].\n \ - SUB TEMP[4], CONST[1]., TEMP[0].\n \ - MUL TEMP[3], TEMP[0], TEMP[3]\n\ - MUL TEMP[4], TEMP[1], TEMP[4]\n\ - ADD TEMP[3], TEMP[3], TEMP[4]\n - -static const char blend_multiply_asm[] = - TEX TEMP[1], IN[0], SAMP[2], 2D\n - EXTENDED_BLEND_OVER - MUL TEMP[4], TEMP[0], TEMP[1]\n - ADD TEMP[1], TEMP[4], TEMP[3]\n/*result.rgb*/ - MUL TEMP[2], TEMP[0]., TEMP[1].\n - ADD TEMP[3], TEMP[0]., TEMP[1].\n - SUB TEMP[1].w, TEMP[3], TEMP[2]\n - MOV %s, TEMP[1]\n; -#if 1 -static const char blend_screen_asm[] = - TEX TEMP[1], IN[0], SAMP[2], 2D\n - ADD TEMP[3], TEMP[0], TEMP[1]\n - MUL TEMP[2], TEMP[0], TEMP[1]\n - SUB %s, TEMP[3], TEMP[2]\n; -#else -static const char blend_screen_asm[] = - TEX TEMP[1], IN[0], SAMP[2], 2D\n - MOV %s, TEMP[1]\n; -#endif - -static const char blend_darken_asm[] = - TEX TEMP[1], IN[0], SAMP[2], 2D\n - EXTENDED_BLEND_OVER - MUL TEMP[4], TEMP[0], TEMP[1].\n - MUL TEMP[5], TEMP[1], TEMP[0].\n - MIN TEMP[4], TEMP[4], TEMP[5]\n - ADD TEMP[1], TEMP[3], TEMP[4]\n - MUL TEMP[2], TEMP[0]., TEMP[1].\n - ADD TEMP[3], TEMP[0]., TEMP[1].\n - SUB TEMP[1].w, TEMP[3], TEMP[2]\n - MOV %s, TEMP[1]\n; - -static const char blend_lighten_asm[] = - TEX TEMP[1], IN[0], SAMP[2], 2D\n - EXTENDED_BLEND_OVER - MUL TEMP[4], TEMP[0], TEMP[1].\n - MUL TEMP[5], TEMP[1], TEMP[0].\n - MAX TEMP[4], TEMP[4], TEMP[5]\n - ADD TEMP[1], TEMP[3], TEMP[4
Re: [Mesa3d-dev] [PATCH] switch shaders assembly TGSI code by tgsi_ureg
Hi again, Third version: removing debug messages Igor On Mon, Feb 1, 2010 at 10:08 PM, Igor Oliveira igor.olive...@openbossa.org wrote: Hi, I am resenting the patches, i fixed some bugs in ureg functions and did some clean up in shaders cache code. i tested it with all progs/openvg files. Igor On Mon, Feb 1, 2010 at 1:02 PM, Zack Rusin za...@vmware.com wrote: On Monday 01 February 2010 10:19:53 Igor Oliveira wrote: Hello, Theses patchs switch all shaders code implemeted using TGSI assembly by tgsi_ureg. Hey Igor, very nice work! Since I don't have the conformance framework anymore, did you test your changes with the examples that we have? Before committing it'd be nice to know that we won't get too many obvious regressions. z From 81a644140b7a5185d4b200a0ac6cfa3384bed152 Mon Sep 17 00:00:00 2001 From: Igor Oliveira igor.olive...@openbossa.org Date: Mon, 1 Feb 2010 22:01:51 -0400 Subject: [PATCH 1/2] vega: change tgsi asm by tgsi_ureg --- src/gallium/state_trackers/vega/asm_fill.h | 551 +++- 1 files changed, 380 insertions(+), 171 deletions(-) diff --git a/src/gallium/state_trackers/vega/asm_fill.h b/src/gallium/state_trackers/vega/asm_fill.h index 2f394ad..2777346 100644 --- a/src/gallium/state_trackers/vega/asm_fill.h +++ b/src/gallium/state_trackers/vega/asm_fill.h @@ -27,166 +27,375 @@ #ifndef ASM_FILL_H #define ASM_FILL_H -static const char solid_fill_asm[] = - MOV %s, CONST[0]\n; - - -static const char linear_grad_asm[] = - MOV TEMP[0].xy, IN[0]\n - MOV TEMP[0].z, CONST[1].\n - DP3 TEMP[1], CONST[2], TEMP[0]\n - DP3 TEMP[2], CONST[3], TEMP[0]\n - DP3 TEMP[3], CONST[4], TEMP[0]\n - RCP TEMP[3], TEMP[3]\n - MUL TEMP[1], TEMP[1], TEMP[3]\n - MUL TEMP[2], TEMP[2], TEMP[3]\n - MOV TEMP[4].x, TEMP[1]\n - MOV TEMP[4].y, TEMP[2]\n - MUL TEMP[0], CONST[0]., TEMP[4].\n - MAD TEMP[1], CONST[0]., TEMP[4]., TEMP[0]\n - MUL TEMP[2], TEMP[1], CONST[0].\n - TEX %s, TEMP[2], SAMP[0], 1D\n; - -static const char radial_grad_asm[] = - MOV TEMP[0].xy, IN[0]\n - MOV TEMP[0].z, CONST[1].\n - DP3 TEMP[1], CONST[2], TEMP[0]\n - DP3 TEMP[2], CONST[3], TEMP[0]\n - DP3 TEMP[3], CONST[4], TEMP[0]\n - RCP TEMP[3], TEMP[3]\n - MUL TEMP[1], TEMP[1], TEMP[3]\n - MUL TEMP[2], TEMP[2], TEMP[3]\n - MOV TEMP[5].x, TEMP[1]\n - MOV TEMP[5].y, TEMP[2]\n - MUL TEMP[0], CONST[0]., TEMP[5].\n - MAD TEMP[1], CONST[0]., TEMP[5]., TEMP[0]\n - ADD TEMP[1], TEMP[1], TEMP[1]\n - MUL TEMP[3], TEMP[5]., TEMP[5].\n - MAD TEMP[4], TEMP[5]., TEMP[5]., TEMP[3]\n - MOV TEMP[4], -TEMP[4]\n - MUL TEMP[2], CONST[0]., TEMP[4]\n - MUL TEMP[0], CONST[1]., TEMP[2]\n - MUL TEMP[3], TEMP[1], TEMP[1]\n - SUB TEMP[2], TEMP[3], TEMP[0]\n - RSQ TEMP[2], |TEMP[2]|\n - RCP TEMP[2], TEMP[2]\n - SUB TEMP[1], TEMP[2], TEMP[1]\n - ADD TEMP[0], CONST[0]., CONST[0].\n - RCP TEMP[0], TEMP[0]\n - MUL TEMP[2], TEMP[1], TEMP[0]\n - TEX %s, TEMP[2], SAMP[0], 1D\n; - -static const char pattern_asm[] = - MOV TEMP[0].xy, IN[0]\n - MOV TEMP[0].z, CONST[1].\n - DP3 TEMP[1], CONST[2], TEMP[0]\n - DP3 TEMP[2], CONST[3], TEMP[0]\n - DP3 TEMP[3], CONST[4], TEMP[0]\n - RCP TEMP[3], TEMP[3]\n - MUL TEMP[1], TEMP[1], TEMP[3]\n - MUL TEMP[2], TEMP[2], TEMP[3]\n - MOV TEMP[4].x, TEMP[1]\n - MOV TEMP[4].y, TEMP[2]\n - RCP TEMP[0], CONST[1].zwzw\n - MOV TEMP[1], TEMP[4]\n - MUL TEMP[1].x, TEMP[1], TEMP[0]\n - MUL TEMP[1].y, TEMP[1], TEMP[0]\n - TEX %s, TEMP[1], SAMP[0], 2D\n; - - -static const char mask_asm[] = - TEX TEMP[1], IN[0], SAMP[1], 2D\n - MUL TEMP[0].w, TEMP[0]., TEMP[1].\n - MOV %s, TEMP[0]\n; - - -static const char image_normal_asm[] = - TEX %s, IN[1], SAMP[3], 2D\n; - -static const char image_multiply_asm[] = - TEX TEMP[1], IN[1], SAMP[3], 2D\n - MUL %s, TEMP[0], TEMP[1]\n; - -static const char image_stencil_asm[] = - TEX TEMP[1], IN[1], SAMP[3], 2D\n - MUL %s, TEMP[0], TEMP[1]\n; - - -#define EXTENDED_BLEND_OVER \ - SUB TEMP[3], CONST[1]., TEMP[1].\n \ - SUB TEMP[4], CONST[1]., TEMP[0].\n \ - MUL TEMP[3], TEMP[0], TEMP[3]\n\ - MUL TEMP[4], TEMP[1], TEMP[4]\n\ - ADD TEMP[3], TEMP[3], TEMP[4]\n - -static const char blend_multiply_asm[] = - TEX TEMP[1], IN[0], SAMP[2], 2D\n - EXTENDED_BLEND_OVER - MUL TEMP[4], TEMP[0], TEMP[1]\n - ADD TEMP[1], TEMP[4], TEMP[3]\n/*result.rgb*/ - MUL TEMP[2], TEMP[0]., TEMP[1].\n - ADD TEMP[3], TEMP[0]., TEMP[1].\n - SUB TEMP[1].w, TEMP[3], TEMP[2]\n - MOV %s, TEMP[1]\n; -#if 1 -static const char blend_screen_asm[] = - TEX TEMP[1], IN[0], SAMP[2], 2D\n - ADD TEMP[3], TEMP[0], TEMP[1]\n - MUL TEMP[2], TEMP[0], TEMP[1]\n - SUB %s, TEMP[3], TEMP[2]\n; -#else -static const char blend_screen_asm[] = - TEX TEMP[1], IN[0], SAMP[2], 2D\n - MOV %s, TEMP[1]\n; -#endif
Re: [Mesa3d-dev] [PATCH] switch shaders assembly TGSI code by tgsi_ureg
Hi, I am resenting the patches, i fixed some bugs in ureg functions and did some clean up in shaders cache code. i tested it with all progs/openvg files. Igor On Mon, Feb 1, 2010 at 1:02 PM, Zack Rusin za...@vmware.com wrote: On Monday 01 February 2010 10:19:53 Igor Oliveira wrote: Hello, Theses patchs switch all shaders code implemeted using TGSI assembly by tgsi_ureg. Hey Igor, very nice work! Since I don't have the conformance framework anymore, did you test your changes with the examples that we have? Before committing it'd be nice to know that we won't get too many obvious regressions. z From 81a644140b7a5185d4b200a0ac6cfa3384bed152 Mon Sep 17 00:00:00 2001 From: Igor Oliveira igor.olive...@openbossa.org Date: Mon, 1 Feb 2010 22:01:51 -0400 Subject: [PATCH 1/2] vega: change tgsi asm by tgsi_ureg --- src/gallium/state_trackers/vega/asm_fill.h | 551 +++- 1 files changed, 380 insertions(+), 171 deletions(-) diff --git a/src/gallium/state_trackers/vega/asm_fill.h b/src/gallium/state_trackers/vega/asm_fill.h index 2f394ad..2777346 100644 --- a/src/gallium/state_trackers/vega/asm_fill.h +++ b/src/gallium/state_trackers/vega/asm_fill.h @@ -27,166 +27,375 @@ #ifndef ASM_FILL_H #define ASM_FILL_H -static const char solid_fill_asm[] = - MOV %s, CONST[0]\n; - - -static const char linear_grad_asm[] = - MOV TEMP[0].xy, IN[0]\n - MOV TEMP[0].z, CONST[1].\n - DP3 TEMP[1], CONST[2], TEMP[0]\n - DP3 TEMP[2], CONST[3], TEMP[0]\n - DP3 TEMP[3], CONST[4], TEMP[0]\n - RCP TEMP[3], TEMP[3]\n - MUL TEMP[1], TEMP[1], TEMP[3]\n - MUL TEMP[2], TEMP[2], TEMP[3]\n - MOV TEMP[4].x, TEMP[1]\n - MOV TEMP[4].y, TEMP[2]\n - MUL TEMP[0], CONST[0]., TEMP[4].\n - MAD TEMP[1], CONST[0]., TEMP[4]., TEMP[0]\n - MUL TEMP[2], TEMP[1], CONST[0].\n - TEX %s, TEMP[2], SAMP[0], 1D\n; - -static const char radial_grad_asm[] = - MOV TEMP[0].xy, IN[0]\n - MOV TEMP[0].z, CONST[1].\n - DP3 TEMP[1], CONST[2], TEMP[0]\n - DP3 TEMP[2], CONST[3], TEMP[0]\n - DP3 TEMP[3], CONST[4], TEMP[0]\n - RCP TEMP[3], TEMP[3]\n - MUL TEMP[1], TEMP[1], TEMP[3]\n - MUL TEMP[2], TEMP[2], TEMP[3]\n - MOV TEMP[5].x, TEMP[1]\n - MOV TEMP[5].y, TEMP[2]\n - MUL TEMP[0], CONST[0]., TEMP[5].\n - MAD TEMP[1], CONST[0]., TEMP[5]., TEMP[0]\n - ADD TEMP[1], TEMP[1], TEMP[1]\n - MUL TEMP[3], TEMP[5]., TEMP[5].\n - MAD TEMP[4], TEMP[5]., TEMP[5]., TEMP[3]\n - MOV TEMP[4], -TEMP[4]\n - MUL TEMP[2], CONST[0]., TEMP[4]\n - MUL TEMP[0], CONST[1]., TEMP[2]\n - MUL TEMP[3], TEMP[1], TEMP[1]\n - SUB TEMP[2], TEMP[3], TEMP[0]\n - RSQ TEMP[2], |TEMP[2]|\n - RCP TEMP[2], TEMP[2]\n - SUB TEMP[1], TEMP[2], TEMP[1]\n - ADD TEMP[0], CONST[0]., CONST[0].\n - RCP TEMP[0], TEMP[0]\n - MUL TEMP[2], TEMP[1], TEMP[0]\n - TEX %s, TEMP[2], SAMP[0], 1D\n; - -static const char pattern_asm[] = - MOV TEMP[0].xy, IN[0]\n - MOV TEMP[0].z, CONST[1].\n - DP3 TEMP[1], CONST[2], TEMP[0]\n - DP3 TEMP[2], CONST[3], TEMP[0]\n - DP3 TEMP[3], CONST[4], TEMP[0]\n - RCP TEMP[3], TEMP[3]\n - MUL TEMP[1], TEMP[1], TEMP[3]\n - MUL TEMP[2], TEMP[2], TEMP[3]\n - MOV TEMP[4].x, TEMP[1]\n - MOV TEMP[4].y, TEMP[2]\n - RCP TEMP[0], CONST[1].zwzw\n - MOV TEMP[1], TEMP[4]\n - MUL TEMP[1].x, TEMP[1], TEMP[0]\n - MUL TEMP[1].y, TEMP[1], TEMP[0]\n - TEX %s, TEMP[1], SAMP[0], 2D\n; - - -static const char mask_asm[] = - TEX TEMP[1], IN[0], SAMP[1], 2D\n - MUL TEMP[0].w, TEMP[0]., TEMP[1].\n - MOV %s, TEMP[0]\n; - - -static const char image_normal_asm[] = - TEX %s, IN[1], SAMP[3], 2D\n; - -static const char image_multiply_asm[] = - TEX TEMP[1], IN[1], SAMP[3], 2D\n - MUL %s, TEMP[0], TEMP[1]\n; - -static const char image_stencil_asm[] = - TEX TEMP[1], IN[1], SAMP[3], 2D\n - MUL %s, TEMP[0], TEMP[1]\n; - - -#define EXTENDED_BLEND_OVER \ - SUB TEMP[3], CONST[1]., TEMP[1].\n \ - SUB TEMP[4], CONST[1]., TEMP[0].\n \ - MUL TEMP[3], TEMP[0], TEMP[3]\n\ - MUL TEMP[4], TEMP[1], TEMP[4]\n\ - ADD TEMP[3], TEMP[3], TEMP[4]\n - -static const char blend_multiply_asm[] = - TEX TEMP[1], IN[0], SAMP[2], 2D\n - EXTENDED_BLEND_OVER - MUL TEMP[4], TEMP[0], TEMP[1]\n - ADD TEMP[1], TEMP[4], TEMP[3]\n/*result.rgb*/ - MUL TEMP[2], TEMP[0]., TEMP[1].\n - ADD TEMP[3], TEMP[0]., TEMP[1].\n - SUB TEMP[1].w, TEMP[3], TEMP[2]\n - MOV %s, TEMP[1]\n; -#if 1 -static const char blend_screen_asm[] = - TEX TEMP[1], IN[0], SAMP[2], 2D\n - ADD TEMP[3], TEMP[0], TEMP[1]\n - MUL TEMP[2], TEMP[0], TEMP[1]\n - SUB %s, TEMP[3], TEMP[2]\n; -#else -static const char blend_screen_asm[] = - TEX TEMP[1], IN[0], SAMP[2], 2D\n - MOV %s, TEMP[1]\n; -#endif - -static const char blend_darken_asm[] = - TEX TEMP[1], IN[0], SAMP[2], 2D\n - EXTENDED_BLEND_OVER - MUL TEMP[4], TEMP[0], TEMP[1].\n - MUL
Re: [Mesa3d-dev] [PATCH] softpipe: Fix softpipe_reset_sampler_varients, check if the sampler has a texture != NULL. It fixes the bug 25863
Hi, You are right!. The bug happens in blend_bind_samplers in shader.c the texture[0] is NULL and it is wrong initialized. The patch below fix it: --- a/src/gallium/state_trackers/vega/shader.c +++ b/src/gallium/state_trackers/vega/shader.c @@ -135,8 +135,8 @@ static VGint blend_bind_samplers(struct vg_context *ctx, textures[2] = stfb-blend_texture; if (!samplers[0] || !textures[0]) { - samplers[1] = samplers[2]; - textures[1] = textures[2]; + samplers[0] = samplers[2]; + textures[0] = textures[2]; } if (!samplers[1] || !textures[1]) { samplers[1] = samplers[0]; -- 1.6.3.3 On Mon, Jan 25, 2010 at 7:41 AM, José Fonseca jfons...@vmware.com wrote: On Sun, 2010-01-24 at 18:36 -0800, Igor Oliveira wrote: The patch fixes the bug 25863. The bug happens when i use blend types like multiply, screen, dark in vega state tracker. --- a/src/gallium/drivers/softpipe/sp_state_sampler.c +++ b/src/gallium/drivers/softpipe/sp_state_sampler.c @@ -244,7 +244,7 @@ softpipe_reset_sampler_varients(struct softpipe_context *softpipe) * fragment programs. */ for (i = 0; i = softpipe-vs-max_sampler; i++) { - if (softpipe-vertex_samplers[i]) { + if (softpipe-vertex_samplers[i] softpipe-vertex_textures[i]) { softpipe-tgsi.vert_samplers_list[i] = get_sampler_varient( i, sp_sampler(softpipe-vertex_samplers[i]), @@ -258,7 +258,7 @@ softpipe_reset_sampler_varients(struct softpipe_context *softpipe) } for (i = 0; i = softpipe-fs-info.file_max[TGSI_FILE_SAMPLER]; i++) { - if (softpipe-sampler[i]) { + if (softpipe-sampler[i] softpipe-texture[i]) { softpipe-tgsi.frag_samplers_list[i] = get_sampler_varient( i, sp_sampler(softpipe-sampler[i]), -- 1.6.3.3 This doesn't seem the best way to fix this: the segfault may happen in softpipe code, but the state tracker has the responsibility to sanitize state. Shouldn't the vega state tracker bind a dummy black texture in this case? Also, the net effect of this patch is to use the previously bound texture/sampler -- which may be null -- so the bug is potentially still there, but perhaps less likely. I wonder why changing blend type causes a null texture to be bound in the first place? I recall from Zack's VG talk that certain kind of blending ops had to be implemented with texturing, but binding a null texture binding seems a symptom of some subtle bug. Jose From 45e4ae5196d3a040220364c6a8b790b73d7d2ca1 Mon Sep 17 00:00:00 2001 From: Igor Oliveira igor.olive...@openbossa.org Date: Mon, 25 Jan 2010 10:32:40 -0400 Subject: [PATCH] vega: Fix null texture when using multiply, screen, darken and light blend --- src/gallium/state_trackers/vega/shader.c |4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/src/gallium/state_trackers/vega/shader.c b/src/gallium/state_trackers/vega/shader.c index bd5ae79..8e59d53 100644 --- a/src/gallium/state_trackers/vega/shader.c +++ b/src/gallium/state_trackers/vega/shader.c @@ -135,8 +135,8 @@ static VGint blend_bind_samplers(struct vg_context *ctx, textures[2] = stfb-blend_texture; if (!samplers[0] || !textures[0]) { - samplers[1] = samplers[2]; - textures[1] = textures[2]; + samplers[0] = samplers[2]; + textures[0] = textures[2]; } if (!samplers[1] || !textures[1]) { samplers[1] = samplers[0]; -- 1.6.3.3 -- Throughout its 18-year history, RSA Conference consistently attracts the world's best and brightest in the field, creating opportunities for Conference attendees to learn about information security's most important issues through interactions with peers, luminaries and emerging and established companies. http://p.sf.net/sfu/rsaconf-dev2dev___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
[Mesa3d-dev] Vega advanced blending
Hello, This is just a report about the work i am doing. 2 days ago i began to study the vega state tracker to understand better and help a bit since there i already fix some bugs. So right now i am implementing the advanced blending extension[1]. This extension include many blending methods supported in authoring tools file formats like SVG 1.2. I just implemented right now 2 blend operations in next days i am finishing all operation methods. [1]. http://www.khronos.org/registry/vg/extensions/KHR/advanced_blending.txt Igor diff --git a/include/VG/openvg.h b/include/VG/openvg.h index 60167e4..2a6c510 100644 --- a/include/VG/openvg.h +++ b/include/VG/openvg.h @@ -444,6 +444,8 @@ typedef enum { VG_BLEND_DARKEN = 0x2007, VG_BLEND_LIGHTEN= 0x2008, VG_BLEND_ADDITIVE = 0x2009, + VG_BLEND_SUBTRACT_KHR = 0x2017, + VG_BLEND_INVERT_KHR = 0x2018, VG_BLEND_MODE_FORCE_SIZE= VG_MAX_ENUM } VGBlendMode; diff --git a/src/gallium/state_trackers/vega/api_params.c b/src/gallium/state_trackers/vega/api_params.c index db77fd9..0bc5e09 100644 --- a/src/gallium/state_trackers/vega/api_params.c +++ b/src/gallium/state_trackers/vega/api_params.c @@ -160,7 +160,7 @@ void vgSeti (VGParamType type, VGint value) break; case VG_BLEND_MODE: if (value VG_BLEND_SRC || - value VG_BLEND_ADDITIVE) + value VG_BLEND_INVERT_KHR) error = VG_ILLEGAL_ARGUMENT_ERROR; else { ctx-state.dirty |= BLEND_DIRTY; diff --git a/src/gallium/state_trackers/vega/asm_fill.h b/src/gallium/state_trackers/vega/asm_fill.h index 2f394ad..47c8b9d 100644 --- a/src/gallium/state_trackers/vega/asm_fill.h +++ b/src/gallium/state_trackers/vega/asm_fill.h @@ -164,6 +164,30 @@ static const char blend_lighten_asm[] = SUB TEMP[1].w, TEMP[3], TEMP[2]\n MOV %s, TEMP[1]\n; +static const char blend_subtract_khr_asm[] = + TEX TEMP[1], IN[0], SAMP[2], 2D\n + SUB TEMP[1], TEMP[1], TEMP[0]\n + STR TEMP[2]\n + NOT TEMP[2]\n + MAX TEMP[1], TEMP[1], TEMP[2]\n + MUL TEMP[2], TEMP[0]., TEMP[1].\n + ADD TEMP[3], TEMP[0]., TEMP[1].\n + SUB TEMP[1].w, TEMP[3], TEMP[2]\n + MOV %s, TEMP[0]\n; + +static const char blend_invert_khr_asm[] = + TEX TEMP[1], IN[0], SAMP[2], 2D\n + SUB TEMP[2], CONST[0]., TEMP[0].\n + SUB TEMP[3], CONST[0]., TEMP[1]\n + MUL TEMP[2].xyz, TEMP[1], TEMP[2].\n + MUL TEMP[3].xyz, TEMP[0]., TEMP[3]\n + ADD TEMP[0], TEMP[2], TEMP[3]\n + MUL TEMP[2], TEMP[0]., TEMP[1].\n + ADD TEMP[3], TEMP[0]., TEMP[1].\n + SUB TEMP[1].w, TEMP[3], TEMP[2]\n + MOV %s, TEMP[0]\n; + + static const char premultiply_asm[] = MUL TEMP[0].xyz, TEMP[0], TEMP[0].\n; @@ -224,14 +248,18 @@ static const struct shader_asm_info shaders_asm[] = { VG_TRUE, 0, 0, 1, 1, 0, 2}, /* extra blend modes */ - {VEGA_BLEND_MULTIPLY_SHADER, 200, blend_multiply_asm, + {VEGA_BLEND_MULTIPLY_SHADER, 200, blend_multiply_asm, VG_TRUE, 1, 1, 2, 1, 0, 5}, - {VEGA_BLEND_SCREEN_SHADER,200, blend_screen_asm, + {VEGA_BLEND_SCREEN_SHADER, 200, blend_screen_asm, VG_TRUE, 0, 0, 2, 1, 0, 4}, - {VEGA_BLEND_DARKEN_SHADER,200, blend_darken_asm, + {VEGA_BLEND_DARKEN_SHADER, 200, blend_darken_asm, VG_TRUE, 1, 1, 2, 1, 0, 6}, - {VEGA_BLEND_LIGHTEN_SHADER, 200, blend_lighten_asm, + {VEGA_BLEND_LIGHTEN_SHADER, 200, blend_lighten_asm, VG_TRUE, 1, 1, 2, 1, 0, 6}, + {VEGA_BLEND_SUBTRACT_KHR_SHADER, 200, blend_subtract_khr_asm, +VG_TRUE, 0, 0, 2, 1, 0, 4}, + {VEGA_BLEND_INVERT_KHR_SHADER, 200, blend_invert_khr_asm, +VG_TRUE, 1, 1, 2, 1, 0, 4}, /* premultiply */ {VEGA_PREMULTIPLY_SHADER, 100, premultiply_asm, diff --git a/src/gallium/state_trackers/vega/shader.c b/src/gallium/state_trackers/vega/shader.c index 8e59d53..b25cbf2 100644 --- a/src/gallium/state_trackers/vega/shader.c +++ b/src/gallium/state_trackers/vega/shader.c @@ -126,7 +126,9 @@ static VGint blend_bind_samplers(struct vg_context *ctx, if (bmode == VG_BLEND_MULTIPLY || bmode == VG_BLEND_SCREEN || bmode == VG_BLEND_DARKEN || - bmode == VG_BLEND_LIGHTEN) { + bmode == VG_BLEND_LIGHTEN || + bmode == VG_BLEND_SUBTRACT_KHR || + bmode == VG_BLEND_INVERT_KHR) { struct st_framebuffer *stfb = ctx-draw_buffer; vg_prepare_blend_surface(ctx); @@ -261,6 +263,12 @@ static void setup_shader_program(struct shader *shader) case VG_BLEND_LIGHTEN: shader_id |= VEGA_BLEND_LIGHTEN_SHADER; break; + case VG_BLEND_SUBTRACT_KHR: + shader_id |= VEGA_BLEND_SUBTRACT_KHR_SHADER; + break; + case VG_BLEND_INVERT_KHR: + shader_id |= VEGA_BLEND_INVERT_KHR_SHADER; + break; default: /* handled by pipe_blend_state */ break; diff --git
Re: [Mesa3d-dev] TGSI build and sse2 removal
Hi michal, you could me maintain informed about that changes? i am creating my environment to try to create tgsi neon optimizations. I am using beagleboard + Angstrom distribution in my environment. And i thinking something about dsp too. Igor On Mon, Jan 25, 2010 at 1:02 PM, michal mic...@vmware.com wrote: Brian Paul wrote on 2010-01-25 16:09: José Fonseca wrote: Michal, On Mon, 2010-01-25 at 06:27 -0800, michal wrote: I would like to have those two modules go away, as they are maintenance pain with no real benefits. The build module has been superseded by the ureg module, and apparently all third-party code has already migrated or is in the process of porting to new interface. I would like to nuke it if nobody minds. I'm fine with this. We can't remove this until we switch to ureg in the draw code. The draw_pipe_pstipple.c, draw_pipe_aaline.c and draw_pipe_aapoint.c files still haven't been converted to use tgsi_ureg. There may be some other uses elsewhere. Michal, can you update that code first? That's the plan. For sse2, I am looking at simplifying it enough to be able to accelerate pass-thru fragment shaders and simple vertex shaders. That's it. For more sophisticated stuff we already have llvmpipe. I agree with this in principle, but I think it's better not to get too much ahead of ourselves here: drivers are using tgsi_exec/sse2 for software vertex processing fallbacks. And while the plan is indeed to move the LLVM JIT code generation out of llvmpipe into the auxiliary modules so that all pipe drivers can use that for fallbacks, the fact is we're not there yet. So for tgsi_sse2 I think it's better not to introduce any performance regressions in vertex processing until llvm code generation is in place and working for everybody. I agree. It's too early to remove the sse2 code. OK, that makes sense, I will leave it alone for the time being. Thanks, guys. -- Throughout its 18-year history, RSA Conference consistently attracts the world's best and brightest in the field, creating opportunities for Conference attendees to learn about information security's most important issues through interactions with peers, luminaries and emerging and established companies. http://p.sf.net/sfu/rsaconf-dev2dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev -- The Planet: dedicated and managed hosting, cloud storage, colocation Stay online with enterprise data centers and the best network in the business Choose flexible plans and management services without long-term contracts Personal 24x7 support from experience hosting pros just a phone call away. http://p.sf.net/sfu/theplanet-com ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] Vega advanced blending
Hello On Mon, Jan 25, 2010 at 5:51 PM, Zack Rusin za...@vmware.com wrote: On Monday 25 January 2010 16:38:46 Igor Oliveira wrote: Hello, This is just a report about the work i am doing. 2 days ago i began to study the vega state tracker to understand better and help a bit since there i already fix some bugs. So right now i am implementing the advanced blending extension[1]. This extension include many blending methods supported in authoring tools file formats like SVG 1.2. I just implemented right now 2 blend operations in next days i am finishing all operation methods. Sounds great Igor. Feel free to add some test to progs/openvg/trivial or such that shows the extended blending. When I was working on this code I used to use the Khronos conformance framework but I don't have access to it anymore and it'd be a good idea for us to build up more of a testing infrastructure for this stuff. diff --git a/include/VG/openvg.h b/include/VG/openvg.h index 60167e4..2a6c510 100644 --- a/include/VG/openvg.h +++ b/include/VG/openvg.h @@ -444,6 +444,8 @@ typedef enum { VG_BLEND_DARKEN = 0x2007, VG_BLEND_LIGHTEN = 0x2008, VG_BLEND_ADDITIVE = 0x2009, + VG_BLEND_SUBTRACT_KHR = 0x2017, + VG_BLEND_INVERT_KHR = 0x2018, VG_BLEND_MODE_FORCE_SIZE = VG_MAX_ENUM } VGBlendMode; This change isn't right, we'd like to keep the openvg.h header like it is provided from Khronos (exactly the same way we do for GL). We just need to add the new vgext.h header from Khronos to account for all those new extensions. If you'd like I can do that soon. yep it would be cool! My first idea is to create a branch. +static const char blend_subtract_khr_asm[] = + TEX TEMP[1], IN[0], SAMP[2], 2D\n + SUB TEMP[1], TEMP[1], TEMP[0]\n + STR TEMP[2]\n + NOT TEMP[2]\n + MAX TEMP[1], TEMP[1], TEMP[2]\n + MUL TEMP[2], TEMP[0]., TEMP[1].\n + ADD TEMP[3], TEMP[0]., TEMP[1].\n + SUB TEMP[1].w, TEMP[3], TEMP[2]\n + MOV %s, TEMP[0]\n; + +static const char blend_invert_khr_asm[] = + TEX TEMP[1], IN[0], SAMP[2], 2D\n + SUB TEMP[2], CONST[0]., TEMP[0].\n + SUB TEMP[3], CONST[0]., TEMP[1]\n + MUL TEMP[2].xyz, TEMP[1], TEMP[2].\n + MUL TEMP[3].xyz, TEMP[0]., TEMP[3]\n + ADD TEMP[0], TEMP[2], TEMP[3]\n + MUL TEMP[2], TEMP[0]., TEMP[1].\n + ADD TEMP[3], TEMP[0]., TEMP[1].\n + SUB TEMP[1].w, TEMP[3], TEMP[2]\n + MOV %s, TEMP[0]\n; Looks good. Ideally we'd switch all of this hand assembly to tgsi_ureg code. It'd be a lot more flexible and more readable than manual assembling of semi-completed assembly fragments. Hmm. i do not know about tgsi_ureg but i can have a look on it. diff --git a/src/gallium/state_trackers/vega/shaders_cache.h b/src/gallium/state_trackers/vega/shaders_cache.h index feca58b..5bbb724 100644 --- a/src/gallium/state_trackers/vega/shaders_cache.h +++ b/src/gallium/state_trackers/vega/shaders_cache.h @@ -48,11 +48,13 @@ enum VegaShaderType { VEGA_BLEND_SCREEN_SHADER = 1 9, VEGA_BLEND_DARKEN_SHADER = 1 10, VEGA_BLEND_LIGHTEN_SHADER = 1 11, + VEGA_BLEND_SUBTRACT_KHR_SHADER = 1 12, + VEGA_BLEND_INVERT_KHR_SHADER = 1 13, - VEGA_PREMULTIPLY_SHADER = 1 12, - VEGA_UNPREMULTIPLY_SHADER = 1 13, + VEGA_PREMULTIPLY_SHADER = 1 14, + VEGA_UNPREMULTIPLY_SHADER = 1 15, - VEGA_BW_SHADER = 1 14 + VEGA_BW_SHADER = 1 16 }; We'll probably also need to do something about the caching. With 20+ extra blend modes we'll run out of the bits in our 32bit key that we're using to lookup shaders in our cache right now. Yep. i was looking on it too. z i will try to maintain you in touch about the advanced blending implementation. igor -- The Planet: dedicated and managed hosting, cloud storage, colocation Stay online with enterprise data centers and the best network in the business Choose flexible plans and management services without long-term contracts Personal 24x7 support from experience hosting pros just a phone call away. http://p.sf.net/sfu/theplanet-com ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
[Mesa3d-dev] [PATCH] docs: add documentation to double opcodes
Hello this patch applies in double-opcode-branch. It adds documentation to double opcodes. Igor From 135fd9c851e9b50149508a3e7c04203623cc297b Mon Sep 17 00:00:00 2001 From: Igor Oliveira igor.olive...@openbossa.org Date: Mon, 25 Jan 2010 19:23:04 -0400 Subject: [PATCH] docs: add documentation to double opcodes --- src/gallium/docs/source/tgsi.rst | 111 ++ 1 files changed, 111 insertions(+), 0 deletions(-) diff --git a/src/gallium/docs/source/tgsi.rst b/src/gallium/docs/source/tgsi.rst index ebee490..2d311eb 100644 --- a/src/gallium/docs/source/tgsi.rst +++ b/src/gallium/docs/source/tgsi.rst @@ -1090,6 +1090,117 @@ BREAKC - Break Conditional TBD +Double Opcodes +^^^ + +DADD - Add Double + +.. math:: + + dst.xy = src0.xy + src1.xy + + dst.zw = src0.zw + src1.zw + + +DDIV - Divide Double + +.. math:: + + dst.xy = src0.xy / src1.xy + + dst.zw = src0.zw / src1.zw + +DSEQ - Set Double on Equal + +.. math:: + + dst.xy = src0.xy == src1.xy ? 1.0F : 0.0F + + dst.zw = src0.zw == src1.zw ? 1.0F : 0.0F + +DSLT - Set Double on Less than + +.. math:: + + dst.xy = src0.xy src1.xy ? 1.0F : 0.0F + + dst.zw = src0.zw src1.zw ? 1.0F : 0.0F + +DFRAC - Double Fraction + +.. math:: + + dst.xy = src.xy - \lfloor src.xy\rfloor + + dst.zw = src.zw - \lfloor src.zw\rfloor + + +DFRACEXP - Convert Double Number to Fractional and Integral Components + +.. math:: + + dst0.xy = frexp(src.xy, dst1.xy) + + dst0.zw = frexp(src.zw, dst1.zw) + +DLDEXP - Multiple Double Number by Integral Power of 2 + +.. math:: + + dst.xy = ldexp(src0.xy, src1.xy) + + dst.zw = ldexp(src0.zw, src1.zw) + +DMIN - Minimum Double + +.. math:: + + dst.xy = min(src0.xy, src1.xy) + + dst.zw = min(src0.zw, src1.zw) + +DMAX - Maximum Double + +.. math:: + + dst.xy = max(src0.xy, src1.xy) + + dst.zw = max(src0.zw, src1.zw) + +DMUL - Multiply Double + +.. math:: + + dst.xy = src0.xy \times src1.xy + + dst.zw = src0.zw \times src1.zw + + +DMAD - Multiply And Add Doubles + +.. math:: + + dst.xy = src0.xy \times src1.xy + src2.xy + + dst.zw = src0.zw \times src1.zw + src2.zw + + +DRCP - Reciprocal Double + +.. math:: + + dst.xy = \frac{1}{src.xy} + + dst.zw = \frac{1}{src.zw} + +DSQRT - Square root double + +.. math:: + + dst.xy = \sqrt{src.xy} + + dst.zw = \sqrt{src.zw} + Explanation of symbols used -- -- 1.6.3.3 -- The Planet: dedicated and managed hosting, cloud storage, colocation Stay online with enterprise data centers and the best network in the business Choose flexible plans and management services without long-term contracts Personal 24x7 support from experience hosting pros just a phone call away. http://p.sf.net/sfu/theplanet-com___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
[Mesa3d-dev] [PATCH] egl: check if driver_name is null
running egl with egl_x11_swrast driver i receive a segfault because x11_screen_probe_dri2 returns a NULL driver_name. This patch checks if the driver_name is null. --- a/src/gallium/state_trackers/egl/x11/native_x11.c +++ b/src/gallium/state_trackers/egl/x11/native_x11.c @@ -70,7 +70,8 @@ native_create_probe(EGLNativeDisplayType dpy) if (xscr) { if (x11_screen_support(xscr, X11_SCREEN_EXTENSION_DRI2)) { driver_name = x11_screen_probe_dri2(xscr); - nprobe-data = strdup(driver_name); + if (driver_name) +nprobe-data = strdup(driver_name); } x11_screen_destroy(xscr); -- 1.6.3.3 From 7ed634c927dc388ae2475766edf48b9ca88fb07f Mon Sep 17 00:00:00 2001 From: Igor Oliveira igor.olive...@openbossa.org Date: Sun, 24 Jan 2010 12:26:31 -0400 Subject: [PATCH] egl: check if driver_name is null --- src/gallium/state_trackers/egl/x11/native_x11.c |3 ++- 1 files changed, 2 insertions(+), 1 deletions(-) diff --git a/src/gallium/state_trackers/egl/x11/native_x11.c b/src/gallium/state_trackers/egl/x11/native_x11.c index 695ab88..dd3c9f8 100644 --- a/src/gallium/state_trackers/egl/x11/native_x11.c +++ b/src/gallium/state_trackers/egl/x11/native_x11.c @@ -70,7 +70,8 @@ native_create_probe(EGLNativeDisplayType dpy) if (xscr) { if (x11_screen_support(xscr, X11_SCREEN_EXTENSION_DRI2)) { driver_name = x11_screen_probe_dri2(xscr); - nprobe-data = strdup(driver_name); + if (driver_name) +nprobe-data = strdup(driver_name); } x11_screen_destroy(xscr); -- 1.6.3.3 -- Throughout its 18-year history, RSA Conference consistently attracts the world's best and brightest in the field, creating opportunities for Conference attendees to learn about information security's most important issues through interactions with peers, luminaries and emerging and established companies. http://p.sf.net/sfu/rsaconf-dev2dev___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] [PATCH] add dfrac, dfracexp, dldexp opcodes to gallium
Hello, Yes .. i see my mistake ... thanks for ping me. I am sending again the patchs with the fix and adding a new patch that add a test to dfrac so no more mistakes there :. The test has the same output of frc test. Igor On Wed, Jan 20, 2010 at 3:53 AM, michal mic...@vmware.com wrote: Igor Oliveira wrote on 2010-01-20 00:37: Hi, These patches add support to dfrac, dldexp and fracexp opcodes. The fracexp opcode i think it is the only opcode that use 2 DST registers. The first one is used to store the fractional part(it store in a double) and the second one is used to store the exponent part(it is a int). In the tests we can see it working. static void +micro_dfrac(union tgsi_double_channel *dst, + const union tgsi_double_channel *src) +{ + dst-d[0] = src-d[0] - floor(src-d[0]); + dst-d[1] = src-d[1] - floor(src-d[0]); + dst-d[2] = src-d[2] - floor(src-d[0]); + dst-d[3] = src-d[3] - floor(src-d[0]) Igor, Shouldn't the second line have floor(src-d[1]), and so on? From c7731a40ab136fd98848f9155f9ce5ae3436 Mon Sep 17 00:00:00 2001 From: Igor Oliveira igor.olive...@openbossa.org Date: Tue, 19 Jan 2010 17:01:50 -0400 Subject: [PATCH 1/6] gallium: add dfrac and dldexp opcodes --- src/gallium/include/pipe/p_shader_tokens.h |4 +++- 1 files changed, 3 insertions(+), 1 deletions(-) diff --git a/src/gallium/include/pipe/p_shader_tokens.h b/src/gallium/include/pipe/p_shader_tokens.h index 15f8b0d..9ab18c4 100644 --- a/src/gallium/include/pipe/p_shader_tokens.h +++ b/src/gallium/include/pipe/p_shader_tokens.h @@ -334,7 +334,9 @@ struct tgsi_property_data { #define TGSI_OPCODE_DRCP157 #define TGSI_OPCODE_DSQRT 158 #define TGSI_OPCODE_DMAD159 -#define TGSI_OPCODE_LAST160 +#define TGSI_OPCODE_DFRAC 160 +#define TGSI_OPCODE_DLDEXP 161 +#define TGSI_OPCODE_LAST162 #define TGSI_SAT_NONE0 /* do not saturate */ #define TGSI_SAT_ZERO_ONE1 /* clamp to [0,1] */ -- 1.6.3.3 From 8d92fc49c50f815fa6075cdfe0c340c5bcd7a7dd Mon Sep 17 00:00:00 2001 From: Igor Oliveira igor.olive...@openbossa.org Date: Tue, 19 Jan 2010 17:02:34 -0400 Subject: [PATCH 2/6] tgsi: implement DFRAC and DLDEXP opcode --- src/gallium/auxiliary/tgsi/tgsi_exec.c | 28 src/gallium/auxiliary/tgsi/tgsi_info.c |4 +++- 2 files changed, 31 insertions(+), 1 deletions(-) diff --git a/src/gallium/auxiliary/tgsi/tgsi_exec.c b/src/gallium/auxiliary/tgsi/tgsi_exec.c index f8a4468..efb9ea1 100644 --- a/src/gallium/auxiliary/tgsi/tgsi_exec.c +++ b/src/gallium/auxiliary/tgsi/tgsi_exec.c @@ -290,6 +290,26 @@ micro_dmad(union tgsi_double_channel *dst, } static void +micro_dfrac(union tgsi_double_channel *dst, +const union tgsi_double_channel *src) +{ + dst-d[0] = src-d[0] - floor(src-d[0]); + dst-d[1] = src-d[1] - floor(src-d[1]); + dst-d[2] = src-d[2] - floor(src-d[2]); + dst-d[3] = src-d[3] - floor(src-d[3]); +} + +static void +micro_dldexp(union tgsi_double_channel *dst, + const union tgsi_double_channel *src) +{ + dst-d[0] = ldexp(src[0].d[0], src[1].d[0]); + dst-d[1] = ldexp(src[0].d[1], src[1].d[1]); + dst-d[2] = ldexp(src[0].d[2], src[1].d[2]); + dst-d[3] = ldexp(src[0].d[3], src[1].d[3]); +} + +static void micro_exp2(union tgsi_exec_channel *dst, const union tgsi_exec_channel *src) { @@ -3907,6 +3927,14 @@ exec_instruction( exec_double_trinary(mach, inst, micro_dmad); break; + case TGSI_OPCODE_DFRAC: + exec_double_unary(mach, inst, micro_dfrac); + break; + + case TGSI_OPCODE_DLDEXP: + exec_double_binary(mach, inst, micro_dldexp); + break; + default: printf(%d, inst-Instruction.Opcode); assert( 0 ); diff --git a/src/gallium/auxiliary/tgsi/tgsi_info.c b/src/gallium/auxiliary/tgsi/tgsi_info.c index 269ef73..8340b07 100644 --- a/src/gallium/auxiliary/tgsi/tgsi_info.c +++ b/src/gallium/auxiliary/tgsi/tgsi_info.c @@ -190,7 +190,9 @@ static const struct tgsi_opcode_info opcode_info[TGSI_OPCODE_LAST] = { 1, 2, 0, 0, 0, 0, DSEQ, TGSI_OPCODE_DSEQ }, { 1, 1, 0, 0, 0, 0, DRCP, TGSI_OPCODE_DRCP }, { 1, 1, 0, 0 ,0, 0, DSQRT, TGSI_OPCODE_DSQRT }, - { 1, 3, 0, 0 ,0, 0, DMAD, TGSI_OPCODE_DMAD } + { 1, 3, 0, 0 ,0, 0, DMAD, TGSI_OPCODE_DMAD }, + { 1, 1, 0, 0, 0, 0, DFRAC, TGSI_OPCODE_DFRAC}, + { 1, 2, 0, 0, 0, 0, DLDEXP, TGSI_OPCODE_DLDEXP} }; const struct tgsi_opcode_info * -- 1.6.3.3 From bf1c40a2bc8d365d41e9a1c8149283a0addc7acf Mon Sep 17 00:00:00 2001 From: Igor Oliveira igor.olive...@openbossa.org Date: Tue, 19 Jan 2010 19:20:53 -0400 Subject: [PATCH 3/6] gallium: DFRACEXP opcode to tgsi --- src/gallium/include/pipe/p_shader_tokens.h |3 ++- 1 files changed, 2 insertions(+), 1 deletions(-) diff --git a/src/gallium/include/pipe/p_shader_tokens.h b/src/gallium/include/pipe/p_shader_tokens.h index 9ab18c4..e0c191c
Re: [Mesa3d-dev] [PATCH] Implement double opcodes: ddiv, dmul, dmax, dmin, dslt, dsge, dseq, drcp, dqsrt and dmad
cool! i saw that i lost a file in commit(tgsi_info.c and p_shader_tokens.h) . But look likes that you fix it in branch. Thanks! Igor On Tue, Jan 19, 2010 at 8:34 AM, michal mic...@vmware.com wrote: Igor Oliveira wrote on 2010-01-18 19:55: The patches implement gallium opcodes ddiv, dmul, dmax, dmin, dslt, dsge, dseq, drcp, dqsrt and dmad and add tests to it. They are applicable in gallium-double-opcode branch. The next patchs i will add documentation and missing double opcodes implementation like dfrac, dldexp and dfracexp. Excellent, commited with cosmetic changes. Thanks! -- Throughout its 18-year history, RSA Conference consistently attracts the world's best and brightest in the field, creating opportunities for Conference attendees to learn about information security's most important issues through interactions with peers, luminaries and emerging and established companies. http://p.sf.net/sfu/rsaconf-dev2dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
[Mesa3d-dev] [PATCH] add dfrac, dfracexp, dldexp opcodes to gallium
Hi, These patches add support to dfrac, dldexp and fracexp opcodes. The fracexp opcode i think it is the only opcode that use 2 DST registers. The first one is used to store the fractional part(it store in a double) and the second one is used to store the exponent part(it is a int). In the tests we can see it working. Igor ps.: the next patch would be adding the opcodes in documentation file. From c7731a40ab136fd98848f9155f9ce5ae3436 Mon Sep 17 00:00:00 2001 From: Igor Oliveira igor.olive...@openbossa.org Date: Tue, 19 Jan 2010 17:01:50 -0400 Subject: [PATCH 1/5] gallium: add dfrac and dldexp opcodes --- src/gallium/include/pipe/p_shader_tokens.h |4 +++- 1 files changed, 3 insertions(+), 1 deletions(-) diff --git a/src/gallium/include/pipe/p_shader_tokens.h b/src/gallium/include/pipe/p_shader_tokens.h index 15f8b0d..9ab18c4 100644 --- a/src/gallium/include/pipe/p_shader_tokens.h +++ b/src/gallium/include/pipe/p_shader_tokens.h @@ -334,7 +334,9 @@ struct tgsi_property_data { #define TGSI_OPCODE_DRCP157 #define TGSI_OPCODE_DSQRT 158 #define TGSI_OPCODE_DMAD159 -#define TGSI_OPCODE_LAST160 +#define TGSI_OPCODE_DFRAC 160 +#define TGSI_OPCODE_DLDEXP 161 +#define TGSI_OPCODE_LAST162 #define TGSI_SAT_NONE0 /* do not saturate */ #define TGSI_SAT_ZERO_ONE1 /* clamp to [0,1] */ -- 1.6.3.3 From b6dccfaf8ee3941c341f09c6749a398bf684f196 Mon Sep 17 00:00:00 2001 From: Igor Oliveira igor.olive...@openbossa.org Date: Tue, 19 Jan 2010 17:02:34 -0400 Subject: [PATCH 2/5] tgsi: implement DFRAC and DLDEXP opcode --- src/gallium/auxiliary/tgsi/tgsi_exec.c | 28 src/gallium/auxiliary/tgsi/tgsi_info.c |4 +++- 2 files changed, 31 insertions(+), 1 deletions(-) diff --git a/src/gallium/auxiliary/tgsi/tgsi_exec.c b/src/gallium/auxiliary/tgsi/tgsi_exec.c index f8a4468..b6abbeb 100644 --- a/src/gallium/auxiliary/tgsi/tgsi_exec.c +++ b/src/gallium/auxiliary/tgsi/tgsi_exec.c @@ -290,6 +290,26 @@ micro_dmad(union tgsi_double_channel *dst, } static void +micro_dfrac(union tgsi_double_channel *dst, +const union tgsi_double_channel *src) +{ + dst-d[0] = src-d[0] - floor(src-d[0]); + dst-d[1] = src-d[1] - floor(src-d[0]); + dst-d[2] = src-d[2] - floor(src-d[0]); + dst-d[3] = src-d[3] - floor(src-d[0]); +} + +static void +micro_dldexp(union tgsi_double_channel *dst, + const union tgsi_double_channel *src) +{ + dst-d[0] = ldexp(src[0].d[0], src[1].d[0]); + dst-d[1] = ldexp(src[0].d[1], src[1].d[1]); + dst-d[2] = ldexp(src[0].d[2], src[1].d[2]); + dst-d[3] = ldexp(src[0].d[3], src[1].d[3]); +} + +static void micro_exp2(union tgsi_exec_channel *dst, const union tgsi_exec_channel *src) { @@ -3907,6 +3927,14 @@ exec_instruction( exec_double_trinary(mach, inst, micro_dmad); break; + case TGSI_OPCODE_DFRAC: + exec_double_unary(mach, inst, micro_dfrac); + break; + + case TGSI_OPCODE_DLDEXP: + exec_double_binary(mach, inst, micro_dldexp); + break; + default: printf(%d, inst-Instruction.Opcode); assert( 0 ); diff --git a/src/gallium/auxiliary/tgsi/tgsi_info.c b/src/gallium/auxiliary/tgsi/tgsi_info.c index 269ef73..8340b07 100644 --- a/src/gallium/auxiliary/tgsi/tgsi_info.c +++ b/src/gallium/auxiliary/tgsi/tgsi_info.c @@ -190,7 +190,9 @@ static const struct tgsi_opcode_info opcode_info[TGSI_OPCODE_LAST] = { 1, 2, 0, 0, 0, 0, DSEQ, TGSI_OPCODE_DSEQ }, { 1, 1, 0, 0, 0, 0, DRCP, TGSI_OPCODE_DRCP }, { 1, 1, 0, 0 ,0, 0, DSQRT, TGSI_OPCODE_DSQRT }, - { 1, 3, 0, 0 ,0, 0, DMAD, TGSI_OPCODE_DMAD } + { 1, 3, 0, 0 ,0, 0, DMAD, TGSI_OPCODE_DMAD }, + { 1, 1, 0, 0, 0, 0, DFRAC, TGSI_OPCODE_DFRAC}, + { 1, 2, 0, 0, 0, 0, DLDEXP, TGSI_OPCODE_DLDEXP} }; const struct tgsi_opcode_info * -- 1.6.3.3 From 28f586f4a525043e3496086534d3b7c2cb657b77 Mon Sep 17 00:00:00 2001 From: Igor Oliveira igor.olive...@openbossa.org Date: Tue, 19 Jan 2010 19:20:53 -0400 Subject: [PATCH 3/5] gallium: DFRACEXP opcode to tgsi --- src/gallium/include/pipe/p_shader_tokens.h |3 ++- 1 files changed, 2 insertions(+), 1 deletions(-) diff --git a/src/gallium/include/pipe/p_shader_tokens.h b/src/gallium/include/pipe/p_shader_tokens.h index 9ab18c4..e0c191c 100644 --- a/src/gallium/include/pipe/p_shader_tokens.h +++ b/src/gallium/include/pipe/p_shader_tokens.h @@ -336,7 +336,8 @@ struct tgsi_property_data { #define TGSI_OPCODE_DMAD159 #define TGSI_OPCODE_DFRAC 160 #define TGSI_OPCODE_DLDEXP 161 -#define TGSI_OPCODE_LAST162 +#define TGSI_OPCODE_DFRACEXP162 +#define TGSI_OPCODE_LAST163 #define TGSI_SAT_NONE0 /* do not saturate */ #define TGSI_SAT_ZERO_ONE1 /* clamp to [0,1] */ -- 1.6.3.3 From 8b7b6484c495f4d0b0b1a51942b73ad88deecece Mon Sep 17 00:00:00
[Mesa3d-dev] [PATCH] Implement double opcodes: ddiv, dmul, dmax, dmin, dslt, dsge, dseq, drcp, dqsrt and dmad
The patches implement gallium opcodes ddiv, dmul, dmax, dmin, dslt, dsge, dseq, drcp, dqsrt and dmad and add tests to it. They are applicable in gallium-double-opcode branch. The next patchs i will add documentation and missing double opcodes implementation like dfrac, dldexp and dfracexp. Igor From 0762c16db13543aa79c77c1c8ebbcfdc581fc8b1 Mon Sep 17 00:00:00 2001 From: Igor Oliveira igor.olive...@openbossa.org Date: Mon, 18 Jan 2010 13:54:19 -0400 Subject: [PATCH 1/6] gallium: add double opcodes ddiv, dmul, dmax, dmin, dslt, dsge, dseq, drcp and dqsrt --- src/gallium/include/pipe/p_shader_tokens.h | 11 ++- 1 files changed, 10 insertions(+), 1 deletions(-) diff --git a/src/gallium/include/pipe/p_shader_tokens.h b/src/gallium/include/pipe/p_shader_tokens.h index 5975146..ccfe41c 100644 --- a/src/gallium/include/pipe/p_shader_tokens.h +++ b/src/gallium/include/pipe/p_shader_tokens.h @@ -323,7 +323,16 @@ struct tgsi_property_data { #define TGSI_OPCODE_D2F 146 #define TGSI_OPCODE_DMOV147 #define TGSI_OPCODE_DADD148 -#define TGSI_OPCODE_LAST149 +#define TGSI_OPCODE_DDIV149 +#define TGSI_OPCODE_DMUL150 +#define TGSI_OPCODE_DMAX151 +#define TGSI_OPCODE_DMIN152 +#define TGSI_OPCODE_DSLT153 +#define TGSI_OPCODE_DSGE154 +#define TGSI_OPCODE_DSEQ155 +#define TGSI_OPCODE_DRCP156 +#define TGSI_OPCODE_DSQRT 157 +#define TGSI_OPCODE_LAST158 #define TGSI_SAT_NONE0 /* do not saturate */ #define TGSI_SAT_ZERO_ONE1 /* clamp to [0,1] */ -- 1.6.3.3 From 1139cb4c0cdfa7e8b1a1af7e6876f8f4ed3098df Mon Sep 17 00:00:00 2001 From: Igor Olivera igor.olive...@openbossa.org Date: Mon, 18 Jan 2010 13:56:58 -0400 Subject: [PATCH 2/6] tgsi: implement double opcodes --- src/gallium/auxiliary/tgsi/tgsi_exec.c | 127 src/gallium/auxiliary/tgsi/tgsi_info.c | 11 +++- 2 files changed, 137 insertions(+), 1 deletions(-) diff --git a/src/gallium/auxiliary/tgsi/tgsi_exec.c b/src/gallium/auxiliary/tgsi/tgsi_exec.c index b9ea761..c66c41a 100644 --- a/src/gallium/auxiliary/tgsi/tgsi_exec.c +++ b/src/gallium/auxiliary/tgsi/tgsi_exec.c @@ -180,6 +180,96 @@ micro_dmov(union tgsi_double_channel *dst, } static void +micro_ddiv(union tgsi_double_channel *dst, + const union tgsi_double_channel *src) +{ + dst-d[0] = src[0].d[0]/src[1].d[0]; + dst-d[1] = src[0].d[1]/src[1].d[1]; + dst-d[2] = src[0].d[2]/src[1].d[2]; + dst-d[3] = src[0].d[3]/src[1].d[3]; +} + +static void +micro_dmul(union tgsi_double_channel *dst, + const union tgsi_double_channel *src) +{ + dst-d[0] = src[0].d[0]*src[1].d[0]; + dst-d[1] = src[0].d[1]*src[1].d[1]; + dst-d[2] = src[0].d[2]*src[1].d[2]; + dst-d[3] = src[0].d[3]*src[1].d[3]; +} + +static void +micro_dmax(union tgsi_double_channel *dst, + const union tgsi_double_channel *src) +{ + dst-d[0] = src[0].d[0] src[1].d[0] ? src[0].d[0] : src[1].d[0]; + dst-d[1] = src[0].d[1] src[1].d[1] ? src[0].d[1] : src[1].d[1]; + dst-d[2] = src[0].d[2] src[1].d[2] ? src[0].d[2] : src[1].d[2]; + dst-d[3] = src[0].d[3] src[1].d[3] ? src[0].d[3] : src[1].d[3]; +} + +static void +micro_dmin(union tgsi_double_channel *dst, + const union tgsi_double_channel *src) +{ + dst-d[0] = src[0].d[0] src[1].d[0] ? src[0].d[0] : src[1].d[0]; + dst-d[1] = src[0].d[1] src[1].d[1] ? src[0].d[1] : src[1].d[1]; + dst-d[2] = src[0].d[2] src[1].d[2] ? src[0].d[2] : src[1].d[2]; + dst-d[3] = src[0].d[3] src[1].d[3] ? src[0].d[3] : src[1].d[3]; +} + +static void +micro_dslt(union tgsi_double_channel *dst, + const union tgsi_double_channel *src) +{ + dst-d[0] = src[0].d[0] src[1].d[0] ? 1.0F : 0.0F; + dst-d[1] = src[0].d[1] src[1].d[1] ? 1.0F : 0.0F; + dst-d[2] = src[0].d[2] src[1].d[2] ? 1.0F : 0.0F; + dst-d[3] = src[0].d[3] src[1].d[3] ? 1.0F : 0.0F; +} + +static void +micro_dsge(union tgsi_double_channel *dst, + const union tgsi_double_channel *src) +{ + dst-d[0] = src[0].d[0] = src[1].d[0] ? 1.0F : 0.0F; + dst-d[1] = src[0].d[1] = src[1].d[1] ? 1.0F : 0.0F; + dst-d[2] = src[0].d[2] = src[1].d[2] ? 1.0F : 0.0F; + dst-d[3] = src[0].d[3] = src[1].d[3] ? 1.0F : 0.0F; +} + +static void +micro_dseq(union tgsi_double_channel *dst, + const union tgsi_double_channel *src) +{ + dst-d[0] = src[0].d[0] == src[1].d[0] ? 1.0F : 0.0F; + dst-d[1] = src[0].d[1] == src[1].d[1] ? 1.0F : 0.0F; + dst-d[2] = src[0].d[2] == src[1].d[2] ? 1.0F : 0.0F; + dst-d[3] = src[0].d[3] == src[1].d[3] ? 1.0F : 0.0F; +} + +static void +micro_drcp(union tgsi_double_channel *dst, + const union tgsi_double_channel *src) +{ + dst-d[0] = 1.0F/src-d[0]; + dst-d[1] = 1.0F/src-d[1]; + dst-d[2] = 1.0F/src-d[2]; + dst-d[3] = 1.0F/src-d[3]; +} + +static void
[Mesa3d-dev] [PATH] Add double opcodes to TGSI Revision 2
These patches add support to double opcodes as discussed in mail list. The opcodes create are: movd, ddiv, dadd, dseq, dmax, dmin, dmul, dmuladd, drcp and dslt. They are used like suggested by Zack: MOVD A.xy, C.xy, c.xy where x is the lsb and y is the msb. There are still missing some opcodes being implemented(i will send the code soon), they are: dfrac, dfracexp, dldexp and convert between float and double. Revision 2 update: In revision 2 we remove the create_double function it is not used, change the MULADD opcode to DMAD and add a documentation to new opcodes. Michal: i am seeing the double opcode branch i can move the opcode codes to use the exec_double_binary/unary Igor From 83f895a235e76d8d556411fd0154650a2598acd0 Mon Sep 17 00:00:00 2001 From: Igor Oliveira igor.olive...@openbossa.org Date: Tue, 12 Jan 2010 07:40:50 -0400 Subject: [PATCH 1/3] tgsi: add double opcodes --- src/gallium/include/pipe/p_shader_tokens.h | 13 - 1 files changed, 12 insertions(+), 1 deletions(-) diff --git a/src/gallium/include/pipe/p_shader_tokens.h b/src/gallium/include/pipe/p_shader_tokens.h index 550e2ab..789edaa 100644 --- a/src/gallium/include/pipe/p_shader_tokens.h +++ b/src/gallium/include/pipe/p_shader_tokens.h @@ -319,7 +319,18 @@ struct tgsi_property_data { #define TGSI_OPCODE_CASE142 #define TGSI_OPCODE_DEFAULT 143 #define TGSI_OPCODE_ENDSWITCH 144 -#define TGSI_OPCODE_LAST145 + +#define TGSI_OPCODE_MOVD145 +#define TGSI_OPCODE_DDIV146 +#define TGSI_OPCODE_DADD147 +#define TGSI_OPCODE_DSEQ148 +#define TGSI_OPCODE_DMAX149 +#define TGSI_OPCODE_DMIN150 +#define TGSI_OPCODE_DMUL151 +#define TGSI_OPCODE_DMAD152 +#define TGSI_OPCODE_DRCP153 +#define TGSI_OPCODE_DSLT154 +#define TGSI_OPCODE_LAST155 #define TGSI_SAT_NONE0 /* do not saturate */ #define TGSI_SAT_ZERO_ONE1 /* clamp to [0,1] */ -- 1.6.3.3 From 91d50bdbd6f35af9a0e342c46c8ee5fbe0910421 Mon Sep 17 00:00:00 2001 From: Igor Oliveira igor.olive...@openbossa.org Date: Tue, 12 Jan 2010 07:41:08 -0400 Subject: [PATCH 2/3] tgsi: implement double opcodes --- src/gallium/auxiliary/tgsi/tgsi_exec.c | 230 +- src/gallium/auxiliary/tgsi/tgsi_info.c | 10 + src/gallium/auxiliary/tgsi/tgsi_opcode_tmp.h | 11 +- 3 files changed, 249 insertions(+), 2 deletions(-) diff --git a/src/gallium/auxiliary/tgsi/tgsi_exec.c b/src/gallium/auxiliary/tgsi/tgsi_exec.c index f43233b..4f2b29c 100644 --- a/src/gallium/auxiliary/tgsi/tgsi_exec.c +++ b/src/gallium/auxiliary/tgsi/tgsi_exec.c @@ -69,6 +69,15 @@ #define TILE_BOTTOM_LEFT 2 #define TILE_BOTTOM_RIGHT 3 +union tgsi_double { + struct int_double { + int lsb; + int msb; + double d; + } id; + double d; +}; + static void micro_abs(union tgsi_exec_channel *dst, const union tgsi_exec_channel *src) @@ -380,6 +389,184 @@ micro_trunc(union tgsi_exec_channel *dst, dst-f[3] = (float)(int)src-f[3]; } +static void +micro_movd(union tgsi_exec_channel *dst, + const union tgsi_exec_channel *src) +{ + union tgsi_double dsrc, ddst; + + dsrc.id.lsb = src-u[0]; + dsrc.id.msb = src-u[1]; + + ddst.d = dsrc.d; + + dst-u[0] = ddst.id.lsb; + dst-u[1] = ddst.id.msb; +} + +static void +micro_dadd(union tgsi_exec_channel *dst, + const union tgsi_exec_channel *src) +{ + union tgsi_double dsrc0, dsrc1, ddst; + + dsrc0.id.lsb = src[0].u[0]; + dsrc0.id.msb = src[0].u[1]; + + dsrc1.id.lsb = src[1].u[0]; + dsrc1.id.msb = src[1].u[1]; + + ddst.d = dsrc0.d * dsrc1.d; + + dst-u[0] = ddst.id.lsb; + dst-u[1] = ddst.id.msb; +} + +static void +micro_ddiv(union tgsi_exec_channel *dst, + const union tgsi_exec_channel *src) +{ + union tgsi_double dsrc0, dsrc1, ddst; + + dsrc0.id.lsb = src[0].u[0]; + dsrc0.id.msb = src[0].u[1]; + + dsrc1.id.lsb = src[1].u[0]; + dsrc1.id.msb = src[1].u[1]; + + if (dsrc1.d != 0) { + ddst.d = dsrc0.d/dsrc1.d; + dst-u[0] = ddst.id.lsb; + dst-u[1] = ddst.id.msb; + } +} + +static void +micro_dseq(union tgsi_exec_channel *dst, + const union tgsi_exec_channel *src) +{ + union tgsi_double dsrc0, dsrc1, ddst; + + dsrc0.id.lsb = src[0].u[0]; + dsrc0.id.msb = src[0].u[1]; + + dsrc1.id.lsb = src[1].u[0]; + dsrc1.id.msb = src[1].u[1]; + + ddst.d = dsrc0.d == dsrc1.d ? 1.0F : 0.0F; + + dst-u[0] = ddst.id.lsb; + dst-u[1] = ddst.id.msb; +} + +static void +micro_dslt(union tgsi_exec_channel *dst, + const union tgsi_exec_channel *src) +{ + union tgsi_double dsrc0, dsrc1, ddst; + + dsrc0.id.lsb = src[0].u[0]; + dsrc0.id.msb = src[0].u[1]; + + dsrc1.id.lsb = src[1].u[0]; + dsrc1.id.msb = src[1].u[1]; + + ddst.d = dsrc0.d dsrc1.d ? 1.0F : 0.0F; + + dst-u[0
[Mesa3d-dev] [PATCH] add double opcodes to tgsi
These patches add support to double opcodes as discussed in mail list. The opcodes create are: movd, ddiv, dadd, dseq, dmax, dmin, dmul, dmuladd, drcp and dslt. They are used like suggested by Zack: MOVD A.xy, C.xy, c.xy where x is the lsb and y is the msb. There are still missing some opcodes being implemented(i will send the code soon), they are: dfrac, dfracexp, dldexp and convert between float and double. Igor From 4eebdbbd2822157f063a84b3dcb425ddbab84104 Mon Sep 17 00:00:00 2001 From: Igor Oliveira igor.olive...@openbossa.org Date: Mon, 11 Jan 2010 09:31:27 -0400 Subject: [PATCH 1/2] tgsi: add double opcodes to gallium --- src/gallium/include/pipe/p_shader_tokens.h | 13 - 1 files changed, 12 insertions(+), 1 deletions(-) diff --git a/src/gallium/include/pipe/p_shader_tokens.h b/src/gallium/include/pipe/p_shader_tokens.h index 550e2ab..27125fc 100644 --- a/src/gallium/include/pipe/p_shader_tokens.h +++ b/src/gallium/include/pipe/p_shader_tokens.h @@ -319,7 +319,18 @@ struct tgsi_property_data { #define TGSI_OPCODE_CASE142 #define TGSI_OPCODE_DEFAULT 143 #define TGSI_OPCODE_ENDSWITCH 144 -#define TGSI_OPCODE_LAST145 + +#define TGSI_OPCODE_MOVD145 +#define TGSI_OPCODE_DDIV146 +#define TGSI_OPCODE_DADD147 +#define TGSI_OPCODE_DSEQ148 +#define TGSI_OPCODE_DMAX149 +#define TGSI_OPCODE_DMIN150 +#define TGSI_OPCODE_DMUL151 +#define TGSI_OPCODE_DMULADD 152 +#define TGSI_OPCODE_DRCP153 +#define TGSI_OPCODE_DSLT154 +#define TGSI_OPCODE_LAST155 #define TGSI_SAT_NONE0 /* do not saturate */ #define TGSI_SAT_ZERO_ONE1 /* clamp to [0,1] */ -- 1.6.3.3 From 63048b005ffcba83064069619d1bd19145d5d515 Mon Sep 17 00:00:00 2001 From: Igor Oliveira igor.olive...@openbossa.org Date: Mon, 11 Jan 2010 09:31:57 -0400 Subject: [PATCH 2/2] tgsi: implement double opcodes --- src/gallium/auxiliary/tgsi/tgsi_exec.c | 274 +- src/gallium/auxiliary/tgsi/tgsi_info.c | 10 + src/gallium/auxiliary/tgsi/tgsi_opcode_tmp.h | 11 +- 3 files changed, 293 insertions(+), 2 deletions(-) diff --git a/src/gallium/auxiliary/tgsi/tgsi_exec.c b/src/gallium/auxiliary/tgsi/tgsi_exec.c index f43233b..3c37931 100644 --- a/src/gallium/auxiliary/tgsi/tgsi_exec.c +++ b/src/gallium/auxiliary/tgsi/tgsi_exec.c @@ -69,6 +69,15 @@ #define TILE_BOTTOM_LEFT 2 #define TILE_BOTTOM_RIGHT 3 +union tgsi_double { + struct int_double { + int lsb; + int msb; + double d; + } id; + double d; +}; + static void micro_abs(union tgsi_exec_channel *dst, const union tgsi_exec_channel *src) @@ -380,6 +389,228 @@ micro_trunc(union tgsi_exec_channel *dst, dst-f[3] = (float)(int)src-f[3]; } +static double create_double(unsigned int lsb, +unsigned int msb) +{ + long long int value; + long long int f; + int e,s; + double dst; + + value = ((long long int)msb 32) + + (long long int)lsb; + + s = (int) ((value 0x8000) 63); + e = (int) ((value 0x7FE0) 52); + f = (value 0x001F); + + + e = e?(e - 1023 - 51):(1022 - 52); + dst = ldxep((double)f, e); + + return (s?-dst:dst); +} + +static void +micro_movd(union tgsi_exec_channel *dst, + const union tgsi_exec_channel *src) +{ + union tgsi_double dsrc, ddst; + + dsrc.id.lsb = src-u[0]; + dsrc.id.msb = src-u[1]; + + ddst.d = dsrc.d; + + dst-u[0] = ddst.id.lsb; + dst-u[1] = ddst.id.msb; +} + +static void +micro_dadd(union tgsi_exec_channel *dst, + const union tgsi_exec_channel *src) +{ + union tgsi_double dsrc0, dsrc1, ddst; + + dsrc0.id.lsb = src[0].u[0]; + dsrc0.id.msb = src[0].u[1]; + + dsrc1.id.lsb = src[1].u[0]; + dsrc1.id.msb = src[1].u[1]; + + dsrc0.d = create_double(dsrc0.id.lsb, dsrc0.id.msb); + dsrc1.d = create_double(dsrc1.id.lsb, dsrc1.id.msb); + + ddst.d = dsrc0.d * dsrc1.d; + + dst-u[0] = ddst.id.lsb; + dst-u[1] = ddst.id.msb; +} + +static void +micro_ddiv(union tgsi_exec_channel *dst, + const union tgsi_exec_channel *src) +{ + union tgsi_double dsrc0, dsrc1, ddst; + + dsrc0.id.lsb = src[0].u[0]; + dsrc0.id.msb = src[0].u[1]; + + dsrc1.id.lsb = src[1].u[0]; + dsrc1.id.msb = src[1].u[1]; + + if (dsrc1.d != 0) { + ddst.d = dsrc0.d/dsrc1.d; + dst-u[0] = ddst.id.lsb; + dst-u[1] = ddst.id.msb; + } +} + +static void +micro_dseq(union tgsi_exec_channel *dst, + const union tgsi_exec_channel *src) +{ + union tgsi_double dsrc0, dsrc1, ddst; + + dsrc0.id.lsb = src[0].u[0]; + dsrc0.id.msb = src[0].u[1]; + + dsrc1.id.lsb = src[1].u[0]; + dsrc1.id.msb = src[1].u[1]; + + dsrc0.d = create_double(dsrc0.id.lsb, dsrc0.id.msb); + dsrc1.d = create_double(dsrc1.id.lsb, dsrc1.id.msb
Re: [Mesa3d-dev] [PATCH] add double opcodes to tgsi
Right, Doing it. On Mon, Jan 11, 2010 at 10:15 AM, Keith Whitwell kei...@vmware.com wrote: On Mon, 2010-01-11 at 05:37 -0800, Igor Oliveira wrote: These patches add support to double opcodes as discussed in mail list. The opcodes create are: movd, ddiv, dadd, dseq, dmax, dmin, dmul, dmuladd, drcp and dslt. They are used like suggested by Zack: MOVD A.xy, C.xy, c.xy where x is the lsb and y is the msb. There are still missing some opcodes being implemented(i will send the code soon), they are: dfrac, dfracexp, dldexp and convert between float and double. Igor Igor, This looks good to me. I'll let others comment on the content, but in keeping with the new policy on gallium interface changes, please extend these patches with the documentation changes for the new opcodes in gallium/docs and resubmit. Keith -- This SF.Net email is sponsored by the Verizon Developer Community Take advantage of Verizon's best-in-class app development support A streamlined, 14 day to market process makes app distribution fast and easy Join now and get one step closer to millions of Verizon customers http://p.sf.net/sfu/verizon-dev2dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] [PATCH] add double opcodes to tgsi
Ok Igor On Mon, Jan 11, 2010 at 10:37 AM, Keith Whitwell kei...@vmware.com wrote: On Mon, 2010-01-11 at 05:37 -0800, Igor Oliveira wrote: +OP13(DMULADD) For consistency with the existing opcodes, would it be better to have DMAD here? Keith -- This SF.Net email is sponsored by the Verizon Developer Community Take advantage of Verizon's best-in-class app development support A streamlined, 14 day to market process makes app distribution fast and easy Join now and get one step closer to millions of Verizon customers http://p.sf.net/sfu/verizon-dev2dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] [PATCH] add double opcodes to tgsi
Hello, On Mon, Jan 11, 2010 at 1:54 PM, michal mic...@vmware.com wrote: Igor Oliveira wrote on 2010-01-11 14:37: These patches add support to double opcodes as discussed in mail list. The opcodes create are: movd, ddiv, dadd, dseq, dmax, dmin, dmul, dmuladd, drcp and dslt. They are used like suggested by Zack: MOVD A.xy, C.xy, c.xy where x is the lsb and y is the msb. There are still missing some opcodes being implemented(i will send the code soon), they are: dfrac, dfracexp, dldexp and convert between float and double. Igor, There are some bits and pieces in your patch that I am not sure if they are correct. To understand that, let me first create a new feature branch (gallium-double-opcodes) and add a few basic opcodes (F2D, D2F, DMOV, DADD). Also, since there is no API state tracker that supports doubles, I will add a test to the python state tracker to see how well things are going. Once done, it will be a lot easier for us to read your patches that introduce new opcodes. What do you think? i agree, in the mean time i would be working on the fixes suggested by Keith. Igor -- This SF.Net email is sponsored by the Verizon Developer Community Take advantage of Verizon's best-in-class app development support A streamlined, 14 day to market process makes app distribution fast and easy Join now and get one step closer to millions of Verizon customers http://p.sf.net/sfu/verizon-dev2dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] [RFC] add support to double opcodes
right, so in d_add micro code we could have: tgsi_double dsrc0, dsrc1, ddst; src0.msb = src0-f[0]; src0.lsb = src0-f[1]; src1.msb = src1-f[0]; src2.lsb = src1-f[1]; ddst.reg = src0.reg + sr1.reg; dst-f[0] = ddst.lsb; dst-f[1] = ddst.msb; and the tgsi_double would be something like: union tgsi_double { float lsb; float msb; double reg; }; On Wed, Jan 6, 2010 at 7:56 PM, Zack Rusin za...@vmware.com wrote: On Wednesday 06 January 2010 14:56:35 Igor Oliveira wrote: Hi, the patches add support to double opcodes in gallium/tgsi. It just implement some opcodes i like to know if someone has suggestion about the patches. Hi Igor, first of all this should probably go into a feature branch because it'll be a bit of work before it's usable. The patches that you've proposed are unlikely what we'll want for double's. Keith, Michal and I discussed this on the phone a few days back and the biggest issue with doubles is that unlike the switch between the integers and floats they actually need bigger registers to accomodate them. Given that the registers in TGSI are untyped and its up to instructions to define the type it becomes hard for drivers to figure out the size of the registers beforehand. The solution that I personally like and what seems to becoming the facto standard when dealing with double support is having double precision values represented by a pair of registers. Outputs are either the pair yx or to the pair wz, where the msb is stored in y/w. For example: Idata 3.0 = (0x4008) in register r looks like: r.w = 0x4008 ;high dword r.z = 0x ;low dword Or: r.y = 0x4008 ;high dword r.x = 0x ;low dword All source double inputs must be in xy (after swizzle operations). For example: d_add r1.xy, r2.xy, r2.xy Or d_add r1.zw, r2.xy, r2.xy Each computes twice the value in r2.xy, and places the result in either xy or zw. This assures that the register size stays constant. Of course the instruction semantics are different to the typical 4-component wide TGSI instructions, but that, I think, is a lot less of an issue. z -- This SF.Net email is sponsored by the Verizon Developer Community Take advantage of Verizon's best-in-class app development support A streamlined, 14 day to market process makes app distribution fast and easy Join now and get one step closer to millions of Verizon customers http://p.sf.net/sfu/verizon-dev2dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] [RFC] add support to double opcodes
Hi, so basically instead of add double in exec_channel it would be? diff --git a/src/gallium/auxiliary/tgsi/tgsi_exec.c b/src/gallium/auxiliary/tgsi/tgsi_exec.c index f43233b..3c90677 100644 --- a/src/gallium/auxiliary/tgsi/tgsi_exec.c +++ b/src/gallium/auxiliary/tgsi/tgsi_exec.c @@ -380,6 +380,33 @@ micro_trunc(union tgsi_exec_channel *dst, dst-f[3] = (float)(int)src-f[3]; } +static void +micro_movd(union tgsi_exec_channel *dst, + const union tgsi_exec_channel *src) +{ + union tgsi_double dsrc, ddst; + + TGSI_CREATE_DOUBLE(dsrc, src); + + ddst = dsrc; /* we do not need it */ + TGSI_DOUBLE_TO_EXECCHANELL(dst, ddst); +} + +static void +micro_ddiv(union tgsi_exec_channel *dst, + const union tgsi_exec_channel *src0, + const union tgsi_exec_channel *src1) +{ + union tgsi_double dsrc0, dsrc1, ddst; + + TGSI_CREATE_DOUBLE(dsrc0, src0); + TGSI_CREATE_DOUBLE(dsrc1, src1); + + if (dsrc1.reg != 0) { + ddst.reg = dsrc0.reg/dsrc1.reg; + TGSI_DOUBLE_TO_EXECCHANELL(dst, ddst); + } +} #define CHAN_X 0 #define CHAN_Y 1 @@ -3491,6 +3518,14 @@ exec_instruction( exec_vector_binary(mach, inst, micro_usne, TGSI_EXEC_DATA_UINT, TGSI_EXEC_DATA_UINT); break; + case TGSI_OPCODE_MOVD: + exec_vector_unary(mach, inst, micro_movd, TGSI_EXEC_DATA_DOUBLE, TGSI_EXEC_DATA_DOUBLE); + break; + + case TGSI_OPCODE_DDIV: + exec_vector_binary(mach, inst, micro_ddiv, TGSI_EXEC_DATA_DOUBLE, TGSI_EXEC_DATA_DOUBLE); + break; + case TGSI_OPCODE_SWITCH: exec_switch(mach, inst); break; @@ -3503,7 +3538,8 @@ exec_instruction( exec_default(mach); break; - case TGSI_OPCODE_ENDSWITCH: + + case TGSI_OPCODE_ENDSWITCH: exec_endswitch(mach); break; diff --git a/src/gallium/auxiliary/tgsi/tgsi_exec.h b/src/gallium/auxiliary/tgsi/tgsi_exec.h index aa3a98d..7da44d9 100644 --- a/src/gallium/auxiliary/tgsi/tgsi_exec.h +++ b/src/gallium/auxiliary/tgsi/tgsi_exec.h @@ -50,6 +50,18 @@ union tgsi_exec_channel unsigned u[QUAD_SIZE]; }; +union tgsi_double +{ + floatlsb; + floatmsb; + double reg; +}; + +#define TGSI_CREATE_DOUBLE(dsrc, src) dsrc.lsb = src-f[0], \ + dsrc.msb = src-f[1] + +#define TGSI_DOUBLE_TO_EXECCHANELL(channel, dvalue) channel-f[0] = dvalue.lsb, \ +channel-f[1] = dvalue.msb /** * A vector[RGBA] of channels[4 pixels] */ diff --git a/src/gallium/auxiliary/tgsi/tgsi_info.c b/src/gallium/auxiliary/tgsi/tgsi_info.c index de0e09c..acbd7dc 100644 --- a/src/gallium/auxiliary/tgsi/tgsi_info.c +++ b/src/gallium/auxiliary/tgsi/tgsi_info.c @@ -171,6 +171,8 @@ static const struct tgsi_opcode_info opcode_info[TGSI_OPCODE_LAST] = { 1, 2, 0, 0, 0, 0, USGE, TGSI_OPCODE_USGE }, { 1, 2, 0, 0, 0, 0, USHR, TGSI_OPCODE_USHR }, { 1, 2, 0, 0, 0, 0, USLT, TGSI_OPCODE_USLT }, + { 1, 1, 0, 0, 0, 0, MOVD, TGSI_OPCODE_MOVD }, + { 1, 2, 0, 0, 0, 0, DDIV, TGSI_OPCODE_DDIV }, { 1, 2, 0, 0, 0, 0, USNE, TGSI_OPCODE_USNE }, { 0, 1, 0, 0, 0, 0, SWITCH, TGSI_OPCODE_SWITCH }, { 0, 1, 0, 0, 0, 0, CASE, TGSI_OPCODE_CASE }, diff --git a/src/gallium/auxiliary/tgsi/tgsi_opcode_tmp.h b/src/gallium/auxiliary/tgsi/tgsi_opcode_tmp.h index e4af15c..f441636 100644 --- a/src/gallium/auxiliary/tgsi/tgsi_opcode_tmp.h +++ b/src/gallium/auxiliary/tgsi/tgsi_opcode_tmp.h @@ -167,6 +167,8 @@ OP12(USGE) OP12(USHR) OP12(USLT) OP12(USNE) +OP11(MOVD) +OP12(DDIV) #undef OP00 diff --git a/src/gallium/include/pipe/p_shader_tokens.h b/src/gallium/include/pipe/p_shader_tokens.h index 550e2ab..939d02f 100644 --- a/src/gallium/include/pipe/p_shader_tokens.h +++ b/src/gallium/include/pipe/p_shader_tokens.h @@ -319,7 +319,10 @@ struct tgsi_property_data { #define TGSI_OPCODE_CASE142 #define TGSI_OPCODE_DEFAULT 143 #define TGSI_OPCODE_ENDSWITCH 144 -#define TGSI_OPCODE_LAST145 + +#define TGSI_OPCODE_MOVD145 +#define TGSI_OPCODE_DDIV146 +#define TGSI_OPCODE_LAST147 #define TGSI_SAT_NONE0 /* do not saturate */ #define TGSI_SAT_ZERO_ONE1 /* clamp to [0,1] */ On Wed, Jan 6, 2010 at 7:56 PM, Zack Rusin za...@vmware.com wrote: On Wednesday 06 January 2010 14:56:35 Igor Oliveira wrote: Hi, the patches add support to double opcodes in gallium/tgsi. It just implement some opcodes i like to know if someone has suggestion about the patches. Hi Igor, first of all this should probably go into a feature branch because it'll be a bit of work before it's usable. The patches that you've proposed are unlikely what we'll want for double's. Keith, Michal and I discussed this on the phone a few days back and the biggest issue with doubles is that unlike the switch between the integers and floats they actually need bigger registers to accomodate them. Given that the registers in TGSI
Re: [Mesa3d-dev] [RFC] add support to double opcodes
Hi, We could use the same idea to create int64 opcodes. and by the way would be created a branch to gallium double opcodes? Igor On Thu, Jan 7, 2010 at 11:23 AM, Zack Rusin za...@vmware.com wrote: On Thursday 07 January 2010 09:11:11 michal wrote: Zack, 1. Do I understand correctly that while D_ADD dst.xy, src1.xy, src2.zw will add one double, is the following code D_ADD dst, src1, src2.zwxy also valid, and results in two doubles being added together? Good question. I guess that would be up to us to define. The DX/AMD CAL don't allow that because they define inputs as being in the xy component only so all the double instructions operate on exclusively one double. We could allow it but simply not use it right away from the state tracker side. 2. Is the list of double-precision opcodes proposed by Igor roughly enough for OpenCL implementation? Another good question. It will largely depend how our implementation of math functions for CL 1.1 will look like. CL 1.1 defines double support for such math functions as acos, acosh, acospi, cs, cosh, cospu (same with sin and tan), ceil, copysign, exp, exp2, exp10, fabs, fdim, floor, fmax, fmin, fmod, frack, frexp, hypot(x, y) [computes value of the square root of x^2+y^2], log, log2, log10, mad, mod, pow, pown, remainder, rint, round, rsqurt, sqrt, trunc (and various permutations of those and some that are obviously implementable with above), so it all boils down to how we'll implement those functions. I think that a minimal set that could be enough would be: dadd, ddiv, deq, dlt, dfrac, dfracexp, dldexp, dmax, dmin, dmov, dmul, dmuladd, drcp and dsqrt, plus conversion instructions that convert between float and double and back. (this is assuming table or some other fixed implementation of trigonometric functions and in general assumes that we trade performance for simplicity at least for the time being). z -- This SF.Net email is sponsored by the Verizon Developer Community Take advantage of Verizon's best-in-class app development support A streamlined, 14 day to market process makes app distribution fast and easy Join now and get one step closer to millions of Verizon customers http://p.sf.net/sfu/verizon-dev2dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev -- This SF.Net email is sponsored by the Verizon Developer Community Take advantage of Verizon's best-in-class app development support A streamlined, 14 day to market process makes app distribution fast and easy Join now and get one step closer to millions of Verizon customers http://p.sf.net/sfu/verizon-dev2dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
[Mesa3d-dev] [RFC] add support to double opcodes
Hi, the patches add support to double opcodes in gallium/tgsi. It just implement some opcodes i like to know if someone has suggestion about the patches. Igor From e856e2aa3b801fc6f386a88df646a396c27d8ee8 Mon Sep 17 00:00:00 2001 From: Igor Oliveira igor.olive...@openbossa.org Date: Wed, 6 Jan 2010 15:50:00 -0400 Subject: [PATCH 1/2] gallium: Add double opcodes --- src/gallium/include/pipe/p_shader_tokens.h | 10 +- 1 files changed, 9 insertions(+), 1 deletions(-) diff --git a/src/gallium/include/pipe/p_shader_tokens.h b/src/gallium/include/pipe/p_shader_tokens.h index 550e2ab..efbbafe 100644 --- a/src/gallium/include/pipe/p_shader_tokens.h +++ b/src/gallium/include/pipe/p_shader_tokens.h @@ -319,7 +319,15 @@ struct tgsi_property_data { #define TGSI_OPCODE_CASE142 #define TGSI_OPCODE_DEFAULT 143 #define TGSI_OPCODE_ENDSWITCH 144 -#define TGSI_OPCODE_LAST145 + +#define TGSI_OPCODE_MOVD145 +#define TGSI_OPCODE_DDIV146 +#define TGSI_OPCODE_DMAX147 +#define TGSI_OPCODE_DMIN148 +#define TGSI_OPCODE_DNEG149 +#define TGSI_OPCODE_DSGE150 +#define TGSI_OPCODE_DSLT151 +#define TGSI_OPCODE_LAST152 #define TGSI_SAT_NONE0 /* do not saturate */ #define TGSI_SAT_ZERO_ONE1 /* clamp to [0,1] */ -- 1.6.3.3 From 8be48c41cfc988279832d60f02c2cf496590e162 Mon Sep 17 00:00:00 2001 From: Igor Oliveira igor.olive...@openbossa.org Date: Wed, 6 Jan 2010 15:51:09 -0400 Subject: [PATCH 2/2] tgsi: implement double opcodes --- src/gallium/auxiliary/tgsi/tgsi_exec.c | 113 +- src/gallium/auxiliary/tgsi/tgsi_exec.h |1 + src/gallium/auxiliary/tgsi/tgsi_info.c |7 ++ src/gallium/auxiliary/tgsi/tgsi_opcode_tmp.h |7 ++ 4 files changed, 126 insertions(+), 2 deletions(-) diff --git a/src/gallium/auxiliary/tgsi/tgsi_exec.c b/src/gallium/auxiliary/tgsi/tgsi_exec.c index f43233b..ddcc829 100644 --- a/src/gallium/auxiliary/tgsi/tgsi_exec.c +++ b/src/gallium/auxiliary/tgsi/tgsi_exec.c @@ -380,6 +380,85 @@ micro_trunc(union tgsi_exec_channel *dst, dst-f[3] = (float)(int)src-f[3]; } +static void +micro_movd(union tgsi_exec_channel *dst, + const union tgsi_exec_channel *src) +{ + dst-d[0] = src-d[0]; + dst-d[1] = src-d[1]; + dst-d[2] = src-d[2]; + dst-d[3] = src-d[3]; +} + +static void +micro_ddiv(union tgsi_exec_channel *dst, + const union tgsi_exec_channel *src0, + const union tgsi_exec_channel *src1) +{ + if (src1-d[0] != 0) + dst-d[0] = src0-d[0]/src1-d[0]; + if (src1-d[1] != 0) + dst-d[1] = src0-d[1]/src1-d[1]; + if (src1-d[2] != 0) + dst-d[2] = src0-d[2]/src1-d[2]; + if (src1-d[3] != 0) + dst-d[3] = src0-d[3]/src1-d[3]; +} + +static void +micro_dmax( + union tgsi_exec_channel *dst, + const union tgsi_exec_channel *src0, + const union tgsi_exec_channel *src1) +{ + dst-d[0] = src0-d[0] src1-d[0] ? src0-d[0] : src1-d[0]; + dst-d[1] = src0-d[1] src1-d[1] ? src0-d[1] : src1-d[1]; + dst-d[2] = src0-d[2] src1-d[2] ? src0-d[2] : src1-d[2]; + dst-d[3] = src0-d[3] src1-d[3] ? src0-d[3] : src1-d[3]; +} + +static void +micro_dmin( + union tgsi_exec_channel *dst, + const union tgsi_exec_channel *src0, + const union tgsi_exec_channel *src1) +{ + dst-d[0] = src0-d[0] src1-d[0] ? src0-d[0] : src1-d[0]; + dst-d[1] = src0-d[1] src1-d[1] ? src0-d[1] : src1-d[1]; + dst-d[2] = src0-d[2] src1-d[2] ? src0-d[2] : src1-d[2]; + dst-d[3] = src0-d[3] src1-d[3] ? src0-d[3] : src1-d[3]; +} + +static void +micro_dneg( + union tgsi_exec_channel *dst, + const union tgsi_exec_channel *src ) +{ + dst-d[0] = -src-d[0]; + dst-d[1] = -src-d[1]; + dst-d[2] = -src-d[2]; + dst-d[3] = -src-d[3]; +} + +static void +micro_dsge(union tgsi_exec_channel *dst, + const union tgsi_exec_channel *src) +{ + dst-d[0] = src[0].d[0] = src[1].d[0] ? 1.0f : 0.0f; + dst-d[1] = src[0].d[1] = src[1].d[1] ? 1.0f : 0.0f; + dst-d[2] = src[0].d[2] = src[1].d[2] ? 1.0f : 0.0f; + dst-d[3] = src[0].d[3] = src[1].d[3] ? 1.0f : 0.0f; +} + +static void +micro_dslt(union tgsi_exec_channel *dst, + const union tgsi_exec_channel *src) +{ + dst-d[0] = src[0].d[0] src[1].d[0] ? 1.0f : 0.0f; + dst-d[1] = src[0].d[1] src[1].d[1] ? 1.0f : 0.0f; + dst-d[2] = src[0].d[2] src[1].d[2] ? 1.0f : 0.0f; + dst-d[3] = src[0].d[3] src[1].d[3] ? 1.0f : 0.0f; +} #define CHAN_X 0 #define CHAN_Y 1 @@ -389,7 +468,8 @@ micro_trunc(union tgsi_exec_channel *dst, enum tgsi_exec_datatype { TGSI_EXEC_DATA_FLOAT, TGSI_EXEC_DATA_INT, - TGSI_EXEC_DATA_UINT + TGSI_EXEC_DATA_UINT, + TGSI_EXEC_DATA_DOUBLE }; /* @@ -3491,6 +3571,34 @@ exec_instruction( exec_vector_binary(mach, inst, micro_usne, TGSI_EXEC_DATA_UINT, TGSI_EXEC_DATA_UINT); break; + case TGSI_OPCODE_MOVD
[Mesa3d-dev] [PATH]OpenCL: fix segfault in context and make tests work
Hi, The fix patch changes cl_uint to cl_device_type in Device class, it fix some tests errors and the second one fix a segfault in context creation and implement some errors messages. Igor From 6ea02fdfe3e69bafcfa04e693dfd2469b3b76386 Mon Sep 17 00:00:00 2001 From: Igor Oliveira igor.olive...@openbossa.org Date: Tue, 5 Jan 2010 11:19:13 -0400 Subject: [PATCH 1/2] fix Device::type type, change cl_uint by cl_device_type and fix return information --- src/core/device.cpp | 21 + src/core/device.h | 10 +- 2 files changed, 18 insertions(+), 13 deletions(-) diff --git a/src/core/device.cpp b/src/core/device.cpp index 20e6f2b..6c7e7d5 100644 --- a/src/core/device.cpp +++ b/src/core/device.cpp @@ -11,7 +11,7 @@ #include softpipe/sp_winsys.h -Device * Device::create(cl_uint type) +Device * Device::create(cl_device_type type) { switch(type) { case CL_DEVICE_TYPE_CPU: { @@ -54,15 +54,16 @@ cl_int Device::info(cl_device_info opcode, void * paramValue, size_t * paramValueSizeRet) const { + size_t sizeRet = 0; + if (!paramValue) return CL_SUCCESS; switch (opcode) { case CL_DEVICE_TYPE: - if (paramValueSizeRet) - *paramValueSizeRet = sizeof(type()); - ((cl_int*)paramValue)[0] = type(); + sizeRet = sizeof(type()); + ((cl_device_type*)paramValue)[0] = type(); break; case CL_DEVICE_VENDOR_ID: break; @@ -168,19 +169,23 @@ cl_int Device::info(cl_device_info opcode, break; } - if (paramValueSizeRet paramValueSize != *paramValueSizeRet) - return CL_INVALID_VALUE; + if (paramValueSizeRet) + *paramValueSizeRet = sizeRet; + + if (paramValueSize != sizeRet) { + return CL_INVALID_VALUE; + } return CL_SUCCESS; } -Device::Device(cl_uint type, struct pipe_screen *screen) +Device::Device(cl_device_type type, struct pipe_screen *screen) : m_screen(screen) { fillInfo(type); } -void Device::fillInfo(cl_uint type) +void Device::fillInfo(cl_device_type type) { m_info.type = type; m_info.vendorId = 0;//should be a PCIe ID diff --git a/src/core/device.h b/src/core/device.h index 5a3d43f..5ba0a43 100644 --- a/src/core/device.h +++ b/src/core/device.h @@ -11,9 +11,9 @@ struct pipe_screen; class Device { public: - static Device *create(cl_uint type); + static Device *create(cl_device_type type); public: - inline cl_uint type() const; + inline cl_device_type type() const; inline struct pipe_screen *screen() const; cl_int info(cl_device_info opcode, @@ -22,8 +22,8 @@ public: size_t *paramValueSizeRet) const; private: - Device(cl_uint type, struct pipe_screen *screen); - void fillInfo(cl_uint type); + Device(cl_device_type type, struct pipe_screen *screen); + void fillInfo(cl_device_type type); private: DeviceInfo m_info; @@ -31,7 +31,7 @@ private: struct pipe_screen *m_screen; }; -inline cl_uint Device::type() const +inline cl_device_type Device::type() const { return m_info.type; } -- 1.6.3.3 From cdb70b3ee991304db3888a710d1c43817dfb4841 Mon Sep 17 00:00:00 2001 From: Igor Oliveira igor.olive...@openbossa.org Date: Tue, 5 Jan 2010 11:23:25 -0400 Subject: [PATCH 2/2] context: fix segfault and returns --- src/api/api_context.cpp | 17 ++--- 1 files changed, 14 insertions(+), 3 deletions(-) diff --git a/src/api/api_context.cpp b/src/api/api_context.cpp index 8393dcb..1703fde 100644 --- a/src/api/api_context.cpp +++ b/src/api/api_context.cpp @@ -16,15 +16,26 @@ clCreateContext(cl_context_properties properties, { cl_context ret_context = NULL; cl_device_type type; -cl_device_id device = devices[0]; +cl_device_id device = devices?devices[0]:NULL; cl_int device_info; +if (num_devices = 0) { + if (errcode_ret) + *errcode_ret = CL_INVALID_VALUE; + goto fail; +} + device_info = clGetDeviceInfo(device, CL_DEVICE_TYPE, sizeof(type), type, NULL); -if (device_info != CL_INVALID_DEVICE) { +if (device_info == CL_SUCCESS) { ret_context = clCreateContextFromType(properties, type, pfn_notify, user_data, errcode_ret); +} else { + if (device_info == CL_INVALID_DEVICE) { + if (errcode_ret) + *errcode_ret = CL_INVALID_VALUE; + } } - +fail: return ret_context; } -- 1.6.3.3 -- This SF.Net email is sponsored by the Verizon Developer Community Take advantage of Verizon's best-in-class app development support A streamlined, 14 day to market process makes app distribution fast and easy Join now and get one step closer to millions of Verizon customers http://p.sf.net/sfu/verizon-dev2dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists
Re: [Mesa3d-dev] [PATCH] [OpenCL] fix device bugs found by unit tests
Hi, On Mon, Jan 4, 2010 at 10:27 AM, Zack Rusin za...@vmware.com wrote: On Wednesday 30 December 2009 09:07:48 Igor Oliveira wrote: This patch fix some bugs found by unit tests like passing a wrong device type all the devices(gpu, cpu and accelarator) was being created, ignore paramValue if it is NULL and return invalid_value if paramValueSize != paramValueSizeReturn . Hey Igor, thanks for the patches I just pushed them. cool! For the future could you maybe send your patches using git format-patch? Otherwise I have to be recreating commit messages from your emails while remembering to commit with --author to preserve ownership. Thanks! Right i will do that. z igor -- This SF.Net email is sponsored by the Verizon Developer Community Take advantage of Verizon's best-in-class app development support A streamlined, 14 day to market process makes app distribution fast and easy Join now and get one step closer to millions of Verizon customers http://p.sf.net/sfu/verizon-dev2dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
[Mesa3d-dev] [PATCH] fix missing semantic name in tgsi_text.c
Hi, i found a tgsi bug running vega state tracker. The bug happens because in tgsi_text.c line 991: for (i = 0; i TGSI_SEMANTIC_COUNT; i++) TGSI_SEMANTIC_COUNT is bigger than semantic_name declared in tgsi_text.c: 936 static const char *semantic_names[TGSI_SEMANTIC_COUNT] = 937 { 938POSITION, 939COLOR, 940BCOLOR, 941FOG, 942PSIZE, 943GENERIC, 944NORMAL, 945FACE, 946PRIM_ID 947 }; TGSI_SEMANTIC_COUNT is 10 but there is just 8 elements seeing other files i see that there is missing semantic name: EDGEFLAG. The patch below add EDGEFLAG in semantic_names. diff --git a/src/gallium/auxiliary/tgsi/tgsi_text.c b/src/gallium/auxiliary/tgsi/tgsi_text.c index 2e3f9a9..9fcffed 100644 --- a/src/gallium/auxiliary/tgsi/tgsi_text.c +++ b/src/gallium/auxiliary/tgsi/tgsi_text.c @@ -932,6 +932,7 @@ static const char *semantic_names[TGSI_SEMANTIC_COUNT] = GENERIC, NORMAL, FACE, + EDGEFLAG, PRIM_ID }; -- This SF.Net email is sponsored by the Verizon Developer Community Take advantage of Verizon's best-in-class app development support A streamlined, 14 day to market process makes app distribution fast and easy Join now and get one step closer to millions of Verizon customers http://p.sf.net/sfu/verizon-dev2dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
[Mesa3d-dev] [PATCH] [OpenCL] fix device bugs found by unit tests
This patch fix some bugs found by unit tests like passing a wrong device type all the devices(gpu, cpu and accelarator) was being created, ignore paramValue if it is NULL and return invalid_value if paramValueSize != paramValueSizeReturn . ps: the patch is in annex too. diff --git a/src/api/api_device.cpp b/src/api/api_device.cpp index f91ad29..4d6bf19 100644 --- a/src/api/api_device.cpp +++ b/src/api/api_device.cpp @@ -73,13 +73,13 @@ clGetDeviceIDs(cl_device_type device_type, gpu = (device_type CL_DEVICE_TYPE_DEFAULT) || (device_type CL_DEVICE_TYPE_GPU) || - (device_type CL_DEVICE_TYPE_ALL); + !(device_type ^ CL_DEVICE_TYPE_ALL); cpu = (device_type CL_DEVICE_TYPE_CPU) || - (device_type CL_DEVICE_TYPE_ALL); + !(device_type ^ CL_DEVICE_TYPE_ALL); accelerator = (device_type CL_DEVICE_TYPE_ACCELERATOR) || - (device_type CL_DEVICE_TYPE_ALL); + !(device_type ^ CL_DEVICE_TYPE_ALL); if (!gpu !cpu !accelerator) return CL_INVALID_DEVICE_TYPE; diff --git a/src/core/device.cpp b/src/core/device.cpp index c300f79..20e6f2b 100644 --- a/src/core/device.cpp +++ b/src/core/device.cpp @@ -39,9 +39,12 @@ Device * Device::create(cl_uint type) static void stringToParam(const std::string str, void * paramValue, + size_t paramValueSize, size_t * paramValueSizeRet) { - strcpy((char*)paramValue, str.c_str()); + char *paramCharValue = (char *)paramValue; + paramCharValue[paramValueSize - 1] = 0; + strncpy(paramCharValue, str.c_str(), paramValueSize - 1); if (paramValueSizeRet) *paramValueSizeRet = str.size(); } @@ -51,8 +54,14 @@ cl_int Device::info(cl_device_info opcode, void * paramValue, size_t * paramValueSizeRet) const { + if (!paramValue) + return CL_SUCCESS; + switch (opcode) { case CL_DEVICE_TYPE: + if (paramValueSizeRet) + *paramValueSizeRet = sizeof(type()); + ((cl_int*)paramValue)[0] = type(); break; case CL_DEVICE_VENDOR_ID: @@ -140,10 +149,10 @@ cl_int Device::info(cl_device_info opcode, case CL_DEVICE_QUEUE_PROPERTIES: break; case CL_DEVICE_NAME: - stringToParam(m_info.name, paramValue, paramValueSizeRet); + stringToParam(m_info.name, paramValue, paramValueSize, paramValueSizeRet); break; case CL_DEVICE_VENDOR: - stringToParam(m_info.name, paramValue, paramValueSizeRet); + stringToParam(m_info.name, paramValue, paramValueSize, paramValueSizeRet); break; case CL_DRIVER_VERSION: break; @@ -159,6 +168,9 @@ cl_int Device::info(cl_device_info opcode, break; } + if (paramValueSizeRet paramValueSize != *paramValueSizeRet) + return CL_INVALID_VALUE; + return CL_SUCCESS; } diff --git a/src/api/api_device.cpp b/src/api/api_device.cpp index f91ad29..4d6bf19 100644 --- a/src/api/api_device.cpp +++ b/src/api/api_device.cpp @@ -73,13 +73,13 @@ clGetDeviceIDs(cl_device_type device_type, gpu = (device_type CL_DEVICE_TYPE_DEFAULT) || (device_type CL_DEVICE_TYPE_GPU) || - (device_type CL_DEVICE_TYPE_ALL); + !(device_type ^ CL_DEVICE_TYPE_ALL); cpu = (device_type CL_DEVICE_TYPE_CPU) || - (device_type CL_DEVICE_TYPE_ALL); + !(device_type ^ CL_DEVICE_TYPE_ALL); accelerator = (device_type CL_DEVICE_TYPE_ACCELERATOR) || - (device_type CL_DEVICE_TYPE_ALL); + !(device_type ^ CL_DEVICE_TYPE_ALL); if (!gpu !cpu !accelerator) return CL_INVALID_DEVICE_TYPE; diff --git a/src/core/device.cpp b/src/core/device.cpp index c300f79..20e6f2b 100644 --- a/src/core/device.cpp +++ b/src/core/device.cpp @@ -39,9 +39,12 @@ Device * Device::create(cl_uint type) static void stringToParam(const std::string str, void * paramValue, + size_t paramValueSize, size_t * paramValueSizeRet) { - strcpy((char*)paramValue, str.c_str()); + char *paramCharValue = (char *)paramValue; + paramCharValue[paramValueSize - 1] = 0; + strncpy(paramCharValue, str.c_str(), paramValueSize - 1); if (paramValueSizeRet) *paramValueSizeRet = str.size(); } @@ -51,8 +54,14 @@ cl_int Device::info(cl_device_info opcode, void * paramValue, size_t * paramValueSizeRet) const { + if (!paramValue) + return CL_SUCCESS; + switch (opcode) { case CL_DEVICE_TYPE: + if (paramValueSizeRet) + *paramValueSizeRet = sizeof(type()); + ((cl_int*)paramValue)[0] = type(); break; case CL_DEVICE_VENDOR_ID: @@ -140,10 +149,10 @@ cl_int Device::info(cl_device_info opcode, case CL_DEVICE_QUEUE_PROPERTIES: break; case CL_DEVICE_NAME: - stringToParam(m_info.name, paramValue,
[Mesa3d-dev] PATCH:OpenCL: create tests
Hi guys, There are two patches the first one(cmake_test.patch) create the cmake tests infrastructure and the second one(files_test.patch) create the device and context tests. I am using check unit tests[1] because it is simple and it is used in many projects. The tests already found some bugs(like segfaults and not implemented features). All the tests has been wrote using opencl specification. [1] http://check.sourceforge.net/ diff --git a/CMakeLists.txt b/CMakeLists.txt index 2b26770..681ced3 100644 --- a/CMakeLists.txt +++ b/CMakeLists.txt @@ -21,3 +21,9 @@ Find_Package(Clang REQUIRED) add_subdirectory(src) add_subdirectory(examples) + +IF (BUILD_TESTS) +ENABLE_TESTING() +Find_Package(Check REQUIRED) +add_subdirectory(tests) +ENDIF (BUILD_TESTS) diff --git a/cmake/modules/FindCheck.cmake b/cmake/modules/FindCheck.cmake new file mode 100644 index 000..d7a5bcd --- /dev/null +++ b/cmake/modules/FindCheck.cmake @@ -0,0 +1,57 @@ +# - Try to find the CHECK libraries +# Once done this will define +# +# Note: This module is originally found in opensync project +# +# CHECK_FOUND - system has check +# CHECK_INCLUDE_DIRS - the check include directory +# CHECK_LIBRARIES - check library +# +# Copyright (c) 2007 Daniel Gollub gol...@b1-systems.de +# Copyright (c) 2007-2009 Bjoern Ricks bjoern.ri...@gmail.com +# +# Redistribution and use is allowed according to the terms of the New +# BSD license. +# For details see the accompanying COPYING-CMAKE-SCRIPTS file. + + +INCLUDE( FindPkgConfig ) + +IF ( Check_FIND_REQUIRED ) + SET( _pkgconfig_REQUIRED REQUIRED ) +ELSE( Check_FIND_REQUIRED ) + SET( _pkgconfig_REQUIRED ) +ENDIF ( Check_FIND_REQUIRED ) + +IF ( CHECK_MIN_VERSION ) + PKG_SEARCH_MODULE( CHECK ${_pkgconfig_REQUIRED} check=${CHECK_MIN_VERSION} ) +ELSE ( CHECK_MIN_VERSION ) + PKG_SEARCH_MODULE( CHECK ${_pkgconfig_REQUIRED} check ) +ENDIF ( CHECK_MIN_VERSION ) + +# Look for CHECK include dir and libraries +IF( NOT CHECK_FOUND AND NOT PKG_CONFIG_FOUND ) + + FIND_PATH( CHECK_INCLUDE_DIRS check.h ) + + FIND_LIBRARY( CHECK_LIBRARIES NAMES check ) + + IF ( CHECK_INCLUDE_DIRS AND CHECK_LIBRARIES ) + SET( CHECK_FOUND 1 ) + IF ( NOT Check_FIND_QUIETLY ) + MESSAGE ( STATUS Found CHECK: ${CHECK_LIBRARIES} ) + ENDIF ( NOT Check_FIND_QUIETLY ) + ELSE ( CHECK_INCLUDE_DIRS AND CHECK_LIBRARIES ) + IF ( Check_FIND_REQUIRED ) + MESSAGE( FATAL_ERROR Could NOT find CHECK ) + ELSE ( Check_FIND_REQUIRED ) + IF ( NOT Check_FIND_QUIETLY ) +MESSAGE( STATUS Could NOT find CHECK ) + ENDIF ( NOT Check_FIND_QUIETLY ) + ENDIF ( Check_FIND_REQUIRED ) + ENDIF ( CHECK_INCLUDE_DIRS AND CHECK_LIBRARIES ) +ENDIF( NOT CHECK_FOUND AND NOT PKG_CONFIG_FOUND ) + +# Hide advanced variables from CMake GUIs +MARK_AS_ADVANCED( CHECK_INCLUDE_DIRS CHECK_LIBRARIES ) + diff --git a/tests/CMakeLists.txt b/tests/CMakeLists.txt new file mode 100644 index 000..6660be0 --- /dev/null +++ b/tests/CMakeLists.txt @@ -0,0 +1,18 @@ +INCLUDE_DIRECTORIES(${Clover_SOURCE_DIR}/include ${CHECK_INCLUDE_DIRS}) +LINK_DIRECTORIES(${Clover_BINARY_DIR}/src ${CHECK_LIBRARY_DIRS}) + +set(OPENCL_TESTS_SOURCE +tests.c +test_device.cpp +test_context.cpp +) + +add_executable(tests ${OPENCL_TESTS_SOURCE}) +target_link_libraries(tests OpenCL ${CHECK_LIBRARIES}) + +MACRO(OPENCL_TEST EXECUTABLE_NAME TEST_NAME) +add_test(${TEST_NAME} ${EXECUTABLE_NAME} ${TEST_NAME}) +ENDMACRO(OPENCL_TEST) + +OPENCL_TEST(tests device) +OPENCL_TEST(tests context) diff --git a/tests/test_context.cpp b/tests/test_context.cpp new file mode 100644 index 000..1e7d2e1 --- /dev/null +++ b/tests/test_context.cpp @@ -0,0 +1,52 @@ +#include test_context.h + +#include OpenCL/cl.h + +START_TEST (test_create_context) +{ +cl_context context = NULL; +cl_device_id device; +cl_int result = -1; +cl_int err_code; +cl_int result_context; + +result = clGetDeviceIDs(CL_DEVICE_TYPE_CPU, 1, device, NULL); +if(result == CL_SUCCESS) { +//in clover we need plataform argument? +context = clCreateContext(0, 1, device, NULL, NULL, NULL); +fail_if(context == NULL, It should work, context not created); +result_context = clReleaseContext(context); +fail_if((result_context != CL_SUCCESS) || (context != NULL), +It should work, context no released); + +context = clCreateContext(0, 1, NULL, NULL, NULL, err_code); +fail_if((context != NULL) || (err_code != CL_INVALID_VALUE), +It should not work, context or err_code returning wrong); +result_context = clReleaseContext(context); + +context = clCreateContext(0, 0, device, NULL, NULL, err_code); +fail_if((context != NULL) || (err_code != CL_INVALID_VALUE), +It should not work, context or err_code returning wrong); +result_context = clReleaseContext(context); +} +} +END_TEST + +START_TEST (test_create_context_from_type) +{ +cl_int result = -1; +cl_context
[Mesa3d-dev] PATCH[0/1]: OpenCL: create and implement stub context methods
These patchs implements and implements stub context methods in OpenCL. Almost all operation in OpenCL use a context. The patch implements the gallium3d context and implements the methods below: -clCreateContext -clCreateContexFromType -clRetainContext -clReleaseContext ps: probably i show break it in 2 patchs -- Return on Information: Google Enterprise Search pays you back Get the facts. http://p.sf.net/sfu/google-dev2dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
[Mesa3d-dev] PATCH[1/1]: OpenCL: create and implement stub context methods
diff --git a/src/CMakeLists.txt b/src/CMakeLists.txt index b8ff200..c29f7c6 100644 --- a/src/CMakeLists.txt +++ b/src/CMakeLists.txt @@ -16,6 +16,7 @@ set(CLOVER_SRC_FILES api/api_memory.cpp api/api_profiling.cpp api/api_sampler.cpp api/api_gl.cpp core/device.cpp +core/context.cpp compiler/compiler.cpp cpuwinsys/cpuwinsys.c) diff --git a/src/api/api_context.cpp b/src/api/api_context.cpp index fbf3af9..8393dcb 100644 --- a/src/api/api_context.cpp +++ b/src/api/api_context.cpp @@ -1,5 +1,8 @@ #include OpenCL/cl.h +#include core/context.h +#include core/device.h +#include cpuwinsys/cpuwinsys.h // Context APIs @@ -11,7 +14,18 @@ clCreateContext(cl_context_properties properties, void * user_data, cl_int *errcode_ret) { -return 0; +cl_context ret_context = NULL; +cl_device_type type; +cl_device_id device = devices[0]; +cl_int device_info; + +device_info = clGetDeviceInfo(device, CL_DEVICE_TYPE, sizeof(type), type, NULL); +if (device_info != CL_INVALID_DEVICE) { +ret_context = clCreateContextFromType(properties, type, +pfn_notify, user_data, errcode_ret); +} + +return ret_context; } cl_context @@ -21,19 +35,57 @@ clCreateContextFromType(cl_context_properties properties, void * user_data, cl_int *errcode_ret) { -return 0; +struct pipe_context *context = NULL; + +switch (device_type) { +case CL_DEVICE_TYPE_CPU: +context = +cl_create_context(cpu_winsys()); + +break; +default: +if (errcode_ret) { +*errcode_ret = CL_INVALID_DEVICE_TYPE; +} +goto fail; +} + +fail: +return cl_convert_context(context); } cl_int clRetainContext(cl_context context) { -return 0; +cl_int ret; + +if (context) { +context-id++; +ret = CL_SUCCESS; +} else { +ret = CL_INVALID_CONTEXT; +} + +return ret; } cl_int clReleaseContext(cl_context context) { -return 0; +cl_uint ret; + +if (context) { +if( !context-id ) { +context-pipe.destroy(context-pipe); +} else { +context-id--; +} +ret = CL_SUCCESS; +} else { +ret = CL_INVALID_CONTEXT; +} + +return ret; } cl_int diff --git a/src/core/context.cpp b/src/core/context.cpp new file mode 100644 index 000..891a96e --- /dev/null +++ b/src/core/context.cpp @@ -0,0 +1,19 @@ +#include context.h +#include util/u_memory.h + +void cl_destroy_context( struct pipe_context *context ) +{ +struct _cl_context *clcontext = cl_convert_context(context); + +FREE(clcontext); +} + +struct pipe_context *cl_create_context( struct pipe_winsys *winsys ) +{ +struct _cl_context *cl_context = CALLOC_STRUCT(_cl_context); + +cl_context-pipe.winsys = winsys; +cl_context-pipe.destroy = cl_destroy_context; + +return cl_context-pipe; +} diff --git a/src/core/context.h b/src/core/context.h index f74bcdb..00b0f33 100644 --- a/src/core/context.h +++ b/src/core/context.h @@ -2,15 +2,22 @@ #define CONTEXT_H #include OpenCL/cl.h - #include pipe/p_context.h struct _cl_context { -struct pipe_context *pipe; +struct pipe_context pipe; cl_uint id; }; -void cl_set_current_context(struct _cl_context *ctx); -struct _cl_context *cl_current_context(void); +void cl_set_current_context( struct _cl_context *ctx); +struct _cl_context *cl_current_context( void); + +struct pipe_context *cl_create_context( struct pipe_winsys *winsys ); + +static INLINE struct _cl_context * +cl_convert_context( struct pipe_context *pipe ) +{ +return (struct _cl_context *)pipe; +} #endif diff --git a/src/core/device.cpp b/src/core/device.cpp index 4553d1b..c300f79 100644 --- a/src/core/device.cpp +++ b/src/core/device.cpp @@ -219,8 +219,8 @@ void Device::fillInfo(cl_uint type) m_info.queueProperties = ; #endif - m_info.name = m_screen-get_name(m_screen); - m_info.vendor = m_screen-get_vendor(m_screen); + //m_info.name = m_screen-get_name(m_screen); + //m_info.vendor = m_screen-get_vendor(m_screen); //m_info.driverVersion = ; m_info.profile = FULL_PROFILE; //m_info.version = ; -- Return on Information: Google Enterprise Search pays you back Get the facts. http://p.sf.net/sfu/google-dev2dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] PATCH[0/1]: OpenCL: create and implement stub context methods
Hi Zack, 1) agreed. OpencCL is a complete different project and should exist in a different repository. 1.1) Well use Gallium as CPU backend is a software dilemma: All problems in computer science can be solved by another level of indirection...except for the problem of too many layers of indirection But in my opinion we can use Gallium for CPU operations too, using gallium as a backend for all device types we maintain a code consistency. 2)Well the project live in a different repository, so we could use CMake instead of SCons. Personally i prefer CMake, it is lot of fast to generate Makefiles and easy to use. 3)I prefer C++ but as you said it could cause militant schism because it is a complex language and many people do not like it. But i do not care if we use C instead of C++ :). Igor On Wed, Dec 9, 2009 at 3:08 PM, Zack Rusin za...@vmware.com wrote: On Wednesday 09 December 2009 13:29:05 Igor Oliveira wrote: These patchs implements and implements stub context methods in OpenCL. Almost all operation in OpenCL use a context. The patch implements the gallium3d context and implements the methods below: -clCreateContext -clCreateContexFromType -clRetainContext -clReleaseContext ps: probably i show break it in 2 patchs Hi Igor, the patch looks ok. Thanks. we're just working on adding support for compute to Gallium so I'd probably wait a bit so that we can nail down the framework underneath before anything else. Also we have to decide the following issues before doing really anything: 1) should the opencl state tracker live in a repository of its own or should it live withing mesa3d repo like the other state trackers. The thing that makes the opencl state tracker a bit different is that it has to work on raw cpu (which is really subquestion to 1: should the opencl state tracker be able to work without gallium when working on top of a cpu). i didn't really feel like creating another branch of mesa and be merging things in initially which is why there is a separate repo right now. 2) should the opencl state tracker be using cmake or scons. originally i picked cmake because it generates actual makefile's and that's incredibly useful. 3) the language selection. it's using c++ because llvm is using c++ and because i dig c++, which is good enough for me but i guess it might cause militant schism. i'd also like to rename it to coal to make it fit better with the mesa and gallium naming. opinions? z -- Return on Information: Google Enterprise Search pays you back Get the facts. http://p.sf.net/sfu/google-dev2dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev