Re: [Mesa-dev] [PATCH 2/2] i965: handle gl_PointCoord for Gen4 and Gen5 platform
On Thu, Feb 23, 2012 at 02:37:06PM +0800, Yuanhan Liu wrote: On Tue, Feb 21, 2012 at 11:59:17AM -0800, Eric Anholt wrote: On Sun, 19 Feb 2012 13:31:33 +0800, Liu Aleaxander aleaxan...@gmail.com wrote: On Sun, Feb 19, 2012 at 8:54 AM, Eric Anholt e...@anholt.net wrote: On Sat, 18 Feb 2012 23:07:32 +0800, Liu Aleaxander aleaxan...@gmail.com wrote: + /* + * gl_PointCoord is a FS instead of VS builtin variable, thus is not + * included in c-nr_setup_regs. But FS need SF do the interpolation, + * so that here padding the interpolation for gl_PointCoord in last. + */ + if (c-key.do_point_coord) + c-nr_setup_regs++; So you're writing an extra attribute of setup, but you haven't increased the size of the URB entry. If the URB full except for pointcoord, you'd end up writing over the next URB entry and probably hanging the GPU. If you correctly allocate the URB size and that gets reflected in urb_read_length, you should be able to read your new attribute. That's maybe the place I don't understand quite well so far. Say, you write attributes into URB from SF thread to FS. I did increase the urb_read_length in wm_state. Is that the allocation you mean? Should I, say, allocate size for extra attributes just for FS in SF stage? If so, would you please tell me how? Since if I understand correctly, the urb_read_length in SF_STATE is counted from the attributes from VF stage. Thus at least urb_read_length in SF stage is not the right place for me to touch, right? urb_entry_size in the SF is the size of what the SF outputs and is what determines how much space is allocated by brw_urb.c Thanks for the info. Well, I was trying to figure out why this patch doesn't work when the urb read lenght per each attribute is set to 2 at calculate_urb_setup(), but it does work when set to 4. I may not quite understand the urb layout now; and I'm trying to figure it out. OK, finally, I guess I understand that now. And I wrote another patch to fix this issue. And this time we don't need to write a single PS shader to handle it thus it's much cleaner than this patch. Thanks, Yuanhan Liu ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] i965: handle gl_PointCoord for Gen4 and Gen5 platform
On Tue, Feb 21, 2012 at 11:59:17AM -0800, Eric Anholt wrote: On Sun, 19 Feb 2012 13:31:33 +0800, Liu Aleaxander aleaxan...@gmail.com wrote: On Sun, Feb 19, 2012 at 8:54 AM, Eric Anholt e...@anholt.net wrote: On Sat, 18 Feb 2012 23:07:32 +0800, Liu Aleaxander aleaxan...@gmail.com wrote: + /* + * gl_PointCoord is a FS instead of VS builtin variable, thus is not + * included in c-nr_setup_regs. But FS need SF do the interpolation, + * so that here padding the interpolation for gl_PointCoord in last. + */ + if (c-key.do_point_coord) + c-nr_setup_regs++; So you're writing an extra attribute of setup, but you haven't increased the size of the URB entry. If the URB full except for pointcoord, you'd end up writing over the next URB entry and probably hanging the GPU. If you correctly allocate the URB size and that gets reflected in urb_read_length, you should be able to read your new attribute. That's maybe the place I don't understand quite well so far. Say, you write attributes into URB from SF thread to FS. I did increase the urb_read_length in wm_state. Is that the allocation you mean? Should I, say, allocate size for extra attributes just for FS in SF stage? If so, would you please tell me how? Since if I understand correctly, the urb_read_length in SF_STATE is counted from the attributes from VF stage. Thus at least urb_read_length in SF stage is not the right place for me to touch, right? urb_entry_size in the SF is the size of what the SF outputs and is what determines how much space is allocated by brw_urb.c Thanks for the info. Well, I was trying to figure out why this patch doesn't work when the urb read lenght per each attribute is set to 2 at calculate_urb_setup(), but it does work when set to 4. I may not quite understand the urb layout now; and I'm trying to figure it out. Thanks, Yuanhan Liu ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] i965: handle gl_PointCoord for Gen4 and Gen5 platform
On Sun, 19 Feb 2012 13:31:33 +0800, Liu Aleaxander aleaxan...@gmail.com wrote: On Sun, Feb 19, 2012 at 8:54 AM, Eric Anholt e...@anholt.net wrote: On Sat, 18 Feb 2012 23:07:32 +0800, Liu Aleaxander aleaxan...@gmail.com wrote: + /* + * gl_PointCoord is a FS instead of VS builtin variable, thus is not + * included in c-nr_setup_regs. But FS need SF do the interpolation, + * so that here padding the interpolation for gl_PointCoord in last. + */ + if (c-key.do_point_coord) + c-nr_setup_regs++; So you're writing an extra attribute of setup, but you haven't increased the size of the URB entry. If the URB full except for pointcoord, you'd end up writing over the next URB entry and probably hanging the GPU. If you correctly allocate the URB size and that gets reflected in urb_read_length, you should be able to read your new attribute. That's maybe the place I don't understand quite well so far. Say, you write attributes into URB from SF thread to FS. I did increase the urb_read_length in wm_state. Is that the allocation you mean? Should I, say, allocate size for extra attributes just for FS in SF stage? If so, would you please tell me how? Since if I understand correctly, the urb_read_length in SF_STATE is counted from the attributes from VF stage. Thus at least urb_read_length in SF stage is not the right place for me to touch, right? urb_entry_size in the SF is the size of what the SF outputs and is what determines how much space is allocated by brw_urb.c pgpWdNMPE6Jeq.pgp Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/2] i965: handle gl_PointCoord for Gen4 and Gen5 platform
This patch add the support of gl_PointCoord gl builtin var for platform gen4 and gen5(ILK). We can get the point start coord and the current pixel coord from PS payload, and the only left element needed is the point size. Then handle that in SF thread. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=45975 Piglit: glsl-fs-pointcoord and fbo-gl_pointcoord NOTE: This is a candidate for stable release branches. v2: comments from Eric, handle FRAG_ATTRIB_PNTC at brw_emit_point_sprite_setup(). Signed-off-by: Yuanhan Liu yuanhan@linux.intel.com --- src/mesa/drivers/dri/i965/brw_fs.cpp | 39 ++ src/mesa/drivers/dri/i965/brw_fs.h |1 + src/mesa/drivers/dri/i965/brw_fs_visitor.cpp |2 + src/mesa/drivers/dri/i965/brw_sf.c |2 + src/mesa/drivers/dri/i965/brw_sf.h |1 + src/mesa/drivers/dri/i965/brw_sf_emit.c | 23 -- 6 files changed, 64 insertions(+), 4 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp index cbbef11..6e247f7 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp @@ -421,6 +421,37 @@ fs_visitor::emit_fragcoord_interpolation(ir_variable *ir) } fs_reg * +fs_visitor::emit_pointcoord_interpolation_gen4(ir_variable *ir) +{ + fs_reg *reg = new(this-mem_ctx) fs_reg(this, ir-type); + fs_reg pntcoord = *reg; + int urb_start = c-nr_payload_regs + c-prog_data.curb_read_length; + fs_reg pnt_size_rcp = fs_reg(brw_vec1_grf(urb_start + + urb_setup[FRAG_ATTRIB_PNTC] * + urb_entries_per_attr, 0)); + bool render_to_fbo = ctx-DrawBuffer-Name != 0; + + /* gl_PointCoord.x */ + emit(BRW_OPCODE_MUL, pntcoord, pnt_size_rcp, +this-delta_x[BRW_WM_PERSPECTIVE_PIXEL_BARYCENTRIC]); + pntcoord.reg_offset++; + + /* gl_PointCoord.y */ + emit(BRW_OPCODE_MUL, pntcoord, pnt_size_rcp, +this-delta_y[BRW_WM_PERSPECTIVE_PIXEL_BARYCENTRIC]); + /* +* Window coordinates in an FBO are inverted, which means coord in Y +* must be inverted, too. +*/ + if ((ctx-Point.SpriteOrigin == GL_LOWER_LEFT) != render_to_fbo) { + pntcoord.negate = true; + emit(BRW_OPCODE_ADD, pntcoord, pntcoord, fs_reg(1.0f)); + } + + return reg; +} + +fs_reg * fs_visitor::emit_general_interpolation(ir_variable *ir) { fs_reg *reg = new(this-mem_ctx) fs_reg(this, ir-type); @@ -710,6 +741,14 @@ fs_visitor::calculate_urb_setup() urb_setup[fp_index] = urb_next++; } } + /* + * It's a FS only attribute, and we did interpolation for this attribute + * in SF thread. So, count it here, too. + * + * See brw_emit_point_sprite_setup() for more info. + */ + if (brw-fragment_program-Base.InputsRead BITFIELD64_BIT(FRAG_ATTRIB_PNTC)) + urb_setup[FRAG_ATTRIB_PNTC] = urb_next++; } c-prog_data.urb_read_length = urb_next * urb_entries_per_attr; diff --git a/src/mesa/drivers/dri/i965/brw_fs.h b/src/mesa/drivers/dri/i965/brw_fs.h index 1b4f2e3..06afeb4 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.h +++ b/src/mesa/drivers/dri/i965/brw_fs.h @@ -547,6 +547,7 @@ public: void emit_dummy_fs(); fs_reg *emit_fragcoord_interpolation(ir_variable *ir); fs_reg *emit_frontfacing_interpolation(ir_variable *ir); + fs_reg *emit_pointcoord_interpolation_gen4(ir_variable *ir); fs_reg *emit_general_interpolation(ir_variable *ir); void emit_interpolation_setup_gen4(); void emit_interpolation_setup_gen6(); diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp index bf1da9d..399d8b8 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp @@ -63,6 +63,8 @@ fs_visitor::visit(ir_variable *ir) reg = emit_fragcoord_interpolation(ir); } else if (!strcmp(ir-name, gl_FrontFacing)) { reg = emit_frontfacing_interpolation(ir); + } else if (!strcmp(ir-name, gl_PointCoord) intel-gen 6) { + reg = emit_pointcoord_interpolation_gen4(ir); } else { reg = emit_general_interpolation(ir); } diff --git a/src/mesa/drivers/dri/i965/brw_sf.c b/src/mesa/drivers/dri/i965/brw_sf.c index 54c27f9..d4544fe 100644 --- a/src/mesa/drivers/dri/i965/brw_sf.c +++ b/src/mesa/drivers/dri/i965/brw_sf.c @@ -167,6 +167,8 @@ brw_upload_sf_prog(struct brw_context *brw) key.point_sprite_coord_replace |= (1 i); } } + if (brw-fragment_program-Base.InputsRead BITFIELD64_BIT(FRAG_ATTRIB_PNTC)) + key.do_point_coord = 1; key.sprite_origin_lower_left = (ctx-Point.SpriteOrigin == GL_LOWER_LEFT); /* _NEW_LIGHT */ key.do_flat_shading = (ctx-Light.ShadeModel == GL_FLAT); diff --git a/src/mesa/drivers/dri/i965/brw_sf.h b/src/mesa/drivers/dri/i965/brw_sf.h index 4ef0240..f908fc0 100644
Re: [Mesa-dev] [PATCH 2/2] i965: handle gl_PointCoord for Gen4 and Gen5 platform
On Sat, 18 Feb 2012 23:07:32 +0800, Liu Aleaxander aleaxan...@gmail.com wrote: This patch add the support of gl_PointCoord gl builtin var for platform gen4 and gen5(ILK). We can get the point start coord and the current pixel coord from PS payload, and the only left element needed is the point size. Then handle that in SF thread. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=45975 Piglit: glsl-fs-pointcoord and fbo-gl_pointcoord NOTE: This is a candidate for stable release branches. + /* +* Window coordinates in an FBO are inverted, which means coord in Y +* must be inverted, too. +*/ + if ((ctx-Point.SpriteOrigin == GL_LOWER_LEFT) != render_to_fbo) { + pntcoord.negate = true; + emit(BRW_OPCODE_ADD, pntcoord, pntcoord, fs_reg(1.0f)); + } In the FS, it should be looking at the data in the key instead of ctx-whatever, to be sure that statechanges are handled correctly. (Though, given that point sprite coordinate flippinng is handled in the SF code, I'd think we would handle it the same way for pointcoord too). + /* +* gl_PointCoord is a FS instead of VS builtin variable, thus is not +* included in c-nr_setup_regs. But FS need SF do the interpolation, +* so that here padding the interpolation for gl_PointCoord in last. +*/ + if (c-key.do_point_coord) + c-nr_setup_regs++; So you're writing an extra attribute of setup, but you haven't increased the size of the URB entry. If the URB full except for pointcoord, you'd end up writing over the next URB entry and probably hanging the GPU. If you correctly allocate the URB size and that gets reflected in urb_read_length, you should be able to read your new attribute. pgpU0q3gK9inb.pgp Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] i965: handle gl_PointCoord for Gen4 and Gen5 platform
On Sun, Feb 19, 2012 at 8:54 AM, Eric Anholt e...@anholt.net wrote: On Sat, 18 Feb 2012 23:07:32 +0800, Liu Aleaxander aleaxan...@gmail.com wrote: This patch add the support of gl_PointCoord gl builtin var for platform gen4 and gen5(ILK). We can get the point start coord and the current pixel coord from PS payload, and the only left element needed is the point size. Then handle that in SF thread. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=45975 Piglit: glsl-fs-pointcoord and fbo-gl_pointcoord NOTE: This is a candidate for stable release branches. + /* + * Window coordinates in an FBO are inverted, which means coord in Y + * must be inverted, too. + */ + if ((ctx-Point.SpriteOrigin == GL_LOWER_LEFT) != render_to_fbo) { + pntcoord.negate = true; + emit(BRW_OPCODE_ADD, pntcoord, pntcoord, fs_reg(1.0f)); + } In the FS, it should be looking at the data in the key instead of ctx-whatever, to be sure that statechanges are handled correctly. Yes, I tried this way first, but I have no idea why the value of origin_upper_left and render_to_fbo in key are both equal to 0 whatever the origin you set and render to fbo or not. I don't check this issue deeply but that's why I changed to using the value in ctx. (Though, given that point sprite coordinate flippinng is handled in the SF code, I'd think we would handle it the same way for pointcoord too). It's somehow easy for me to implement this by doing in FS code generation. If you prefer way do it in SF stage, Ok, I will try to do it in that way. + /* + * gl_PointCoord is a FS instead of VS builtin variable, thus is not + * included in c-nr_setup_regs. But FS need SF do the interpolation, + * so that here padding the interpolation for gl_PointCoord in last. + */ + if (c-key.do_point_coord) + c-nr_setup_regs++; So you're writing an extra attribute of setup, but you haven't increased the size of the URB entry. If the URB full except for pointcoord, you'd end up writing over the next URB entry and probably hanging the GPU. If you correctly allocate the URB size and that gets reflected in urb_read_length, you should be able to read your new attribute. That's maybe the place I don't understand quite well so far. Say, you write attributes into URB from SF thread to FS. I did increase the urb_read_length in wm_state. Is that the allocation you mean? Should I, say, allocate size for extra attributes just for FS in SF stage? If so, would you please tell me how? Since if I understand correctly, the urb_read_length in SF_STATE is counted from the attributes from VF stage. Thus at least urb_read_length in SF stage is not the right place for me to touch, right? Thanks, Yuanhan Liu ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev