Re: [Mesa-dev] [PATCH 2/2] i965: handle gl_PointCoord for Gen4 and Gen5 platform

2012-02-26 Thread Yuanhan Liu
On Thu, Feb 23, 2012 at 02:37:06PM +0800, Yuanhan Liu wrote:
 On Tue, Feb 21, 2012 at 11:59:17AM -0800, Eric Anholt wrote:
  On Sun, 19 Feb 2012 13:31:33 +0800, Liu Aleaxander aleaxan...@gmail.com 
  wrote:
   On Sun, Feb 19, 2012 at 8:54 AM, Eric Anholt e...@anholt.net wrote:
On Sat, 18 Feb 2012 23:07:32 +0800, Liu Aleaxander 
aleaxan...@gmail.com wrote:
+   /*
+    * gl_PointCoord is a FS instead of VS builtin variable, thus is 
not
+    * included in c-nr_setup_regs. But FS need SF do the 
interpolation,
+    * so that here padding the interpolation for gl_PointCoord in 
last.
+    */
+   if (c-key.do_point_coord)
+      c-nr_setup_regs++;
   
So you're writing an extra attribute of setup, but you haven't increased
the size of the URB entry.  If the URB full except for pointcoord, you'd
end up writing over the next URB entry and probably hanging the GPU.  If
you correctly allocate the URB size and that gets reflected in
urb_read_length, you should be able to read your new attribute.
   
   That's maybe the place I don't understand quite well so far. Say, you
   write attributes into URB from SF thread to FS. I did increase the
   urb_read_length in wm_state. Is that the allocation you mean? Should
   I, say, allocate size for extra attributes just for FS in SF stage? If
   so, would you please tell me how? Since if I understand correctly, the
   urb_read_length in SF_STATE is counted from the attributes from VF
   stage. Thus at least urb_read_length in SF stage is not the right
   place for me to touch, right?
  
  urb_entry_size in the SF is the size of what the SF outputs and is what
  determines how much space is allocated by brw_urb.c
 
 Thanks for the info.
 
 Well, I was trying to figure out why this patch doesn't work when the
 urb read lenght per each attribute is set to 2 at calculate_urb_setup(),
 but it does work when set to 4. I may not quite understand the urb
 layout now; and I'm trying to figure it out.

OK, finally, I guess I understand that now. And I wrote another
patch to fix this issue. And this time we don't need to write a single
PS shader to handle it thus it's much cleaner than this patch.

Thanks,
Yuanhan Liu

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] i965: handle gl_PointCoord for Gen4 and Gen5 platform

2012-02-22 Thread Yuanhan Liu
On Tue, Feb 21, 2012 at 11:59:17AM -0800, Eric Anholt wrote:
 On Sun, 19 Feb 2012 13:31:33 +0800, Liu Aleaxander aleaxan...@gmail.com 
 wrote:
  On Sun, Feb 19, 2012 at 8:54 AM, Eric Anholt e...@anholt.net wrote:
   On Sat, 18 Feb 2012 23:07:32 +0800, Liu Aleaxander aleaxan...@gmail.com 
   wrote:
   +   /*
   +    * gl_PointCoord is a FS instead of VS builtin variable, thus is not
   +    * included in c-nr_setup_regs. But FS need SF do the interpolation,
   +    * so that here padding the interpolation for gl_PointCoord in last.
   +    */
   +   if (c-key.do_point_coord)
   +      c-nr_setup_regs++;
  
   So you're writing an extra attribute of setup, but you haven't increased
   the size of the URB entry.  If the URB full except for pointcoord, you'd
   end up writing over the next URB entry and probably hanging the GPU.  If
   you correctly allocate the URB size and that gets reflected in
   urb_read_length, you should be able to read your new attribute.
  
  That's maybe the place I don't understand quite well so far. Say, you
  write attributes into URB from SF thread to FS. I did increase the
  urb_read_length in wm_state. Is that the allocation you mean? Should
  I, say, allocate size for extra attributes just for FS in SF stage? If
  so, would you please tell me how? Since if I understand correctly, the
  urb_read_length in SF_STATE is counted from the attributes from VF
  stage. Thus at least urb_read_length in SF stage is not the right
  place for me to touch, right?
 
 urb_entry_size in the SF is the size of what the SF outputs and is what
 determines how much space is allocated by brw_urb.c

Thanks for the info.

Well, I was trying to figure out why this patch doesn't work when the
urb read lenght per each attribute is set to 2 at calculate_urb_setup(),
but it does work when set to 4. I may not quite understand the urb
layout now; and I'm trying to figure it out.

Thanks,
Yuanhan Liu 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] i965: handle gl_PointCoord for Gen4 and Gen5 platform

2012-02-21 Thread Eric Anholt
On Sun, 19 Feb 2012 13:31:33 +0800, Liu Aleaxander aleaxan...@gmail.com wrote:
 On Sun, Feb 19, 2012 at 8:54 AM, Eric Anholt e...@anholt.net wrote:
  On Sat, 18 Feb 2012 23:07:32 +0800, Liu Aleaxander aleaxan...@gmail.com 
  wrote:
  +   /*
  +    * gl_PointCoord is a FS instead of VS builtin variable, thus is not
  +    * included in c-nr_setup_regs. But FS need SF do the interpolation,
  +    * so that here padding the interpolation for gl_PointCoord in last.
  +    */
  +   if (c-key.do_point_coord)
  +      c-nr_setup_regs++;
 
  So you're writing an extra attribute of setup, but you haven't increased
  the size of the URB entry.  If the URB full except for pointcoord, you'd
  end up writing over the next URB entry and probably hanging the GPU.  If
  you correctly allocate the URB size and that gets reflected in
  urb_read_length, you should be able to read your new attribute.
 
 That's maybe the place I don't understand quite well so far. Say, you
 write attributes into URB from SF thread to FS. I did increase the
 urb_read_length in wm_state. Is that the allocation you mean? Should
 I, say, allocate size for extra attributes just for FS in SF stage? If
 so, would you please tell me how? Since if I understand correctly, the
 urb_read_length in SF_STATE is counted from the attributes from VF
 stage. Thus at least urb_read_length in SF stage is not the right
 place for me to touch, right?

urb_entry_size in the SF is the size of what the SF outputs and is what
determines how much space is allocated by brw_urb.c


pgpWdNMPE6Jeq.pgp
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/2] i965: handle gl_PointCoord for Gen4 and Gen5 platform

2012-02-18 Thread Liu Aleaxander
This patch add the support of gl_PointCoord gl builtin var for platform
gen4 and gen5(ILK).

We can get the point start coord and the current pixel coord from PS
payload, and the only left element needed is the point size. Then handle
that in SF thread.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=45975
Piglit: glsl-fs-pointcoord and fbo-gl_pointcoord

NOTE: This is a candidate for stable release branches.

v2: comments from Eric, handle FRAG_ATTRIB_PNTC at
brw_emit_point_sprite_setup().

Signed-off-by: Yuanhan Liu yuanhan@linux.intel.com
---
 src/mesa/drivers/dri/i965/brw_fs.cpp |   39 ++
 src/mesa/drivers/dri/i965/brw_fs.h   |1 +
 src/mesa/drivers/dri/i965/brw_fs_visitor.cpp |2 +
 src/mesa/drivers/dri/i965/brw_sf.c   |2 +
 src/mesa/drivers/dri/i965/brw_sf.h   |1 +
 src/mesa/drivers/dri/i965/brw_sf_emit.c  |   23 --
 6 files changed, 64 insertions(+), 4 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index cbbef11..6e247f7 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -421,6 +421,37 @@ fs_visitor::emit_fragcoord_interpolation(ir_variable *ir)
 }

 fs_reg *
+fs_visitor::emit_pointcoord_interpolation_gen4(ir_variable *ir)
+{
+   fs_reg *reg = new(this-mem_ctx) fs_reg(this, ir-type);
+   fs_reg pntcoord = *reg;
+   int urb_start = c-nr_payload_regs + c-prog_data.curb_read_length;
+   fs_reg pnt_size_rcp = fs_reg(brw_vec1_grf(urb_start +
+ urb_setup[FRAG_ATTRIB_PNTC] *
+ urb_entries_per_attr, 0));
+   bool render_to_fbo = ctx-DrawBuffer-Name != 0;
+
+   /* gl_PointCoord.x */
+   emit(BRW_OPCODE_MUL, pntcoord, pnt_size_rcp,
+this-delta_x[BRW_WM_PERSPECTIVE_PIXEL_BARYCENTRIC]);
+   pntcoord.reg_offset++;
+
+   /* gl_PointCoord.y */
+   emit(BRW_OPCODE_MUL, pntcoord, pnt_size_rcp,
+this-delta_y[BRW_WM_PERSPECTIVE_PIXEL_BARYCENTRIC]);
+   /*
+* Window coordinates in an FBO are inverted, which means coord in Y
+* must be inverted, too.
+*/
+   if ((ctx-Point.SpriteOrigin == GL_LOWER_LEFT) != render_to_fbo) {
+  pntcoord.negate = true;
+  emit(BRW_OPCODE_ADD, pntcoord, pntcoord, fs_reg(1.0f));
+   }
+
+   return reg;
+}
+
+fs_reg *
 fs_visitor::emit_general_interpolation(ir_variable *ir)
 {
fs_reg *reg = new(this-mem_ctx) fs_reg(this, ir-type);
@@ -710,6 +741,14 @@ fs_visitor::calculate_urb_setup()
   urb_setup[fp_index] = urb_next++;
 }
   }
+  /*
+   * It's a FS only attribute, and we did interpolation for this attribute
+   * in SF thread. So, count it here, too.
+   *
+   * See brw_emit_point_sprite_setup() for more info.
+   */
+  if (brw-fragment_program-Base.InputsRead 
BITFIELD64_BIT(FRAG_ATTRIB_PNTC))
+ urb_setup[FRAG_ATTRIB_PNTC] = urb_next++;
}

c-prog_data.urb_read_length = urb_next * urb_entries_per_attr;
diff --git a/src/mesa/drivers/dri/i965/brw_fs.h
b/src/mesa/drivers/dri/i965/brw_fs.h
index 1b4f2e3..06afeb4 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.h
+++ b/src/mesa/drivers/dri/i965/brw_fs.h
@@ -547,6 +547,7 @@ public:
void emit_dummy_fs();
fs_reg *emit_fragcoord_interpolation(ir_variable *ir);
fs_reg *emit_frontfacing_interpolation(ir_variable *ir);
+   fs_reg *emit_pointcoord_interpolation_gen4(ir_variable *ir);
fs_reg *emit_general_interpolation(ir_variable *ir);
void emit_interpolation_setup_gen4();
void emit_interpolation_setup_gen6();
diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
index bf1da9d..399d8b8 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
@@ -63,6 +63,8 @@ fs_visitor::visit(ir_variable *ir)
 reg = emit_fragcoord_interpolation(ir);
   } else if (!strcmp(ir-name, gl_FrontFacing)) {
 reg = emit_frontfacing_interpolation(ir);
+  } else if (!strcmp(ir-name, gl_PointCoord)  intel-gen  6) {
+ reg = emit_pointcoord_interpolation_gen4(ir);
   } else {
 reg = emit_general_interpolation(ir);
   }
diff --git a/src/mesa/drivers/dri/i965/brw_sf.c
b/src/mesa/drivers/dri/i965/brw_sf.c
index 54c27f9..d4544fe 100644
--- a/src/mesa/drivers/dri/i965/brw_sf.c
+++ b/src/mesa/drivers/dri/i965/brw_sf.c
@@ -167,6 +167,8 @@ brw_upload_sf_prog(struct brw_context *brw)
key.point_sprite_coord_replace |= (1  i);
   }
}
+   if (brw-fragment_program-Base.InputsRead 
BITFIELD64_BIT(FRAG_ATTRIB_PNTC))
+  key.do_point_coord = 1;
key.sprite_origin_lower_left = (ctx-Point.SpriteOrigin == GL_LOWER_LEFT);
/* _NEW_LIGHT */
key.do_flat_shading = (ctx-Light.ShadeModel == GL_FLAT);
diff --git a/src/mesa/drivers/dri/i965/brw_sf.h
b/src/mesa/drivers/dri/i965/brw_sf.h
index 4ef0240..f908fc0 100644

Re: [Mesa-dev] [PATCH 2/2] i965: handle gl_PointCoord for Gen4 and Gen5 platform

2012-02-18 Thread Eric Anholt
On Sat, 18 Feb 2012 23:07:32 +0800, Liu Aleaxander aleaxan...@gmail.com wrote:
 This patch add the support of gl_PointCoord gl builtin var for platform
 gen4 and gen5(ILK).
 
 We can get the point start coord and the current pixel coord from PS
 payload, and the only left element needed is the point size. Then handle
 that in SF thread.
 
 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=45975
 Piglit: glsl-fs-pointcoord and fbo-gl_pointcoord
 
 NOTE: This is a candidate for stable release branches.


 +   /*
 +* Window coordinates in an FBO are inverted, which means coord in Y
 +* must be inverted, too.
 +*/
 +   if ((ctx-Point.SpriteOrigin == GL_LOWER_LEFT) != render_to_fbo) {
 +  pntcoord.negate = true;
 +  emit(BRW_OPCODE_ADD, pntcoord, pntcoord, fs_reg(1.0f));
 +   }

In the FS, it should be looking at the data in the key instead of
ctx-whatever, to be sure that statechanges are handled correctly.

(Though, given that point sprite coordinate flippinng is handled in the
SF code, I'd think we would handle it the same way for pointcoord too).

 +   /*
 +* gl_PointCoord is a FS instead of VS builtin variable, thus is not
 +* included in c-nr_setup_regs. But FS need SF do the interpolation,
 +* so that here padding the interpolation for gl_PointCoord in last.
 +*/
 +   if (c-key.do_point_coord)
 +  c-nr_setup_regs++;

So you're writing an extra attribute of setup, but you haven't increased
the size of the URB entry.  If the URB full except for pointcoord, you'd
end up writing over the next URB entry and probably hanging the GPU.  If
you correctly allocate the URB size and that gets reflected in
urb_read_length, you should be able to read your new attribute.


pgpU0q3gK9inb.pgp
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] i965: handle gl_PointCoord for Gen4 and Gen5 platform

2012-02-18 Thread Liu Aleaxander
On Sun, Feb 19, 2012 at 8:54 AM, Eric Anholt e...@anholt.net wrote:
 On Sat, 18 Feb 2012 23:07:32 +0800, Liu Aleaxander aleaxan...@gmail.com 
 wrote:
 This patch add the support of gl_PointCoord gl builtin var for platform
 gen4 and gen5(ILK).

 We can get the point start coord and the current pixel coord from PS
 payload, and the only left element needed is the point size. Then handle
 that in SF thread.

 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=45975
 Piglit: glsl-fs-pointcoord and fbo-gl_pointcoord

 NOTE: This is a candidate for stable release branches.


 +   /*
 +    * Window coordinates in an FBO are inverted, which means coord in Y
 +    * must be inverted, too.
 +    */
 +   if ((ctx-Point.SpriteOrigin == GL_LOWER_LEFT) != render_to_fbo) {
 +      pntcoord.negate = true;
 +      emit(BRW_OPCODE_ADD, pntcoord, pntcoord, fs_reg(1.0f));
 +   }

 In the FS, it should be looking at the data in the key instead of
 ctx-whatever, to be sure that statechanges are handled correctly.

Yes, I tried this way first, but I have no idea why the value of
origin_upper_left and render_to_fbo in key are both equal to 0
whatever the origin you set and render to fbo or not. I don't check
this issue deeply but that's why I changed to using the value in ctx.


 (Though, given that point sprite coordinate flippinng is handled in the
 SF code, I'd think we would handle it the same way for pointcoord too).

It's somehow easy for me to implement this by doing in FS code
generation. If you prefer way do it in SF stage, Ok, I will try to do
it in that way.


 +   /*
 +    * gl_PointCoord is a FS instead of VS builtin variable, thus is not
 +    * included in c-nr_setup_regs. But FS need SF do the interpolation,
 +    * so that here padding the interpolation for gl_PointCoord in last.
 +    */
 +   if (c-key.do_point_coord)
 +      c-nr_setup_regs++;

 So you're writing an extra attribute of setup, but you haven't increased
 the size of the URB entry.  If the URB full except for pointcoord, you'd
 end up writing over the next URB entry and probably hanging the GPU.  If
 you correctly allocate the URB size and that gets reflected in
 urb_read_length, you should be able to read your new attribute.

That's maybe the place I don't understand quite well so far. Say, you
write attributes into URB from SF thread to FS. I did increase the
urb_read_length in wm_state. Is that the allocation you mean? Should
I, say, allocate size for extra attributes just for FS in SF stage? If
so, would you please tell me how? Since if I understand correctly, the
urb_read_length in SF_STATE is counted from the attributes from VF
stage. Thus at least urb_read_length in SF stage is not the right
place for me to touch, right?


Thanks,
Yuanhan Liu
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev