The PRM suggests a larger layout, mostly to support having gl_ClipDistance[] somewhere predictable for the fixed-function clipper -- but it didn't actually arrive in Gen5.
Just use the same layout for both Gen4 and Gen5. No Piglit regressions. Improves performance in CS:S Video Stress Test by ~3%. Signed-off-by: Chris Forbes <[email protected]> --- src/mesa/drivers/dri/i965/brw_sf_state.c | 5 +---- src/mesa/drivers/dri/i965/brw_vs.c | 23 +++-------------------- 2 files changed, 4 insertions(+), 24 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_sf_state.c b/src/mesa/drivers/dri/i965/brw_sf_state.c index 7c29ba2..e9b7e66 100644 --- a/src/mesa/drivers/dri/i965/brw_sf_state.c +++ b/src/mesa/drivers/dri/i965/brw_sf_state.c @@ -131,10 +131,7 @@ const struct brw_tracked_state brw_sf_vp = { int brw_sf_compute_urb_entry_read_offset(struct intel_context *intel) { - if (intel->gen == 5) - return 3; - else - return 1; + return 1; } static void upload_sf_unit( struct brw_context *brw ) diff --git a/src/mesa/drivers/dri/i965/brw_vs.c b/src/mesa/drivers/dri/i965/brw_vs.c index 720325d..d173d2e 100644 --- a/src/mesa/drivers/dri/i965/brw_vs.c +++ b/src/mesa/drivers/dri/i965/brw_vs.c @@ -85,34 +85,17 @@ brw_compute_vue_map(struct brw_context *brw, struct brw_vue_map *vue_map, */ switch (intel->gen) { case 4: + case 5: /* There are 8 dwords in VUE header pre-Ironlake: * dword 0-3 is indices, point width, clip flags. * dword 4-7 is ndc position * dword 8-11 is the first vertex data. - */ - assign_vue_slot(vue_map, VARYING_SLOT_PSIZ); - assign_vue_slot(vue_map, BRW_VARYING_SLOT_NDC); - assign_vue_slot(vue_map, VARYING_SLOT_POS); - break; - case 5: - /* There are 20 DWs (D0-D19) in VUE header on Ironlake: - * dword 0-3 of the header is indices, point width, clip flags. - * dword 4-7 is the ndc position - * dword 8-11 of the vertex header is the 4D space position - * dword 12-19 of the vertex header is the user clip distance. - * dword 20-23 is a pad so that the vertex element data is aligned - * dword 24-27 is the first vertex data we fill. * - * Note: future pipeline stages expect 4D space position to be - * contiguous with the other varyings, so we make dword 24-27 a - * duplicate copy of the 4D space position. + * On Ironlake the VUE header is nominally 20 dwords, but the hardware + * will accept the same header layout as Gen4 [and should be a bit faster] */ assign_vue_slot(vue_map, VARYING_SLOT_PSIZ); assign_vue_slot(vue_map, BRW_VARYING_SLOT_NDC); - assign_vue_slot(vue_map, BRW_VARYING_SLOT_POS_DUPLICATE); - assign_vue_slot(vue_map, VARYING_SLOT_CLIP_DIST0); - assign_vue_slot(vue_map, VARYING_SLOT_CLIP_DIST1); - assign_vue_slot(vue_map, BRW_VARYING_SLOT_PAD); assign_vue_slot(vue_map, VARYING_SLOT_POS); break; case 6: -- 1.8.3 _______________________________________________ mesa-dev mailing list [email protected] http://lists.freedesktop.org/mailman/listinfo/mesa-dev
