Re: [Mesa-dev] [PATCH 3/3] i965/gen6-7/sol: Bump primitive counter BO size.

2017-12-01 Thread Francisco Jerez
Eero Tamminen  writes:

> Hi,
>
> Tested-By: Eero Tamminen 
>
> On 18.11.2017 00:28, Francisco Jerez wrote:
>> Improves performance of SynMark2 OglGSCloth by a further 9.65%±0.59%
>> due to the reduction in overwraps of the primitive count buffer that
>> lead to a CPU stall on previous rendering.  Cummulative performance
>> improvement from the series 81.50% ±0.96% (data gathered on VLV).
>
> I tested the patch series with transform feedback using tests on SNB 
> GT2, BYT, HSW GT2 and BSW, using git versions of Mesa, drm-tip kernel 
> and X server.
>
>
> SNB GT2:
> * No noticeable perf impact on GfxBench Manhattan
> * Mesa unfortunately renders GSCloth incorrectly on SNB,
>but that happens also without this patch series:
>   https://bugs.freedesktop.org/show_bug.cgi?id=103824
>
> BYT:
> * 1-2% perf improvement in GfxBench Manhattan 3.0 & 3.1
> * 30% perf improvement in GSCloth
>- Device is single channel one, was your VLV 2-channel one?
>

I don't have access to the VLV system today to verify, but your system
is likely hitting the bandwidth limits of the system sooner than mine
(either because of slower memory clocks or because of single- vs
dual-channel), after which point performance doesn't improve further
for you because it's fully bandwidth-bound.

> HSW GT2:
> * No noticeable perf impact
>

This is also expected, HSW uses the hsw_sol.c XFB implementation which
this patch doesn't have any effect on.

> BSW:
> * No noticeable perf impact (as expected)
>
>
>   - Eero
>
>> ---
>>   src/mesa/drivers/dri/i965/gen6_sol.c | 5 +++--
>>   1 file changed, 3 insertions(+), 2 deletions(-)
>> 
>> diff --git a/src/mesa/drivers/dri/i965/gen6_sol.c 
>> b/src/mesa/drivers/dri/i965/gen6_sol.c
>> index b1baf01bcd9..355acd42189 100644
>> --- a/src/mesa/drivers/dri/i965/gen6_sol.c
>> +++ b/src/mesa/drivers/dri/i965/gen6_sol.c
>> @@ -197,7 +197,7 @@ brw_new_transform_feedback(struct gl_context *ctx, 
>> GLuint name)
>>  brw_obj->offset_bo =
>> brw_bo_alloc(brw->bufmgr, "transform feedback offsets", 16, 64);
>>  brw_obj->prim_count_bo =
>> -  brw_bo_alloc(brw->bufmgr, "xfb primitive counts", 4096, 64);
>> +  brw_bo_alloc(brw->bufmgr, "xfb primitive counts", 16384, 64);
>>   
>>  return _obj->base;
>>   }
>> @@ -287,7 +287,8 @@ brw_save_primitives_written_counters(struct brw_context 
>> *brw,
>>  assert(obj->prim_count_bo != NULL);
>>   
>>  /* Check if there's enough space for a new pair of four values. */
>> -   if ((obj->counter.bo_end + 2) * streams * sizeof(uint64_t) >= 4096) {
>> +   if ((obj->counter.bo_end + 2) * streams * sizeof(uint64_t) >=
>> +   obj->prim_count_bo->size) {
>> aggregate_transform_feedback_counter(brw, obj->prim_count_bo,
>>  >previous_counter);
>> aggregate_transform_feedback_counter(brw, obj->prim_count_bo,
>> 
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/3] i965/gen6-7/sol: Bump primitive counter BO size.

2017-11-20 Thread Eero Tamminen

Hi,

Tested-By: Eero Tamminen 

On 18.11.2017 00:28, Francisco Jerez wrote:

Improves performance of SynMark2 OglGSCloth by a further 9.65%±0.59%
due to the reduction in overwraps of the primitive count buffer that
lead to a CPU stall on previous rendering.  Cummulative performance
improvement from the series 81.50% ±0.96% (data gathered on VLV).


I tested the patch series with transform feedback using tests on SNB 
GT2, BYT, HSW GT2 and BSW, using git versions of Mesa, drm-tip kernel 
and X server.



SNB GT2:
* No noticeable perf impact on GfxBench Manhattan
* Mesa unfortunately renders GSCloth incorrectly on SNB,
  but that happens also without this patch series:
https://bugs.freedesktop.org/show_bug.cgi?id=103824

BYT:
* 1-2% perf improvement in GfxBench Manhattan 3.0 & 3.1
* 30% perf improvement in GSCloth
  - Device is single channel one, was your VLV 2-channel one?

HSW GT2:
* No noticeable perf impact

BSW:
* No noticeable perf impact (as expected)


- Eero


---
  src/mesa/drivers/dri/i965/gen6_sol.c | 5 +++--
  1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/gen6_sol.c 
b/src/mesa/drivers/dri/i965/gen6_sol.c
index b1baf01bcd9..355acd42189 100644
--- a/src/mesa/drivers/dri/i965/gen6_sol.c
+++ b/src/mesa/drivers/dri/i965/gen6_sol.c
@@ -197,7 +197,7 @@ brw_new_transform_feedback(struct gl_context *ctx, GLuint 
name)
 brw_obj->offset_bo =
brw_bo_alloc(brw->bufmgr, "transform feedback offsets", 16, 64);
 brw_obj->prim_count_bo =
-  brw_bo_alloc(brw->bufmgr, "xfb primitive counts", 4096, 64);
+  brw_bo_alloc(brw->bufmgr, "xfb primitive counts", 16384, 64);
  
 return _obj->base;

  }
@@ -287,7 +287,8 @@ brw_save_primitives_written_counters(struct brw_context 
*brw,
 assert(obj->prim_count_bo != NULL);
  
 /* Check if there's enough space for a new pair of four values. */

-   if ((obj->counter.bo_end + 2) * streams * sizeof(uint64_t) >= 4096) {
+   if ((obj->counter.bo_end + 2) * streams * sizeof(uint64_t) >=
+   obj->prim_count_bo->size) {
aggregate_transform_feedback_counter(brw, obj->prim_count_bo,
 >previous_counter);
aggregate_transform_feedback_counter(brw, obj->prim_count_bo,



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/3] i965/gen6-7/sol: Bump primitive counter BO size.

2017-11-17 Thread Francisco Jerez
Improves performance of SynMark2 OglGSCloth by a further 9.65%±0.59%
due to the reduction in overwraps of the primitive count buffer that
lead to a CPU stall on previous rendering.  Cummulative performance
improvement from the series 81.50% ±0.96% (data gathered on VLV).
---
 src/mesa/drivers/dri/i965/gen6_sol.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/gen6_sol.c 
b/src/mesa/drivers/dri/i965/gen6_sol.c
index b1baf01bcd9..355acd42189 100644
--- a/src/mesa/drivers/dri/i965/gen6_sol.c
+++ b/src/mesa/drivers/dri/i965/gen6_sol.c
@@ -197,7 +197,7 @@ brw_new_transform_feedback(struct gl_context *ctx, GLuint 
name)
brw_obj->offset_bo =
   brw_bo_alloc(brw->bufmgr, "transform feedback offsets", 16, 64);
brw_obj->prim_count_bo =
-  brw_bo_alloc(brw->bufmgr, "xfb primitive counts", 4096, 64);
+  brw_bo_alloc(brw->bufmgr, "xfb primitive counts", 16384, 64);
 
return _obj->base;
 }
@@ -287,7 +287,8 @@ brw_save_primitives_written_counters(struct brw_context 
*brw,
assert(obj->prim_count_bo != NULL);
 
/* Check if there's enough space for a new pair of four values. */
-   if ((obj->counter.bo_end + 2) * streams * sizeof(uint64_t) >= 4096) {
+   if ((obj->counter.bo_end + 2) * streams * sizeof(uint64_t) >=
+   obj->prim_count_bo->size) {
   aggregate_transform_feedback_counter(brw, obj->prim_count_bo,
>previous_counter);
   aggregate_transform_feedback_counter(brw, obj->prim_count_bo,
-- 
2.14.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev