Re: [Beignet] [PATCH 1/3] Benchmark: Evaluate math performance on intervals

Lupescu, Grigore Sun, 01 May 2016 21:32:41 -0700

Regarding the first question - For math functions I made the benchmarks to 
evaluate the gaps of performance between native and different paths of 
internal. So I would understand where should I maybe focus on optimization.


I never meant to make a general all purpose benchmark for any driver - I find 
that quite difficult since I don't think just reiterating through an interval 
would offer real world performance. If you have any ideas here though, would be 
great :)

-----Original Message-----
From: Song, Ruiling 
Sent: Monday, May 2, 2016 5:10 AM
To: Lupescu, Grigore <[email protected]>; [email protected]
Subject: RE: [Beignet] [PATCH 1/3] Benchmark: Evaluate math performance on 
intervals



> -----Original Message-----
> From: Beignet [mailto:[email protected]] On Behalf 
> Of Grigore Lupescu
> Sent: Monday, May 2, 2016 3:04 AM
> To: [email protected]
> Subject: [Beignet] [PATCH 1/3] Benchmark: Evaluate math performance on 
> intervals
> 
> From: Grigore Lupescu <grigore.lupescu at intel.com>
> 
> Functions to benchmark math functions on intervals.
> Tests: sin, cos, exp2, exp, exp10, log2, log, log10
> 
> Signed-off-by: Grigore Lupescu <grigore.lupescu at intel.com>
> ---
>  benchmark/CMakeLists.txt     |   3 +-
>  benchmark/benchmark_math.cpp | 126 ++++++++++++++++++++
>  kernels/bench_math.cl        | 272
> +++++++++++++++++++++++++++++++++++++++++++
>  3 files changed, 400 insertions(+), 1 deletion(-)  create mode 100644 
> benchmark/benchmark_math.cpp  create mode 100644 kernels/bench_math.cl
> 
> diff --git a/benchmark/CMakeLists.txt b/benchmark/CMakeLists.txt index 
> dd33829..4c3c933 100644
> --- a/benchmark/CMakeLists.txt
> +++ b/benchmark/CMakeLists.txt
> @@ -18,7 +18,8 @@ set (benchmark_sources
>    benchmark_copy_buffer_to_image.cpp
>    benchmark_copy_image_to_buffer.cpp
>    benchmark_copy_buffer.cpp
> -  benchmark_copy_image.cpp)
> +  benchmark_copy_image.cpp
> +  benchmark_math.cpp)
> 
> +/* calls internal fast (native) if (x > -0x1.6p1 && x < 0x1.6p1) */ 
> +kernel void bench_math_exp(
> +  global float *src,
> +  global float *dst,
> +  float pwr,
> +  uint loop)
> +{
> +  float result = src[get_global_id(0)];
> +
> +  for(; loop > 0; loop--)
> +  {
> +#if defined(BENCHMARK_NATIVE)
> +    result = native_exp(-0x1.6p1 - result); /* calls native */ #elif 
> +defined(BENCHMARK_INTERNAL_FAST)
> +    result = exp(-0x1.6p1 + result); /* calls internal fast */ #else
> +    result = exp(-0x1.6p1 - result); /* calls internal slow */ #endif

I think we should separate the benchmark test from the real implementation.
Then we can make easy comparison with other driver implementation and Also the 
implementation in Beignet may change in the future.
What's your idea on this?

> +  }
> +
> +  dst[get_global_id(0)] = result;
> +}
> +

> +/* benchmark sin performance */
> +kernel void bench_math_sin(
> +  global float *src,
> +  global float *dst,
> +  float pwr,
> +  uint loop)
> +{
> +  float result = src[get_global_id(0)];
> +
> +  for(; loop > 0; loop--)
> +  {
> +#if defined(BENCHMARK_NATIVE)
> +    result = native_sin(result); /* calls native */ #else
> +    result = sin(result);    /* calls internal, random complexity */

What's the range of 'result'? Seems very small? I think we need to make sure 
the input argument to sin() in a large range.
As we need try to optimize for general case.

Thanks!
Ruiling
> +    //result = sin(0.1f + result); /* calls internal, (1) no reduction */
> +    //result = sin(2.f + result); /* calls internal, (2) fast reduction */
> +    //result = sin(4001 + result); /* calls internal, (3) slow reduction */
> +    result *= 0x1p-16;
> +#endif
> +  }
> +
> +  dst[get_global_id(0)] = result;
> +}
> +

_______________________________________________
Beignet mailing list
[email protected]
https://lists.freedesktop.org/mailman/listinfo/beignet

Re: [Beignet] [PATCH 1/3] Benchmark: Evaluate math performance on intervals

Reply via email to