Hi Nanley,

It's cool that you're doing this, it's important work.

About your GPU performance, have in mind that:

1) GEGL is tile-based, so there a memcpy over the whole input  when we use
the GPU to linearize the image data. Our tile sizes are quite small (128x64
last time I saw it) so it's too much of an overhead to call a CL kernel for
such small data, that's why the cl iterator has to go over much larger
regions (such as 2048x4096).

2) Besides that, there's the PCIe 2.0/3.0 bus overhead, it is specially a
problem for bandwidth-limited filters such as yours, If you have a
convolution filter for example, performance is much better on the GPU
compared to the CPU.

Even though, there's a lot of speedup that can be gained by chaining
multiple operations/filters so that intermediate data never leaves the GPU,
that's where the real performance gain is. Consider also that in many
filters there's adjustable params where real-time feedback to the user is
important. In this case, input data will be uploaded to the GPU only once
as long as the image fits completely in the GPU memory. Our CL code has a
caching system using the whole GPU  memory as a pool.

in the perf folder there's some cool utilities that can be used for simple
profiling. In my system I got a Intel GPU which is not a particularly fast
GPU:

victorolivs-MacBook-Pro:perf victoroliv$ export GEGL_USE_OPENCL=no

victorolivs-MacBook-Pro:perf victoroliv$ ./test-bcontrast

@ bcontrast: 971.37 megabytes/second

victorolivs-MacBook-Pro:perf victoroliv$ export GEGL_USE_OPENCL=yes

victorolivs-MacBook-Pro:perf victoroliv$ ./test-bcontrast

@ bcontrast: 364.53 megabytes/second

victorolivs-MacBook-Pro:perf victoroliv$ export GEGL_USE_OPENCL=no

victorolivs-MacBook-Pro:perf victoroliv$ ./test-blur

@ gaussian-blur: 163.04 megabytes/second

victorolivs-MacBook-Pro:perf victoroliv$ export GEGL_USE_OPENCL=yes

victorolivs-MacBook-Pro:perf victoroliv$ ./test-blur

@ gaussian-blur: 248.08 megabytes/second

I haven't really tried much OpenCL in the CPU, but Pippin should have more
information about the performance measurements he did. You can submit
patches in the mailing list.

Victor

On Thu, Nov 20, 2014 at 7:59 PM, Nanley Chery <nanleych...@gmail.com> wrote:

> Thanks for the quick fix, it's working on my system.
>
> I noticed that you've enabled GPU's by default due to some testing. Where
> can I find these results? According to my tests among two operations
> Edge-laplace and Video-degradation (currently on my bitbucket branches:
> edge_upstrm, vid_upstrm), OpenCL on my GPU performs 7.8x slower than my
> CPU.
>
> The following are some results for the Video-Degradation operation. I took
> the average over 5 trials and the units are in seconds. On my CPU, I was
> able to achieve a 37.6x average speed up using OpenCL than without it.
> Enabling multiple threads increases my times ~0.01s.
>
>   ImgSize Intel Core i7-2675QM No OpenCL Radeon HD 6450 Speedup from No
> Opencl GPU Slowdown from CPU  32x32 0.000650425 0.0668745008 0.0014333608
> 102.8166211323 2.2037295614  64x64 0.0007871332 0.0678992536 0.0021703034
> 86.2614530806 2.7572250796  128x128 0.0007964746 0.0676906748 0.0048043746
> 84.9878637687 6.0320499863  256x256 0.0014093932 0.0683310326 0.0154375408
> 48.4825899543 10.9533243101  512x512 0.0080697294 0.074770076 0.0621315552
> 9.2654997824 7.6993356432  1024x1024 0.0287260124 0.0886687912
> 0.2647257696 3.0867072661 9.215541855  2048x2048 0.1051937164 0.1571853124
> 0.976372136 1.4942462134 9.2816583482  4096x4096 0.4144775556 0.3923948302
> 4.07693664 0.9467215411 9.8363266838  5197x5543 0.5391386664 0.6064622732
> 6.9632312602 1.1248725254 12.9154736882
>
>
>
> 37.6073972516 7.8771850173
> Please let me know if you spot anything wrong with my measuring
> methodology or OpenCL implementation. Also, is the mailing-list and the
> bugzilla both suitable places to submit patches?
>
> Thanks,
> Nanley
>
> On Thu, Nov 20, 2014 at 2:00 AM, Victor Oliveira <victormath...@gmail.com>
> wrote:
>
>> I put it back, hopefully everything is alright now.
>>
>> Victor
>>
>> On Wed, Nov 19, 2014 at 2:41 PM, Nanley Chery <nanleych...@gmail.com>
>> wrote:
>> > Thanks for the question Victor. I'm actually running a custom perl
>> script to
>> > automate the process. Your question led me to find a bug in the script.
>> >
>> > Cheers,
>> > Nanley
>> >
>> > On Wed, Nov 19, 2014 at 5:33 PM, Victor Oliveira <
>> victormath...@gmail.com>
>> > wrote:
>> >>
>> >> Have you tried GEGL_DEBUG=opencl ?
>> >>
>> >> On Wed, Nov 19, 2014 at 2:32 PM, Nanley Chery <nanleych...@gmail.com>
>> >> wrote:
>> >> > I'm glad we could find this bug. Rolling back to the older version of
>> >> > gegl-operation-point-filter.c and adding support for enums in
>> >> > gegl-operation.c allows my opencl kernel to run (among other
>> changes). I
>> >> > will rebase my repo on top of master once it's updated. The last
>> issue
>> >> > that
>> >> > I'm having is that I get no entry for gegl:video-degradation when I
>> have
>> >> > instrumentation enabled (GEGL_DEBUG_TIME=1). I've been parsing the
>> >> > output to
>> >> > determine the speed of other opencl implementations. Any suggestions?
>> >> >
>> >> > Thanks,
>> >> > Nanley
>> >> >
>> >> > On Wed, Nov 19, 2014 at 2:26 PM, Nanley Chery <nanleych...@gmail.com
>> >
>> >> > wrote:
>> >> >>
>> >> >> It seems like the code to initialize and run the opencl kernel was
>> lost
>> >> >> in
>> >> >> this commit:
>> >> >>
>> >> >>
>> >> >>
>> >> >>
>> https://git.gnome.org/browse/gegl/commit/gegl?id=a206f032f77064cf9bff8590ac83ca5b086b53fd
>> >> >>
>> >> >> I'm not familiar enough with the codebase to understand the commit
>> >> >> message. Why was this functionality removed?
>> >> >> Should I add the deleted code into video degradation's process
>> >> >> function?
>> >> >>
>> >> >> Thanks,
>> >> >> Nanley
>> >> >>
>> >> >> On Wed, Nov 19, 2014 at 12:57 AM, Nanley Chery <
>> nanleych...@gmail.com>
>> >> >> wrote:
>> >> >>>
>> >> >>> I noticed there was more to the brightness-contrast example. I made
>> >> >>> the
>> >> >>> adjustments concerning the kernel name and parameter values.
>> >> >>> The code compiles now. The current problem that I'm experiencing is
>> >> >>> that
>> >> >>> the run-composition.py test for video-degradation passes with an
>> empty
>> >> >>> kernel.
>> >> >>> I'm not sure which code paths are executing to make this work. Any
>> >> >>> pointers? I'll do some grepping of the source tree in the meantime.
>> >> >>>
>> >> >>> Thanks,
>> >> >>> Nanley
>> >> >>>
>> >> >>> On Tue, Nov 18, 2014 at 8:22 PM, Nanley Chery <
>> nanleych...@gmail.com>
>> >> >>> wrote:
>> >> >>>>
>> >> >>>> Wow. Thank you for the tip, CL_CHECK is now giving me an output.
>> >> >>>>
>> >> >>>> This is the error message:
>> >> >>>> (lt-gegl:10486): GEGL-video-degradation.c-WARNING **: Error in
>> >> >>>> video-degradation.c:236@cl_process - invalid kernel
>> >> >>>>
>> >> >>>> I thought that I had followed the kernel compilation process
>> >> >>>> correctly.
>> >> >>>> Do you notice any mistake? I have pushed my latest change to the
>> >> >>>> branch.
>> >> >>>>
>> >> >>>> Nanley
>> >> >>>>
>> >> >>>>
>> >> >>>>
>> >> >>>> On Tue, Nov 18, 2014 at 8:06 PM, Victor Oliveira
>> >> >>>> <victormath...@gmail.com> wrote:
>> >> >>>>>
>> >> >>>>> Hi Nanley,
>> >> >>>>>
>> >> >>>>> I'd recommend you follow operations/common/brightness-contrast.c
>> >> >>>>> file
>> >> >>>>> for a point-filter operation (i.e. a pixel-wise filter) instead
>> of
>> >> >>>>> doing what you did.
>> >> >>>>>
>> >> >>>>> Notice that in operations/common/brightness-contrast.c#n153
>> there's
>> >> >>>>> a
>> >> >>>>> string brightness_contrast_cl_source which is a string in
>> >> >>>>> opencl/brightness-contrast.cl.h, these are auto-generated files
>> from
>> >> >>>>> the kernels in the opencl folder.
>> >> >>>>>
>> >> >>>>> Let me know what happens from that.
>> >> >>>>>
>> >> >>>>> Victor
>> >> >>>>>
>> >> >>>>> On Tue, Nov 18, 2014 at 4:45 PM, Nanley Chery
>> >> >>>>> <nanleych...@gmail.com>
>> >> >>>>> wrote:
>> >> >>>>> > Hi Victor,
>> >> >>>>> >
>> >> >>>>> > Thank you very much for taking a look. I understand about the
>> >> >>>>> > time.
>> >> >>>>> >
>> >> >>>>> > Here's the link to my bitbucket branch:
>> >> >>>>> >
>> https://bitbucket.org/nanoman281/gegl-cse6230/branch/vid_upstrm
>> >> >>>>> >
>> >> >>>>> > The latest commit is what's causing the video-degradation.xml
>> test
>> >> >>>>> > to
>> >> >>>>> > fail
>> >> >>>>> > (I'm testing using run-compositions.py).
>> >> >>>>> >
>> >> >>>>> > Nanley
>> >> >>>>> >
>> >> >>>>> > On Tue, Nov 18, 2014 at 5:11 PM, Victor Oliveira
>> >> >>>>> > <victormath...@gmail.com>
>> >> >>>>> > wrote:
>> >> >>>>> >>
>> >> >>>>> >> Hi Nanley,
>> >> >>>>> >>
>> >> >>>>> >> Just to let you know, I'll need some time to answer that
>> because
>> >> >>>>> >> I'll
>> >> >>>>> >> need to build GIMP on my new laptop.
>> >> >>>>> >>
>> >> >>>>> >> Can you share your code so I can give a look?
>> >> >>>>> >>
>> >> >>>>> >> Victor
>> >> >>>>> >>
>> >> >>>>> >> On Tue, Nov 18, 2014 at 12:49 PM, Nanley Chery
>> >> >>>>> >> <nanleych...@gmail.com>
>> >> >>>>> >> wrote:
>> >> >>>>> >> > Hi Victor,
>> >> >>>>> >> >
>> >> >>>>> >> > I'm a student working on OpenCL porting work for my High
>> >> >>>>> >> > Performance
>> >> >>>>> >> > Computing class. I'm trying to implement an OpenCL port for
>> the
>> >> >>>>> >> > newly-committed video-degradation operation. Are you
>> willing to
>> >> >>>>> >> > provide
>> >> >>>>> >> > guidance on the following roadblock?
>> >> >>>>> >> >
>> >> >>>>> >> >
>> >> >>>>> >> > The issue that I'm finding is that creating a cl_process
>> method
>> >> >>>>> >> > and
>> >> >>>>> >> > setting
>> >> >>>>> >> > the following variables in gegl_op_class_init is not enough
>> to
>> >> >>>>> >> > get
>> >> >>>>> >> > the
>> >> >>>>> >> > cl_process method called:
>> >> >>>>> >> >
>> >> >>>>> >> > operation_class->opencl_support = TRUE;
>> >> >>>>> >> > point_filter_class->cl_process = cl_process;
>> >> >>>>> >> >
>> >> >>>>> >> > If I manually try to call the cl_process function in the
>> >> >>>>> >> > process
>> >> >>>>> >> > method
>> >> >>>>> >> > (like in edge-laplace.c), the program terminates in the
>> >> >>>>> >> > gegl_cl_set_kernel_args method without an error from
>> CL_CHECK;
>> >> >>>>> >> >
>> >> >>>>> >> > Is there something I'm missing? I apologize for mailing you
>> >> >>>>> >> > directly
>> >> >>>>> >> > instead
>> >> >>>>> >> > of writing to the mailing list. I'm a little pressed for
>> time,
>> >> >>>>> >> > so
>> >> >>>>> >> > I
>> >> >>>>> >> > opted
>> >> >>>>> >> > for this option.
>> >> >>>>> >> >
>> >> >>>>> >> > Regards,
>> >> >>>>> >> > Nanley
>> >> >>>>> >
>> >> >>>>> >
>> >> >>>>
>> >> >>>>
>> >> >>>
>> >> >>
>> >> >
>> >
>> >
>>
>
>
_______________________________________________
gegl-developer-list mailing list
List address:    gegl-developer-list@gnome.org
List membership: https://mail.gnome.org/mailman/listinfo/gegl-developer-list

Reply via email to