Hi, I am working for developing pixman with OpenCL (only compositing function), and made it working, but performance is very very very slow..
Main reason for slow performance is basically doing compositing line by line.. 1) i think launching kernel per scan-line is too much overhead, 2) i think copying memory in scaline ways is also overhead.. (from host to device) [general recommendation in gpgpu computing is copy all the data once do the the task..] 3) also kernel are not optimized much (kernels are just combine function which are there in pixman_composite_32.c) after that i changed to launch the kernel for whole rectangle (width*height), basically remove the height for loop in general_composite_rect.. . (i.e. get all the src, dest, mask data for widht & height (whole rectangular area), do the compositing) now performance is much better than my last attempt.. (still not very promising though) how can i handle case where store var is not null i.e fetc dest, composite line & store dest? Thanks & Regards, Pachauri
_______________________________________________ Pixman mailing list [email protected] http://lists.freedesktop.org/mailman/listinfo/pixman
