I just finished the first performance tests of my gdalwarp OpenCL code. It's doing better than I expected. I used this command: "time gdalwarp -q -r lanczos -t_srs '+proj=merc +a=6378137.0 +b=6378137.0 +nadgri...@null +wktext +units=m' big_test.tif big_test.out.tif"

I can compile the OpenCL code two different ways. I can run OpenCL code on the CPU and distribute it across processors by selecting the CPU as the device. This compiles a multithreaded version of the code. By selecting the GPU device, the OpenCL code compiles to run on my Mac Pro's graphics card, a GeForce GTX 285. To test, I used a 80 MB RGB raster, with 8 bits per channel.

With the original lanczos resampler code I get 5:31, with OpenCL on my Mac Pro's 16 cores 0:39, and with OpenCL on my GTX 285 0:10. That's a 36x speedup.

Using cubicspline resampling, the original code takes 0:59, the OpenCL CPU code takes 0:13, and the OpenCL GPU code takes 0:08. Still a significant speedup.

And with cubic resampling, the original code takes 0:19, OpenCL CPU takes 0:09, and OpenCL GPU takes 0:07. Still better than twice as fast.

Basically, the OpenCL GPU code in all cases is I/O bound. The GPU is laughing and requesting more difficult work.

I haven't tested all different types of data and commands. If anyone has any samples and warping commands for testing, now would be the time to send them to me. I don't know of any GPU bugs in the current code.

Here is my current code:
http://github.com/mailseth/OpenCL-integration-for-GRASS---GDAL

~Seth
_______________________________________________
gdal-dev mailing list
[email protected]
http://lists.osgeo.org/mailman/listinfo/gdal-dev

Reply via email to