Hi all, I finished the last missing pieces of the surface optimization I started a while back. The changes are not that invasive but it could use some real-life testing before going into master. If no issues are found I'd like for it to be a part of MyPaint 1.1 release.
The code is found in the "surface-optimizations" on mainline repository: http://gitorious.org/mypaint/mypaint/commits/surface-optimizations Please test! (checkout branch, build and run mypaint as normal) == Changes == The optimizations follow a three-pronged strategy: 1. Reordering of data access to minimize fetching and updating of tiles. 2. Coarse grained parallelism using multithreading via OpenMP directives. 3. Fine grained parallelism using SSE via GCC auto-vectorization. The MyPaint surface API has a concept of an atomic transaction: surface.begin_atomic() and surface.end_atomic(). Inside such a transaction, we call brush.stroke_to(surface, ...) each time there is a motion event on the canvas. Depending on the brush configuration and current state this may result in 0 to N surface.draw_dab() calls. N can be in the order of 10-100. Previously each draw_dab() call would fetch the affected tiles, process the draw_dab operation and update the tiles with the results. When subsequent draw_dab() calls affect the same tiles, fetching and updating of tiles would happen up to N-1 times as often as is needed. Now, each time draw_dab() is called, an operation struct is added to a queue for each of the affected tiles before returning. No processing is done at this point. When end_atomic() is called to complete the transaction, the tiles that have pending operations are distributed evenly among the processing threads. The processing of a tile is completely independent of other tiles, allowing it to be done in a lock-free manner. When a get_color() request is made by the brush engine during a surface transaction, the pending draw_dab operations on the affected tiles must be flushed to return the correct value. Both the flushing and calculation of the color is done multi-threaded in the same way as above. Within each thread, SSE based vectorization is used to process a tile. Currently this is limited to part of the brush mask calculation, as the run-length encoding of the masks makes it difficult to auto-vectorize all of the mask calculation and the blending/compositing. == Results == These results are on from my laptop, running Arch Linux current. CPU: Dual-core Intel i5 M520@2.4 GHz, 6GB RAM Note: this benchmarks the *raw* surface rendering performance. The user *may* experience speed-ups similar to what is shown here, but this is is only if layer compositing and rendering to screen is not a bottleneck. http://jonnor.com/files/temp/mypaint-brushengine-opt.png http://jonnor.com/files/temp/mypaint-brushengine-opt.txt Take-aways: * 20% to 50% performance improvements for larger brushes (16 px+) on the currently used Python-based backend. * Performance does not regress significantly for small brushes, max -4% degradation found. * After the changes, GEGL-based backend is circa 30% faster than the Python-based backend with 1 thread, and twice as fast with 2 threads. A quad-core CPU with 4 threads will have an even higher speedup. To reproduce: scons enable_gegl=true enable_openmp=true # to enable GEGL backend, requires babl+gegl git cd brushlib/tests export PYTHONPATH=../../lib:../.. export LD_LIBRARY_PATH=../.. export GEGL_SWAP=RAM export OMP_NUM_THREADS=2 ../../lib/test-python-surface # current python-based backend ./test-gegl-surface # GEGL backend Look inside mypaint-test-surface.c to see/change the different test cases. == Future == Given that the GEGL backend has a significantly higher raw performance, I hope that after we release MyPaint 1.1 we can start the transition to use it instead of our current backend. I have some more ideas for further improve performance, and am working to document these now. -- Jon Nordby - www.jonnor.com _______________________________________________ Mypaint-discuss mailing list Mypaint-discuss@gna.org https://mail.gna.org/listinfo/mypaint-discuss