On 7 October 2013 00:36, brdavs <brd...@yahoo.com> wrote:
> Hi Jon,
>
> Thank you very much for your kind reply.

You are welcome. I'm keeping the mypaint-discuss mailing list CC'ed so
others can follow as well.

>
> I have several questions / comments:
> 1. Why do you think that OpenCL is a better solution
> than doing it purely in OpenGL?
>
> One problem I see is that if you are drawing a lot of
> small dabs, OpenCL will probably be very inefficient,
> because you won't be doing much work for each dab.
> You'll be launching a kernel a lot which is not negligible.

OpenCL is a more general programming model, hence it will probably be
easier to translate existing code and concepts in a meaningful way.
Also, I've programmed GPUs in this manner before (using CUDA) - where
as I've never done non-trivial OpenGL.
If one can make it work in OpenGL/GLSL, that would probably be ideal.
Both from performance and availability perspective. Any ideas how to
realize the algorithm there would be very welcomed!

>
> In addition to a surface backend, you will likely want a way to
> display the surface on screen - so that is the other thing that needs
> implementing. This could/should be OpenGL based, using the
> OpenCL+OpenGL interoperability if the surf backend is in OpenCL.
>
> For making full advantage of such a backend in MyPaint itself, one
> would also have to implement layers etc. on the GPU side.
>
> 2. Even if you can batch a bunch of dabs together, the dabs will overlap
> and you would have to resort to atomic operations (slower) for proper
> blending.

The CPU backend has an operations queue in which dabs are batched, in
a tile-wise manner. See operationsqueue.c and mypaint-tiledsurface.c
No concurrency is attempted between individual dabs, ordering is
preserved by the queue. Threads work on individual tiles, with no
syncronization necessary between them.

On the GPU one would ideally like that every (output) pixel is
computed concurrently without sync. One could perhaps apply the same
queuing principle, but probably use a vector for the depth - and store
the queue suitable for consumption by individual thread warps (16/32
on GPUs I am used to).

But, starting with the naive one-kernel-per-dab approach is probably
the best. Once we have that we can think of smarter ways of doing
things.

>
> 3. Is there a super simple working example of the brushlib
> painting a (hardwired) stroke with a configurable brush?
>
> I would be interested in trying a few simple things in
> OpenCL (no tiling, etc.) and would be good to have
> a simple starting point in C/C++ and a reference
> implementation.

Sadly the "minimal.c" example is broken right now. The API usage as
shown is fine, its just the simple surface that is borked - wrong
buffer stride handling in the backend. I will have a look at fixing it
up later, but I would not wait for it.
For GPU things, you'll need a lot of additional scaffolding (OpenCL
setup etc), so maybe copy minimal.c and start adding that. I suggest
you put the code in a subdirectory as it will have extra dependencies,
for instance "brushlib/cl/".

> BlackInk seems to have a very impressive paint engine inplemented
> on the GPU:
> http://www.bleank.com/BlackInk-a115.html
>
>
> Marko
>



-- 
Jon Nordby - www.jonnor.com

_______________________________________________
Mypaint-discuss mailing list
Mypaint-discuss@gna.org
https://mail.gna.org/listinfo/mypaint-discuss

Reply via email to