Le jeudi 22 décembre 2011 07:07:13, Dan Dennedy a écrit : > > Yes, openCL has some interesting features. Bt at that moment, my first > > experiments are quite disapointing. > > First, it's really difficult to optimize for speed. It really requires > > intimate knowledge of the targeted hardware to get the best from it. > > I've written a bicubic scaling filter for both openGL and openCL, and > > after several hours of reading, adjusting and testing using all > > available nvidia's gpu documentation (mostly found in cuda), the openCL > > version is still far from openGL performances. To upscale a 720x576 > > image to 1920x1080, openCL is about 3x slower : 323 fps vs 1150. > > Interesting findings, and maybe this is due to immaturity of OpenCL > implementations. Thank you for the update. GLSL is interesting too. I > believe both avenues may face inconsistent levels of support across > vendors and chips.
Yes, OpenCL implementations are probably immature. The nvidia's one is more or less a cuda wrapper, and i've discovered that cuda can't write to textures (image2D in OpenCL), so i guess nv opencl image writing is nothing but a big (and slow) hack. Using global (gpu) memory instead runs a bit faster (400 vs 320 fps). The good is that both opencl and opengl can share buffers (on gpu implementations), so opencl could be used for general purpose algos while glsl would give its best on pixels. Anyway, i've now to try this inside MLT. But then i have a few questions and suggestions. - is it possible to constrain the list of available effects (filters, transitions..) based on a consumer property or a factory_init option? The idea is to avoid mixing gpu and cpu effects, because even with PBO, downloading data from gpu is really slow. That would probably void all performances gain. I understand that it requires a full set of glsl effects to be implemented, but most would just be a port. - how does MLT make use of multithreading? I guess it runs producers in separate threads, but is this also the case for filters? Multithreading is not really an option with opengl (binding/releasing context is incredibly slow ( 40fps vs 2200 ), so threads synchronization must happen right before entering gl path). - should we have a gl consumer that takes care of display in a window provided by the frontend (similar to sdl consumer) or should the frontend take care of context creation and gl initialisation? The later may have some advantages, for example a Qt frontend could make sure to have a context compatible with QtopenglPaintEngine and take advantage of QPainter to do overpainting (e.g. for title or image placement) - how do you retrieve native frame image data format? glsl can upload any uchar data and do csc in shaders, so it would be a waste of resource to have the cpu doing a csc first. - is it possible to disable an effect based on runtime checks? e.g. If an effect requires glsl 4.2 and the running platform have glsl 1.5, it should be disabled. P.S. Since i'm still unsure to understand well MLT's framework, some of the aboves may make no sense :) -- Christophe Thommeret ------------------------------------------------------------------------------ Ridiculously easy VDI. With Citrix VDI-in-a-Box, you don't need a complex infrastructure or vast IT resources to deliver seamless, secure access to virtual desktops. With this all-in-one solution, easily deploy virtual desktops for less than the cost of PCs and save 60% on VDI infrastructure costs. Try it free! http://p.sf.net/sfu/Citrix-VDIinabox _______________________________________________ Mlt-devel mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/mlt-devel
