Thanks for investigating this topic. Here are some comments from me:
1) I support an attempt to reduce unnecessary image format conversions. Your 
focus is on track compositing. But it would also be good if we could find a 
general pattern that would work for filters as well. Some filters can operate 
on multiple image formats. But there is not a good way for the filter to know 
what image format to request from get_image to result in the fewest conversions.
2) I am not excited about adding more parameters to the get_image function. But 
I understand this might be a necessary concession to improve efficiency.
3) I have a long-term vision to replace the current get_image function which 
takes multiple parameters with a new version that takes a single mlt_image 
structure. In general, I would like to increase the use of mlt_image objects in 
the framework. In my current vision, we could add a new get_image function that 
passes an mlt_image and then create a wrapper that allows legacy services to 
continue to work until they are converted to use mlt_image. Maybe your 
suggestion could help to move us in that direction.
4) I had previously thought of an idea to add a new function called 
get_image_dry_run() (or something like that) which would allow a service to 
query the next service/producer to find out what format would be returned. I 
think the function could return the image type (or maybe a list of possible 
types) that would be returned when the service calls get_image(). The downside 
to this idea is that someone would need to implement this new function for all 
services. But maybe there could be a default implementation that would just be 
a pass-through and we would only need to implement the new function for some 
key services.
5) As you mentioned, our current method to find out if an image has any 
non-zejust ro alpha pixels is to request the alpha mask and then seek for 
values in the returned mask.. I wonder if an image should have a variable 
called "alpha_status" that can have one of three values: "alpha", "no_alpha" 
and "unknown". If the framework is certain that there is no alpha (for example, 
because it just converted RGB to RGBA), then it would set the status to "no 
alpha". In the case of unknown, a service could request a full scan of the 
alpha channel to determine the status. Some accounting would be required to 
make sure that services are setting the variable appropriately when modifying 
an image.
6) It would be good if we had a better way to measure the number of conversions 
that are occurring on an image/frame for a given configuration. My current 
thought is that if we can make the use of mlt_image more prolific, then we 
could add a "conversion_count" variable that gets incremented every time a 
conversion occurs on the image. Maybe there are other ideas that could help us 
better understand how many conversions are occuring.
Those are some comments from me. I would be interested in continued development 
of this optimization since it could have a huge impact on the framework 
performance.
Regards,
~Brian


    On Friday, September 2, 2022 at 12:38:40 PM CDT, jb <j...@kdenlive.org> 
wrote:  
 
 Hi all,

In Kdenlive, we often have projects with multiple tracks, and track 
compositing is one of the bottleneck we have with MLT.

Shotcut uses the frei0r.cairoblend transition for compositing. cairoblend first 
requests an rgba image for the top (b_frame), and checks if it really has 
transparency with an "is_opaque" function that check the alpha value of every 
pixel in the image. The downside is that if we have an opaque video on the top 
track, this causes a noticeable slowdown when playing.

In the qtblend transition, I tried to work around it be first requesting the 
top (b frame) image in the consumer requested format (usually yuv). Then we 
check if the frame has an alpha channel using the "mlt_frame_get_alpha" 
function. If there is no alpha, we directly return the frame, otherwise we 
do another request for an rgba image to process. The advantage over cairoblend 
is that there is almost no overhead when the top track contains an opaque 
video. The downside is that there is a noticeable overhead when there is an 
alpha channel as we first request an yuv frame and then an rgba.

So to improve this an try to reduce yuv <> rgb conversions, I was thinking to 
add a property to all producers "producer_get_frame" method (that is defined 
before we attempt to fetch an image), to inform the framework about the 
default mlt_image_format produced by the producer. For example, the image 
producer would set the "format" property to mlt_image_rgba if it is an image 
with alpha channel, and mlt_image_rgb if there is no alpha channel. This 
information could then be retrieved by the framework to optimize the frame 
conversion processes.

In the case of my qtblend transition, if we know that the producer creates 
rgba images by default, we can directly retrieve the rgba frame, leading to 
much better performance.

Of course, effects can change the format of a frame, adding or removing an 
alpha channel, but in many cases this would allow us to optimize the 
performance.

What do you think of this ? If the idea seems ok, I volunteer to produce a 
patch implementing the feature (I already tested it for qimage and 
kdenlivetitle producers).

If you don't like the idea, do you have any better idea to implement a good 
performance transition that would work efficiently for both cases (top frames 
with and without alpha channel) ?

Thanks in advance and best regards,

Jean-Baptiste






_______________________________________________
Mlt-devel mailing list
Mlt-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mlt-devel
  
_______________________________________________
Mlt-devel mailing list
Mlt-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mlt-devel

Reply via email to