The new backend is an extension of the existing Torch backend rather than a 
separate implementation.

Inference in CLIP differs from other models as it encodes (embeds) both images 
and tokenized text labels, then calculates the similarity between the encoded 
vectors. As a result, its forward pass takes two inputs and produces two 
outputs.

To ensure clarity and modularity, I have created a separate 
dnn_torch_backend_clip file instead of expanding dnn_torch_backend. This keeps 
the main file manageable and allows for easy exclusion from the build when the 
tokenizer-cpp library is not included.

If preferred, I can implement a standalone tokenizer class, integrate it into 
e.g. libavutil, and move the remaining code to the backend.

-----Ursprüngliche Nachricht-----
Von: ffmpeg-devel <ffmpeg-devel-boun...@ffmpeg.org> Im Auftrag von Guo, Yejun
Gesendet: Tuesday, 18 February 2025 11:09
An: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org>
Betreff: Re: [FFmpeg-devel] [PATCH] avfilter/dnn: add zero-shot image 
classification using CLIP models



> -----Original Message-----
> From: ffmpeg-devel <ffmpeg-devel-boun...@ffmpeg.org> On Behalf Of 
> Maximilian Kaindl
> Sent: Tuesday, February 18, 2025 12:29 AM
> To: FFmpeg development discussions and patches <ffmpeg- 
> de...@ffmpeg.org>
> Subject: Re: [FFmpeg-devel] [PATCH] avfilter/dnn: add zero-shot image 
> classification using CLIP models
> 
> Hello Yejun Guo,
> 
> yes i can do that and submit it in another patch. Do you also have 
> some feedback for the clip backend? I have already made some small 
> changes (cuda accel and new preprocessing) that i will submit along 
> with the other patch, but i would like to hear your thoughts.
> 
Could you share why we need a new backend?

> Thanks
> 
> ________________________________
> From: ffmpeg-devel <ffmpeg-devel-boun...@ffmpeg.org> on behalf of Guo, 
> Yejun <yejun.guo-at-intel....@ffmpeg.org>
> Sent: Sunday, February 16, 2025 7:09 AM
> To: FFmpeg development discussions and patches <ffmpeg- 
> de...@ffmpeg.org>
> Subject: Re: [FFmpeg-devel] [PATCH] avfilter/dnn: add zero-shot image 
> classification using CLIP models
> 
> 
> 
> > -----Original Message-----
> > From: ffmpeg-devel <ffmpeg-devel-boun...@ffmpeg.org> On Behalf Of 
> > m.kaindl0...@gmail.com
> > Sent: Thursday, January 30, 2025 4:33 AM
> > To: ffmpeg-devel@ffmpeg.org
> > Subject: [FFmpeg-devel] [PATCH] avfilter/dnn: add zero-shot image 
> > classification using CLIP models
> >
> > Add a new filter 'dnn_clip' that performs zero-shot image 
> > classification using CLIP (Contrastive Language-Image Pre-Training) models.
> The filter supports:
> 
> For image classification with new dnn models, we'd better add the new 
> model support with dnn_classify at https://ffmpeg.org/ffmpeg- 
> filters.html#dnn_005fclassify
> 
> 
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email 
> ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email 
> ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with 
subject "unsubscribe".

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Reply via email to