The new backend is an extension of the existing Torch backend rather than a separate implementation.
Inference in CLIP differs from other models as it encodes (embeds) both images and tokenized text labels, then calculates the similarity between the encoded vectors. As a result, its forward pass takes two inputs and produces two outputs. To ensure clarity and modularity, I have created a separate dnn_torch_backend_clip file instead of expanding dnn_torch_backend. This keeps the main file manageable and allows for easy exclusion from the build when the tokenizer-cpp library is not included. If preferred, I can implement a standalone tokenizer class, integrate it into e.g. libavutil, and move the remaining code to the backend. -----Ursprüngliche Nachricht----- Von: ffmpeg-devel <ffmpeg-devel-boun...@ffmpeg.org> Im Auftrag von Guo, Yejun Gesendet: Tuesday, 18 February 2025 11:09 An: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org> Betreff: Re: [FFmpeg-devel] [PATCH] avfilter/dnn: add zero-shot image classification using CLIP models > -----Original Message----- > From: ffmpeg-devel <ffmpeg-devel-boun...@ffmpeg.org> On Behalf Of > Maximilian Kaindl > Sent: Tuesday, February 18, 2025 12:29 AM > To: FFmpeg development discussions and patches <ffmpeg- > de...@ffmpeg.org> > Subject: Re: [FFmpeg-devel] [PATCH] avfilter/dnn: add zero-shot image > classification using CLIP models > > Hello Yejun Guo, > > yes i can do that and submit it in another patch. Do you also have > some feedback for the clip backend? I have already made some small > changes (cuda accel and new preprocessing) that i will submit along > with the other patch, but i would like to hear your thoughts. > Could you share why we need a new backend? > Thanks > > ________________________________ > From: ffmpeg-devel <ffmpeg-devel-boun...@ffmpeg.org> on behalf of Guo, > Yejun <yejun.guo-at-intel....@ffmpeg.org> > Sent: Sunday, February 16, 2025 7:09 AM > To: FFmpeg development discussions and patches <ffmpeg- > de...@ffmpeg.org> > Subject: Re: [FFmpeg-devel] [PATCH] avfilter/dnn: add zero-shot image > classification using CLIP models > > > > > -----Original Message----- > > From: ffmpeg-devel <ffmpeg-devel-boun...@ffmpeg.org> On Behalf Of > > m.kaindl0...@gmail.com > > Sent: Thursday, January 30, 2025 4:33 AM > > To: ffmpeg-devel@ffmpeg.org > > Subject: [FFmpeg-devel] [PATCH] avfilter/dnn: add zero-shot image > > classification using CLIP models > > > > Add a new filter 'dnn_clip' that performs zero-shot image > > classification using CLIP (Contrastive Language-Image Pre-Training) models. > The filter supports: > > For image classification with new dnn models, we'd better add the new > model support with dnn_classify at https://ffmpeg.org/ffmpeg- > filters.html#dnn_005fclassify > > > _______________________________________________ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel > > To unsubscribe, visit link above, or email > ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe". > _______________________________________________ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel > > To unsubscribe, visit link above, or email > ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe". _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe". _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".