https://bugs.kde.org/show_bug.cgi?id=515936

            Bug ID: 515936
           Summary: Support for Semantic Image Search using CLIP-ViT-H-14
                    models
    Classification: Applications
           Product: digikam
      Version First 9.0.0
       Reported In:
          Platform: Other
                OS: Other
            Status: REPORTED
          Severity: wishlist
          Priority: NOR
         Component: Searches-Engine
          Assignee: [email protected]
          Reporter: [email protected]
  Target Milestone: ---

SUMMARY
I would like to request the integration of the CLIP-ViT-H-14 multimodal model
into digiKam to enable advanced semantic search and automated image tagging.
RATIONALE
Currently, digiKam relies on metadata (EXIF/IPTC) and basic AI tools for face
detection and quality analysis. Adding a CLIP (Contrastive Language-Image
Pre-training) backbone would allow users to:
Search by Natural Language: Search for images using descriptive phrases (e.g.,
"sunset over mountains with a red car") without needing manual tags.
Improved Visual Similarity: Find "more images like this" with much higher
accuracy than current color-based histograms.
Automated Keyword Suggestion: Use the ViT-H-14 model to generate high-quality
semantic keywords for a collection.
TECHNICAL SUGGESTIONS
Model: CLIP-ViT-H-14-laion2B-s32B-b79K is widely considered the industry
standard for open-source semantic embeddings.
Implementation: This could be integrated into the existing "Maintenance" or
"Search" sidebar. Since digiKam already uses OpenCV and deep learning engines
for face recognition, this model could leverage the same GPU acceleration
infrastructure.
Performance: While ViT-H-14 is large, it provides a significantly better
"zero-shot" understanding than the smaller ViT-B models, making it ideal for
professional photography management.
ADDITIONAL CONTEXT
Other open-source photo managers (like Immich or Photoprism or Photochat AI )
have successfully implemented CLIP-based search. Bringing this to digiKam would
maintain its position as the premier advanced photo management suite for the
KDE community.

-- 
You are receiving this mail because:
You are watching all bug changes.

Reply via email to