https://bugs.kde.org/show_bug.cgi?id=514908

            Bug ID: 514908
           Summary: Wishlist: Integrate local LLM/Vision Model support for
                    AI-powered image captioning and tagging
    Classification: Applications
           Product: digikam
      Version First unspecified
       Reported In:
          Platform: Other
                OS: Other
            Status: REPORTED
          Severity: wishlist
          Priority: NOR
         Component: Tags-Engine
          Assignee: [email protected]
          Reporter: [email protected]
  Target Milestone: ---

Created attachment 188757
  --> https://bugs.kde.org/attachment.cgi?id=188757&action=edit
URL for the Github Repo

Feature Goal:
Integrate automated, high-quality image captioning and keyword generation using
local Vision-Language Models (VLM), similar to the functionality in the
ImageIndexer tool by jabberjabberjabber.

Specific Features to Adopt:
Local LLM Integration: Support for backends like KoboldCPP or Ollama or similar
model feature to process images locally without privacy concerns .
Automated Captioning: Use AI to generate natural language descriptions of
images (e.g., "A golden retriever playing with a blue ball in a sunny park").
Advanced Tagging: Extract specific keywords from the AI-generated captions to
populate the digiKam Tags hierarchy automatically.
Batch Processing: The ability to run this "indexing" over a selection of images
or an entire album as a background task

Why this is needed:
Current AI tagging in digiKam is often limited to basic object detection (e.g.,
"dog," "car"). Modern VLMs can provide context, mood, and detailed descriptions
that significantly enhance the searchability of large photo collections.

-- 
You are receiving this mail because:
You are watching all bug changes.

Reply via email to