https://bugs.kde.org/show_bug.cgi?id=384444
Bug ID: 384444
Summary: Wish: support "remote metadata services" (eg. AI based
image tagging like Clarifai.com)
Product: digikam
Version: 5.6.0
Platform: Appimage
OS: Linux
Status: UNCONFIRMED
Severity: normal
Priority: NOR
Component: Metadata-Hub
Assignee: [email protected]
Reporter: [email protected]
Target Milestone: ---
See discussion on the mailing list today:
Original post: (by me)
"I want to use Clarifai to do some automatic tagging (www.clarifai.com)
experiments since my tagging requirements take *way* too much time. (Actually,
I think this would be a great feature for Digikam by default - automated AI
based tagging ... ;) )
Basically I'm looking for a way to select some images and say "Send these
images to Clarifai via API and save the returned image tags to the image
metadata".
Is this possible at all without resorting to C++ and hacking the Digikam source
itself?
I know I can create a batch operation with a custom shell script but this shell
script expects a *different* image as the output - while I just want to update
the metadata."
Reply (Gilles):
"This is a very interresting subject, but without a simple response to your
question.
You cannot easily connect digiKam Database to this kind of remote web service.
Only C++ code do it. ..."
Reply (Andrey Goreev):
"I second on this one.
Large corporations e.g. Google and Microsoft have similar services embedded in
their solutions e.g. Ms OneDrive and Google Photos but none of them let you
download your data because they want you to be hooked to their services. If
digiKam was capable of getting keywords from a server via API and write them to
database/metadata/sidecars using Exiv2 that would be a great feature."
My reply:
"There are several such services that allow AI operations on images via API.
Clarifai is just the (currently) most popular and best one - here's a
comparison:
https://www.quora.com/Which-company-has-the-best-image-recognition-APIs-in-the-market-place-today-What-are-they-charging
The data that these services return varies. Some do tags, some do descriptions,
some descriptions are multi language, some do videos as well (frame by frame or
second by second). Most services are asynchronous (Ie. you upload a bunch of
images and then check later for the metadata, in a background job). We need to
decide what to do with the returned metadata (for text: overwrite or append?,
for tags: create in a subtree? Allow all tags or a whitelist? detect and reuse
renamed/moved tags? etc).
I think a generic „upload image and then download metadata“ concept in Digikam
which allows plugging in many of these services makes sense. The interface is
probably always HTTP(S) based so most of the code probably already exists.
We just need a way to use it in Digikam and an options dialog for each service
(for API key, maybe post and get URLs, supported file formats, returned data,
etc).
Oh, and PS: Google and MS provide Vision APIs to do the same thing they do in
their own photo apps. We could plug them in too. The APIs are just not free
forever, there's a quota. :-)
--------
I'm going to put a bounty of €50 on this bug (as a donation to the Digikam
project) if it gets implemented. It would be a huge time saver if I could use
this.
--
You are receiving this mail because:
You are watching all bug changes.