https://bugs.kde.org/show_bug.cgi?id=518026
Bug ID: 518026
Summary: Wish: cluster unrecognized ("Unknown") faces by
embedding similarity
Classification: Applications
Product: digikam
Version First 9.0.0
Reported In:
Platform: Other
OS: Linux
Status: REPORTED
Severity: normal
Priority: NOR
Component: Faces-Engine
Assignee: [email protected]
Reporter: [email protected]
Target Milestone: ---
OVERVIEW
--------
digiKam detects faces and compares them against trained identities, but faces
that do not match any known person are all filed under the single "Unknown"
tag.
With a large photo collection this tag can accumulate tens or hundreds of
thousands of face regions, with no way to discover that many of them depict the
same unidentified person.
The request is to add a clustering step that groups Unknown faces by the
similarity of their SFace embeddings, and surfaces each group either as a
tentative sub-tag (e.g. "Unknown Person 1", "Unknown Person 2") or as a
dedicated "Clusters" view inside the People sidebar — so the user can review
a group of visually similar faces and confirm or name them in bulk.
MOTIVATION
----------
The face-scan pipeline already computes a 128-dimensional SFace embedding for
every detected face (stored transiently as a cv::Mat in the extractor thread).
For recognised faces this embedding is compared to FaceMatrices and either
assigned or discarded. For Unknown faces the embedding is simply discarded.
Retaining and clustering these embeddings would unlock a substantial usability
improvement at very low extra cost, since the heavy DNN work is already done.
PROPOSED APPROACH
-----------------
1. During face scan, persist the SFace embedding for every face that is not
assigned to a known identity into a new DB table, e.g.:
CREATE TABLE UnknownFaceEmbeddings (
id INTEGER PRIMARY KEY,
imageid INTEGER NOT NULL,
tagid INTEGER NOT NULL,
embedding BLOB NOT NULL -- 128 × float32, L2-normalised
);
2. Add a "Cluster Unknown Faces" action (Maintenance menu or People sidebar).
A reasonable default algorithm is online centroid clustering (O(n × c),
where c = number of emerging clusters), followed by centroid-merge and a
configurable minimum-cluster-size noise filter. The cosine distance
threshold (epsilon) and minimum cluster size should be exposed in the
Face Recognition settings panel.
3. Surface each cluster as a tentative tag under "Unknown", named
"Unknown Person 1" etc., so existing confirmation workflows (bulk-confirm
from the People sidebar) apply without UI changes.
PRIOR ART
---------
I prototyped this outside digiKam using Kotlin + ONNX Runtime + SQLite JDBC,
operating directly on a copy of digikam4.db. The schema queries (finding
Unknown tag IDs, cursor-paginated face retrieval, aux table creation) work
correctly against a real digiKam 9.0.0 database. The anticipated bottleneck for
large collections is the ONNX inference step; persisting embeddings during the
existing scan would eliminate that bottleneck entirely, since the DNN work is
already done at that point.
TECHNICAL NOTES
---------------
* SFace embedding: 128 × float32, raw little-endian, L2-normalised
(matches the layout already used in FaceMatrices for trained identities).
* Distance metric: cosine distance = 1 − dot(a, b) on L2-normalised vectors.
Same metric used in FaceClassifier::featureSFaceCompare().
* The KNN/SVM threshold already chosen by the user maps naturally to an
epsilon for clustering — same face, same slider.
* The change to the scan pipeline is minimal: one INSERT per Unknown face
per scan (skippable with INSERT OR IGNORE for re-scans).
--
You are receiving this mail because:
You are watching all bug changes.