https://bugs.kde.org/show_bug.cgi?id=518026

            Bug ID: 518026
           Summary: Wish: cluster unrecognized ("Unknown") faces by
                    embedding similarity
    Classification: Applications
           Product: digikam
      Version First 9.0.0
       Reported In:
          Platform: Other
                OS: Linux
            Status: REPORTED
          Severity: normal
          Priority: NOR
         Component: Faces-Engine
          Assignee: [email protected]
          Reporter: [email protected]
  Target Milestone: ---

OVERVIEW
--------
digiKam detects faces and compares them against trained identities, but faces
that do not match any known person are all filed under the single "Unknown"
tag.
With a large photo collection this tag can accumulate tens or hundreds of
thousands of face regions, with no way to discover that many of them depict the
same unidentified person.

The request is to add a clustering step that groups Unknown faces by the
similarity of their SFace embeddings, and surfaces each group either as a
tentative sub-tag (e.g. "Unknown Person 1", "Unknown Person 2") or as a
dedicated "Clusters" view inside the People sidebar — so the user can review
a group of visually similar faces and confirm or name them in bulk.

MOTIVATION
----------
The face-scan pipeline already computes a 128-dimensional SFace embedding for
every detected face (stored transiently as a cv::Mat in the extractor thread).
For recognised faces this embedding is compared to FaceMatrices and either
assigned or discarded. For Unknown faces the embedding is simply discarded.
Retaining and clustering these embeddings would unlock a substantial usability
improvement at very low extra cost, since the heavy DNN work is already done.

PROPOSED APPROACH
-----------------
1. During face scan, persist the SFace embedding for every face that is not
   assigned to a known identity into a new DB table, e.g.:

     CREATE TABLE UnknownFaceEmbeddings (
         id        INTEGER PRIMARY KEY,
         imageid   INTEGER NOT NULL,
         tagid     INTEGER NOT NULL,
         embedding BLOB NOT NULL   -- 128 × float32, L2-normalised
     );

2. Add a "Cluster Unknown Faces" action (Maintenance menu or People sidebar).
   A reasonable default algorithm is online centroid clustering (O(n × c),
   where c = number of emerging clusters), followed by centroid-merge and a
   configurable minimum-cluster-size noise filter. The cosine distance
   threshold (epsilon) and minimum cluster size should be exposed in the
   Face Recognition settings panel.

3. Surface each cluster as a tentative tag under "Unknown", named
   "Unknown Person 1" etc., so existing confirmation workflows (bulk-confirm
   from the People sidebar) apply without UI changes.

PRIOR ART
---------
I prototyped this outside digiKam using Kotlin + ONNX Runtime + SQLite JDBC,
operating directly on a copy of digikam4.db. The schema queries (finding
Unknown tag IDs, cursor-paginated face retrieval, aux table creation) work
correctly against a real digiKam 9.0.0 database. The anticipated bottleneck for
large collections is the ONNX inference step; persisting embeddings during the
existing scan would eliminate that bottleneck entirely, since the DNN work is
already done at that point.

TECHNICAL NOTES
---------------
* SFace embedding: 128 × float32, raw little-endian, L2-normalised
  (matches the layout already used in FaceMatrices for trained identities).
* Distance metric: cosine distance = 1 − dot(a, b) on L2-normalised vectors.
  Same metric used in FaceClassifier::featureSFaceCompare().
* The KNN/SVM threshold already chosen by the user maps naturally to an
  epsilon for clustering — same face, same slider.
* The change to the scan pipeline is minimal: one INSERT per Unknown face
  per scan (skippable with INSERT OR IGNORE for re-scans).

-- 
You are receiving this mail because:
You are watching all bug changes.

Reply via email to