When do you consider two documents are duplicates? When 1 field has the same 
value, when multiple fields have the same value, or all fields etc?

Sent from Mail for Windows

From: Vince McMahon
Sent: Sunday, October 22, 2023 3:22 PM
To: users@solr.apache.org
Subject: what is SOLR syntax to remove duplicated documents

I have a SOLR 8.X.  I suspect one of the core has duplicates and wants to
remove the duplicated documents.  Signature, as in the SOLR guide, is not
implemented.  https://solr.apache.org/guide/6_6/de-duplication.html

in sql, a query without the use of a hash column will be liked:
;WITH CTE AS
(
    SELECT  cols,
            RN = ROW_NUMBER() OVER( PARTITION BY cols
                                    ORDER BY updated DESC)
    FROM [table]
)
DELETE FROM CTE
WHERE RN > 1

what would be the syntax for SOLR query?

Reply via email to