: Is it possible in solr to have multivalued "id"? Or I need to make my : own "mv_ID" for this? Any ideas how to achieve this efficiently?
This isn't something the SignatureUpdateProcessor is going to be able to hel pyou with -- it does the deduplication be changing hte low level "update" (implemented as a delete then add) so that the key used to delete the older documents is based on the signature field instead of the id field. in order to do what you are describing, you would need to query the index for matching signatures, then add the resulting ids to your document before doing that "update" You could posibly do this in a custom UpdateProcessor, but you'd have to do something tricky to ensure you didn't overlook docs that had been addd but not yet committed when checking for dups. I don't have a good suggestion for how to do this internally in Slr -- it seems like the type of bulk processing logic that would be better suited for an external process before you ever start indexing (much like link analysis for back refrences) -Hoss