If I understand correctly what you're trying to do, docValues for a
number of field types are (at least in their multivalued incarnation)
backed by SortedSetDocValues, which inherently deduplicate values
per-document. In your case it sounds like you could maybe rely on that
behavior as a feature, set stored=false, docValues=true,
useDocValuesAsStored=true, and achieve the desired behavior?
Michael

On Thu, Oct 29, 2020 at 6:17 AM Srinivas Kashyap
<srini...@bamboorose.com.invalid> wrote:
>
> Thanks Dwane,
>
> I have a doubt, according to the java doc, the duplicates still continue to 
> exist in the field. May be during query time, the field returns only unique 
> values? Am I right with my assumption?
>
> And also, what is the performance overhead for this UniqueFiled*Factory?
>
> Thanks,
> Srinivas
>
> From: Dwane Hall <dwaneh...@hotmail.com>
> Sent: 29 October 2020 14:33
> To: solr-user@lucene.apache.org
> Subject: Re: Avoiding duplicate entry for a multivalued field
>
> Srinivas this is possible by adding an unique field update processor to the 
> update processor chain you are using to perform your updates (/update, 
> /update/json, /update/json/docs, .../a_custom_one)
>
> The Java Documents explain its use nicely
> (https://lucene.apache.org/solr/8_6_0//solr-core/org/apache/solr/update/processor/UniqFieldsUpdateProcessorFactory.html<https://lucene.apache.org/solr/8_6_0//solr-core/org/apache/solr/update/processor/UniqFieldsUpdateProcessorFactory.html>)
>  or there are articles on stack overflow addressing this exact problem 
> (https://stackoverflow.com/questions/37005747/how-to-remove-duplicates-from-multivalued-fields-in-solr#37006655<https://stackoverflow.com/questions/37005747/how-to-remove-duplicates-from-multivalued-fields-in-solr#37006655>)
>
> Thanks,
>
> Dwane
> ________________________________
> From: Srinivas Kashyap 
> <srini...@bamboorose.com.INVALID<mailto:srini...@bamboorose.com.INVALID>>
> Sent: Thursday, 29 October 2020 3:49 PM
> To: solr-user@lucene.apache.org<mailto:solr-user@lucene.apache.org> 
> <solr-user@lucene.apache.org<mailto:solr-user@lucene.apache.org>>
> Subject: Avoiding duplicate entry for a multivalued field
>
> Hello,
>
> Say, I have a schema field which is multivalued. Is there a way to maintain 
> distinct values for that field though I continue to add duplicate values 
> through atomic update via solrj?
>
> Is there some property setting to have only unique values in a multi valued 
> fields?
>
> Thanks,
> Srinivas
> ________________________________
> DISCLAIMER:
> E-mails and attachments from Bamboo Rose, LLC are confidential.
> If you are not the intended recipient, please notify the sender immediately 
> by replying to the e-mail, and then delete it without making copies or using 
> it in any way.
> No representation is made that this email or any attachments are free of 
> viruses. Virus scanning is recommended and is the responsibility of the 
> recipient.
>
> Disclaimer
>
> The information contained in this communication from the sender is 
> confidential. It is intended solely for use by the recipient and others 
> authorized to receive it. If you are not the recipient, you are hereby 
> notified that any disclosure, copying, distribution or taking action in 
> relation of the contents of this information is strictly prohibited and may 
> be unlawful.
>
> This email has been scanned for viruses and malware, and may have been 
> automatically archived by Mimecast Ltd, an innovator in Software as a Service 
> (SaaS) for business. Providing a safer and more useful place for your human 
> generated data. Specializing in; Security, archiving and compliance. To find 
> out more visit the Mimecast website.

Reply via email to