Re: How to remove duplicate tokens from solr

2020-09-18 Thread Rajdeep Sahoo
Hi all, I have found the below details in stackoverflow but not sure how to include the jar. Can any one help with this? I've created a new filter class from "FilteringTokenFilter". The task is pretty simple. I would check before adding into the list. I have created a simple plugin Eliminate

Re: How to remove duplicate tokens from solr

2020-09-17 Thread Rajdeep Sahoo
But not sure why these type of search string is causing high cpu utilization. On Fri, 18 Sep, 2020, 12:49 am Rahul Goswami, wrote: > Is this for a phrase search? If yes then the position of the token would > matter too and not sure which token would you want to remove. "eg > "tshirt hat

Re: How to remove duplicate tokens from solr

2020-09-17 Thread Rahul Goswami
Is this for a phrase search? If yes then the position of the token would matter too and not sure which token would you want to remove. "eg "tshirt hat tshirt". Also, are you looking to save space and want this at index time? Or just want to remove duplicates from the search string? If this is at

Re: How to remove duplicate tokens from solr

2020-09-17 Thread Rajdeep Sahoo
If someone is searching with " tshirt tshirt tshirt tshirt tshirt tshirt" we need to remove the duplicates and search with tshirt. On Fri, 18 Sep, 2020, 12:19 am Alexandre Rafalovitch, wrote: > This is not quite enough information. > There is >

Re: How to remove duplicate tokens from solr

2020-09-17 Thread Alexandre Rafalovitch
This is not quite enough information. There is https://lucene.apache.org/solr/guide/8_6/filter-descriptions.html#remove-duplicates-token-filter but it has specific limitations. What is the problem that you are trying to solve that you feel is due to duplicate tokens? Why are they duplicates? Is

How to remove duplicate tokens from solr

2020-09-17 Thread Rajdeep Sahoo
Hi team, Is there any way to remove duplicate tokens from solr. Is there any filter for this.