Dear Community,
I have a Solr cluster with an index containing approximately 100+ million addresses. I need to do a bulk search with the same number of addresses in order to find near-duplicate entities. Is there anything specific that I should look for before doing so? At the moment I am just querying the index using the Solr client but that means that I will be executing 100+ million HTTP requests to the cluster and that sounds very time consuming and not optimal. Is there any offline solution to query Solr? Thanks for your help!