On 4/12/2018 5:53 AM, girish.vignesh wrote:
Solr gives old data while faceting from old deleted or updated documents.

For example we are doing faceting on name. name changes frequently for our
application. When we index the document after changing the name we get both
old name and new name in the search results. After digging more on this I
got to know that Solr indexes are composed of segments (write once) and each
segment contains set of documents. Whenever hard commit happens these
segments will be closed and even if a document is deleted after that it will
still have those documents (which will be marked as deleted). These
documents will not be cleared immediately. It will not be displayed in the
search result though, but somehow faceting is still able to access those
data.

If all documents with that term are deleted, then this will be fixed by adding a facet.mincount=1 parameter to your facet URL.  If you are using the JSON facet API, then there is a mincount parameter that you can place into your JSON request. I've never actually used the JSON facet API, but there is documentation:

https://lucene.apache.org/solr/guide/7_2/json-facet-api.html#TermsFacet

The mincount parameter might make it unnecessary to optimize.  But if you are updating a LOT of your documents on a regular basis, you might find that it gives you better performance, so optimizing once a day during a time when traffic is low might be useful.

Thanks,
Shawn

Reply via email to