On 4/12/2018 5:53 AM, girish.vignesh wrote:
Solr gives old data while faceting from old deleted or updated documents.
For example we are doing faceting on name. name changes frequently for our
application. When we index the document after changing the name we get both
old name and new name in the search results. After digging more on this I
got to know that Solr indexes are composed of segments (write once) and each
segment contains set of documents. Whenever hard commit happens these
segments will be closed and even if a document is deleted after that it will
still have those documents (which will be marked as deleted). These
documents will not be cleared immediately. It will not be displayed in the
search result though, but somehow faceting is still able to access those
data.
If all documents with that term are deleted, then this will be fixed by
adding a facet.mincount=1 parameter to your facet URL. If you are using
the JSON facet API, then there is a mincount parameter that you can
place into your JSON request. I've never actually used the JSON facet
API, but there is documentation:
https://lucene.apache.org/solr/guide/7_2/json-facet-api.html#TermsFacet
The mincount parameter might make it unnecessary to optimize. But if
you are updating a LOT of your documents on a regular basis, you might
find that it gives you better performance, so optimizing once a day
during a time when traffic is low might be useful.
Thanks,
Shawn