Hi, We are planning to use Lucene 4.8.1 over Oracle (1 to 2 TB data) and seeking information on "How Lucene conduct housekeeping or maintenance of indexes over a period of time". *Is it a blocking operation for write and search or it will not block anything while merging is going on? *
I found :- *"Since Lucene adds the updated document to the index and marks all previous versions as deleted. So to get rid of deleted documents Lucene needs to do some housekeeping over a period of time. Under the hood is that from time to time segments are merged into (usually) bigger segments using configurable MergePolicy <http://lucene.apache.org/java/3_4_0/api/core/org/apache/lucene/index/MergePolicy> (TieredMergePolicy). "* 1- Is it's a blocking operation for write and search both or it will not block anything while merging is going on? 2- What is the best practice to avoid any blocking in production servers? Not sure how Solr or Elasticsearch is handling it. Should we control the merging by calling *forcemerge(int) at low traffic time *to avoid any unpredictable blocking operation? Is it recommended or Lucene do intelligent merging and don't block anything (updates and searches) or there are ways to reduce the blocking time to a very small duration (1 -2 minutes) using some API or demon thread etc. Looking for your professional guidance on it. Regards Gaurav