Hi Rahul,
If I got your mail right there is misconception of SolrCloud - nodes are
infrastructure of cloud and collection is something that is "unit". So
when you commit, you are committing changes you did on collection and
SolrCloud will handle nodes. When you commit to three 3 nodes it is
actually 3 commits to single collection.
It is not considered to be good practice to have script that does
commits. Solr has autocommit functionality. You should also educate
about soft v.s. hard commits. Following article is good starting point:
https://lucidworks.com/blog/2013/08/23/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/
Regards,
Emir
On 25.01.2016 12:02, Rahul Ramesh wrote:
We are facing some issue and we are finding it difficult to debug the
problem. We wanted to understand how solr commit works.
A background on our setup:
We have 3 Node Solr Cluster running in version 5.3.1. Its a index heavy
use case. In peak load, we index 400-500 documents/second.
We also want these documents to be visible as quickly as possible, hence we
run an external script which commits every 3 mins.
Consider the three nodes as N1, N2, N3. Commit is an synchronous operation.
So, we will not get control till the commit operation is complete.
Consider the following scenario. Although it looks like a basic scenario in
distributed system:-) but we just wanted to eliminate this possibility.
step 1 : At time T1, commit happens to Node N1
step 2: At same time T1, we search for all the documents inserted in Node
N2.
My question is
1. Is commit an atomic operation? I mean, will commit happen on all the
nodes at the same time?
2. Can we say that, the search result will always contain the documents
before commit / or after commit . Or can it so happen that we get new
documents fron N1, N2 but old documents (i.e., before commit) from N3?
Thank you,
Rahul
--
Monitoring * Alerting * Anomaly Detection * Centralized Log Management
Solr & Elasticsearch Support * http://sematext.com/