I am currently using SOLR 4.4. but not planning to use solrcloud in very
near
future.
I have 3 master / 3 slave setup. Each master is linked to its
corresponding
slave.. I have disabled auto polling..
We do both push (using MQ) and pull indexing using SOLRJ indexing
program.
I have enabled soft commit in slave (to view the changes immediately pushed
by queue).
I am thinking of doing the batch indexing in master (optimize and hard
commit) and push indexing in both master / slave.
I am trying to do more testing with my configuration but thought of getting
to know some answers before diving very deep...
Since the queue pushes the docs in master / slave there is a possibility of
slave having more record compared to master (when master is busy doing
batch
indexing).. What would happen if the slave has additional segments compared
to Master. will that be deleted when the replication happens.
If a message is pushed from a queue to both master and slave during
replication, will there be a latency in seeing that document even if we
use
softcommit in slave?
We want to make sure that we are not missing any documents from queue
(since
its updated via UI and we don't really store that data anywhere except
in
index)
If you are doing replication, then all updates must go to the master
server. You cannot update the slave directly. The replication happens, the
slave will be identical to the master... Any documents aent to only the
slave will be lost.
Replication will happen according to the interval you have configured, or
since you say you have disabled polling, according to whatever schedule
you manually trigger a replication.
SolrCloud would probably be a better fit for you. With a properly
configured SolrCloud you just index to any host in the cloud and documents
end up exactly where they need to go, and all replicas get updated.
Thanks,
Shawn