Hi,
I am considering SolrCloud for our applications but I have run into the
limitation of not being able to use Join Queries in distributed searches.
Our requirements are the following:
- SolrCloud will serve many applications where each application "index"
is separate from other application. Each application really is customer
deployment and we need to isolate customers data from each other
-Join queries are required. Queries will only look at one customer at a
time.
- Since data volume for each customer is small in Solr/Lucene standards,
(1-2 Million document is small, right?), we are really interested in the
replication aspect of SolrCloud more than distributed search.
I am considering the following SolrCloud design with questions:
- Start SolrCloud with 1 shard only. This should allow join queries to
work correctly since all documents will be available in the same shard
(index). is this a correct assumption?
- Each customer will have its own collection in the SolrCloud. Do
collections provide me with data isolation between customers?
- Adding more nodes as replicas of the single shard to achieve
replication and fault tolerance.
Thank you,
Hs