Hi Shawn,

Thanks for your response!

Yes! 500 collections.
Each collection/core has around 50k to 50L documents/jsons (depending upon
the client). We made one core for each client. Each json has 15 fields.
It already in production as as Solr stand alone server.
We want to use SolrCloud for it now, so as to make it scalable for future
safety. How do I make it possible?

As per your response, I understood that, I have to create 3 zookeeper
instances and some machines that house 1 solr node each.
Is that the optimized solution? *And how many machines do I need to build
to house solr nodes keeping in mind 500 collections?*

Thanks in advance!

On Fri, Dec 6, 2019 at 11:44 AM Shawn Heisey <apa...@elyograg.org> wrote:

> On 12/5/2019 12:28 PM, Vignan Malyala wrote:
> > I currently have 500 collections in my stand alone solr. Bcoz of day by
> day
> > increase in Data, I want to convert it into solr cloud.
> > Can you suggest me how to do it successfully.
> > How many shards should be there?
> > How many nodes should be there?
> > Are so called nodes different machines i should take?
> > How many zoo keeper nodes should be there?
> > Are so called zoo keeper nodes different machines i should take?
> > Total how many machines i have to take to implement scalable solr cloud?
>
> 500 collections is large enough that running it in SolrCloud is likely
> to encounter scalability issues.  SolrCloud's design does not do well
> with that many collections in the cluster, even if there are a lot of
> machines.
>
> There's a lot of comment history on this issue:
>
> https://issues.apache.org/jira/browse/SOLR-7191
>
> Generally speaking, each machine should only house one Solr node,
> whether you're running cloud or not.  If each one requires a really huge
> heap, it might be worthwhile to split it, but that's the only time I
> would do so.  And I would generally prefer to add more machines than to
> run multiple Solr nodes on one machine.
>
> One thing you might do, if the way your data is divided will permit it,
> is to run multiple SolrCloud clusters.  Multiple clusters can all use
> one ZooKeeper ensemble.
>
> ZooKeeper requires a minimum of three machines for fault tolerance.
> With 3 or 4 machines in the ensemble, you can survive one machine
> failure.  To survive two failures requires at least 5 machines.
>
> Thanks,
> Shawn
>

Reply via email to