Re: From solr to solr cloud

2019-12-06 Thread Erick Erickson
Because you use individual collections, you really don’t have to care 
about getting it all right up front.

Each collection can be created on a specified set of nodes, see the 
“createNodeSet”
parameter of the collections API “CREATE” command. 

And let’s say you figure out later that you need more hardware and want to move
some of your existing collections to new hardware. Use the MOVEREPLICA API
command.

So say you start out with 1 machine hosting 500 collections.
You get more and more and more clients and your machine gets overloaded. Or
one of your collections grows disproportionately to the others. You spin up a
new machine and MOVEREPLICA for some number of replicas on your
original machine to the new hardware.

Also consider that at some point, it may be desirable to have multiple “pods”.
By that I mean it can get awkward to have thousands of collections hosted on
a single Zookeeper ensemble. Again, because you have individual collections
you can declare one “pod” (Zookeeper + Solr nodes) full and spin up
another one, i.e. totally separate hardware, separate ZK ensembles. The pods
don’t know about each other at all.

Best,
Erick

> On Dec 6, 2019, at 3:12 AM, Vignan Malyala  wrote:
> 
> Hi Shawn,
> 
> Thanks for your response!
> 
> Yes! 500 collections.
> Each collection/core has around 50k to 50L documents/jsons (depending upon
> the client). We made one core for each client. Each json has 15 fields.
> It already in production as as Solr stand alone server.
> We want to use SolrCloud for it now, so as to make it scalable for future
> safety. How do I make it possible?
> 
> As per your response, I understood that, I have to create 3 zookeeper
> instances and some machines that house 1 solr node each.
> Is that the optimized solution? *And how many machines do I need to build
> to house solr nodes keeping in mind 500 collections?*
> 
> Thanks in advance!
> 
> On Fri, Dec 6, 2019 at 11:44 AM Shawn Heisey  wrote:
> 
>> On 12/5/2019 12:28 PM, Vignan Malyala wrote:
>>> I currently have 500 collections in my stand alone solr. Bcoz of day by
>> day
>>> increase in Data, I want to convert it into solr cloud.
>>> Can you suggest me how to do it successfully.
>>> How many shards should be there?
>>> How many nodes should be there?
>>> Are so called nodes different machines i should take?
>>> How many zoo keeper nodes should be there?
>>> Are so called zoo keeper nodes different machines i should take?
>>> Total how many machines i have to take to implement scalable solr cloud?
>> 
>> 500 collections is large enough that running it in SolrCloud is likely
>> to encounter scalability issues.  SolrCloud's design does not do well
>> with that many collections in the cluster, even if there are a lot of
>> machines.
>> 
>> There's a lot of comment history on this issue:
>> 
>> https://issues.apache.org/jira/browse/SOLR-7191
>> 
>> Generally speaking, each machine should only house one Solr node,
>> whether you're running cloud or not.  If each one requires a really huge
>> heap, it might be worthwhile to split it, but that's the only time I
>> would do so.  And I would generally prefer to add more machines than to
>> run multiple Solr nodes on one machine.
>> 
>> One thing you might do, if the way your data is divided will permit it,
>> is to run multiple SolrCloud clusters.  Multiple clusters can all use
>> one ZooKeeper ensemble.
>> 
>> ZooKeeper requires a minimum of three machines for fault tolerance.
>> With 3 or 4 machines in the ensemble, you can survive one machine
>> failure.  To survive two failures requires at least 5 machines.
>> 
>> Thanks,
>> Shawn
>> 



Re: From solr to solr cloud

2019-12-06 Thread Vignan Malyala
Hi Shawn,

Thanks for your response!

Yes! 500 collections.
Each collection/core has around 50k to 50L documents/jsons (depending upon
the client). We made one core for each client. Each json has 15 fields.
It already in production as as Solr stand alone server.
We want to use SolrCloud for it now, so as to make it scalable for future
safety. How do I make it possible?

As per your response, I understood that, I have to create 3 zookeeper
instances and some machines that house 1 solr node each.
Is that the optimized solution? *And how many machines do I need to build
to house solr nodes keeping in mind 500 collections?*

Thanks in advance!

On Fri, Dec 6, 2019 at 11:44 AM Shawn Heisey  wrote:

> On 12/5/2019 12:28 PM, Vignan Malyala wrote:
> > I currently have 500 collections in my stand alone solr. Bcoz of day by
> day
> > increase in Data, I want to convert it into solr cloud.
> > Can you suggest me how to do it successfully.
> > How many shards should be there?
> > How many nodes should be there?
> > Are so called nodes different machines i should take?
> > How many zoo keeper nodes should be there?
> > Are so called zoo keeper nodes different machines i should take?
> > Total how many machines i have to take to implement scalable solr cloud?
>
> 500 collections is large enough that running it in SolrCloud is likely
> to encounter scalability issues.  SolrCloud's design does not do well
> with that many collections in the cluster, even if there are a lot of
> machines.
>
> There's a lot of comment history on this issue:
>
> https://issues.apache.org/jira/browse/SOLR-7191
>
> Generally speaking, each machine should only house one Solr node,
> whether you're running cloud or not.  If each one requires a really huge
> heap, it might be worthwhile to split it, but that's the only time I
> would do so.  And I would generally prefer to add more machines than to
> run multiple Solr nodes on one machine.
>
> One thing you might do, if the way your data is divided will permit it,
> is to run multiple SolrCloud clusters.  Multiple clusters can all use
> one ZooKeeper ensemble.
>
> ZooKeeper requires a minimum of three machines for fault tolerance.
> With 3 or 4 machines in the ensemble, you can survive one machine
> failure.  To survive two failures requires at least 5 machines.
>
> Thanks,
> Shawn
>


Re: From solr to solr cloud

2019-12-06 Thread Vignan Malyala
Yes! 500 collections.
Each collection/core has around 50k to 50L documents/jsons (depending upon
the client). We made one core for each client. Each json has 15 fields.
It already in production as as Solr stand alone server.

We want to use SolrCloud for it now, so as to make it scalable for future
safety. How do I make it possible (obviously with minimum cost)?

On Fri, Dec 6, 2019 at 11:14 AM Paras Lehana 
wrote:

> Do you mean 500 cores? Tell us about the data more. How many documents per
> core do you have or what performance issues are you facing?
>
> On Fri, 6 Dec 2019 at 01:01, David Hastings 
> wrote:
>
> > are you noticing performance decreases in stand alone solr as of now?
> >
> > On Thu, Dec 5, 2019 at 2:29 PM Vignan Malyala 
> > wrote:
> >
> > > Hi
> > > I currently have 500 collections in my stand alone solr. Bcoz of day by
> > day
> > > increase in Data, I want to convert it into solr cloud.
> > > Can you suggest me how to do it successfully.
> > > How many shards should be there?
> > > How many nodes should be there?
> > > Are so called nodes different machines i should take?
> > > How many zoo keeper nodes should be there?
> > > Are so called zoo keeper nodes different machines i should take?
> > > Total how many machines i have to take to implement scalable solr
> cloud?
> > >
> > > Plz detail these questions. Any of documents on web aren't clear for
> > > production environments.
> > > Thanks in advance.
> > >
> >
>
>
> --
> --
> Regards,
>
> *Paras Lehana* [65871]
> Development Engineer, Auto-Suggest,
> IndiaMART Intermesh Ltd.
>
> 8th Floor, Tower A, Advant-Navis Business Park, Sector 142,
> Noida, UP, IN - 201303
>
> Mob.: +91-9560911996
> Work: 01203916600 | Extn:  *8173*
>
> --
> *
> *
>
>  
>


Re: From solr to solr cloud

2019-12-05 Thread Shawn Heisey

On 12/5/2019 12:28 PM, Vignan Malyala wrote:

I currently have 500 collections in my stand alone solr. Bcoz of day by day
increase in Data, I want to convert it into solr cloud.
Can you suggest me how to do it successfully.
How many shards should be there?
How many nodes should be there?
Are so called nodes different machines i should take?
How many zoo keeper nodes should be there?
Are so called zoo keeper nodes different machines i should take?
Total how many machines i have to take to implement scalable solr cloud?


500 collections is large enough that running it in SolrCloud is likely 
to encounter scalability issues.  SolrCloud's design does not do well 
with that many collections in the cluster, even if there are a lot of 
machines.


There's a lot of comment history on this issue:

https://issues.apache.org/jira/browse/SOLR-7191

Generally speaking, each machine should only house one Solr node, 
whether you're running cloud or not.  If each one requires a really huge 
heap, it might be worthwhile to split it, but that's the only time I 
would do so.  And I would generally prefer to add more machines than to 
run multiple Solr nodes on one machine.


One thing you might do, if the way your data is divided will permit it, 
is to run multiple SolrCloud clusters.  Multiple clusters can all use 
one ZooKeeper ensemble.


ZooKeeper requires a minimum of three machines for fault tolerance. 
With 3 or 4 machines in the ensemble, you can survive one machine 
failure.  To survive two failures requires at least 5 machines.


Thanks,
Shawn


Re: From solr to solr cloud

2019-12-05 Thread Paras Lehana
Do you mean 500 cores? Tell us about the data more. How many documents per
core do you have or what performance issues are you facing?

On Fri, 6 Dec 2019 at 01:01, David Hastings 
wrote:

> are you noticing performance decreases in stand alone solr as of now?
>
> On Thu, Dec 5, 2019 at 2:29 PM Vignan Malyala 
> wrote:
>
> > Hi
> > I currently have 500 collections in my stand alone solr. Bcoz of day by
> day
> > increase in Data, I want to convert it into solr cloud.
> > Can you suggest me how to do it successfully.
> > How many shards should be there?
> > How many nodes should be there?
> > Are so called nodes different machines i should take?
> > How many zoo keeper nodes should be there?
> > Are so called zoo keeper nodes different machines i should take?
> > Total how many machines i have to take to implement scalable solr cloud?
> >
> > Plz detail these questions. Any of documents on web aren't clear for
> > production environments.
> > Thanks in advance.
> >
>


-- 
-- 
Regards,

*Paras Lehana* [65871]
Development Engineer, Auto-Suggest,
IndiaMART Intermesh Ltd.

8th Floor, Tower A, Advant-Navis Business Park, Sector 142,
Noida, UP, IN - 201303

Mob.: +91-9560911996
Work: 01203916600 | Extn:  *8173*

-- 
*
*

 


Re: From solr to solr cloud

2019-12-05 Thread David Hastings
are you noticing performance decreases in stand alone solr as of now?

On Thu, Dec 5, 2019 at 2:29 PM Vignan Malyala  wrote:

> Hi
> I currently have 500 collections in my stand alone solr. Bcoz of day by day
> increase in Data, I want to convert it into solr cloud.
> Can you suggest me how to do it successfully.
> How many shards should be there?
> How many nodes should be there?
> Are so called nodes different machines i should take?
> How many zoo keeper nodes should be there?
> Are so called zoo keeper nodes different machines i should take?
> Total how many machines i have to take to implement scalable solr cloud?
>
> Plz detail these questions. Any of documents on web aren't clear for
> production environments.
> Thanks in advance.
>


From solr to solr cloud

2019-12-05 Thread Vignan Malyala
Hi
I currently have 500 collections in my stand alone solr. Bcoz of day by day
increase in Data, I want to convert it into solr cloud.
Can you suggest me how to do it successfully.
How many shards should be there?
How many nodes should be there?
Are so called nodes different machines i should take?
How many zoo keeper nodes should be there?
Are so called zoo keeper nodes different machines i should take?
Total how many machines i have to take to implement scalable solr cloud?

Plz detail these questions. Any of documents on web aren't clear for
production environments.
Thanks in advance.