Also if you go with custom code using MOVEREPLICA use the docs from 8.6
which are much more complete and accurate (after I tripped on some features
not well described, got irritated and took out my frustrations on the
documentation :) ). There is probably nothing in 8.6 that differs from 8.5
for this feature aside from better docs.

On Fri, May 14, 2021 at 9:55 AM Jan Høydahl <[email protected]> wrote:

> Hi,
>
> Autoscaling could perhaps be used, since you could label your nodes and
> make some rules.
>
> But beware that Autoscaling is deprecated and is replaced with a new
> framework / replica placement plugin from 9.0
> See
> http://www.cominvent.com/pub/solr-9-docs/core/org/apache/solr/cluster/placement/PlacementPlugin.html
> for a documentation of this Interface.
> Also see draft ref-guide
> https://nightlies.apache.org/solr/draft-guides/solr-reference-guide-main/replica-placement-plugins.html
>
> What you could then do once you start using 9.0 is write your own
> placement plugin which, depending on env.variables / sys.props set on each
> node, can decide how to place collections there.
> You could e.g. have a rule that would reserve nodes tagged as
> "data-intensive" for collections with a collection-property
> "require-data-intensive", and place other replicas across the remaining
> nodes...
>
> Another option until you get there would be to write your own client code
> which periodically inspects the clusterstate from the outside, and then
> issues MOVEREPLICA commands until it is happy. It would not prevent the
> large collection ending up on the wrong nodes after a RESTORE, but it would
> make sure they are moved immediately.
>
> Hope this gives some food for thought :)
>
> Jan
>
> > 10. mai 2021 kl. 15:44 skrev Edward Turner <[email protected]>:
> >
> > Hi all,
> >
> > Question in brief: in Solrcloud, how can we assign specific nodes to
> serve
> > a collection, given that our cloud is deployed using the backup/restore
> > feature?
> >
> > We are using Solrcloud 8.5.2 with 5 nodes, serving 13 collections. The
> > majority of these collections are not large, but one of them is large and
> > is the most important one in our application. This important collection
> has
> > 5 shards about 60 GB in size (more than double the largest other
> > collection) with about 330 M documents. Our nodes have 32 GB RAM and 8
> > CPUs. We use NFS, but when we go into production, we will have dedicated
> > SSDs available to us.
> >
> > Since our most important collection is being used by users much more than
> > the other collections, we think it makes sense to serve its data from
> > specific nodes, which are not used to serve any other collection. So far,
> > we have not needed to do this, but we do now as we're seeing some
> > performance issues ...
> >
> > The caveat is that we utilise the backup/restore feature of Solrcloud to
> > deploy our application on different data-centres.
> >
> > We've read briefly about the Autoscaling features of Solrcloud, but have
> > not yet made use of them. However, I can't see whether Autoscaling allows
> > us to dedicate specific nodes to a collection.
> >
> > Does anyone with any experience with this have any advice for us?
> >
> > Kind regards,
> >
> > Edd
> > --------------------
> > Edward Turner
>
>

-- 
http://www.needhamsoftware.com (work)
http://www.the111shift.com (play)

Reply via email to