Also if you go with custom code using MOVEREPLICA use the docs from 8.6 which are much more complete and accurate (after I tripped on some features not well described, got irritated and took out my frustrations on the documentation :) ). There is probably nothing in 8.6 that differs from 8.5 for this feature aside from better docs.
On Fri, May 14, 2021 at 9:55 AM Jan Høydahl <[email protected]> wrote: > Hi, > > Autoscaling could perhaps be used, since you could label your nodes and > make some rules. > > But beware that Autoscaling is deprecated and is replaced with a new > framework / replica placement plugin from 9.0 > See > http://www.cominvent.com/pub/solr-9-docs/core/org/apache/solr/cluster/placement/PlacementPlugin.html > for a documentation of this Interface. > Also see draft ref-guide > https://nightlies.apache.org/solr/draft-guides/solr-reference-guide-main/replica-placement-plugins.html > > What you could then do once you start using 9.0 is write your own > placement plugin which, depending on env.variables / sys.props set on each > node, can decide how to place collections there. > You could e.g. have a rule that would reserve nodes tagged as > "data-intensive" for collections with a collection-property > "require-data-intensive", and place other replicas across the remaining > nodes... > > Another option until you get there would be to write your own client code > which periodically inspects the clusterstate from the outside, and then > issues MOVEREPLICA commands until it is happy. It would not prevent the > large collection ending up on the wrong nodes after a RESTORE, but it would > make sure they are moved immediately. > > Hope this gives some food for thought :) > > Jan > > > 10. mai 2021 kl. 15:44 skrev Edward Turner <[email protected]>: > > > > Hi all, > > > > Question in brief: in Solrcloud, how can we assign specific nodes to > serve > > a collection, given that our cloud is deployed using the backup/restore > > feature? > > > > We are using Solrcloud 8.5.2 with 5 nodes, serving 13 collections. The > > majority of these collections are not large, but one of them is large and > > is the most important one in our application. This important collection > has > > 5 shards about 60 GB in size (more than double the largest other > > collection) with about 330 M documents. Our nodes have 32 GB RAM and 8 > > CPUs. We use NFS, but when we go into production, we will have dedicated > > SSDs available to us. > > > > Since our most important collection is being used by users much more than > > the other collections, we think it makes sense to serve its data from > > specific nodes, which are not used to serve any other collection. So far, > > we have not needed to do this, but we do now as we're seeing some > > performance issues ... > > > > The caveat is that we utilise the backup/restore feature of Solrcloud to > > deploy our application on different data-centres. > > > > We've read briefly about the Autoscaling features of Solrcloud, but have > > not yet made use of them. However, I can't see whether Autoscaling allows > > us to dedicate specific nodes to a collection. > > > > Does anyone with any experience with this have any advice for us? > > > > Kind regards, > > > > Edd > > -------------------- > > Edward Turner > > -- http://www.needhamsoftware.com (work) http://www.the111shift.com (play)
