[
https://issues.apache.org/jira/browse/SOLR-9241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15384130#comment-15384130
]
Noble Paul commented on SOLR-9241:
----------------------------------
After Going through the patch and discussing with others I would recommend the
following for incorporating these features.
h3. AllocationStrategy
This will have to be replaced with the Replica placement strategy. If there are
missing features that we can add them
h3.Redistribute
This will be a new collection admin action REDISTRIBUTE. Optionally, the
command should accept node names
h3. Scale Up
This should be merged into the ADDREPLICA command. That command should accept
a call with a collection name (shard-name can be optional) and nu:of replicas
to be added. The system should automatically identify the nodes and create
replicas
h4. Scale down
Fold this into DELETEREPLICA command. The command should accept the no:of
replicas to remove
h4. Remove dead nodes
A new command DELETENODE to clean up all replicas in that node
h4. Replace
A new command called REPLACENODE to be created.
h4. Auto Shard
We need to revisit the whole strategy. The current model is neither scalable or
useful. It is not possible to merge all shards into one and split them later.
That cannot work for a seriously sized cluster. We need a new mechanism to
intelligently to merge shards. We then need to identify new hash ranges in such
a way that the no:of splits and merges are minimal. The command would be called
as RESHARD and we should pass it the no:of new shards to be created.
> Rebalance API for SolrCloud
> ---------------------------
>
> Key: SOLR-9241
> URL: https://issues.apache.org/jira/browse/SOLR-9241
> Project: Solr
> Issue Type: New Feature
> Components: SolrCloud
> Affects Versions: 6.1
> Environment: Ubuntu, Mac OsX
> Reporter: Nitin Sharma
> Labels: Cluster, SolrCloud
> Fix For: 6.1
>
> Attachments: Redistribute_After.jpeg, Redistribute_Before.jpeg,
> Redistribute_call.jpeg, Replace_After.jpeg, Replace_Before.jpeg,
> Replace_Call.jpeg, SOLR-9241-4.6.patch, SOLR-9241-6.1.patch
>
> Original Estimate: 2,016h
> Remaining Estimate: 2,016h
>
> This is the v1 of the patch for Solrcloud Rebalance api (as described in
> http://engineering.bloomreach.com/solrcloud-rebalance-api/) , built at
> Bloomreach by Nitin Sharma and Suruchi Shah. The goal of the API is to
> provide a zero downtime mechanism to perform data manipulation and efficient
> core allocation in solrcloud. This API was envisioned to be the base layer
> that enables Solrcloud to be an auto scaling platform. (and work in unison
> with other complementing monitoring and scaling features).
> Patch Status:
> ===============
> The patch is work in progress and incremental. We have done a few rounds of
> code clean up. We wanted to get the patch going first to get initial feed
> back. We will continue to work on making it more open source friendly and
> easily testable.
> Deployment Status:
> ====================
> The platform is deployed in production at bloomreach and has been battle
> tested for large scale load. (millions of documents and hundreds of
> collections).
> Internals:
> =============
> The internals of the API and performance :
> http://engineering.bloomreach.com/solrcloud-rebalance-api/
> It is built on top of the admin collections API as an action (with various
> flavors). At a high level, the rebalance api provides 2 constructs:
> Scaling Strategy: Decides how to move the data. Every flavor has multiple
> options which can be reviewed in the api spec.
> Re-distribute - Move around data in the cluster based on capacity/allocation.
> Auto Shard - Dynamically shard a collection to any size.
> Smart Merge - Distributed Mode - Helps merging data from a larger shard setup
> into smaller one. (the source should be divisible by destination)
> Scale up - Add replicas on the fly
> Scale Down - Remove replicas on the fly
> Allocation Strategy: Decides where to put the data. (Nodes with least
> cores, Nodes that do not have this collection etc). Custom implementations
> can be built on top as well. One other example is Availability Zone aware.
> Distribute data such that every replica is placed on different availability
> zone to support HA.
> Detailed API Spec:
> ====================
> https://github.com/bloomreach/solrcloud-rebalance-api
> Contributors:
> =====================
> Nitin Sharma
> Suruchi Shah
> Questions/Comments:
> =====================
> You can reach me at [email protected]
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]