[
https://issues.apache.org/jira/browse/SOLR-11285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Andrzej Bialecki updated SOLR-11285:
-------------------------------------
Attachment: SOLR-11285.patch
This patch extends the scope of {{ClusterDataProvider}} interface that was
already used in the policy framework to include ZK-like and Solr-like
operations, which then can be delegated to real ZK / Solr or to their mocks.
Changes in this patch allow (almost...) running {{OverseerTriggerThread}} with
simulated ZK / Solr. One open issue is that {{Assign}} uses
{{ReplicaAssigner}}, which uses snitches and CoreContainer - in this patch I
punted on changing this, it's too entangled, but probably could be changed to
use the same approach as {{SolrClientDataProvider.getNodeValues}}.
Another open issue was the refactoring of {{DistributedQueue}} and widening of
"throws" clauses, which are no longer ZK-specific - this probably needs to be
partially reverted, or a set of specialized exception classes needs to be
introduced instead of ZK-specific ones.
The patch came out quite large, but most of it are pretty rote substitutions /
renames to use the {{ClusterDataProvider}} interface instead of
{{ZkStateReader}}, {{SolrZkClient}} etc. It's probably best to review the
changes using branch {{jira/solr-11285}}.
> Support simulations at scale in the autoscaling framework
> ---------------------------------------------------------
>
> Key: SOLR-11285
> URL: https://issues.apache.org/jira/browse/SOLR-11285
> Project: Solr
> Issue Type: Improvement
> Security Level: Public(Default Security Level. Issues are Public)
> Components: AutoScaling
> Reporter: Andrzej Bialecki
> Assignee: Andrzej Bialecki
> Attachments: SOLR-11285.patch
>
>
> This is a spike to investigate how difficult it would be to modify the
> autoscaling framework so that it's possible to run simulated large-scale
> experiments and test its dynamic behavior without actually spinning up a
> large cluster.
> Currently many components rely heavily on actual Solr, ZK and behavior of ZK
> watches, or insist on making actual HTTP calls. Notable exception is the core
> Policy framework where most of the ZK / Solr details are abstracted.
> As the algorithms for autoscaling that we implement become more and more
> complex the ability to effectively run multiple large simulations will be
> crucial - it's very easy to unknowingly introduce catastrophic instabilities
> that don't manifest themselves in regular unit tests.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]