[ 
https://issues.apache.org/jira/browse/SOLR-11285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrzej Bialecki  updated SOLR-11285:
-------------------------------------
    Attachment: SOLR-11285.patch

This patch extends the scope of {{ClusterDataProvider}} interface that was 
already used in the policy framework to include ZK-like and Solr-like 
operations, which then can be delegated to real ZK / Solr or to their mocks.

Changes in this patch allow (almost...) running {{OverseerTriggerThread}} with 
simulated ZK / Solr. One open issue is that {{Assign}} uses 
{{ReplicaAssigner}}, which uses snitches and CoreContainer - in this patch I 
punted on changing this, it's too entangled, but probably could be changed to 
use the same approach as {{SolrClientDataProvider.getNodeValues}}.

Another open issue was the refactoring of {{DistributedQueue}} and widening of 
"throws" clauses, which are no longer ZK-specific - this probably needs to be 
partially reverted, or a set of specialized exception classes needs to be 
introduced instead of ZK-specific ones.

The patch came out quite large, but most of it are pretty rote substitutions / 
renames to use the {{ClusterDataProvider}} interface instead of 
{{ZkStateReader}}, {{SolrZkClient}} etc. It's probably best to review the 
changes using branch {{jira/solr-11285}}.

> Support simulations at scale in the autoscaling framework
> ---------------------------------------------------------
>
>                 Key: SOLR-11285
>                 URL: https://issues.apache.org/jira/browse/SOLR-11285
>             Project: Solr
>          Issue Type: Improvement
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: AutoScaling
>            Reporter: Andrzej Bialecki 
>            Assignee: Andrzej Bialecki 
>         Attachments: SOLR-11285.patch
>
>
> This is a spike to investigate how difficult it would be to modify the 
> autoscaling framework so that it's possible to run simulated large-scale 
> experiments and test its dynamic behavior without actually spinning up a 
> large cluster.
> Currently many components rely heavily on actual Solr, ZK and behavior of ZK 
> watches, or insist on making actual HTTP calls. Notable exception is the core 
> Policy framework where most of the ZK / Solr details are abstracted.
> As the algorithms for autoscaling that we implement become more and more 
> complex the ability to effectively run multiple large simulations will be 
> crucial - it's very easy to unknowingly introduce catastrophic instabilities 
> that don't manifest themselves in regular unit tests.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to