[ 
https://issues.apache.org/jira/browse/SOLR-14306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17052526#comment-17052526
 ] 

Jan Høydahl commented on SOLR-14306:
------------------------------------

{quote}It seems that refactoring coordination code into a separate module is a 
great first step for whichever direction we go in the future.
{quote}
+1.

The single biggest obstacle I sense when helping customers with SolrCloud is 
Zookeeper. How do we install it, how many, nodes, how to secure it, can ZK run 
on same nodes as Solr, can we use embedded ZK in our test environment etc. And 
I think ZK will be an even bigger topic when more people start deploying in 
k8s. So if we manage to isolate coordination and cluster state on a higher 
level, then offering etcd or ratis plugins in the future will be within reach.

> Refactor coordination code into separate module and evaluate using Curator
> --------------------------------------------------------------------------
>
>                 Key: SOLR-14306
>                 URL: https://issues.apache.org/jira/browse/SOLR-14306
>             Project: Solr
>          Issue Type: Improvement
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: SolrCloud
>            Reporter: Tomas Eduardo Fernandez Lobbe
>            Priority: Major
>
> This Jira issue is to discuss two changes that unfortunately are difficult to 
> address separately
>  # Separate all ZooKeeper coordination logic into it’s own module, that can 
> be tested in isolation
>  # Evaluate using Apache Curator for coordination instead of our own logic.
> I drafted a 
> [SIP|https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=148640472],
>  but this is very much WIP, I’d like to hear opinions before I spend too much 
> time on something people hates.
> From the initial draft of the SIP:
> {quote}The main goal of this change is to allow better testing of the 
> different ZooKeeper interactions related to coordination (leader election, 
> queues, etc).
There are already some abstractions in place for lower level 
> operations (set-data, get-data, etc, see DistribStateManager), so the idea is 
> to have a new, related abstraction named CoordinationManager, where we could 
> have some higher level coordination-related classes, like LeaderRunner 
> (Overseer), LeaderLatch (for shard leaders), etc. 
Curator comes into place 
> because, in order to refactor the existing code into these new abstractions, 
> we’d have to rework much of it, so we could instead consider using Curator, a 
> library that was mentioned in the past many times. While I don’t think this 
> is required, It would make this transition and our code simpler (from what I 
> could see, however, input from people with more Curator experience would be 
> greatly appreciated).
>  While it would be out of the scope of this change, If the 
> abstractions/interfaces are correctly designed, this could lead to, in the 
> future, be able to use something other than ZooKeeper for coordination, 
> either etcd or maybe even some in-memory replacement for tests.
> {quote}
> There are still many open questions, and many questions I still don’t know 
> we’ll have, but please, let me know if you have any early feedback, specially 
> if you’ve worked with Curator in the past.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to