Re: SolrCloud "master mode" planned?

David Smiley Wed, 26 Apr 2017 14:07:05 -0700

> On Apr 26, 2017, at 4:35 PM, Upayavira <[email protected]> wrote:
> 
> I have done a *lot* of automating this. Redoing it recently it was quite 
> embarrassing to realise how much complexity there is involved in it - it is 
> crazy hard to get a basic, production ready SolrCloud setup running.


Would you mind enumerating a list of what sort of issues you ran into deploying 
ZooKeeper in a production config?  A quick draft list of sorts just to get a 
sense of what sort of stuff generally you had to contend with.  I recently did 
it in a Docker/Kontena infrastructure.  I did not find it to be hard; maybe 
medium :-).  I got the nodes working out of the box with minimal effort but had 
to make changes to harden it.
* I found the existing official Docker image for Zookeeper lacking in that I 
couldn't easily specify the "auto purge" settings, which default to no purging 
which is unacceptable.
* I set "-XX:+CrashOnOutOfMemoryError" so that the process would end when an 
OOM occurs so that Kontena (Docker orchestrator) would notice its down so it 
could restart it (a rare event obviously).  Users not using a container 
environment might not care about this I guess.  This was merely a configuration 
setting; no Docker image hack needed.  
* I also ensured I used the latest ZK 3.4.6 release.... I recall 3.4.4 (or 
maybe even 3.4.5?) cached DNS entries without re-looking up if it failed which 
is particularly problematic in a container environment where it's common for 
services to get a new IP when they are restarted.  Thankfully I did not learn 
that issue the hard way; I recall a blog warning of this issue by Shalin or 
Martijn Koster.  No action from me here other than ensuring I used an 
appropriate new version.  Originally out of laziness I used Confluent's Docker 
image but I knew I would have to switch because of this issue.

> One thing that is hard is getting a ZooKeeper ensemble going - using 
> Exhibitor makes it much easier.
> 
> Something that has often occurred to me is, why do we require people to go 
> download a separate ZooKeeper, and work out how to install and configure it, 
> when we have it embedded already? Why can't we just have a 'bin/solr zk 
> start' command which starts an "embedded" zookeeper, but without Solr. To 
> really make it neat, we offer some way (a la Exhibitor) for multiple 
> concurrently started ZK nodes to autodiscover each other, then getting our 
> three ZK nodes up won't be quite so treacherous.

I've often thought the same -- why not just embed it.  People say it's not a 
"production config" but this is only because we all keep telling us this is in 
an echo chamber and we believe ourselves :-P

~ David

> 
> On Wed, 26 Apr 2017, at 03:58 PM, Mike Drob wrote:
>> Could the zk role also be guaranteed to run the Overseer (and no 
>> collections)? If we already have that separated out, it would make sense to 
>> put it with the embedded zk. I think you can already configure and place 
>> things manually this way, but it would be a huge win to package it all up 
>> nicely for users and set it to turnkey operation.
>> 
>> I think it was a great improvement for deployment when we dropped tomcat, 
>> this is the next logical step.
>> 
>> Mike
>> 
>> On Wed, Apr 26, 2017, 4:22 AM Jan Høydahl <[email protected] 
>> <mailto:[email protected]>> wrote:
>> There have been suggestions to add a “node controller” process which again 
>> could start Solr and perhaps ZK on a node.
>> 
>> But adding a new “zk” role which would let that node start (embedded) ZK I 
>> cannot recall. It would of course make a deploy simpler if ZK was hidden as 
>> a solr role/feature and perhaps assigned to N nodes, moved if needed etc. If 
>> I’m not mistaken ZK 3.5 would make such more dynamic setups easier but is 
>> currently in beta.
>> 
>> Also, in these days of containers, I kind of like the concept of spinning up 
>> N ZK containers that the Solr containers connect to and let Kubernetes or 
>> whatever you use take care of placement, versions etc. So perhaps the need 
>> for a production-ready solr-managed zk is not as big as it used to be, or 
>> maybe even undesirable? For production Windows installs I could still 
>> clearly see a need though.
>> 
>> --
>> Jan Høydahl, search solution architect
>> Cominvent AS - www.cominvent.com <http://www.cominvent.com/>
>> 
>>> 25. apr. 2017 kl. 23.30 skrev Ishan Chattopadhyaya 
>>> <[email protected] <mailto:[email protected]>>:
>>> 
>>> Hi Otis,
>>> I've been working on, and shall be working on, a few issues on the lines of 
>>> "hide ZK".
>>> 
>>> SOLR-6736: Uploading configsets can now be done through Solr nodes instead 
>>> of uploading them to ZK.
>>> SOLR-10272: Use a _default configset, with the intention of not needing the 
>>> user to bother about the concept of configsets unless he needs to
>>> SOLR-10446 (SOLR-9057): User can use CloudSolrClient without access to ZK
>>> SOLR-8440: Enabling BasicAuth security through bin/solr script
>>> Ability to edit security.json through the bin/solr script
>>> Having all this in place, and perhaps some more that I may be missing, 
>>> should hopefully not need the user to know much about ZK.
>>> 
>>> 1. Do you have suggestions on what more needs to be done for "hiding ZK"?
>>> 2. Do you have suggestions on how to track this overall theme of "hiding 
>>> ZK"? Some of these issues I mentioned are associated with other epics, so I 
>>> don't know if creating a "hiding ZK" epic and having these (and other 
>>> issues) as sub-tasks is a good idea (maybe it is). Alternatively, how about 
>>> tracking these (and other issues) using some label?
>>> Regards,
>>> Ishan
>>> 
>>> 
>>> 
>>> On Wed, Apr 26, 2017 at 2:39 AM, Otis Gospodnetić 
>>> <[email protected] <mailto:[email protected]>> wrote:
>>> Hi,
>>> 
>>> This thread about Solr master-slave vs. SolrCloud deployment poll seems to 
>>> point out people find SolrCloud (the ZK part of it) deployment complex:
>>> 
>>> http://search-lucene.com/m/Solr/eHNlfm4WpJPVR92?subj=Re+Poll+Master+Slave+or+SolrCloud+
>>>  
>>> <http://search-lucene.com/m/Solr/eHNlfm4WpJPVR92?subj=Re+Poll+Master+Slave+or+SolrCloud+>
>>> 
>>> It could be just how information is presented...
>>> ... or how ZK is exposed as something external, which it is...
>>> 
>>> Are there plans to "hide ZK"?  Or maybe have the notion of master-only (not 
>>> as in master-slave, but as in running ZK only, not hosting data) mode for 
>>> SolrCloud nodes (a la ES)?  
>>> 
>>> I peeked at JIRA, but couldn't find anything about that, although I seem to 
>>> recall some mention of embedding ZK to make things easier for SolrCloud 
>>> users.  I think I saw that at some Lucene Revolution talk?
>>> 
>>> Thanks,
>>> Otis
>>> --
>>> Monitoring - Log Management - Alerting - Anomaly Detection
>>> Solr & Elasticsearch Consulting Support Training - http://sematext.com/ 
>>> <http://sematext.com/>
>>> 
>

Re: SolrCloud "master mode" planned?

Reply via email to