> On Apr 26, 2017, at 4:35 PM, Upayavira <[email protected]> wrote: > > I have done a *lot* of automating this. Redoing it recently it was quite > embarrassing to realise how much complexity there is involved in it - it is > crazy hard to get a basic, production ready SolrCloud setup running.
Would you mind enumerating a list of what sort of issues you ran into deploying ZooKeeper in a production config? A quick draft list of sorts just to get a sense of what sort of stuff generally you had to contend with. I recently did it in a Docker/Kontena infrastructure. I did not find it to be hard; maybe medium :-). I got the nodes working out of the box with minimal effort but had to make changes to harden it. * I found the existing official Docker image for Zookeeper lacking in that I couldn't easily specify the "auto purge" settings, which default to no purging which is unacceptable. * I set "-XX:+CrashOnOutOfMemoryError" so that the process would end when an OOM occurs so that Kontena (Docker orchestrator) would notice its down so it could restart it (a rare event obviously). Users not using a container environment might not care about this I guess. This was merely a configuration setting; no Docker image hack needed. * I also ensured I used the latest ZK 3.4.6 release.... I recall 3.4.4 (or maybe even 3.4.5?) cached DNS entries without re-looking up if it failed which is particularly problematic in a container environment where it's common for services to get a new IP when they are restarted. Thankfully I did not learn that issue the hard way; I recall a blog warning of this issue by Shalin or Martijn Koster. No action from me here other than ensuring I used an appropriate new version. Originally out of laziness I used Confluent's Docker image but I knew I would have to switch because of this issue. > One thing that is hard is getting a ZooKeeper ensemble going - using > Exhibitor makes it much easier. > > Something that has often occurred to me is, why do we require people to go > download a separate ZooKeeper, and work out how to install and configure it, > when we have it embedded already? Why can't we just have a 'bin/solr zk > start' command which starts an "embedded" zookeeper, but without Solr. To > really make it neat, we offer some way (a la Exhibitor) for multiple > concurrently started ZK nodes to autodiscover each other, then getting our > three ZK nodes up won't be quite so treacherous. I've often thought the same -- why not just embed it. People say it's not a "production config" but this is only because we all keep telling us this is in an echo chamber and we believe ourselves :-P ~ David > > On Wed, 26 Apr 2017, at 03:58 PM, Mike Drob wrote: >> Could the zk role also be guaranteed to run the Overseer (and no >> collections)? If we already have that separated out, it would make sense to >> put it with the embedded zk. I think you can already configure and place >> things manually this way, but it would be a huge win to package it all up >> nicely for users and set it to turnkey operation. >> >> I think it was a great improvement for deployment when we dropped tomcat, >> this is the next logical step. >> >> Mike >> >> On Wed, Apr 26, 2017, 4:22 AM Jan Høydahl <[email protected] >> <mailto:[email protected]>> wrote: >> There have been suggestions to add a “node controller” process which again >> could start Solr and perhaps ZK on a node. >> >> But adding a new “zk” role which would let that node start (embedded) ZK I >> cannot recall. It would of course make a deploy simpler if ZK was hidden as >> a solr role/feature and perhaps assigned to N nodes, moved if needed etc. If >> I’m not mistaken ZK 3.5 would make such more dynamic setups easier but is >> currently in beta. >> >> Also, in these days of containers, I kind of like the concept of spinning up >> N ZK containers that the Solr containers connect to and let Kubernetes or >> whatever you use take care of placement, versions etc. So perhaps the need >> for a production-ready solr-managed zk is not as big as it used to be, or >> maybe even undesirable? For production Windows installs I could still >> clearly see a need though. >> >> -- >> Jan Høydahl, search solution architect >> Cominvent AS - www.cominvent.com <http://www.cominvent.com/> >> >>> 25. apr. 2017 kl. 23.30 skrev Ishan Chattopadhyaya >>> <[email protected] <mailto:[email protected]>>: >>> >>> Hi Otis, >>> I've been working on, and shall be working on, a few issues on the lines of >>> "hide ZK". >>> >>> SOLR-6736: Uploading configsets can now be done through Solr nodes instead >>> of uploading them to ZK. >>> SOLR-10272: Use a _default configset, with the intention of not needing the >>> user to bother about the concept of configsets unless he needs to >>> SOLR-10446 (SOLR-9057): User can use CloudSolrClient without access to ZK >>> SOLR-8440: Enabling BasicAuth security through bin/solr script >>> Ability to edit security.json through the bin/solr script >>> Having all this in place, and perhaps some more that I may be missing, >>> should hopefully not need the user to know much about ZK. >>> >>> 1. Do you have suggestions on what more needs to be done for "hiding ZK"? >>> 2. Do you have suggestions on how to track this overall theme of "hiding >>> ZK"? Some of these issues I mentioned are associated with other epics, so I >>> don't know if creating a "hiding ZK" epic and having these (and other >>> issues) as sub-tasks is a good idea (maybe it is). Alternatively, how about >>> tracking these (and other issues) using some label? >>> Regards, >>> Ishan >>> >>> >>> >>> On Wed, Apr 26, 2017 at 2:39 AM, Otis Gospodnetić >>> <[email protected] <mailto:[email protected]>> wrote: >>> Hi, >>> >>> This thread about Solr master-slave vs. SolrCloud deployment poll seems to >>> point out people find SolrCloud (the ZK part of it) deployment complex: >>> >>> http://search-lucene.com/m/Solr/eHNlfm4WpJPVR92?subj=Re+Poll+Master+Slave+or+SolrCloud+ >>> >>> <http://search-lucene.com/m/Solr/eHNlfm4WpJPVR92?subj=Re+Poll+Master+Slave+or+SolrCloud+> >>> >>> It could be just how information is presented... >>> ... or how ZK is exposed as something external, which it is... >>> >>> Are there plans to "hide ZK"? Or maybe have the notion of master-only (not >>> as in master-slave, but as in running ZK only, not hosting data) mode for >>> SolrCloud nodes (a la ES)? >>> >>> I peeked at JIRA, but couldn't find anything about that, although I seem to >>> recall some mention of embedding ZK to make things easier for SolrCloud >>> users. I think I saw that at some Lucene Revolution talk? >>> >>> Thanks, >>> Otis >>> -- >>> Monitoring - Log Management - Alerting - Anomaly Detection >>> Solr & Elasticsearch Consulting Support Training - http://sematext.com/ >>> <http://sematext.com/> >>> >
