On 7/28/2016 9:38 AM, Andy C wrote:
> Would it make sense to use the embedded Zookeeper instance in this
> situation? I have seen warning that the embedded Zookeeper should not
> be used in production deployments, but the reason generally given is
> that if Solr goes down Zookeeper will also go down, which doesn't seem
> relevant here. Are there other reasons not to use the embedded Zookeeper?

The embedded zookeeper uses code copied from a fairly old version of
zookeeper and slightly modified.  This was needed at the time SolrCloud
was created because that version of zookeeper would fail to start if the
"myid" file was missing or didn't contain a valid server ID.  In order
for Solr to be able to control the the embedded ZK sufficiently, it
wasn't possible to include the myid file with Solr, so the hack was needed.

Because SolrCloud uses copied code to parse the zoo.cfg file and start
the embedded zookeeper, it will not support ZK features added after 3.2,
like snapshot auto-purge.

Recently, Zookeeper was changed so it will work without a myid file if
there are no "server" lines in the config, so the code hack in SolrCloud
is no longer required.  It will take some time for Solr's code to be
changed to take advantage of this.

As far as functionality, the embedded zookeeper will do fine for non-HA
deployments, but it does mean there will be differences between your
production and non-HA environments in *doing* the deployment, and in how
Solr is configured/started.  If that's acceptable to you, and you do not
need advanced ZK features, then the embedded ZK would be good enough for
non-HA environments.

I personally would still use standalone ZK even for a dev environment,
just to reduce the number of things that are different from production.

Thanks,
Shawn

Reply via email to