Re: legacy replication

David Hastings Fri, 15 Dec 2017 11:13:22 -0800

Understandable.  Right now we have a large set up of solr 5.x servers that
has been doing great for years.  But the time to upgrade has come, with
some things that we want that are not available in the 5.x branch.  I
really like legacy ( master/slave) replication, for the reasons you stated,
but also the fact that the cloud set up seems perfect, if you have a
handful of cheap machines around.  Our production set up has 1 indexer,
which has a 5 minute polling slave, and on releases we have 3 searching
servers that poll manually.   Tjhing is, these machines have over 32 cores
and over 200gb of ram with 2TB SSDs, each, these were not cheap and are
pretty fast with standalone solr.  Also the complexity of adding another 3
or more machines just to do nothing but ZK stuff was getting out of hand.
if its not broken, im not about to fix it


In any case im glad to hear legacy replication will stay.
Thanks,
-Dave

On Fri, Dec 15, 2017 at 1:15 PM, Walter Underwood <wun...@wunderwood.org>
wrote:

> I love legacy replication. It is simple and bulletproof. Loose coupling
> for the win! We only run Solr Cloud when we need sharding or NRT search.
> Loose coupling is a very, very good thing in distributed systems.
>
> Adding a replica (new slave) is trivial. Clone an existing one. This makes
> horizontal scaling so easy. We still haven’t written the procedure and
> scripts for scaling our Solr Cloud cluster. Last time, it was 100% manual
> through the admin UI.
>
> Setting up a Zookeeper ensemble isn’t as easy as it should be. We tried to
> set up a five node ensemble with ZK 3.4.6 and finally gave up after two
> weeks because it was blocking the release. We are using the three node
> 3.4.5 ensemble that had been set up for something else a couple of years
> earlier. I’ve had root on Unix since 1981 and have been running TCP/IP
> since 1983, so I should have been able to figure this out.
>
> We’ve had some serious prod problems with the Solr Cloud cluster, like
> cores stuck in a permanent recovery loop. I finally manually deleted that
> core and created a new one. Ugly.
>
> Even starting Solr Cloud processes is confusing. It took a while to figure
> out they were all joining as the same host (no, I don’t know why), so now
> we start them as: solr start -cloud -h `hostname`
>
> Keeping configs under source control and deploying them isn’t easy. I’m
> not going to install Solr on the Jenkins executor just so it can deploy,
> that is weird and kind of a chicken and egg thing. I ended up writing a
> Python program to get the ZK address from the cluster, use kazoo to load
> directly to ZK, then tell the cluster to reload. Both with that and with
> the provided ZK tools I ran into so much undocumented stuff. What is
> linking? How to the file config directories map to the ZK config
> directories? And so on.
>
> The lack of a thread pool for requests is a very serious problem. If our
> 6.5.1 cluster gets overloaded, it creates 4000 threads, runs out of memory
> and fails. That is just wrong. With earlier versions of Solr, it would get
> slower and slower, but recover gracefully.
>
> Converting a slave into a master is easy. We use this in the config file:
>
>    <lst name="master">
>       <str name="enable">${enable.master:false}</str>
>   …
>   <lst name="slave">
>      <str name="enable">${textbooks.enable.slave:false}</str>
>
> And this at startup (slave config shown): -Denable.master=false
> -Denable.slave=true
>
> Change the properties and restart.
>
> Our 6.5.1 cluster is faster than the non-sharded 4.10.4 master/slave
> cluster, but I’m not happy with the stability in prod. We’ve had more
> search outages in the past six months than we had in the previous four
> years. I’ve had Solr in prod since version 1.2, and this is the first time
> it has really embarrassed me.
>
> There are good things. Search is faster, we’re handling double the query
> volume with 3X the docs.
>
> Sorry for the rant, but it has not been a good fall semester for our
> students (customers).
>
> Walter Underwood
> wun...@wunderwood.org
> http://observer.wunderwood.org/  (my blog)
>
>
> > On Dec 15, 2017, at 9:46 AM, Erick Erickson <erickerick...@gmail.com>
> wrote:
> >
> > There's pretty much zero chance that it'll go away, too much current
> > and ongoing functionality that depends on it.
> >
> > 1> old-style replication has always been used for "full sync" in
> > SolrCloud when peer sync can't be done.
> >
> > 2> The new TLOG and PULL replica types are a marriage of old-style
> > master/slave and SolrCloud. In particular a PULL replica is
> > essentially an old-style slave. A TLOG replica is an old-style slave
> > that also maintains a transaction log so it can take over leadership
> > if necessary.
> >
> > Best,
> > Erick
> >
> > On Fri, Dec 15, 2017 at 8:56 AM, David Hastings
> > <hastings.recurs...@gmail.com> wrote:
> >> So i dont step on the other thread, I want to be assured whether or not
> >> legacy master/slave/repeater replication will continue to be supported
> in
> >> future solr versions.  our infrastructure is set up for this and all
> the HA
> >> redundancies that solrcloud provides we have already spend a lot of time
> >> and resources with very expensive servers to handle solr in standalone
> >> mode.
> >>
> >> thanks.
> >> -David
>
>

Re: legacy replication

Reply via email to