Re: SolrCloud upgrade concern

Jan Høydahl Thu, 28 May 2020 06:02:56 -0700

I had a client who asked a lot about CDCR a few years ago, but I kept 
recommending
aginst it and recommended them to go for Ericks’s alternative (2), since they 
anyway
needed to replicate their Oracle DBs in each DC as well. Much cleaner design to 
let
each cluster have a local datasource and always stay in sync with local DB than 
to 
replicate both DB and index.


There are of course use cases where you want to sync a read-only copy of indices
to multiple DCs. I hope we’ll see a 3rd party tool for that some day, something 
that
can sit outside your Solr clusters, monitor ZK of each cluster, and do some 
magic :)

Jan

> 28. mai 2020 kl. 01:17 skrev Erick Erickson <erickerick...@gmail.com>:
> 
> The biggest issue with CDCR is it’s rather fragile and requires monitoring, 
> it’s not a “fire and forget” type of functionality. For instance, the use of 
> the
> tlogs as a queueing mechanism means that if, for any reason, the 
> communications
> between DCs is broken, the tlogs will grow forever until the connection is
> re-established. Plus the other issues Jason pointed out.
> 
> So yes, some companies do use CDCR to communicate between separate
> DCs. But they also put in some “roll your own” type of monitoring to insure
> things don’t go haywire.
> 
> Alternatives:
> 1> use something that’s built from the ground up to provide reliable 
>     messaging between DCs. Kafka or similar has been mentioned. Write
>     your updates to the Kafka queue and consume them in both DCs.
>     These kinds of solutions have a lot more robustness.
> 
> 2> reproduce your system-of-record rather than Solr in the DCs and 
>   treat the DCs as separate installations. If you adopt this approach,
>  some of the streaming capabilities can be used to monitor that they stay
>  in sync. For instance have a background or periodic task that’ll take a while
>  for a complete run wrap two "search" streams in a "unique” decorator, 
>  anything except an empty result identifies docs not on both DCs.
> 
> 3> Oh Dear. This one is “interesting”. Wrap a “topic" stream on DC1 in 
>    an update decorator for DC2 and wrap both of those in a daemon decorator.
>   That’s gobbledygook, and you’ll have to dig through the docs a bit for
>   that to make sense. Essentially the topic stream is one of the very few 
>   streams that does not (IIRC) require all values in the fl list be docValues.
>   It fires the first time and establishes a checkpoint, finding all docs up 
> to that point.
>   Thereafter, it’ll get docs that have changed since the last time it ran. It 
> uses a tiny 
>   collection for record keeping. Each time the topic stream finds new docs, 
> it passes
>  them to the update stream which sends them to another DC. Wrapping the whole
>  thing in a daemon decorator means it periodically runs in the background. 
> The one
>  shortcoming is that this approach doesn’t propagate deletes. That’s enough 
> of that
>  until you tell us whether it sounds worth pursuing ;)
> 
> So overall, you _can_ use CDCR to connect remote DCs, but it takes time and 
> energy
> to make it robust. Its advantage is that it’s entirely contained within Solr. 
> But it’s not
> getting much attention lately, meaning nobody has decided the functionality 
> is important
> enough to them to donate the time/resources to make it more robust. Were 
> someone
> to take an active interest in it, likely it could be kept around as a plugin 
> that core Solr
> is not responsible for.
> 
> Best,
> Erick
> 
>> On May 27, 2020, at 4:43 PM, gnandre <arnoldbron...@gmail.com> wrote:
>> 
>> Thanks, Jason. This is very helpful.
>> 
>> I should clarify though that I am not using CDCR currently with my
>> existing master-slave architecture. What I meant to say earlier was that we
>> will be relying heavily on the CDCR feature if we migrate from solr
>> master-slave architecture to solrcloud architecture. Are there any
>> alternatives to CDCR? AFAIK, if you want to replicate between different
>> data centers then CDCR is the only option. Also, when you say lot of
>> customers are using SolrCloud successfully, how are they working around the
>> CDCR situation? Do they not have any data center use cases? Is there some
>> list maintained somewhere where one can find which companies are using
>> SolrCloud successfully?
>> 
>> 
>> 
>> On Wed, May 27, 2020 at 9:27 AM Jason Gerlowski <gerlowsk...@gmail.com>
>> wrote:
>> 
>>> Hi Arnold,
>>> 
>>> From what I saw in the community, CDCR saw an initial burst of
>>> development around when it was contributed, but hasn't seen much
>>> attention or improvement since.  So while it's been around for a few
>>> years, I'm not sure it's improved much in terms of stability or
>>> compatibility with other Solr features.
>>> 
>>> Some of the bigger ticket issues still open around CDCR:
>>> - SOLR-11959 no support for basic-auth
>>> - SOLR-12842 infinite retry of failed update-requests (leads to
>>> sync/recovery problems)
>>> - SOLR-12057 no real support for NRT/TLOG/PULL replicas
>>> - SOLR-10679 no support for collection aliases
>>> 
>>> These are in addition to other more architectural issues: CDCR can be
>>> a bottleneck on clusters with high ingestion rates, CDCR uses
>>> full-index-replication more than traditional indexing setups, which
>>> can cause issues with modern index sizes, etc.
>>> 
>>> So, unfortunately, no real good news in terms of CDCR maturing much in
>>> recent releases.  Joel Bernstein filed a JIRA recently suggesting its
>>> removal entirely actually.  Though I don't think it's gone anywhere.
>>> 
>>> That said, I gather from what you said that you're already using CDCR
>>> successfully with Master-Slave.  If none of these pitfalls are biting
>>> you in your current Master-Slave setup, you might not be bothered by
>>> them any more in SolrCloud.  Most of the problems with CDCR are
>>> applicable in master-slave as well as SolrCloud.  I wouldn't recommend
>>> CDCR if you were starting from scratch, and I still recommend you
>>> consider other options.  But since you're already using it with some
>>> success, it might be an orthogonal concern to your potential migration
>>> to SolrCloud.
>>> 
>>> Best of luck deciding!
>>> 
>>> Jason
>>> 
>>> On Fri, May 22, 2020 at 7:06 PM gnandre <arnoldbron...@gmail.com> wrote:
>>>> 
>>>> Thanks for this reply, Jason.
>>>> 
>>>> I am mostly worried about CDCR feature. I am relying heavily on it.
>>>> Although, I am planning to use Solr 8.3. It has been long time since CDCR
>>>> was first introduced. I wonder what is the state of CDCR is 8.3. Is it
>>>> stable now?
>>>> 
>>>> On Wed, Jan 22, 2020, 8:01 AM Jason Gerlowski <gerlowsk...@gmail.com>
>>> wrote:
>>>> 
>>>>> Hi Arnold,
>>>>> 
>>>>> The stability and complexity issues Mark highlighted in his post
>>>>> aren't just imagined - there are real, sometimes serious, bugs in
>>>>> SolrCloud features.  But at the same time there are many many stable
>>>>> deployments out there where SolrCloud is a real success story for
>>>>> users.  Small example, I work at a company (Lucidworks) where our main
>>>>> product (Fusion) is built heavily on top of SolrCloud and we see it
>>>>> deployed successfully every day.
>>>>> 
>>>>> In no way am I trying to minimize Mark's concerns (or David's).  There
>>>>> are stability bugs.  But the extent to which those need affect you
>>>>> depends a lot on what your deployment looks like.  How many nodes?
>>>>> How many collections?  How tightly are you trying to squeeze your
>>>>> hardware?  Is your network flaky?  Are you looking to use any of
>>>>> SolrCloud's newer, less stable features like CDCR, etc.?
>>>>> 
>>>>> Is SolrCloud better for you than Master/Slave?  It depends on what
>>>>> you're hoping to gain by a move to SolrCloud, and on your answers to
>>>>> some of the questions above.  I would be leery of following any
>>>>> recommendations that are made without regard for your reason for
>>>>> switching or your deployment details.  Those things are always the
>>>>> biggest driver in terms of success.
>>>>> 
>>>>> Good luck making your decision!
>>>>> 
>>>>> Best,
>>>>> 
>>>>> Jason
>>>>> 
>>> 
>

Re: SolrCloud upgrade concern

Reply via email to