Thanks for all this information. It clears lot of confusion surrounding CDCR feature. Although, I should say that if CDCR functionality is so fragile in SolrCloud and not worth pursuing much, does it make sense to add some warning about its possible shortcomings in the documentation?
On Thu, May 28, 2020 at 9:02 AM Jan Høydahl <jan....@cominvent.com> wrote: > I had a client who asked a lot about CDCR a few years ago, but I kept > recommending > aginst it and recommended them to go for Ericks’s alternative (2), since > they anyway > needed to replicate their Oracle DBs in each DC as well. Much cleaner > design to let > each cluster have a local datasource and always stay in sync with local DB > than to > replicate both DB and index. > > There are of course use cases where you want to sync a read-only copy of > indices > to multiple DCs. I hope we’ll see a 3rd party tool for that some day, > something that > can sit outside your Solr clusters, monitor ZK of each cluster, and do > some magic :) > > Jan > > > 28. mai 2020 kl. 01:17 skrev Erick Erickson <erickerick...@gmail.com>: > > > > The biggest issue with CDCR is it’s rather fragile and requires > monitoring, > > it’s not a “fire and forget” type of functionality. For instance, the > use of the > > tlogs as a queueing mechanism means that if, for any reason, the > communications > > between DCs is broken, the tlogs will grow forever until the connection > is > > re-established. Plus the other issues Jason pointed out. > > > > So yes, some companies do use CDCR to communicate between separate > > DCs. But they also put in some “roll your own” type of monitoring to > insure > > things don’t go haywire. > > > > Alternatives: > > 1> use something that’s built from the ground up to provide reliable > > messaging between DCs. Kafka or similar has been mentioned. Write > > your updates to the Kafka queue and consume them in both DCs. > > These kinds of solutions have a lot more robustness. > > > > 2> reproduce your system-of-record rather than Solr in the DCs and > > treat the DCs as separate installations. If you adopt this approach, > > some of the streaming capabilities can be used to monitor that they stay > > in sync. For instance have a background or periodic task that’ll take a > while > > for a complete run wrap two "search" streams in a "unique” decorator, > > anything except an empty result identifies docs not on both DCs. > > > > 3> Oh Dear. This one is “interesting”. Wrap a “topic" stream on DC1 in > > an update decorator for DC2 and wrap both of those in a daemon > decorator. > > That’s gobbledygook, and you’ll have to dig through the docs a bit for > > that to make sense. Essentially the topic stream is one of the very > few > > streams that does not (IIRC) require all values in the fl list be > docValues. > > It fires the first time and establishes a checkpoint, finding all docs > up to that point. > > Thereafter, it’ll get docs that have changed since the last time it > ran. It uses a tiny > > collection for record keeping. Each time the topic stream finds new > docs, it passes > > them to the update stream which sends them to another DC. Wrapping the > whole > > thing in a daemon decorator means it periodically runs in the > background. The one > > shortcoming is that this approach doesn’t propagate deletes. That’s > enough of that > > until you tell us whether it sounds worth pursuing ;) > > > > So overall, you _can_ use CDCR to connect remote DCs, but it takes time > and energy > > to make it robust. Its advantage is that it’s entirely contained within > Solr. But it’s not > > getting much attention lately, meaning nobody has decided the > functionality is important > > enough to them to donate the time/resources to make it more robust. Were > someone > > to take an active interest in it, likely it could be kept around as a > plugin that core Solr > > is not responsible for. > > > > Best, > > Erick > > > >> On May 27, 2020, at 4:43 PM, gnandre <arnoldbron...@gmail.com> wrote: > >> > >> Thanks, Jason. This is very helpful. > >> > >> I should clarify though that I am not using CDCR currently with my > >> existing master-slave architecture. What I meant to say earlier was > that we > >> will be relying heavily on the CDCR feature if we migrate from solr > >> master-slave architecture to solrcloud architecture. Are there any > >> alternatives to CDCR? AFAIK, if you want to replicate between different > >> data centers then CDCR is the only option. Also, when you say lot of > >> customers are using SolrCloud successfully, how are they working around > the > >> CDCR situation? Do they not have any data center use cases? Is there > some > >> list maintained somewhere where one can find which companies are using > >> SolrCloud successfully? > >> > >> > >> > >> On Wed, May 27, 2020 at 9:27 AM Jason Gerlowski <gerlowsk...@gmail.com> > >> wrote: > >> > >>> Hi Arnold, > >>> > >>> From what I saw in the community, CDCR saw an initial burst of > >>> development around when it was contributed, but hasn't seen much > >>> attention or improvement since. So while it's been around for a few > >>> years, I'm not sure it's improved much in terms of stability or > >>> compatibility with other Solr features. > >>> > >>> Some of the bigger ticket issues still open around CDCR: > >>> - SOLR-11959 no support for basic-auth > >>> - SOLR-12842 infinite retry of failed update-requests (leads to > >>> sync/recovery problems) > >>> - SOLR-12057 no real support for NRT/TLOG/PULL replicas > >>> - SOLR-10679 no support for collection aliases > >>> > >>> These are in addition to other more architectural issues: CDCR can be > >>> a bottleneck on clusters with high ingestion rates, CDCR uses > >>> full-index-replication more than traditional indexing setups, which > >>> can cause issues with modern index sizes, etc. > >>> > >>> So, unfortunately, no real good news in terms of CDCR maturing much in > >>> recent releases. Joel Bernstein filed a JIRA recently suggesting its > >>> removal entirely actually. Though I don't think it's gone anywhere. > >>> > >>> That said, I gather from what you said that you're already using CDCR > >>> successfully with Master-Slave. If none of these pitfalls are biting > >>> you in your current Master-Slave setup, you might not be bothered by > >>> them any more in SolrCloud. Most of the problems with CDCR are > >>> applicable in master-slave as well as SolrCloud. I wouldn't recommend > >>> CDCR if you were starting from scratch, and I still recommend you > >>> consider other options. But since you're already using it with some > >>> success, it might be an orthogonal concern to your potential migration > >>> to SolrCloud. > >>> > >>> Best of luck deciding! > >>> > >>> Jason > >>> > >>> On Fri, May 22, 2020 at 7:06 PM gnandre <arnoldbron...@gmail.com> > wrote: > >>>> > >>>> Thanks for this reply, Jason. > >>>> > >>>> I am mostly worried about CDCR feature. I am relying heavily on it. > >>>> Although, I am planning to use Solr 8.3. It has been long time since > CDCR > >>>> was first introduced. I wonder what is the state of CDCR is 8.3. Is it > >>>> stable now? > >>>> > >>>> On Wed, Jan 22, 2020, 8:01 AM Jason Gerlowski <gerlowsk...@gmail.com> > >>> wrote: > >>>> > >>>>> Hi Arnold, > >>>>> > >>>>> The stability and complexity issues Mark highlighted in his post > >>>>> aren't just imagined - there are real, sometimes serious, bugs in > >>>>> SolrCloud features. But at the same time there are many many stable > >>>>> deployments out there where SolrCloud is a real success story for > >>>>> users. Small example, I work at a company (Lucidworks) where our > main > >>>>> product (Fusion) is built heavily on top of SolrCloud and we see it > >>>>> deployed successfully every day. > >>>>> > >>>>> In no way am I trying to minimize Mark's concerns (or David's). > There > >>>>> are stability bugs. But the extent to which those need affect you > >>>>> depends a lot on what your deployment looks like. How many nodes? > >>>>> How many collections? How tightly are you trying to squeeze your > >>>>> hardware? Is your network flaky? Are you looking to use any of > >>>>> SolrCloud's newer, less stable features like CDCR, etc.? > >>>>> > >>>>> Is SolrCloud better for you than Master/Slave? It depends on what > >>>>> you're hoping to gain by a move to SolrCloud, and on your answers to > >>>>> some of the questions above. I would be leery of following any > >>>>> recommendations that are made without regard for your reason for > >>>>> switching or your deployment details. Those things are always the > >>>>> biggest driver in terms of success. > >>>>> > >>>>> Good luck making your decision! > >>>>> > >>>>> Best, > >>>>> > >>>>> Jason > >>>>> > >>> > > > >