Anshum: I know I’ve been recommending something like this to clients for a while, do you think a call to the community for people who’ve already put something in the middle might net us some good info on the lurking gremlins? Mind you “recommend” hasn’t actually involved me _doing_ it so I don’t have any actual experience there…
But yeah, absolutely +1 for something making this easier for clients... Erick > On Dec 5, 2020, at 11:43 AM, Ilan Ginzburg <[email protected]> wrote: > > That's an interesting initiative Anshum! > > I can see at least two different approaches here, your mention of SolrJ seems > to hint at the first one: > 1. Get the data as it comes from the client and fork it to local and remote > data centers, > 2. Create (an asynchronous) stream replicating local data center data to > remote. > > Option 1 is strongly consistent but adds latency and potentially blocking on > the critical path. > Option 2 could look like remote PULL replicas, might have lower impact on the > local data center but has to deal with the remote data center always being > somewhat behind. If the client application can handle that, the performance > and efficiency gain (as well as simpler implementation? It doesn't require > another persistence layer) might be worth it... > > Ilan > > On Fri, Dec 4, 2020 at 5:24 PM Anshum Gupta <[email protected]> wrote: > Hi everyone, > > Large scale Solr installations often require cross data-center replication in > order to achieve data replication for both, access latency reasons as well as > disaster recovery. In the past users have either designed their own solutions > to deal with this or have tried to rely on the now-deprecated CDCR. > > It would be really good to have support for cross data-center replication > within Solr, that is offered and supported by the community. This would allow > the effort around this shared problem to converge. > > I’d like to propose a new solution based on my experiences at my day job. The > key points about this approach: > • Uses an external, configurable, messaging system in the middle for > actual replication/mirroring. > • We offer an abstraction and some default implementations based on > what we can support and what users really want. An example here would be > Kafka. > • This would be a separate repository allowing it to have its own > release cadence. We shouldn’t have to release this with every Solr release as > the overlap is just limited to SolrJ interactions. > > I’ll share a more detailed and evolving document soon with the design for > everyone else to contribute to but wanted to share this as I’m starting to > work on this and wanted to avoid parallel efforts towards the same end-goal. > > -- > Anshum Gupta --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
