Hi Eric, What are you recommendations for SolrCloud DR strategy.
Thanks, Raji On Sun, Mar 29, 2020 at 6:25 PM Erick Erickson <erickerick...@gmail.com> wrote: > I don’t recommend CDCR at this point, I think there better approaches. > > The root problem is that CDCR uses tlog files as a queueing mechanism. > If the connection between the DCs is broken for any reason, the tlogs grow > without limit. This could probably be fixed, but a better alternative is to > use something designed to insure messages (updates) are delivered to > separate DCs rathe than try to have CDCR re-invent that wheel. > > Best, > Erick > > > On Mar 29, 2020, at 6:47 PM, S G <sg.online.em...@gmail.com> wrote: > > > > Is CDCR even recommended to be used in production? > > Or it was abandoned before it could become production ready ? > > > > Thanks > > SG > > > > > > On Sun, Mar 29, 2020 at 5:18 AM Erick Erickson <erickerick...@gmail.com> > > wrote: > > > >> What that error usually means is that there are a zillion threads > running. > >> > >> Try taking a thread dump. It’s _probable_ that it’s CDCR, but > >> take a look at the thread dump to see if you have lots of > >> threads that are running. Any by “lots” here, I mean 100s of threads > >> that reference the same component, in this case that have cdcr in > >> the stack trace. > >> > >> CDCR is not getting active work at this point, you might want to > >> consider another replication strategy if you’re not willing to fix > >> the code. > >> > >> Best, > >> Erick > >> > >>> On Mar 29, 2020, at 4:17 AM, Raji N <rajis...@gmail.com> wrote: > >>> > >>> Hi All, > >>> > >>> We running solrcloud 7.6 (with the patch # > >>> > >> > https://issues.apache.org/jira/secure/attachment/12969150)/SOLR-11724.patchon > >>> production on 7 hosts in containers. The container memory is 48GB , > heap > >>> is 24GB. > >>> ulimit -v > >>> > >>> unlimited > >>> > >>> ulimit -m > >>> > >>> unlimited > >>> We don't have any custom code in solr. We have set up bidirectional > CDCR > >>> between primary and secondary Datacenter. Our secondary DC is very > >> unstable > >>> and many times many instances are down. > >>> > >>> We get below exception quite often. Is this because the CDCR connection > >> is > >>> broken. > >>> > >>> WARN (cdcr-update-log-synchronizer-80-thread-1) [ ] > >>> o.a.s.h.CdcrUpdateLogSynchronizer Caught unexpected exception > >>> > >>> java.lang.OutOfMemoryError: unable to create new native thread > >>> > >>> at java.lang.Thread.start0(Native Method) ~[?:1.8.0_211] > >>> > >>> at java.lang.Thread.start(Thread.java:717) ~[?:1.8.0_211] > >>> > >>> at > >>> > >> > org.apache.http.impl.client.IdleConnectionEvictor.start(IdleConnectionEvictor.java:96) > >>> ~[httpclient-4.5.3.jar:4.5.3] > >>> > >>> at > >>> > >> > org.apache.http.impl.client.HttpClientBuilder.build(HttpClientBuilder.java:1219) > >>> ~[httpclient-4.5.3.jar:4.5.3] > >>> > >>> at > >>> > >> > org.apache.solr.client.solrj.impl.HttpClientUtil.createClient(HttpClientUtil.java:319) > >>> ~[solr-solrj-7.6.0.jar:7.6.0 719cde97f84640faa1e3525690d262946571245f > >>> - nknize - 2018-12-07 14:47:53] > >>> > >>> at > >>> > >> > org.apache.solr.client.solrj.impl.HttpClientUtil.createClient(HttpClientUtil.java:330) > >>> ~[solr-solrj-7.6.0.jar:7.6.0 719cde97f84640faa1e3525690d262946571245f > >>> - nknize - 2018-12-07 14:47:53] > >>> > >>> at > >>> > >> > org.apache.solr.client.solrj.impl.HttpClientUtil.createClient(HttpClientUtil.java:268) > >>> ~[solr-solrj-7.6.0.jar:7.6.0 719cde97f84640faa1e3525690d262946571245f > >>> - nknize - 2018-12-07 14:47:53] > >>> > >>> at > >>> > >> > org.apache.solr.client.solrj.impl.HttpClientUtil.createClient(HttpClientUtil.java:255) > >>> ~[solr-solrj-7.6.0.jar:7.6.0 719cde97f84640faa1e3525690d262946571245f > >>> - nknize - 2018-12-07 14:47:53] > >>> > >>> at > >>> > >> > org.apache.solr.client.solrj.impl.HttpSolrClient.<init>(HttpSolrClient.java:200) > >>> ~[solr-solrj-7.6.0.jar:7.6.0 719cde97f84640faa1e3525690d262946571245f > >>> - nknize - 2018-12-07 14:47:53] > >>> > >>> at > >>> > >> > org.apache.solr.client.solrj.impl.HttpSolrClient$Builder.build(HttpSolrClient.java:957) > >>> ~[solr-solrj-7.6.0.jar:7.6.0 719cde97f84640faa1e3525690d262946571245f > >>> - nknize - 2018-12-07 14:47:53] > >>> > >>> at > >>> > >> > org.apache.solr.handler.CdcrUpdateLogSynchronizer$UpdateLogSynchronisation.run(CdcrUpdateLogSynchronizer.java:139) > >>> [solr-core-7.6.0.jar:7.6.0-SNAPSHOT > >>> 34d82ed033cccd8120431b73e93554b85b24a278 - i843100 - 2019-09-30 > >>> 14:02:46] > >>> > >>> at > >>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > >>> [?:1.8.0_211] > >>> > >>> at > >>> java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) > >>> [?:1.8.0_211] > >>> > >>> at > >>> > >> > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) > >>> [?:1.8.0_211] > >>> > >>> at > >>> > >> > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) > >>> [?:1.8.0_211] > >>> > >>> at > >>> > >> > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > >>> [?:1.8.0_211] > >>> > >>> at > >>> > >> > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > >>> [?:1.8.0_211] > >>> > >>> at java.lang.Thread.run(Thread.java:748) [?:1.8.0_211] > >>> > >>> Thanks, > >>> Raji > >> > >> > >