Replying to old thread. Currently I am trying test client side failover programtically. Are there any working examples /best practices for this?
Idea is to failover to different secondary cluster if the primary fails. I will try to share what i am able to do so far. But if anyone has a working example for this then it would be great. With best regards, Ashish On Tue, Apr 16, 2019, 9:37 PM Anthony Baker <aba...@pivotal.io> wrote: > The pattern I’ve seen used looks like this: > > User application (e.g. browser) >> Global load balancer >> Service > instances (e.g. tomcat) >> Geode cluster > > If you have the Geode clusters connected via WAN, you can redirect traffic > to different data centers by tweaking the LB config. > > > Anthony > > > On Apr 16, 2019, at 2:58 AM, aashish choudhary < > aashish.choudha...@gmail.com> wrote: > > So with WAN we will be active/active all time. I agree hardest is to > figure out when data centers are actually down. We are evaluating multiple > approaches as of now. > > On that note would you recommend(possibility any since connection is > mostly tcp/ip) using some load balancer NGINX or something to handle data > center failure. > > With best regards, > Ashish > > On Tue, Apr 16, 2019, 12:54 AM Michael Stolz <mst...@pivotal.io> wrote: > >> We have come across these kinds of use-cases. >> The hardest part is figuring out that one of the data centers is ACTUALLY >> down. >> >> If you can work out a way to be active/active at all times and guard >> against update collisions by using data structures that protect themselves >> (e.g. CRDTs) that would make the whole thing a lot easier. >> -- >> Mike Stolz >> Principal Engineer, GemFire Product Lead >> Mobile: +1-631-835-4771 >> >> >> >> On Mon, Apr 15, 2019 at 1:19 PM aashish choudhary < >> aashish.choudha...@gmail.com> wrote: >> >>> Thanks Mike. Our use case is heavily reliant on Geode(no fallback to >>> database) and business expectation is that there will be no downtime to >>> consumer application because of complete failure on one data center. Which >>> >>> Have you came across such cases with Geode/Gemfire? >>> >>> Regarding catching those exceptions and making a switch I agree with you >>> that it would be tricky to make switch as you explained. >>> >>> Even with rolling restart there will be a downtime and some manual steps >>> will be required to accomplish that. >>> >>> With best regards, >>> Ashish >>> >>> On Thu, Apr 11, 2019, 10:18 PM Michael Stolz <mst...@pivotal.io> wrote: >>> >>>> Yes you can catch the exceptions for no locators available and no >>>> servers available. >>>> You will probably want to wait for a period of time after first seeing >>>> this, because the cluster might be restarting and will be back in just a >>>> minute or so. >>>> >>>> The switch-over can be tricky without just restarting your client. >>>> >>>> All saved references to everything having to do with ClientCache, >>>> Cache, Region, or anything else that communicates need to be forgotten and >>>> re-established. >>>> This can be particularly challenging if you are using a framework that >>>> might remember some of this stuff on your behalf. >>>> I have usually recommended rolling restart of the clients with the new >>>> locator addresses because it is sure to work and not have any hidden issues >>>> with calls in progress or subscriptions or anything like that. >>>> >>>> >>>> -- >>>> Mike Stolz >>>> Principal Engineer, GemFire Product Lead >>>> Mobile: +1-631-835-4771 >>>> >>>> >>>> >>>> On Thu, Apr 11, 2019 at 9:31 AM aashish choudhary < >>>> aashish.choudha...@gmail.com> wrote: >>>> >>>>> Thanks Mike. >>>>> >>>>> Yes we are using wan replication. We want the switch to be an >>>>> automatic step. As soon as prod cluster fails we need to switch to cob >>>>> without any restart of the client application. >>>>> >>>>> One way we are thinking of is probably catching those locator not >>>>> available sort of exception and then make a switch. Any thoughts? >>>>> >>>>> >>>>> With best regards, >>>>> Ashish >>>>> >>>>> >>>>> On Thu, Apr 11, 2019, 1:42 AM Michael Stolz <mst...@pivotal.io> wrote: >>>>> >>>>>> If the data centers are far apart you will want to use the >>>>>> bi-directional GemFire WAN Gateway to replicate between clusters. >>>>>> >>>>>> The trickiest part is figuring out when to switch. If you already >>>>>> have a mechanism for that then that's great. >>>>>> >>>>>> Once you know for sure you want to switch, the easiest way is to >>>>>> install a gemfire.properties file on the client machines that points to >>>>>> the >>>>>> locators in the other data center and restart the clients. >>>>>> >>>>>> There is a programmatic way to do it but is a lot more code and work >>>>>> than this way. >>>>>> >>>>>> Feel free to ask any additional questions here. >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Mike Stolz >>>>>> Principal Engineer, GemFire Product Lead >>>>>> Mobile: +1-631-835-4771 >>>>>> >>>>>> On Wed, Apr 10, 2019, 2:01 PM aashish choudhary < >>>>>> aashish.choudha...@gmail.com> wrote: >>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> We have a scenario where in we need to switch over to a different >>>>>>> data center automatically when any failure occurs in the existing >>>>>>> cluster. >>>>>>> >>>>>>> Any recommendations? >>>>>>> >>>>>>> >>>>>>> With best regards, >>>>>>> Ashish >>>>>>> >>>>>> >