Register a ServiceCacheListener and whenever cacheChanged() is called, write the current state to disk. Writing your own cache is not trivial.
-Jordan > On May 21, 2016, at 3:12 AM, Moshiko Kasirer <[email protected]> wrote: > > We know. But ours is on a file so the next time the app is started and can't > connect to zk it has a cluster view taken from that file... Your cache is in > memory cache afaik > > בתאריך 21 במאי 2016 05:58, "Jordan Zimmerman" <[email protected] > <mailto:[email protected]>> כתב: > You don’t need to maintain your own cache. Service Discovery already handles > that. > > -Jordan > >> On May 20, 2016, at 5:36 PM, Moshiko Kasirer <[email protected] >> <mailto:[email protected]>> wrote: >> >> We are using nginx as our web tier which delegate requests to app nodes >> using consistent hashing to one of the registered app nodes. Since we have >> many web and app nodes we have to make sure all available app nodes are >> known to the web tier and that in any given time they all see the same app >> nodes picture. So we built an app on top of your service discovery that when >> app node ris up he register and web tier is listening to that cluster and >> changes his available app nodes view.In adoption we handle situations when >> there is on connection to zk using a cache file with latest available view >> until the connection is restored. For some reason sometimes although zk is >> up and running the curator connection to which we listen to know if we >> should reregister isn't invoked meaning stays as LOST... >> >> בתאריך 21 במאי 2016 01:23, "Jordan Zimmerman" <[email protected] >> <mailto:[email protected]>> כתב: >> Retry policy is only used for individual operations. Any client-server >> system needs to have retries to avoid temporary network events. The entire >> curator-client and curator-framework modules are written to handle ZooKeeper >> client connection maintenance. So, there isn’t one thing I can point to. >> >> Internally, the ServiceDiscovery code uses a PathChildrenCache instance. If >> all you are using is Service Discovery there is almost no need for you to >> monitor the connection state. What are you trying to accomplish? >> >> -Jordan >> >>> On May 20, 2016, at 5:19 PM, Moshiko Kasirer <[email protected] >>> <mailto:[email protected]>> wrote: >>> >>> The thing is we have many negative tests in which we stop and start the zk >>> quorum the issue I raised only happens from time to time.... So it's hat I >>> hard to reproduce. But you just wrote that when the quorom is up the >>> connection should be reconnected ... how? who does that? ZkClient or >>> curator? That is not related to retry policy? >>> >>> בתאריך 21 במאי 2016 01:12, "Jordan Zimmerman" <[email protected] >>> <mailto:[email protected]>> כתב: >>> If the ZK cluster’s quorum is restored, then the connection state should >>> change to RECONNECTED. There are copious tests in Curator itself that show >>> this. If you’re seeing that Curator does not restore a broken connection >>> then there is a deeper bug. Can you create a test that shows the problem? >>> >>> -Jordan >>> >>>> On May 20, 2016, at 5:07 PM, Moshiko Kasirer <[email protected] >>>> <mailto:[email protected]>> wrote: >>>> >>>> I mean that while zk cluster is up the curator connection state stays LOST >>>> Which in our case means the app node in which it happens doesnt register >>>> himself as avalable.... I just don't seem to understand when does curator >>>> gives up on trying to connect zk and when he doesn't give up. >>>> Thanks for the help ! >>>> >>>> בתאריך 21 במאי 2016 00:58, "Jordan Zimmerman" <[email protected] >>>> <mailto:[email protected]>> כתב: >>>> You must have a retry policy so that you don’t overwhelm your network and >>>> ZooKeeper cluster. The example code shows how to create a reasonable one. >>>>> sometimes although zk cluster is up the curator service discovery >>>>> connection isn't >>>>> >>>> Service Discovery’s internal instances might be waiting based on the retry >>>> policy. But, what do you mean by "curator service discovery connection >>>> isn’t”? There isn’t such a thing as a service discovery connection. >>>> >>>> -Jordan >>>> >>>>> On May 20, 2016, at 4:53 PM, Moshiko Kasirer <[email protected] >>>>> <mailto:[email protected]>> wrote: >>>>> >>>>> We are using your service discovery. So you are saying I should not care >>>>> about the retry policy...? So the only thing left to explain is how come >>>>> sometimes although zk cluster is up the curator service discovery >>>>> connection isn't..... >>>>> >>>>> בתאריך 21 במאי 2016 00:43, "Jordan Zimmerman" >>>>> <[email protected] <mailto:[email protected]>> כתב: >>>>> If you are using Curator’s Service Discovery code, it will be >>>>> continuously re-trying the connections. This is not because of the retry >>>>> policy it’s because the Service Discovery code manages connection >>>>> interruptions internally. >>>>> >>>>> -Jordan >>>>> >>>>>> On May 20, 2016, at 4:40 PM, Moshiko Kasirer <[email protected] >>>>>> <mailto:[email protected]>> wrote: >>>>>> >>>>>> Thanks for the replay I will send those logs ASAP. >>>>>> It's difficult to understand the connection mechanism of zk .... >>>>>> We are using curator 2.10 as our service discovery so we have to make >>>>>> sure that when zk is alive we connect and inform the our server is up we >>>>>> do that by listening to curator connection listener which I think has >>>>>> also to do with the retry policy.... But what I can't understand is why >>>>>> sometimes we can see in the log that curator gave up (Lost) yet still a >>>>>> second later curator connection is restored how? Is it because zk >>>>>> session heartbeat restored the connection? Does that Iovine curator to >>>>>> change his connection state? And on the other side we sometimes get to a >>>>>> point were zk is up but curator connection stays as Lost... >>>>>> That is why I thought of using the new always try policy you entered do >>>>>> you think it can help? That why hope there will be no way that zk is up >>>>>> but curator status is lost.....as once he will retry he will reconnect >>>>>> to zk.... Is that correct? >>>>>> >>>>>> בתאריך 21 במאי 2016 00:10, "Jordan Zimmerman" >>>>>> <[email protected] <mailto:[email protected]>> כתב: >>>>>> Curator’s retry policies are used within each CuratorFramework >>>>>> operation. For example, when you call client.setData().forPath(p, b) the >>>>>> retry policy will be invoked if there is a retry-able exception during >>>>>> the operation. In addition to the retryPolicy, there are connection >>>>>> timeouts. The behavior of how this is handled changed between Curator >>>>>> 2.x and Curator 3.x. In Curator 2.x, for every iteration of the retry, >>>>>> the operation will wait until connection timeout when there’s no >>>>>> connection. In Curator 3.x, the connection timeout wait only occurs once >>>>>> (if the default ConnectionHandlingPolicy is used). >>>>>> >>>>>> In any event, ZooKeeper itself tries to maintain the connection. Also, >>>>>> Curator will re-create the internally managed connection depending >>>>>> various network interruptions, etc. I’d need to see the logs to give you >>>>>> more input. >>>>>> >>>>>> -Jordan >>>>>> >>>>>>> On May 19, 2016, at 10:12 AM, Moshiko Kasirer <[email protected] >>>>>>> <mailto:[email protected]>> wrote: >>>>>>> >>>>>>> first i would like to thank you about curator we are using it as part >>>>>>> of our service discovery >>>>>>> >>>>>>> solution and it helps a lot!! >>>>>>> >>>>>>> i have a question i hope you will be able to help me with. >>>>>>> >>>>>>> its regarding the curator retry policy it seems to me we dont really >>>>>>> understand when this policy is >>>>>>> >>>>>>> invoked, as i see in our logs that although i configured it as max >>>>>>> retry 1 actually in the logs i see >>>>>>> >>>>>>> many ZK re connection attempts (and many curator gave up messages but >>>>>>> later i see >>>>>>> >>>>>>> reconnected status...) . is it possible that that policy is only >>>>>>> relevant to manually invoked >>>>>>> >>>>>>> operations against the ZK cluster done via curator ? and that the re >>>>>>> connections i see in the logs >>>>>>> >>>>>>> are caused by the fact that the ZK was available during start up so >>>>>>> sessions were created and >>>>>>> >>>>>>> then when ZK was down the ZK clients (not curator) are sending >>>>>>> heartbeats as part of the ZK >>>>>>> >>>>>>> architecture? that is the part i am failing to understand and i hope >>>>>>> you can help me with that. >>>>>>> >>>>>>> you have recently added RetreyAllways policy and i wanted to know if it >>>>>>> is save to use it? >>>>>>> >>>>>>> the thing is we always want to retry to reconnect to ZK when he is >>>>>>> available but that is something >>>>>>> >>>>>>> the ZK client does as long as he has open sessions right? i am not >>>>>>> sure that it has to do with the >>>>>>> >>>>>>> retry policy ... >>>>>>> >>>>>>> thanks, >>>>>>> >>>>>>> moshiko >>>>>>> >>>>>>> -- >>>>>>> >>>>>>> Moshiko Kasirer >>>>>>> Software Engineer >>>>>>> T: +972-74-700-4357 <tel:%2B972-74-700-4357> >>>>>>> <http://www.linkedin.com/company/164748> >>>>>>> <http://twitter.com/liveperson> >>>>>>> <http://www.facebook.com/LivePersonInc> We Create Meaningful >>>>>>> Connections >>>>>>> <http://roia.biz/im/n/ndiXvq1BAAGhL0MAABW7QgABwExmMQA-A/> >>>>>>> >>>>>>> This message may contain confidential and/or privileged information. >>>>>>> If you are not the addressee or authorized to receive this on behalf of >>>>>>> the addressee you must not use, copy, disclose or take action based on >>>>>>> this message or any information herein. >>>>>>> If you have received this message in error, please advise the sender >>>>>>> immediately by reply email and delete this message. Thank you. >>>>>> >>>>>> >>>>>> This message may contain confidential and/or privileged information. >>>>>> If you are not the addressee or authorized to receive this on behalf of >>>>>> the addressee you must not use, copy, disclose or take action based on >>>>>> this message or any information herein. >>>>>> If you have received this message in error, please advise the sender >>>>>> immediately by reply email and delete this message. Thank you. >>>>> >>>>> >>>>> This message may contain confidential and/or privileged information. >>>>> If you are not the addressee or authorized to receive this on behalf of >>>>> the addressee you must not use, copy, disclose or take action based on >>>>> this message or any information herein. >>>>> If you have received this message in error, please advise the sender >>>>> immediately by reply email and delete this message. Thank you. >>>> >>>> >>>> This message may contain confidential and/or privileged information. >>>> If you are not the addressee or authorized to receive this on behalf of >>>> the addressee you must not use, copy, disclose or take action based on >>>> this message or any information herein. >>>> If you have received this message in error, please advise the sender >>>> immediately by reply email and delete this message. Thank you. >>> >>> >>> This message may contain confidential and/or privileged information. >>> If you are not the addressee or authorized to receive this on behalf of the >>> addressee you must not use, copy, disclose or take action based on this >>> message or any information herein. >>> If you have received this message in error, please advise the sender >>> immediately by reply email and delete this message. Thank you. >> >> >> This message may contain confidential and/or privileged information. >> If you are not the addressee or authorized to receive this on behalf of the >> addressee you must not use, copy, disclose or take action based on this >> message or any information herein. >> If you have received this message in error, please advise the sender >> immediately by reply email and delete this message. Thank you. > > > This message may contain confidential and/or privileged information. > If you are not the addressee or authorized to receive this on behalf of the > addressee you must not use, copy, disclose or take action based on this > message or any information herein. > If you have received this message in error, please advise the sender > immediately by reply email and delete this message. Thank you.
