We have already done this @Yang Bo. But there are still some complicated scenarios. Firstly I addressed one of them is this PR: https://github.com/apache/incubator-servicecomb-java-chassis/pull/704
------------------ ???????? ------------------ ??????: "Yang Bo"<oaky...@gmail.com>; ????????: 2018??5??16??(??????) ????11:36 ??????: "dev"<dev@servicecomb.apache.org>; ????: Re: [Discussion]About service instances discovery reliable problems We may do something like this: Keep a copy of the instance/metadata information in clientside, and when the SC is down, the client can still use the local information to visit services. On Wed, May 16, 2018 at 11:15 AM, Willem Jiang <willem.ji...@gmail.com> wrote: > If we treat the service center as an online service, it should provide 7*24 > services. > But if we use the standalone service center, it could be challenge for the > service center provide 7*24 service. > > How can we setup the instance refresh strategy? > We may need to provide different solution for different user case. > > > Willem Jiang > > Blog: http://willemjiang.blogspot.com (English) > http://jnn.iteye.com (Chinese) > Twitter: willemjiang > Weibo: ????willem > > On Mon, May 14, 2018 at 4:52 PM, bismy <bi...@qq.com> wrote: > > > Supporting gray release need a lot of facilities to make it work and > > service center upgrading can not apply gray release sometimes. > > And other scenarios like standalone application(not cloud services) > > restart is quite common. And base services restart can't influence user's > > service communication. > > > > > > ------------------ ???????? ------------------ > > ??????: "wjm wjm"<zzz...@gmail.com>; > > ????????: 2018??5??14??(??????) ????4:23 > > ??????: "dev@servicecomb.apache.org"<dev@servicecomb.apache.org>; > > > > ????: Re: [Discussion]About service instances discovery reliable problems > > > > > > > > it's a problem, but why business use gray release, but we reject to the > > solution? > > > > 2018??5??14??????????bismy <bi...@qq.com> ?????? > > > > > When service center all instances stoped and then started. This is > normal > > > when we are doing maintenance. e.g. upgrading > > > > > > > > > > > > > > > ------------------ ???????? ------------------ > > > ??????: "wjm wjm"<zzz...@gmail.com>; > > > ????????: 2018??5??14??(??????) ????12:36 > > > ??????: "dev"<dev@servicecomb.apache.org>; > > > > > > ????: Re: [Discussion]About service instances discovery reliable problems > > > > > > > > > > > > " When service center restarted" > > > > > > that means one instance of SC cluster, or whole SC cluster? > > > even one instance restart will clear all information? > > > > > > 2018-05-14 12:03 GMT+08:00 bismy <bi...@qq.com>: > > > > > > > Hi All, > > > > > > > > > > > > Now we meet a reliable problem. When service center restarted, It > will > > > > clear all service instances information. > > > > And when SDK(java-chassis) queries instance list periodically, it > will > > > get > > > > an empty list and invocation will fail. > > > > > > > > > > > > In order to resolve this problem, two solutions is suggested: > > > > 1. service center provide instances persistence mechanism. When > service > > > > center restarted, it will restore instance information, > > > > and re-calculate the timeout information(e.g. reset instance last > > active > > > > time to startup time). If he gets the heartbeat from instance, the > > > instance > > > > will not be removed, and after timeout, > > > > it can removed instances, like the normal way. > > > > 2. SDK need to take special care with instances remove. SDK don't > > > > actually remove instances when he gets empty list from service center > > and > > > > it will ping the instances. If ping return > > > > OK, the instance will not removed. > > > > > > > > > > > > Known consequencies: > > > > Solution 2: > > > > a. Conflicts with service center white/black rule. > > > > b. In docker or some instances changed frequently scenario, the > > ip/port > > > > is reused by many services when service start/stop, and service > health > > > URL > > > > may also be the same. So it will give a lot of 400 like error when > > > > instances is not updated. > > > > > > > > > > > > Any suggestions? > > > -- Best Regards, Yang.