This is an automated email from the ASF dual-hosted git repository.

linghengqian pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/shardingsphere-elasticjob.git


The following commit(s) were added to refs/heads/master by this push:
     new caab1ea5b Update FAQ document about service fake death (#2262)
caab1ea5b is described below

commit caab1ea5be934b854b476ddec5ca6ed867d6e3e3
Author: zjx990 <[email protected]>
AuthorDate: Thu Sep 14 00:34:55 2023 +0800

    Update FAQ document about service fake death (#2262)
    
    * Update FAQ document about service fake death
    
    ---------
    
    Co-authored-by: zhaojiaxu3 <[email protected]>
---
 docs/content/faq/_index.cn.md | 19 +++++++++++++++++++
 docs/content/faq/_index.en.md | 17 +++++++++++++++++
 2 files changed, 36 insertions(+)

diff --git a/docs/content/faq/_index.cn.md b/docs/content/faq/_index.cn.md
index 8d5489261..18cd5dbe0 100644
--- a/docs/content/faq/_index.cn.md
+++ b/docs/content/faq/_index.cn.md
@@ -129,3 +129,22 @@ Mesos 相关请参考 [Apache Mesos](https://mesos.apache.org/)。
 1. 指定网卡 eno1:`-Delasticjob.preferred.network.interface=eno1`。
 1. 指定IP地址 192.168.0.100:`-Delasticjob.preferred.network.ip=192.168.0.100`。
 1. 泛指IP地址(正则表达式) 192.168.*:`-Delasticjob.preferred.network.ip=192.168.*`。
+
+## 15. zk授权升级,在滚动部署过程中出现实例假死,回退到历史版本也依然存在假死。
+
+回答:
+
+在滚动部署过程中,会触发竞争选举leader,有密码的实例会给zk目录加密导致无密码的实例不可访问,最终导致整体选举阻塞。
+
+例如:
+
+通过日志可以发现会抛出-102异常:
+
+```bash
+xxxx-07-27 22:33:55.224 [DEBUG] [localhost-startStop-1-EventThread] [] [] [] - 
o.a.c.f.r.c.TreeCache : processResult: CuratorEventImpl{type=GET_DATA, 
resultCode=-102, 
path='/xxx/leader/election/latch/_c_bccccdcc-1134-4e0a-bb52-59a13836434a-latch-0000000047',
 name='null', children=null, context=null, stat=null, data=null, 
watchedEvent=null, aclList=null}
+```
+
+解决方案:
+
+1.如果您在升级的过程中出现回退历史版本也依然假死的问题,建议删除zk上所有作业目录,之后再重启历史版本。
+2.计算出合理的作业执行间隙,比如晚上21:00-21:30作业不会触发,在此期间先将实例全部停止,然后将带密码的版本全部部署上线。
\ No newline at end of file
diff --git a/docs/content/faq/_index.en.md b/docs/content/faq/_index.en.md
index 20d32efe9..976d2b886 100644
--- a/docs/content/faq/_index.en.md
+++ b/docs/content/faq/_index.en.md
@@ -128,3 +128,20 @@ For example
 1. specify the interface eno1: `-Delasticjob.preferred.network.interface=eno1`.
 1. specify network addresses, 192.168.0.100: 
`-Delasticjob.preferred.network.ip=192.168.0.100`.
 1. specify network addresses for regular expressions, 192.168.*: 
`-Delasticjob.preferred.network.ip=192.168.*`.
+
+## 15. During the zk authorization upgrade process, there was a false death of 
the instance during the rolling deployment process, and even if the historical 
version was rolled back, there was still false death.
+
+Answer:
+
+During the rolling deployment process, competitive election leaders will be 
triggered, and instances with passwords will encrypt the zk directory, making 
instances without passwords inaccessible, ultimately leading to overall 
election blocking.
+
+For example
+
+Through the logs, it can be found that an -102 exception will be thrown:
+
+```bash
+xxxx-07-27 22:33:55.224 [DEBUG] [localhost-startStop-1-EventThread] [] [] [] - 
o.a.c.f.r.c.TreeCache : processResult: CuratorEventImpl{type=GET_DATA, 
resultCode=-102, 
path='/xxx/leader/election/latch/_c_bccccdcc-1134-4e0a-bb52-59a13836434a-latch-0000000047',
 name='null', children=null, context=null, stat=null, data=null, 
watchedEvent=null, aclList=null}
+```
+
+1.If you encounter the issue of returning to the historical version and still 
pretending to be dead during the upgrade process, it is recommended to delete 
all job directories on zk and restart the historical version afterwards.
+2.Calculate a reasonable job execution gap, such as when the job will not 
trigger from 21:00 to 21:30 in the evening. During this period, first stop all 
instances, and then deploy all versions with passwords online.
\ No newline at end of file

Reply via email to