[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-3211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16752066#comment-16752066
 ] 

yeshuangshuang commented on ZOOKEEPER-3211:
-------------------------------------------

maoling
 Using the jstack tool to check, no deadlock was found. The presence of the 
stack sense in the NIOfactory was very frequent in our test project, especially 
with exception scenarios (disk loss, array resync, etc.)
I don't know how to locate the root cause of the problem


> zookeeper standalone mode,found a high level bug in kernel of centos7.0 
> ,zookeeper Server's  tcp/ip socket connections(default 60 ) are CLOSE_WAIT 
> ,this lead to zk can't work for client any more
> --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: ZOOKEEPER-3211
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3211
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: server
>    Affects Versions: 3.4.5
>         Environment: 1.zoo.cfg
> server.1=127.0.0.1:2902:2903
> 2.kernel
> kernel:Linux localhost.localdomain 3.10.0-123.el7.x86_64 #1 SMP Tue Feb 12 
> 19:44:50 EST 2019 x86_64 x86_64 x86_64 GNU/Linux
> JDK:
> java version "1.7.0_181"
> OpenJDK Runtime Environment (rhel-2.6.14.5.el7-x86_64 u181-b00)
> OpenJDK 64-Bit Server VM (build 24.181-b00, mixed mode)
> zk: 3.4.5
>            Reporter: yeshuangshuang
>            Priority: Blocker
>             Fix For: 3.4.5
>
>         Attachments: 1.log, 2018-12-09_124131.png, 2018-12-09_124210.png, 
> 2018-12-09_132854.png, 2018-12-09_133017.png, 2018-12-09_133049.png, 
> 2018-12-09_133111.png, 2018-12-09_133131.png, 2018-12-09_133150.png, 
> 2018-12-09_133210.png, 2018-12-09_133229.png, 2018-12-09_133248.png, 
> 2018-12-09_133320.png
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> 1.config--zoo.cfg
> server.1=127.0.0.1:2902:2903
> 2.kernel version
> version:Linux localhost.localdomain 3.10.0-123.el7.x86_64 #1 SMP Tue Feb 12 
> 19:44:50 EST 2019 x86_64 x86_64 x86_64 GNU/Linux
> JDK:
> java version "1.7.0_181"
> OpenJDK Runtime Environment (rhel-2.6.14.5.el7-x86_64 u181-b00)
> OpenJDK 64-Bit Server VM (build 24.181-b00, mixed mode)
> zk: 3.4.5
> 3.bug details:
> Occasionally,But the recurrence probability is extremely high. At first, the 
> read-write timeout takes about 6s, and after a few minutes, all connections 
> (including long ones) will be CLOSE_WAIT state.
> 4.:Circumvention scheme: it is found that all connections become close_wait 
> to restart the zookeeper server side actively



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to