[
https://issues.apache.org/jira/browse/KAFKA-5153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16019542#comment-16019542
]
dhiraj prajapati commented on KAFKA-5153:
-----------------------------------------
Hi all,
We have a 3-node cluster on our production environment. We recently upgraded
kafka from 0.9.0.1 to 0.10.1.0 and we are seeing a similar issue of
intermittent disconnection. We never had this issue in 0.9.0.1.
Below is the exception stack trace:
[2017-05-15 09:33:55,398] WARN [ReplicaFetcherThread-0-2], Error in fetch
kafka.server.ReplicaFetcherThread$FetchRequest@7213d6d (kafka.server.
ReplicaFetcherThread)
java.io.IOException: Connection to 2 was disconnected before the response was
read
at
kafka.utils.NetworkClientBlockingOps$$anonfun$blockingSendAndReceive$extension$1$$anonfun$apply$1.apply(NetworkClientBlockingOps.sca
la:115)
at
kafka.utils.NetworkClientBlockingOps$$anonfun$blockingSendAndReceive$extension$1$$anonfun$apply$1.apply(NetworkClientBlockingOps.sca
la:112)
at scala.Option.foreach(Option.scala:257)
at
kafka.utils.NetworkClientBlockingOps$$anonfun$blockingSendAndReceive$extension$1.apply(NetworkClientBlockingOps.scala:112)
at
kafka.utils.NetworkClientBlockingOps$$anonfun$blockingSendAndReceive$extension$1.apply(NetworkClientBlockingOps.scala:108)
at
kafka.utils.NetworkClientBlockingOps$.recursivePoll$1(NetworkClientBlockingOps.scala:137)
at
kafka.utils.NetworkClientBlockingOps$.kafka$utils$NetworkClientBlockingOps$$pollContinuously$extension(NetworkClientBlockingOps.scal
a:143)
at
kafka.utils.NetworkClientBlockingOps$.blockingSendAndReceive$extension(NetworkClientBlockingOps.scala:108)
at
kafka.server.ReplicaFetcherThread.sendRequest(ReplicaFetcherThread.scala:253)
at
kafka.server.ReplicaFetcherThread.fetch(ReplicaFetcherThread.scala:238)
at
kafka.server.ReplicaFetcherThread.fetch(ReplicaFetcherThread.scala:42)
at
kafka.server.AbstractFetcherThread.processFetchRequest(AbstractFetcherThread.scala:118)
at
kafka.server.AbstractFetcherThread.doWork(AbstractFetcherThread.scala:103)
at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:63)
Is there a fix for this issue in any of the kafka 10 versions?
> KAFKA Cluster : 0.10.2.0 : Servers Getting disconnected : Service Impacting
> ---------------------------------------------------------------------------
>
> Key: KAFKA-5153
> URL: https://issues.apache.org/jira/browse/KAFKA-5153
> Project: Kafka
> Issue Type: Bug
> Affects Versions: 0.10.2.0
> Environment: RHEL 6
> Java Version 1.8.0_91-b14
> Reporter: Arpan
> Priority: Critical
> Attachments: server_1_72server.log, server_2_73_server.log,
> server_3_74Server.log, server.properties, ThreadDump_1493564142.dump,
> ThreadDump_1493564177.dump, ThreadDump_1493564249.dump
>
>
> Hi Team,
> I was earlier referring to issue KAFKA-4477 because the problem i am facing
> is similar. I tried to search the same reference in release docs as well but
> did not get anything in 0.10.1.1 or 0.10.2.0. I am currently using
> 2.11_0.10.2.0.
> I am have 3 node cluster for KAFKA and cluster for ZK as well on the same set
> of servers in cluster mode. We are having around 240GB of data getting
> transferred through KAFKA everyday. What we are observing is disconnect of
> the server from cluster and ISR getting reduced and it starts impacting
> service.
> I have also observed file descriptor count getting increased a bit, in normal
> circumstances we have not observed FD count more than 500 but when issue
> started we were observing it in the range of 650-700 on all 3 servers.
> Attaching thread dumps of all 3 servers when we started facing the issue
> recently.
> The issue get vanished once you bounce the nodes and the set up is not
> working more than 5 days without this issue. Attaching server logs as well.
> Kindly let me know if you need any additional information. Attaching
> server.properties as well for one of the server (It's similar on all 3
> serversP)
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)