[ 
https://issues.apache.org/jira/browse/KAFKA-4096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Reddy updated KAFKA-4096:
---------------------------------
    Summary: Kafka Backup and Recovery  (was: Kafka Recovery)

> Kafka Backup and Recovery
> -------------------------
>
>                 Key: KAFKA-4096
>                 URL: https://issues.apache.org/jira/browse/KAFKA-4096
>             Project: Kafka
>          Issue Type: Bug
>          Components: consumer, zkclient
>    Affects Versions: 0.8.2.0
>         Environment: RHEL 7.2, AWS EC2 compute instance
>            Reporter: Karthik Reddy
>            Assignee: Neha Narkhede
>            Priority: Critical
>
> Hi Team,
> We have seen the below messages in the Kafka logs, indicating there was a 
> timeout on ZK.
> Could you please advise us on how to tune or better optimize the Kafka-ZK 
> communication.
> Kafka and ZK are on separate servers.Currently, we have the ZK timeout set to 
> 6000 ms.
> Kafka servers have EBS volumes as the disk.
> We had to restart our consumers and ZK to resolve this issue.
> [2016-03-10 02:29:25,858] INFO Unable to read additional data from server 
> sessionid 0x5531d0003f30030, likely server has closed socket, closing socket 
> connection and attempting reconnect (org.apache.zookeeper.ClientCnxn)
> [2016-03-10 02:29:25,958] INFO zookeeper state changed (Disconnected) 
> (org.I0Itec.zkclient.ZkClient)
> [2016-03-10 02:29:26,381] INFO Opening socket connection to server 
> 10.200.77.74/10.200.77.74:8164. Will not attempt to authenticate using SASL 
> (unknown error) (org.apache.zookeeper.ClientCnxn)
> [2016-03-10 02:29:26,382] INFO Socket connection established to 
> 10.200.77.74/10.200.77.74:8164, initiating session 
> (org.apache.zookeeper.ClientCnxn)
> [2016-03-10 02:29:26,385] INFO Session establishment complete on server 
> 10.200.77.74/10.200.77.74:8164, sessionid = 0x5531d0003f30030, negotiated 
> timeout = 6000 (org.apache.zookeeper.ClientCnxn)
> [2016-03-10 02:29:26,385] INFO zookeeper state changed (SyncConnected) 
> (org.I0Itec.zkclient.ZkClient)
> [2016-03-10 02:29:30,961] INFO conflict in /controller data: 
> {"version":1,"brokerid":3,"timestamp":"1457594970952"} stored data: 
> {"version":1,"brokerid":5,"timestamp":"1457594970043"} (kafka.utils.ZkUtils$)
> [2016-03-10 02:29:30,969] INFO New leader is 5 
> (kafka.server.ZookeeperLeaderElector$LeaderChangeListener)
> [2016-03-10 02:29:31,620] INFO [ReplicaFetcherManager on broker 3] Removed 
> fetcher for partitions 
> [__consumer_offsets,0],[fulfillment.payments.autopay.mongooperation.response,1],[__consumer_offsets,20],[__consumer_offsets,40]
>  (kafka.server.ReplicaFetcherManager)
> [2016-03-10 02:29:31,621] INFO [ReplicaFetcherManager on broker 3] Removed 
> fetcher for partitions 
> [efit.framework.notification.error,1],[__consumer_offsets,15],[fulfillment.payments.autopay.processexception.notification,1],[__consumer_offsets,35]
>  (kafka.server.ReplicaFetcherManager)
> [2016-03-10 02:29:31,621] INFO Truncating log 
> efit.framework.notification.error-1 to offset 637. (kafka.log.Log)
> [2016-03-10 02:29:31,621] INFO Truncating log __consumer_offsets-15 to offset 
> 0. (kafka.log.Log)
> [2016-03-10 02:29:31,622] INFO Truncating log 
> fulfillment.payments.autopay.processexception.notification-1 to offset 0. 
> (kafka.log.Log)
> [2016-03-10 02:29:31,622] INFO Truncating log __consumer_offsets-35 to offset 
> 0. (kafka.log.Log)
> [2016-03-10 02:29:31,623] INFO Loading offsets from [__consumer_offsets,0] 
> (kafka.server.OffsetManager)
> [2016-03-10 02:29:31,624] INFO Loading offsets from [__consumer_offsets,20] 
> (kafka.server.OffsetManager)
> [2016-03-10 02:29:31,624] INFO Finished loading offsets from 
> [__consumer_offsets,0] in 1 milliseconds. (kafka.server.OffsetManager)
> [2016-03-10 02:29:31,625] INFO Loading offsets from [__consumer_offsets,40] 
> (kafka.server.OffsetManager)
> [2016-03-10 02:29:31,625] INFO Finished loading offsets from 
> [__consumer_offsets,20] in 1 milliseconds. (kafka.server.OffsetManager)
> [2016-03-10 02:29:31,625] INFO Finished loading offsets from 
> [__consumer_offsets,40] in 0 milliseconds. (kafka.server.OffsetManager)
> [2016-03-10 02:29:31,627] INFO [ReplicaFetcherManager on broker 3] Added 
> fetcher for partitions List([[efit.framework.notification.error,1], 
> initOffset 637 to broker id:1,host:10.200.77.78,port:8165] , 
> [[__consumer_offsets,15], initOffset 0 to broker 
> id:1,host:10.200.77.78,port:8165] , 
> [[fulfillment.payments.autopay.processexception.notification,1], initOffset 0 
> to broker id:5,host:10.200.75.150,port:8165] , [[__consumer_offsets,35], 
> initOffset 0 to broker id:1,host:10.200.77.78,port:8165] ) 
> (kafka.server.ReplicaFetcherManager)
> [2016-03-10 02:29:31,627] INFO [ReplicaFetcherThread-0-2], Shutting down 
> (kafka.server.ReplicaFetcherThread
> Thanks,
> Karthik



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to