[jira] [Resolved] (KAFKA-15106) AbstractStickyAssignor may stuck in 3.5
[ https://issues.apache.org/jira/browse/KAFKA-15106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] li xiangyuan resolved KAFKA-15106. -- Resolution: Fixed > AbstractStickyAssignor may stuck in 3.5 > --- > > Key: KAFKA-15106 > URL: https://issues.apache.org/jira/browse/KAFKA-15106 > Project: Kafka > Issue Type: Bug > Components: clients >Affects Versions: 3.5.0 >Reporter: li xiangyuan >Assignee: li xiangyuan >Priority: Major > Fix For: 3.6.0 > > > this could reproduce in ut easy, > just int > org.apache.kafka.clients.consumer.internals.AbstractStickyAssignorTest#testLargeAssignmentAndGroupWithNonEqualSubscription, > plz set > partitionCount=200, > consumerCount=20, you can see > isBalanced will return false forever. > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-15170) CooperativeStickyAssignor cannot adjust assignment correctly
li xiangyuan created KAFKA-15170: Summary: CooperativeStickyAssignor cannot adjust assignment correctly Key: KAFKA-15170 URL: https://issues.apache.org/jira/browse/KAFKA-15170 Project: Kafka Issue Type: Bug Components: consumer Affects Versions: 3.5.0 Reporter: li xiangyuan Assignee: li xiangyuan AbstractStickyAssignor use ConstrainedAssignmentBuilder to build assignment when all consumers in group subscribe the same topic list, but it couldn't add all partitions move owner to another consumer to ``partitionsWithMultiplePreviousOwners``. the reason is in function assignOwnedPartitions hasn't add partitions that rack-mismatch with prev owner to allRevokedPartitions, then partition only in this list would add to partitionsWithMultiplePreviousOwners. In Cooperative Rebalance, partitions have changed owner must be removed from final assignment or will lead to incorrect consume behavior, I have already raise a pr, please take a look, thx https://github.com/apache/kafka/pull/13965 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-15106) AbstractStickyAssignor may stuck in 3.5
li xiangyuan created KAFKA-15106: Summary: AbstractStickyAssignor may stuck in 3.5 Key: KAFKA-15106 URL: https://issues.apache.org/jira/browse/KAFKA-15106 Project: Kafka Issue Type: Bug Components: clients Affects Versions: 3.5.0 Reporter: li xiangyuan this caould reproduce in ut easily, just int org.apache.kafka.clients.consumer.internals.AbstractStickyAssignorTest#testLargeAssignmentAndGroupWithNonEqualSubscription, plz set partitionCount=200, consumerCount=20, you can see isBalanced will return false forever. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-14914) binarySearch in AbstactIndex may execute with infinite loop
li xiangyuan created KAFKA-14914: Summary: binarySearch in AbstactIndex may execute with infinite loop Key: KAFKA-14914 URL: https://issues.apache.org/jira/browse/KAFKA-14914 Project: Kafka Issue Type: Bug Components: core Affects Versions: 2.4.0 Reporter: li xiangyuan Attachments: stack.1.txt, stack.2.txt, stack.3.txt Recently our servers in production environment may suddenly stop handle request frequently(for now 3 times in less than 10 days), please check the stack file uploaded, it show that 1 ioThread(data-plane-kafka-request-handler-11) hold the ReadLock of Partition's leaderIsrUpdateLock and keep run the binarySearch function, once any thread(kafka-scheduler-2) need WriteMode Of this lock, then all requests read this partition need ReadMode Lock will use out all ioThreads and then this broker couldn't handle any request. the 3 stack files are fetched with interval about 6 minute, with my standpoint i just could think obviously the binarySearch function cause dead lock and I presuppose maybe the index block values in offsetIndex (at least in mmap) are not sorted. detail information: this problem appear in 2 brokers broker version: 2.4.0 jvm: openjdk 11 hardware: aws c7g 4xlarge, this is a arm64 server, we recently upgrade our servers from c6g 4xlarge to this type, when we use c6g haven't meet this problem, we don't know whether arm or aws c7g server have any problem. other: once we restart broker, it will recover, so we doubt offset index file may not corrupted and maybe something wrong with mmap. plz give any suggestion solve this problem, thx. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-9648) kafka server should resize backlog when create serversocket
li xiangyuan created KAFKA-9648: --- Summary: kafka server should resize backlog when create serversocket Key: KAFKA-9648 URL: https://issues.apache.org/jira/browse/KAFKA-9648 Project: Kafka Issue Type: Improvement Components: core Affects Versions: 0.10.0.1 Reporter: li xiangyuan I have describe a mystery problem (https://issues.apache.org/jira/browse/KAFKA-9211). This issue I found kafka server will trigger tcp Congestion Control in some condition. finally we found the root cause. when kafka server restart for any reason and then execute preferred replica leader, lots of replica leader will give back to it & trigger cluster metadata update. then all clients will establish connection to this server. at the monment many tcp estable request are waiting in the tcp sync queue , and then to accept queue. kafka create serversocket in SocketServer.scala {code:java} serverChannel.socket.bind(socketAddress);{code} this method has second parameter "backlog", min(backlog,tcp_max_syn_backlog) will decide the queue length.beacues kafka haven't set ,it is default value 50. if this queue is full, and tcp_syncookies = 0, then new connection request will be rejected. If tcp_syncookies=1, it will trigger the tcp synccookie mechanism. this mechanism could allow linux handle more tcp sync request, but it would lose many tcp external parameter, include "wscale", the one that allow tcp connection to send much more bytes per tcp package. because syncookie triggerd, wscale has lost, and this tcp connection will handle network very slow, forever,until this connection is closed and establish another tcp connection. so after a preferred repilca executed, lots of new tcp connection will establish without set wscale,and many network traffic to this server will have a very slow speed. i'm not sure whether new linux version have resolved this problem, but kafka also should set backlog a larger value. we now have modify this to 512, seems everything is ok. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (KAFKA-9646) kafka consumer cause high cpu usage
li xiangyuan created KAFKA-9646: --- Summary: kafka consumer cause high cpu usage Key: KAFKA-9646 URL: https://issues.apache.org/jira/browse/KAFKA-9646 Project: Kafka Issue Type: Improvement Components: clients Affects Versions: 2.3.0 Environment: centos-7 3.10.0-957.21.3.el7.x86_64 Reporter: li xiangyuan Attachments: 0.10.0.1.svg, 2.4.0.svg, cpu_use Recently we upgrade kafka server from 0.10.0.1 to 2.3.0 successfully, and because kafka support fetch records from closest broker since 2.4.0, we decide to upgrade our client from 0.10.0.1 to 2.4.0 directly. After upgrade, we found some applications use much more cpu than before. The worst one up from 45% to 70%, therefore we have to rollback this application. we profile this application in test environment(each one execute 6 minutes), and get 2 kafka-clients version cpu flame graph. I have update these file. we found after upgrade to 2.4.0, select.selectNow cause highest cpu usage. this application subscribe 20 topics and each one has 6 consumer threads, and 19 topics has low produce speed (less than 1 message per mintute). we set fetch.max.wait.ms to 5000, cpu usage reduce little but still high then I write a test application, it subscribe 1 topic with 120 consumer threads. when use 2.4.0 client, cpu usage about to 40%. when use 0.10.0.1 ,cpu usage less than 10%. then I try to use 2.4.0 and modify org.apache.kafka.common.network.select , old code below: {code:java} if (timeoutMs == 0L) return this.nioSelector.selectNow(); else return this.nioSelector.select(timeoutMs);{code} change to {code:java} if (timeoutMs == 0) { timeoutMs = 1; } return this.nioSelector.select(timeoutMs); {code} after this change cpu usage about to 20%. i have upload cpu usage pic. i'm wondering why select.selectnow cause high cpu usage, maybe 2.4.0 client has to many useless select? or linux has some performance issue when multithread use selectnow concurrently? -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (KAFKA-9211) kafka upgrade 2.3.0 cause produce speed decrease
li xiangyuan created KAFKA-9211: --- Summary: kafka upgrade 2.3.0 cause produce speed decrease Key: KAFKA-9211 URL: https://issues.apache.org/jira/browse/KAFKA-9211 Project: Kafka Issue Type: Bug Components: controller, producer Affects Versions: 2.3.0 Reporter: li xiangyuan Attachments: broker-jstack.txt, producer-jstack.txt Recently we try upgrade kafka from 0.10.0.1 to 2.3.0. we have 15 clusters in production env, each one has 3~6 brokers. we know kafka upgrade should: 1.replcae code to 2.3.0.jar and restart all brokers one by one 2.unset inter.broker.protocol.version=0.10.0.1 and restart all brokers one by one 3.unset log.message.format.version=0.10.0.1 and restart all brokers one by one for now we have already done step 1 & 2 in 12 clusters.but when we try to upgrade left clusters (already done step 1) in step 2, we found some topics drop produce speed badly. we have research this issue for long time, since we couldn't test it in production environment and we couldn't reproduce in test environment, we couldn't find the root cause. now we only could describe the situation in detail as i know, hope anyone could help us. 1.because bug KAFKA-8653, i add code below in KafkaApis.scala handleJoinGroupRequest function: {code:java} if (rebalanceTimeoutMs <= 0) { rebalanceTimeoutMs = joinGroupRequest.data.sessionTimeoutMs }{code} 2.one cluster upgrade failed has 6 8C16G brokers, about 200 topics with 2 replicas,every broker keep 3000+ partitions and 1500+ leader partition, but most of them has very low produce message speed,about less than 50messages/sec, only one topic with 300 partitions has more than 2500 message/sec with more than 20 consumer groups consume message from it. so this whole cluster produce 4K messages/sec , 11m Bytes in /sec,240m Bytes out /sec.and more than 90% traffic made by that topic has 2500messages/sec. when we unset 5 or 6 servers' inter.broker.protocol.version=0.10.0.1 and restart, this topic produce message drop to about 200messages/sec, i don't know whether the way we use could tirgger any problem. 3.we use kafka wrapped by spring-kafka and set kafkatemplate's autoFlush=true, so each producer.send execution will execute producer.flush immediately too.i know flush method will decrease produce performance dramaticlly, but at least it seems nothing wrong before upgrade step 2. but i doubt whether it's a problem now after upgrade. 4.I noticed when produce speed decrease, some consumer group has large message lag still consume message without any consume speed change or decrease, so I guess only producerequest speed will drop down,but fetchrequest not. 5.we haven't set any throttle configuration, and all producers' acks=1(so it's not broker replica fetch slow), and when this problem triggered, both sever & producers cpu usage down, and servers' ioutil keep less than 30% ,so it shuldn't be a hardware problem. 6.this event triggered often(almost 100%) most brokers has done upgrade step 2,then after a auto leader replica election executed, then we can observe produce speed drop down,and we have to downgrade brokers(set inter.broker.protocol.version=0.10.0.1)and restart brokers one by one,then it could be normal. some cluster have to downgrade all brokers,but some cluster could left 1 or 2 brokers without downgrade, i notice that the broker not need downgrade is the controller. 7.I have print jstack for producer & servers. although I do this not the same cluster, but we can notice that their thread seems really in idle stat. 8.both 0.10.0.1 & 2.3.0 kafka-client will trigger this problem too. 8.unless the largest one topic will drop produce speed certainly, other topic will drop produce speed randomly. maybe topicA will drop speed in first upgrade attempt but next not, and topicB not drop speed in first attemp but dropped when do another attempt. 9.in fact, the largest cluster, has the same topic & group usage scenario mentioned above, but the largest topic has 1w2 messages/sec,will upgrade fail in step 1(just use 2.3.0.jar) any help would be grateful, thx, i'm very sad now... -- This message was sent by Atlassian Jira (v8.3.4#803005)