David Arthur created KAFKA-18921:
------------------------------------

             Summary: Low CPU utilization during streams integration tests
                 Key: KAFKA-18921
                 URL: https://issues.apache.org/jira/browse/KAFKA-18921
             Project: Kafka
          Issue Type: Improvement
          Components: build, streams
            Reporter: David Arthur
         Attachments: image-2025-03-04-17-05-48-768.png

While analyzing some build performance, I noticed that our CPU is not well 
utilized during the streams integration tests. We should investigate to make 
sure we're not needlessly waiting during the tests.

 

Here is an example from this build scan 
https://develocity.apache.org/s/rqe4e2vh5jr72/timeline?page=1&sort=longest

!image-2025-03-04-17-05-48-768.png!

Notice that the ":streams:integration-tests:test" worker is running longer than 
the other works and we can see during that time that the CPU is not very 
utilized. This suggests that the tests are in some idle/waiting state.

 

I tried to reproduce this locally and found at 40 second pause in a random test.

 

[2025-03-04 16:56:11,606] INFO [Consumer 
clientId=consumer-f2eaa005-b1e1-406f-8c52-e50c71b4d807-58, 
groupId=f2eaa005-b1e1-406f-8c52-e50c71b4d807] Resetting offset for partition 
outputTopic_3-0 to position FetchPosition\{offset=0, 
offsetEpoch=Optional.empty, 
currentLeader=LeaderAndEpoch{leader=Optional[localhost:59515 (id: 0 rack: null 
isFenced: false)], epoch=0}}. 
(org.apache.kafka.clients.consumer.internals.SubscriptionState:447)

    [2025-03-04 16:56:55,214] INFO [Consumer 
clientId=testAutoOffsetId-a426e1fb-61ad-4ce8-92ca-f0fe337c2565-StreamThread-1-consumer,
 groupId=testAutoOffsetId] Successfully joined group with generation 
Generation\{generationId=4, 
memberId='testAutoOffsetId-a426e1fb-61ad-4ce8-92ca-f0fe337c2565-StreamThread-1-consumer-2f798603-d538-4041-a90d-2e1b922dd5af',
 protocol='stream'} 
(org.apache.kafka.clients.consumer.internals.ConsumerCoordinator:666)

 

I grabbed a thread dump:

 

at 
org.apache.kafka.streams.integration.utils.IntegrationTestUtils.readRecords(IntegrationTestUtils.java:1177)
        at 
org.apache.kafka.streams.integration.utils.IntegrationTestUtils.readKeyValues(IntegrationTestUtils.java:1136)
        at 
org.apache.kafka.streams.integration.utils.IntegrationTestUtils.lambda$waitUntilMinKeyValueRecordsReceived$4(IntegrationTestUtils.java:710)
        at 
org.apache.kafka.streams.integration.utils.IntegrationTestUtils$$Lambda$3338/0x00000070018f4fb0.call(Unknown
 Source)
        at 
org.apache.kafka.test.TestUtils.retryOnExceptionWithTimeout(TestUtils.java:483)
        at 
org.apache.kafka.test.TestUtils.retryOnExceptionWithTimeout(TestUtils.java:451)
        at 
org.apache.kafka.streams.integration.utils.IntegrationTestUtils.waitUntilMinKeyValueRecordsReceived(IntegrationTestUtils.java:708)
        at 
org.apache.kafka.streams.integration.utils.IntegrationTestUtils.waitUntilMinKeyValueRecordsReceived(IntegrationTestUtils.java:681)
        at 
org.apache.kafka.streams.integration.FineGrainedAutoResetIntegrationTest.shouldResetByDuration(FineGrainedAutoResetIntegrationTest.java:334)

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to