Hi Ross, Do your servers have plenty of free memory that the kernel can use for the filesystem cache? Ideally, all data that might be requested from Kafka will be in the cache, otherwise accessing the data will cause a page fault to disk.
I'm not sure on your use-case, but for my project, we've done some work to make our usage of Kafka a bit more sympathetic to the system. We are in control of both the producers and the consumers, so our consumers publish their presence into ZooKeeper so that the producers for corresponding topics can find them. The producers then use their consumers' current offsets (position within the Kafka topic (queue)) to determine if they should slow down or pause and wait for the consumers to catch up. This back pressure prevents widening the gap between producers/consumers, which would require more filesystem cache space in order to keep requested data in memory. Hope that helps, Meg On Wednesday, December 5, 2018 at 7:28:11 PM UTC-5, Ross Black wrote: > > Hi, > > When running Kafka servers on Linux, the zookeeper sessions have been > expiring frequently. > After lots of investigation, I discovered some threads (not the zookeeper > threads) > taking a very long time to get to the safepoint (frequently > 2 seconds) > > The output from a pause looks like: > > > [2018-12-05T22:54:58.034+0000][safepoint] Thread: 0x00007f23840d1800 [ > 0x17e4] State: _running _has_called_back 0 _at_poll_safepoint 0 > [2018-12-05T22:54:58.034+0000][safepoint] Thread: 0x00007f23840d1800 [ > 0x17e4] State: _running _has_called_back 0 _at_poll_safepoint 0 > [2018-12-05T22:54:58.034+0000][safepoint] Thread: 0x00007f23840d1800 [ > 0x17e4] State: _running _has_called_back 0 _at_poll_safepoint 0 > ... (repeated continually for approx 2 seconds) > > [2018-12-05T22:55:00.548+0000][safepoint] Thread: 0x00007f23840d1800 [ > 0x17e4] State: _running _has_called_back 0 _at_poll_safepoint 0 > [2018-12-05T22:55:00.549+0000][safepoint] Thread: 0x00007f23840d1800 [ > 0x17e4] State: _running _has_called_back 0 _at_poll_safepoint 0 > [2018-12-05T22:55:00.550+0000][safepoint] Thread: 0x00007f23840d1800 [ > 0x17e4] State: _running _has_called_back 0 _at_poll_safepoint 0 > [2018-12-05T22:55:00.551+0000][safepoint] ... found polling page return > exception at pc = 0x00007f3397c7deff, stub =0x00007f338fb4c900 > [2018-12-05T22:55:00.551+0000][safepoint] Thread: 0x00007f23840d1800 [ > 0x17e4] State: _call_back _has_called_back 0 _at_poll_safepoint 0 > [2018-12-05T22:55:00.551+0000][safepoint] Waiting for 3 thread(s) to block > [2018-12-05T22:55:00.551+0000][safepoint] Entering safepoint region: > G1IncCollectionPause > [2018-12-05T22:55:00.566+0000][safepoint] Leaving safepoint region > [2018-12-05T22:55:00.566+0000][safepoint] Total time for which > application threads were stopped: 2.5331635 seconds, Stopping threads took > : 2.5180973 seconds > => BOOT @ 2018-12-05T03:12:26+00:00, epoch 1543979546 > => 2018-12-05T22:55:00+00:00: epoch 1544050500, boot offset 70954 > > thread 0x17e4 = 6116 > > Running 'perf' to sample threads: > sudo perf record -F 99 --proc-map-timeout 5000 --sample-cpu > --switch-events > > sudo perf script --tid 6116 > thread-6116.txt > > ReplicaFetcherT 6116 [007] 70951.936265: 47425865 cycles:ppp: > 7f338fc9ad8a register_finalizer Runtime1 stub (/tmp/perf-3239.map) > ReplicaFetcherT 6116 [007] 70951.987993: 61736469 cycles:ppp: > 7f33977d3165 Ljava/util/ArrayList;::add (/tmp/perf-3239.map) > *** big time gap here* > ReplicaFetcherT 6116 [007] 70954.862422: 70423359 cycles:ppp: > 7f3397b26fe0 Ljava/util/LinkedHashMap;::afterNodeInsertion (/tmp/perf- > 3239.map) > ReplicaFetcherT 6116 [007] 70954.955571: 58601064 cycles:ppp: > 7f3397782118 Ljava/lang/String;::charAt (/tmp/perf-3239.map) > ReplicaFetcherT 6116 [007] 70955.010857: 56837522 cycles:ppp: > 7f33977347b6 Ljava/util/concurrent/ConcurrentHashMap;::tabAt (/tmp/perf- > 3239.map) > > > > sudo perf script --cpu 7 > cpu-7.txt > > ReplicaFetcherT 6116 [007] 70951.864747: 52073629 cycles:ppp: > 7f3397f784f4 Lorg/apache/kafka/common/TopicPartition;::equals (/tmp/perf- > 3239.map) > swapper 0 [007] 70951.889555: 52939697 cycles:ppp: > ffffffff9f39cca9 cpu_idle_poll ([kernel.kallsyms]) > swapper 0 [007] 70951.914775: 45114901 cycles:ppp: > ffffffff9f39ccab cpu_idle_poll ([kernel.kallsyms]) > ReplicaFetcherT 6116 [007] 70951.936265: 47425865 cycles:ppp: > 7f338fc9ad8a register_finalizer Runtime1 stub (/tmp/perf-3239.map) > swapper 0 [007] 70951.958858: 61162448 cycles:ppp: > ffffffff9f39ccab cpu_idle_poll ([kernel.kallsyms]) > ReplicaFetcherT 6116 [007] 70951.987993: 61736469 cycles:ppp: > 7f33977d3165 Ljava/util/ArrayList;::add (/tmp/perf-3239.map) > swapper 0 [007] 70952.017399: 66816847 cycles:ppp: > ffffffff9f39ccb4 cpu_idle_poll ([kernel.kallsyms]) > swapper 0 [007] 70952.049231: 63438867 cycles:ppp: > ffffffff9f39cca9 cpu_idle_poll ([kernel.kallsyms]) > *... (repeated continually for approx 2 seconds)* > > swapper 0 [007] 70954.228216: 201536401 cycles:ppp: > ffffffff9f39cca9 cpu_idle_poll ([kernel.kallsyms]) > swapper 0 [007] 70954.324191: 216277840 cycles:ppp: > ffffffff9f39ccab cpu_idle_poll ([kernel.kallsyms]) > swapper 0 [007] 70954.427185: 227564188 cycles:ppp: > ffffffff9f39cca9 cpu_idle_poll ([kernel.kallsyms]) > swapper 0 [007] 70954.535554: 236205951 cycles:ppp: > ffffffff9f39cca9 cpu_idle_poll ([kernel.kallsyms]) > swapper 0 [007] 70954.648051: 137191740 cycles:ppp: > ffffffff9f39ccab cpu_idle_poll ([kernel.kallsyms]) > swapper 0 [007] 70954.713384: 159810740 cycles:ppp: > ffffffff9f39cca9 cpu_idle_poll ([kernel.kallsyms]) > swapper 0 [007] 70954.789490: 153089075 cycles:ppp: > ffffffff9f39ccab cpu_idle_poll ([kernel.kallsyms]) > ReplicaFetcherT 6116 [007] 70954.862422: 70423359 cycles:ppp: > 7f3397b26fe0 Ljava/util/LinkedHashMap;::afterNodeInsertion (/tmp/perf- > 3239.map) > swapper 0 [007] 70954.895969: 65525177 cycles:ppp: > ffffffff9f39cca9 cpu_idle_poll ([kernel.kallsyms]) > swapper 0 [007] 70954.927183: 59590550 cycles:ppp: > ffffffff9f39ccab cpu_idle_poll ([kernel.kallsyms]) > ReplicaFetcherT 6116 [007] 70954.955571: 58601064 cycles:ppp: > 7f3397782118 Ljava/lang/String;::charAt (/tmp/perf-3239.map) > swapper 0 [007] 70954.983485: 57455478 cycles:ppp: > ffffffff9f39ccab cpu_idle_poll ([kernel.kallsyms]) > ReplicaFetcherT 6116 [007] 70955.010857: 56837522 cycles:ppp: > 7f33977347b6 Ljava/util/concurrent/ConcurrentHashMap;::tabAt (/tmp/perf- > 3239.map) > > > > > It looks like the thread (and CPU?) has been descheduled for approx 2 > seconds! > > I am struggling to find what is causing this behaviour. > It is occurring on 3 Kafka servers running a moderate load (all the same > configuration). > No other applications are running on the machine (except for > fail2ban-server, dnsmasq, node_exporter) > > I have made numerous changes to the JDK, OS, and BIOS settings, but none > of them have solved the problem. > The problem occurred on multiple versions of JDK 8 - I updated to JDK 11 > just to get better safepoint diagnostics. > I have been hunting this problem for weeks now, and was hoping somebody > could provide some help > (suggestions, tools, diagnostic params, tracing tips, anything!) > > Thanks, > Ross > > --- > The current state of the system is: > > Running Ubuntu JDK 11 + debug symbols: > openjdk version "10.0.2" 2018-07-17 > OpenJDK Runtime Environment (build 10.0.2+13-Ubuntu-1ubuntu0.18.04.3) > OpenJDK 64-Bit Server VM (build 10.0.2+13-Ubuntu-1ubuntu0.18.04.3, mixed > mode) > > /usr/lib/jvm/java-11-openjdk-amd64/bin/java -Xmx4G -Xms4G \ > -server -XX:+UseG1GC -XX:MaxGCPauseMillis=20 \ > -XX:InitiatingHeapOccupancyPercent=35 \ > -XX:+ExplicitGCInvokesConcurrent \ > -Djava.awt.headless=true \ > -XX:+PreserveFramePointer \ > -XX:+UnlockDiagnosticVMOptions \ > -XX:+DebugNonSafepoints \ > -Xlog:safepoint=trace:safepoint.log:utctime,tags:filecount=20,filesize= > 102400000 \ > -XX:+UseCountedLoopSafepoints \ > -XX:LoopStripMiningIter=10 \ > -XX:-Inline \ > ... > > > Description: Ubuntu 18.04.1 LTS > Release: 18.04 > > BOOT_IMAGE=/vmlinuz-4.15.0-36-generic intel_idle.max_cstate=0 > processor.max_cstate=0 idle=poll nomodeset net.ifnames=0 > > Motherboard: SuperMicro X10DRU-i+_R1.02b > Backplane: SuperMicro BPN-SAS-815TQ > Processor: 2.1GHz Intel Xeon-Broadwell (E5-2620-V4) > Firmware Version 3.1 6-8-2018 > > I have disabled everything power related that I could find in BIOS > (following > https://www.thomas-krenn.com/en/wiki/Disable_CPU_Power_Saving_Management_in_BIOS > ) > > $ cat /sys/module/intel_idle/parameters/max_cstate > 0 > $ cat /sys/devices/system/cpu/cpuidle/current_driver > none > $ cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor > performance > > > -- You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
