Hi Ross,

Do your servers have plenty of free memory that the kernel can use for the 
filesystem cache? Ideally, all data that might be requested from Kafka will 
be in the cache, otherwise accessing the data will cause a page fault to 
disk.

I'm not sure on your use-case, but for my project, we've done some work to 
make our usage of Kafka a bit more sympathetic to the system. We are in 
control of both the producers and the consumers, so our consumers publish 
their presence into ZooKeeper so that the producers for corresponding 
topics can find them. The producers then use their consumers' current 
offsets (position within the Kafka topic (queue)) to determine if they 
should slow down or pause and wait for the consumers to catch up. This back 
pressure prevents widening the gap between producers/consumers, which would 
require more filesystem cache space in order to keep requested data in 
memory.

Hope that helps,
Meg


On Wednesday, December 5, 2018 at 7:28:11 PM UTC-5, Ross Black wrote:
>
> Hi,
>
> When running Kafka servers on Linux, the zookeeper sessions have been 
> expiring frequently.
> After lots of investigation, I discovered some threads (not the zookeeper 
> threads)
> taking a very long time to get to the safepoint  (frequently > 2 seconds)
>
> The output from a pause looks like:
>
>
> [2018-12-05T22:54:58.034+0000][safepoint] Thread: 0x00007f23840d1800  [
> 0x17e4] State: _running _has_called_back 0 _at_poll_safepoint 0
> [2018-12-05T22:54:58.034+0000][safepoint] Thread: 0x00007f23840d1800  [
> 0x17e4] State: _running _has_called_back 0 _at_poll_safepoint 0
> [2018-12-05T22:54:58.034+0000][safepoint] Thread: 0x00007f23840d1800  [
> 0x17e4] State: _running _has_called_back 0 _at_poll_safepoint 0
> ... (repeated continually for approx 2 seconds)
>
> [2018-12-05T22:55:00.548+0000][safepoint] Thread: 0x00007f23840d1800  [
> 0x17e4] State: _running _has_called_back 0 _at_poll_safepoint 0
> [2018-12-05T22:55:00.549+0000][safepoint] Thread: 0x00007f23840d1800  [
> 0x17e4] State: _running _has_called_back 0 _at_poll_safepoint 0
> [2018-12-05T22:55:00.550+0000][safepoint] Thread: 0x00007f23840d1800  [
> 0x17e4] State: _running _has_called_back 0 _at_poll_safepoint 0
> [2018-12-05T22:55:00.551+0000][safepoint] ... found polling page return 
> exception at pc = 0x00007f3397c7deff, stub =0x00007f338fb4c900
> [2018-12-05T22:55:00.551+0000][safepoint] Thread: 0x00007f23840d1800  [
> 0x17e4] State: _call_back _has_called_back 0 _at_poll_safepoint 0
> [2018-12-05T22:55:00.551+0000][safepoint] Waiting for 3 thread(s) to block
> [2018-12-05T22:55:00.551+0000][safepoint] Entering safepoint region: 
> G1IncCollectionPause
> [2018-12-05T22:55:00.566+0000][safepoint] Leaving safepoint region
> [2018-12-05T22:55:00.566+0000][safepoint] Total time for which 
> application threads were stopped: 2.5331635 seconds, Stopping threads took
> : 2.5180973 seconds
> => BOOT @ 2018-12-05T03:12:26+00:00, epoch 1543979546
> => 2018-12-05T22:55:00+00:00: epoch 1544050500, boot offset 70954
>
> thread 0x17e4 = 6116
>
> Running 'perf' to sample threads:
>   sudo perf record -F 99 --proc-map-timeout 5000 --sample-cpu 
> --switch-events
>
>   sudo perf script --tid 6116 > thread-6116.txt
>
>  ReplicaFetcherT  6116 [007] 70951.936265:   47425865 cycles:ppp:      
> 7f338fc9ad8a register_finalizer Runtime1 stub (/tmp/perf-3239.map)
>  ReplicaFetcherT  6116 [007] 70951.987993:   61736469 cycles:ppp:      
> 7f33977d3165 Ljava/util/ArrayList;::add (/tmp/perf-3239.map)
> *** big time gap here*
>  ReplicaFetcherT  6116 [007] 70954.862422:   70423359 cycles:ppp:      
> 7f3397b26fe0 Ljava/util/LinkedHashMap;::afterNodeInsertion (/tmp/perf-
> 3239.map)
>  ReplicaFetcherT  6116 [007] 70954.955571:   58601064 cycles:ppp:      
> 7f3397782118 Ljava/lang/String;::charAt (/tmp/perf-3239.map)
>  ReplicaFetcherT  6116 [007] 70955.010857:   56837522 cycles:ppp:      
> 7f33977347b6 Ljava/util/concurrent/ConcurrentHashMap;::tabAt (/tmp/perf-
> 3239.map)
>
>
>
>   sudo perf script --cpu 7 > cpu-7.txt
>
>  ReplicaFetcherT  6116 [007] 70951.864747:   52073629 cycles:ppp:      
> 7f3397f784f4 Lorg/apache/kafka/common/TopicPartition;::equals (/tmp/perf-
> 3239.map)
>          swapper     0 [007] 70951.889555:   52939697 cycles:ppp: 
>  ffffffff9f39cca9 cpu_idle_poll ([kernel.kallsyms])
>          swapper     0 [007] 70951.914775:   45114901 cycles:ppp: 
>  ffffffff9f39ccab cpu_idle_poll ([kernel.kallsyms])
>  ReplicaFetcherT  6116 [007] 70951.936265:   47425865 cycles:ppp:      
> 7f338fc9ad8a register_finalizer Runtime1 stub (/tmp/perf-3239.map)
>          swapper     0 [007] 70951.958858:   61162448 cycles:ppp: 
>  ffffffff9f39ccab cpu_idle_poll ([kernel.kallsyms])
>  ReplicaFetcherT  6116 [007] 70951.987993:   61736469 cycles:ppp:      
> 7f33977d3165 Ljava/util/ArrayList;::add (/tmp/perf-3239.map)
>          swapper     0 [007] 70952.017399:   66816847 cycles:ppp: 
>  ffffffff9f39ccb4 cpu_idle_poll ([kernel.kallsyms])
>          swapper     0 [007] 70952.049231:   63438867 cycles:ppp: 
>  ffffffff9f39cca9 cpu_idle_poll ([kernel.kallsyms])
> *... (repeated continually for approx 2 seconds)*
>
>          swapper     0 [007] 70954.228216:  201536401 cycles:ppp: 
>  ffffffff9f39cca9 cpu_idle_poll ([kernel.kallsyms])
>          swapper     0 [007] 70954.324191:  216277840 cycles:ppp: 
>  ffffffff9f39ccab cpu_idle_poll ([kernel.kallsyms])
>          swapper     0 [007] 70954.427185:  227564188 cycles:ppp: 
>  ffffffff9f39cca9 cpu_idle_poll ([kernel.kallsyms])
>          swapper     0 [007] 70954.535554:  236205951 cycles:ppp: 
>  ffffffff9f39cca9 cpu_idle_poll ([kernel.kallsyms])
>          swapper     0 [007] 70954.648051:  137191740 cycles:ppp: 
>  ffffffff9f39ccab cpu_idle_poll ([kernel.kallsyms])
>          swapper     0 [007] 70954.713384:  159810740 cycles:ppp: 
>  ffffffff9f39cca9 cpu_idle_poll ([kernel.kallsyms])
>          swapper     0 [007] 70954.789490:  153089075 cycles:ppp: 
>  ffffffff9f39ccab cpu_idle_poll ([kernel.kallsyms])
>  ReplicaFetcherT  6116 [007] 70954.862422:   70423359 cycles:ppp:      
> 7f3397b26fe0 Ljava/util/LinkedHashMap;::afterNodeInsertion (/tmp/perf-
> 3239.map)
>          swapper     0 [007] 70954.895969:   65525177 cycles:ppp: 
>  ffffffff9f39cca9 cpu_idle_poll ([kernel.kallsyms])
>          swapper     0 [007] 70954.927183:   59590550 cycles:ppp: 
>  ffffffff9f39ccab cpu_idle_poll ([kernel.kallsyms])
>  ReplicaFetcherT  6116 [007] 70954.955571:   58601064 cycles:ppp:      
> 7f3397782118 Ljava/lang/String;::charAt (/tmp/perf-3239.map)
>          swapper     0 [007] 70954.983485:   57455478 cycles:ppp: 
>  ffffffff9f39ccab cpu_idle_poll ([kernel.kallsyms])
>  ReplicaFetcherT  6116 [007] 70955.010857:   56837522 cycles:ppp:      
> 7f33977347b6 Ljava/util/concurrent/ConcurrentHashMap;::tabAt (/tmp/perf-
> 3239.map)
>
>
>
>
> It looks like the thread (and CPU?) has been descheduled for approx 2 
> seconds!
>
> I am struggling to find what is causing this behaviour.
> It is occurring on 3 Kafka servers running a moderate load (all the same 
> configuration).
> No other applications are running on the machine (except for 
> fail2ban-server, dnsmasq, node_exporter)
>
> I have made numerous changes to the JDK, OS, and BIOS settings, but none 
> of them have solved the problem.
> The problem occurred on multiple versions of JDK 8  - I updated to JDK 11 
> just to get better safepoint diagnostics.
> I have been hunting this problem for weeks now, and was hoping somebody 
> could provide some help
> (suggestions, tools, diagnostic params, tracing tips, anything!)
>
> Thanks,
> Ross
>
> ---
> The current state of the system is:
>
> Running Ubuntu JDK 11 + debug symbols:
>   openjdk version "10.0.2" 2018-07-17
>   OpenJDK Runtime Environment (build 10.0.2+13-Ubuntu-1ubuntu0.18.04.3)
>   OpenJDK 64-Bit Server VM (build 10.0.2+13-Ubuntu-1ubuntu0.18.04.3, mixed 
> mode)
>
> /usr/lib/jvm/java-11-openjdk-amd64/bin/java -Xmx4G -Xms4G \
>   -server -XX:+UseG1GC -XX:MaxGCPauseMillis=20 \
>   -XX:InitiatingHeapOccupancyPercent=35 \
>   -XX:+ExplicitGCInvokesConcurrent \
>   -Djava.awt.headless=true \
>   -XX:+PreserveFramePointer \
>   -XX:+UnlockDiagnosticVMOptions \
>   -XX:+DebugNonSafepoints \
>   -Xlog:safepoint=trace:safepoint.log:utctime,tags:filecount=20,filesize=
> 102400000 \
>   -XX:+UseCountedLoopSafepoints \
>   -XX:LoopStripMiningIter=10 \
>   -XX:-Inline \
>   ...
>
>
> Description:    Ubuntu 18.04.1 LTS
> Release:    18.04
>
> BOOT_IMAGE=/vmlinuz-4.15.0-36-generic intel_idle.max_cstate=0 
> processor.max_cstate=0 idle=poll nomodeset net.ifnames=0
>
> Motherboard: SuperMicro X10DRU-i+_R1.02b
> Backplane: SuperMicro BPN-SAS-815TQ
> Processor: 2.1GHz Intel Xeon-Broadwell (E5-2620-V4)
> Firmware Version 3.1 6-8-2018
>
> I have disabled everything power related that I could find in BIOS
> (following 
> https://www.thomas-krenn.com/en/wiki/Disable_CPU_Power_Saving_Management_in_BIOS
> )
>
> $ cat /sys/module/intel_idle/parameters/max_cstate
> 0
> $ cat /sys/devices/system/cpu/cpuidle/current_driver
> none
> $ cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
> performance
>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to