On 12/6/18 2:28 AM, Ross Black wrote:
Hi,
When running Kafka servers on Linux, the zookeeper sessions have been
expiring frequently.
After lots of investigation, I discovered some threads (not the
zookeeper threads)
taking a very long time to get to the safepoint (frequently > 2 seconds)
A quick google shows that Kafka is using writeable mmaps. By itself this
is already terrible if you have a large amount of data to write, but in
combination with the JVM it is horrendous. An innocent memory access can
cause the thread to be blocked on I/O, and since the dirtying rate can
be much higher that cleaning rate, you can end up with the thread
blocked for large amounts of time. The JVM doesn't know anything about
it, so as far as it knows the thread is at a non-safe point.
If Kafka has a setting that disables mmap, I recommend trying it. The
thread will still block, but at least it won't prevent Java from
reaching a safepoint.
The output from a pause looks like:
|
[2018-12-05T22:54:58.034+0000][safepoint]Thread:0x00007f23840d1800[0x17e4]State:_running
_has_called_back 0_at_poll_safepoint 0
[2018-12-05T22:54:58.034+0000][safepoint]Thread:0x00007f23840d1800[0x17e4]State:_running
_has_called_back 0_at_poll_safepoint 0
[2018-12-05T22:54:58.034+0000][safepoint]Thread:0x00007f23840d1800[0x17e4]State:_running
_has_called_back 0_at_poll_safepoint 0
...(repeated continually forapprox 2seconds)
[2018-12-05T22:55:00.548+0000][safepoint]Thread:0x00007f23840d1800[0x17e4]State:_running
_has_called_back 0_at_poll_safepoint 0
[2018-12-05T22:55:00.549+0000][safepoint]Thread:0x00007f23840d1800[0x17e4]State:_running
_has_called_back 0_at_poll_safepoint 0
[2018-12-05T22:55:00.550+0000][safepoint]Thread:0x00007f23840d1800[0x17e4]State:_running
_has_called_back 0_at_poll_safepoint 0
[2018-12-05T22:55:00.551+0000][safepoint]...found polling page
returnexception at pc =0x00007f3397c7deff,stub =0x00007f338fb4c900
[2018-12-05T22:55:00.551+0000][safepoint]Thread:0x00007f23840d1800[0x17e4]State:_call_back
_has_called_back 0_at_poll_safepoint 0
[2018-12-05T22:55:00.551+0000][safepoint]Waitingfor3thread(s)to block
[2018-12-05T22:55:00.551+0000][safepoint]Enteringsafepoint
region:G1IncCollectionPause
[2018-12-05T22:55:00.566+0000][safepoint]Leavingsafepoint region
[2018-12-05T22:55:00.566+0000][safepoint]Totaltime forwhich
application threads were stopped:2.5331635seconds,Stoppingthreads
took:2.5180973seconds
|
=> BOOT @ 2018-12-05T03:12:26+00:00, epoch 1543979546
=> 2018-12-05T22:55:00+00:00: epoch 1544050500, boot offset 70954
thread 0x17e4 = 6116
Running 'perf' to sample threads:
sudo perf record -F 99 --proc-map-timeout 5000 --sample-cpu
--switch-events
sudo perf script --tid 6116 > thread-6116.txt
|
ReplicaFetcherT6116[007]70951.936265:47425865cycles:ppp:7f338fc9ad8aregister_finalizer
Runtime1stub (/tmp/perf-3239.map)
ReplicaFetcherT6116[007]70951.987993:61736469cycles:ppp:7f33977d3165Ljava/util/ArrayList;::add
(/tmp/perf-3239.map)
/**big time gap here/
ReplicaFetcherT6116[007]70954.862422:70423359cycles:ppp:7f3397b26fe0Ljava/util/LinkedHashMap;::afterNodeInsertion
(/tmp/perf-3239.map)
ReplicaFetcherT6116[007]70954.955571:58601064cycles:ppp:7f3397782118Ljava/lang/String;::charAt
(/tmp/perf-3239.map)
ReplicaFetcherT6116[007]70955.010857:56837522cycles:ppp:7f33977347b6Ljava/util/concurrent/ConcurrentHashMap;::tabAt
(/tmp/perf-3239.map)
|
sudo perf script --cpu 7 > cpu-7.txt
|
ReplicaFetcherT6116[007]70951.864747:52073629cycles:ppp:7f3397f784f4Lorg/apache/kafka/common/TopicPartition;::equals
(/tmp/perf-3239.map)
swapper
0[007]70951.889555:52939697cycles:ppp: ffffffff9f39cca9 cpu_idle_poll
([kernel.kallsyms])
swapper
0[007]70951.914775:45114901cycles:ppp: ffffffff9f39ccab cpu_idle_poll
([kernel.kallsyms])
ReplicaFetcherT6116[007]70951.936265:47425865cycles:ppp:7f338fc9ad8aregister_finalizer
Runtime1stub (/tmp/perf-3239.map)
swapper
0[007]70951.958858:61162448cycles:ppp: ffffffff9f39ccab cpu_idle_poll
([kernel.kallsyms])
ReplicaFetcherT6116[007]70951.987993:61736469cycles:ppp:7f33977d3165Ljava/util/ArrayList;::add
(/tmp/perf-3239.map)
swapper
0[007]70952.017399:66816847cycles:ppp: ffffffff9f39ccb4 cpu_idle_poll
([kernel.kallsyms])
swapper
0[007]70952.049231:63438867cycles:ppp: ffffffff9f39cca9 cpu_idle_poll
([kernel.kallsyms])
/...(repeated continually forapprox 2seconds)/
swapper
0[007]70954.228216:201536401cycles:ppp: ffffffff9f39cca9 cpu_idle_poll
([kernel.kallsyms])
swapper
0[007]70954.324191:216277840cycles:ppp: ffffffff9f39ccab cpu_idle_poll
([kernel.kallsyms])
swapper
0[007]70954.427185:227564188cycles:ppp: ffffffff9f39cca9 cpu_idle_poll
([kernel.kallsyms])
swapper
0[007]70954.535554:236205951cycles:ppp: ffffffff9f39cca9 cpu_idle_poll
([kernel.kallsyms])
swapper
0[007]70954.648051:137191740cycles:ppp: ffffffff9f39ccab cpu_idle_poll
([kernel.kallsyms])
swapper
0[007]70954.713384:159810740cycles:ppp: ffffffff9f39cca9 cpu_idle_poll
([kernel.kallsyms])
swapper
0[007]70954.789490:153089075cycles:ppp: ffffffff9f39ccab cpu_idle_poll
([kernel.kallsyms])
ReplicaFetcherT6116[007]70954.862422:70423359cycles:ppp:7f3397b26fe0Ljava/util/LinkedHashMap;::afterNodeInsertion
(/tmp/perf-3239.map)
swapper
0[007]70954.895969:65525177cycles:ppp: ffffffff9f39cca9 cpu_idle_poll
([kernel.kallsyms])
swapper
0[007]70954.927183:59590550cycles:ppp: ffffffff9f39ccab cpu_idle_poll
([kernel.kallsyms])
ReplicaFetcherT6116[007]70954.955571:58601064cycles:ppp:7f3397782118Ljava/lang/String;::charAt
(/tmp/perf-3239.map)
swapper
0[007]70954.983485:57455478cycles:ppp: ffffffff9f39ccab cpu_idle_poll
([kernel.kallsyms])
ReplicaFetcherT6116[007]70955.010857:56837522cycles:ppp:7f33977347b6Ljava/util/concurrent/ConcurrentHashMap;::tabAt
(/tmp/perf-3239.map)
|
It looks like the thread (and CPU?) has been descheduled for approx 2
seconds!
I am struggling to find what is causing this behaviour.
It is occurring on 3 Kafka servers running a moderate load (all the
same configuration).
No other applications are running on the machine (except for
fail2ban-server, dnsmasq, node_exporter)
I have made numerous changes to the JDK, OS, and BIOS settings, but
none of them have solved the problem.
The problem occurred on multiple versions of JDK 8 - I updated to JDK
11 just to get better safepoint diagnostics.
I have been hunting this problem for weeks now, and was hoping
somebody could provide some help
(suggestions, tools, diagnostic params, tracing tips, anything!)
Thanks,
Ross
---
The current state of the system is:
Running Ubuntu JDK 11 + debug symbols:
openjdk version "10.0.2" 2018-07-17
OpenJDK Runtime Environment (build 10.0.2+13-Ubuntu-1ubuntu0.18.04.3)
OpenJDK 64-Bit Server VM (build 10.0.2+13-Ubuntu-1ubuntu0.18.04.3,
mixed mode)
|
/usr/lib/jvm/java-11-openjdk-amd64/bin/java -Xmx4G-Xms4G\
-server -XX:+UseG1GC-XX:MaxGCPauseMillis=20\
-XX:InitiatingHeapOccupancyPercent=35\
-XX:+ExplicitGCInvokesConcurrent\
-Djava.awt.headless=true\
-XX:+PreserveFramePointer\
-XX:+UnlockDiagnosticVMOptions\
-XX:+DebugNonSafepoints\
-Xlog:safepoint=trace:safepoint.log:utctime,tags:filecount=20,filesize=102400000\
-XX:+UseCountedLoopSafepoints\
-XX:LoopStripMiningIter=10\
-XX:-Inline\
...
|
Description: Ubuntu 18.04.1 LTS
Release: 18.04
BOOT_IMAGE=/vmlinuz-4.15.0-36-generic intel_idle.max_cstate=0
processor.max_cstate=0 idle=poll nomodeset net.ifnames=0
Motherboard: SuperMicro X10DRU-i+_R1.02b
Backplane: SuperMicro BPN-SAS-815TQ
Processor: 2.1GHz Intel Xeon-Broadwell (E5-2620-V4)
Firmware Version 3.1 6-8-2018
I have disabled everything power related that I could find in BIOS
(following
https://www.thomas-krenn.com/en/wiki/Disable_CPU_Power_Saving_Management_in_BIOS)
$ cat /sys/module/intel_idle/parameters/max_cstate
0
$ cat /sys/devices/system/cpu/cpuidle/current_driver
none
$ cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
performance
--
You received this message because you are subscribed to the Google
Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to [email protected]
<mailto:[email protected]>.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups
"mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
For more options, visit https://groups.google.com/d/optout.