Takashi Sakai created KAFKA-15413:
-------------------------------------
Summary: kafka-server-stop fails with COLUMNS environment variable
on Ubuntu
Key: KAFKA-15413
URL: https://issues.apache.org/jira/browse/KAFKA-15413
Project: Kafka
Issue Type: Bug
Components: tools
Environment: kafka: 3.5.1
Java: openjdk version "20.0.1" 2023-04-18
OS: Ubuntu 22.04.3 LTS on WSL2/Windows 11
Reporter: Takashi Sakai
{{kafka-server-stop}} script does not work if environment variable {{COLUMNS}}
is set on Ubuntu.
{*}Steps to reproduce{*}:
kafka/zookeeper.properties
{noformat}
dataDir=/tmp/kafka-test-20230828-15217-1lop1tk/zookeeper
clientPort=34461
maxClientCnxns=0
admin.enableServer=false
{noformat}
kafka/server.properties
{noformat}
broker.id=0
listeners=PLAINTEXT://:46161
num.network.threads=3
num.io.threads=8
socket.send.buffer.bytes=102400
socket.receive.buffer.bytes=102400
socket.request.max.bytes=104857600
log.dirs=/tmp/kafka-test-20230828-15217-1lop1tk/kafka-logs
num.partitions=1
num.recovery.threads.per.data.dir=1
offsets.topic.replication.factor=1
transaction.state.log.replication.factor=1
transaction.state.log.min.isr=1
log.retention.hours=168
log.retention.check.interval.ms=300000
zookeeper.connect=localhost:34461
zookeeper.connection.timeout.ms=18000
group.initial.rebalance.delay.ms=0
{noformat}
{noformat}
$ zookeeper-server-start kafka/zookeeper.properties >/dev/null 2>&1 &
[1] 18593
$ kafka-server-start kafka/server.properties >/dev/null 2>&1 &
[2] 18982
$ COLUMNS=10 kafka-server-stop # This is unexpected
No kafka server to stop
$ kafka-server-stop
$ zookeeper-server-stop
[2]+ Exit 143 kafka-server-start kafka/server.properties
$
[1]+ Exit 143 zookeeper-server-start kafka/zookeeper.properties
{noformat}
In the third command, I specified {{COLUMNS}} environment variable. It caused
{{kafka-server-stop}} script to fail finding kafka process.
*Cause*
{{kafka-server-stop}} script uses {{ps ax}} to find kafka process.
{noformat}
OSNAME=$(uname -s)
if [[ "$OSNAME" == "OS/390" ]]; then
(snip)
elif [[ "$OSNAME" == "OS400" ]]; then
(snip)
else
PIDS=$(ps ax | grep ' kafka\.Kafka ' | grep java | grep -v grep | awk
'{print $1}')
fi
{noformat}
On Ubuntu, {{ps ax}} truncates its output if environment variable {{COLUMNS}}
exists.
([source code of ps command|#L226-L230]] shows that COLUMNS environment
variable wins result of {{{}isatty{}}})
{noformat}
$ ps ax | cat
19912 pts/0 Sl 0:03
/home/linuxbrew/.linuxbrew/opt/openjdk/libexec/bin/java -Xmx1G -Xms1G -server
-XX:+UseG1GC -XX:MaxGCPauseMillis=20 -XX:InitiatingHeapOccupancyPercent=35
-XX:+ExplicitGCInvokesConcurrent -XX:MaxInlineLevel=15 -Djava.awt.headless=true
-Xlog:gc*:file=/home/linuxbrew/.linuxbrew/Cellar/kafka/3.5.1/libexec/bin/../logs/kafkaServer-gc.log:time,tags:filecount=10,filesize=100M
-Dcom.sun.management.jmxremote
-Dcom.sun.management.jmxremote.authenticate=false
-Dcom.sun.management.jmxremote.ssl=false
-Dkafka.logs.dir=/home/linuxbrew/.linuxbrew/Cellar/kafka/3.5.1/libexec/bin/../logs
-Dlog4j.configuration=file:/home/linuxbrew/.linuxbrew/Cellar/kafka/3.5.1/libexec/bin/../config/log4j.properties
-cp
/home/linuxbrew/.linuxbrew/Cellar/kafka/3.5.1/libexec/bin/../libs/activation-1.1.1.jar:(snip):/home/linuxbrew/.linuxbrew/Cellar/kafka/3.5.1/libexec/bin/../libs/zstd-jni-1.5.5-1.jar
kafka.Kafka kafka/server.properties
$ COLUMNS=10 ps ax | cat
19912 pts/0 Sl 0:05 /home/linux
{noformat}
I tested this on WSL2 on Windows and openjdk installed with Homebrew, but it
should occur on any environment with {{{}procps-ng{}}}.
*Problem*
This caused CI failure in Homebrew project.
([GitHub/Homebrew/homebrew-core#133887|https://gitlab.com/procps-ng/procps/-/blob/675246119df143a5f8ced6e3313edac6ccc3e222/src/ps/global.c#L226-L230])
Homebrew's behavior that passes {{COLUMNS}} environment variable seems a bug.
But, {{server-stop}} script is not expected to be affected by such an
environment variable. So, this also seemed to be a bug for me.
*Related issues*
This problem, KAFKA-4931 and KAFKA-4110 can also be fixed by introducing
ProcessID file. But the three problem have different cause and can be thought
separately.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)