I've encountered a strange problem. On versions of bash4.4.18 and 4.4.23.
Executing code similar to the following in an embedded environment may
cause problems.
11817 ? S 0:00 /bin/bash ./grand_father.sh
13939 ? S 0:00 \_ /bin/bash ./father.sh
22168 ? S 1:14 \_ /bin/bash ./son_1.sh
27973 ? S 0:00 | \_ sleep 60
22549 ? S 0:00 \_ /bin/bash ./son_2.sh
8378 ? S 0:00 | \_ sleep 1200
22559 ? S 2:21 \_ /bin/bash ./son_3.sh
The problem lies in a piece of code in son_3.sh (the file name and code
have been desensitized, so the file name and variable name will be strange):
some_check ()
{
local check_file="$1"
cd $WORK_PATH
local list_tmp=""
rm -r $WORK_PATH/fail_exist.flg
while read list_tmp || [ ! -z $list_tmp ]
do
local dir=$(echo $list_tmp | cut -d ' ' -f 1)
local absolutepath=$WORK_PATH"/"$dir
cd $absolutepath
local some_num=$(cat /root/xx/yy/zz.ini | wc -l)
local k
for((k=1;k<=$some_num;k++))
do
local sn=$(cat /root/xx/yy/zz.ini | sed -n ''$k'p' | awk -F
"SN:" '{print $2}' | awk -F ";" '{print $1}')
result_file="${sn}_result.txt"
if [ ! -f "${result_file}" ];then
result_file="result.txt"
fi
item_file="${sn}_item.txt"
if [ ! -f "$item_file" ];then
item_file="item.txt"
fi
local RESULT=$(cat ${result_file})
if [ "$RESULT" = "pass" ]; then
if [ -s ${item_file} ]; then
cat ${item_file} | grep ^"\[" | grep "\]" | awk -F "] "
'{print $2}' | awk -F " " '{print $1}' | tr "[A-Z]" "[a-z]" | grep -qwv
"pass"
if [ $? -eq 0 ];then
echo "fail" >${result_file}
fi
fi
fi
cat ${result_file} | grep -i "fail" >/dev/null 2>&1
if [[ $? -eq 0 ]]; then
echo "${sn} fail" >> $WORK_PATH/fail_exist.flg
fi
done
cd $WORK_PATH
done <$check_file
if [ -s $WORK_PATH/fail_exist.flg ] ;then
cat $WORK_PATH/fail_exist.flg | sort | uniq > temp_fail.txt
cp -f temp_fail.txt $WORK_PATH/fail_exist.flg
fi
return 0
}
while true ; do
some_check
done
On repeated runs, the code here
cat ${item_file} | grep ^"\[" | grep "\]" | awk -F "] " '{print $2}' | awk
-F " " '{print $1}' | tr "[A-Z]" "[a-z]" | grep -qwv "pass"
if [ $? -eq 0 ];then
echo "fail" >${result_file}
fi
There will be probabilistic problems. The original exit code of the command
is 1, but it does enter the branch of [ $? -eq 0 ], which means $? is
abnormally changed to 0.
The exit code of the command is 1, which has been strictly traced using the
strace tool.
Is this a known BUG of bash4.4? Because the problem will reappear with
probability when using bash4.4.18 or 4.4.23. But if you switch to bash5.0.0
or bash5.3.9, the problem will no longer recur.
Then on bash version 4.4.18 or 4.4.23, if the code is changed to the
following form, the problem will not recur.
local xx
xx=$(cat ${item_file} | grep ^"\[" | grep "\]" | awk -F "] " '{print $2}' |
awk -F " " '{print $1}' | tr "[A-Z]" "[a-z]" | grep -qwv "pass")
if [ $? -eq 0 ];then
echo "fail" >${result_file}
fi
Why is this? Is there any difference?
王伟 <[email protected]> 于2026年3月2日周一 09:56写道:
> I've encountered a strange problem. On versions of bash4.4.18 and 4.4.23.
> Executing code similar to the following in an embedded environment may
> cause problems.
>
> 11817 ? S 0:00 /bin/bash ./grand_father.sh
> 13939 ? S 0:00 \_ /bin/bash ./father.sh
> 22168 ? S 1:14 \_ /bin/bash ./son_1.sh
> 27973 ? S 0:00 | \_ sleep 60
> 22549 ? S 0:00 \_ /bin/bash ./son_2.sh
> 8378 ? S 0:00 | \_ sleep 1200
> 22559 ? S 2:21 \_ /bin/bash ./son_3.sh
>
> The problem lies in a piece of code in son_3.sh (the file name and code
> have been desensitized, so the file name and variable name will be strange):
>
> some_check ()
> {
> local check_file="$1"
> cd $WORK_PATH
> local list_tmp=""
> rm -r $WORK_PATH/fail_exist.flg
> while read list_tmp || [ ! -z $list_tmp ]
> do
> local dir=$(echo $list_tmp | cut -d ' ' -f 1)
> local absolutepath=$WORK_PATH"/"$dir
>
> cd $absolutepath
> local some_num=$(cat /root/xx/yy/zz.ini | wc -l)
>
> local k
> for((k=1;k<=$some_num;k++))
> do
> local sn=$(cat /root/xx/yy/zz.ini | sed -n ''$k'p' | awk -F
> "SN:" '{print $2}' | awk -F ";" '{print $1}')
> result_file="${sn}_result.txt"
> if [ ! -f "${result_file}" ];then
> result_file="result.txt"
> fi
>
> item_file="${sn}_item.txt"
> if [ ! -f "$item_file" ];then
> item_file="item.txt"
> fi
>
> local RESULT=$(cat ${result_file})
> if [ "$RESULT" = "pass" ]; then
> if [ -s ${item_file} ]; then
> cat ${item_file} | grep ^"\[" | grep "\]" | awk -F "]
> " '{print $2}' | awk -F " " '{print $1}' | tr "[A-Z]" "[a-z]" | grep -qwv
> "pass"
> if [ $? -eq 0 ];then
> echo "fail" >${result_file}
> fi
> fi
> fi
> cat ${result_file} | grep -i "fail" >/dev/null 2>&1
> if [[ $? -eq 0 ]]; then
> echo "${sn} fail" >> $WORK_PATH/fail_exist.flg
> fi
> done
> cd $WORK_PATH
>
> done <$check_file
> if [ -s $WORK_PATH/fail_exist.flg ] ;then
> cat $WORK_PATH/fail_exist.flg | sort | uniq > temp_fail.txt
> cp -f temp_fail.txt $WORK_PATH/fail_exist.flg
> fi
> return 0
> }
>
> while true ; do
> some_check
> done
>
> On repeated runs, the code here
>
>
>
>
> *cat ${item_file} | grep ^"\[" | grep "\]" | awk -F "] " '{print $2}' |
> awk -F " " '{print $1}' | tr "[A-Z]" "[a-z]" | grep -qwv "pass"if [ $? -eq
> 0 ];then echo "fail" >${result_file}fi*
>
> There will be probabilistic problems. The original exit code of the
> command is 1, but it does enter the branch of [ $? -eq 0 ], which means *$?
> is abnormally changed to 0*.
>
> The exit code of the command is 1, which has been strictly traced using
> the strace tool.
>
> Is this a known BUG of bash4.4? Because the problem will reappear with
> probability when using bash4.4.18 or 4.4.23. But if you switch to bash5.0.0
> or bash5.3.9, the problem will no longer recur.
>
> Then on bash version 4.4.18 or 4.4.23, if the code is changed to the
> following form, the problem will not recur.
>
>
>
>
>
> *local xxxx=$(cat ${item_file} | grep ^"\[" | grep "\]" | awk -F "] "
> '{print $2}' | awk -F " " '{print $1}' | tr "[A-Z]" "[a-z]" | grep -qwv
> "pass")if [ $? -eq 0 ];then echo "fail" >${result_file}fi*
>
> Why is this? Is there any difference?
>