I've encountered a strange problem. On versions of bash4.4.18 and 4.4.23.
Executing code similar to the following in an embedded environment may
cause problems.

11817 ?        S          0:00 /bin/bash ./grand_father.sh
13939 ?        S          0:00  \_ /bin/bash ./father.sh
22168 ?        S          1:14      \_ /bin/bash ./son_1.sh
27973 ?        S          0:00      |   \_ sleep 60
22549 ?        S          0:00      \_ /bin/bash ./son_2.sh
 8378 ?        S          0:00      |   \_ sleep 1200
22559 ?        S          2:21      \_ /bin/bash ./son_3.sh

The problem lies in a piece of code in son_3.sh (the file name and code
have been desensitized, so the file name and variable name will be strange):

some_check ()
{
    local check_file="$1"
    cd $WORK_PATH
    local list_tmp=""
    rm -r $WORK_PATH/fail_exist.flg
    while read list_tmp || [ ! -z $list_tmp ]
    do
        local dir=$(echo $list_tmp | cut -d ' ' -f 1)
        local absolutepath=$WORK_PATH"/"$dir

        cd $absolutepath
        local some_num=$(cat /root/xx/yy/zz.ini | wc -l)

        local k
        for((k=1;k<=$some_num;k++))
        do
            local sn=$(cat /root/xx/yy/zz.ini | sed -n ''$k'p' | awk -F
"SN:" '{print $2}' | awk -F ";" '{print $1}')
            result_file="${sn}_result.txt"
            if [ ! -f "${result_file}" ];then
                result_file="result.txt"
            fi

            item_file="${sn}_item.txt"
            if [ ! -f "$item_file" ];then
                item_file="item.txt"
            fi

            local RESULT=$(cat ${result_file})
            if [ "$RESULT" = "pass" ]; then
                if [ -s ${item_file} ]; then
                    cat ${item_file} | grep ^"\[" | grep "\]" | awk -F "] "
'{print $2}' | awk -F " " '{print $1}' | tr "[A-Z]" "[a-z]" | grep -qwv
"pass"
                    if [ $? -eq 0 ];then
                        echo "fail" >${result_file}
                    fi
                fi
            fi
            cat ${result_file} | grep -i "fail" >/dev/null 2>&1
            if [[ $? -eq 0 ]]; then
                echo "${sn} fail" >> $WORK_PATH/fail_exist.flg
            fi
        done
        cd $WORK_PATH

    done <$check_file
    if [ -s $WORK_PATH/fail_exist.flg ] ;then
        cat $WORK_PATH/fail_exist.flg | sort | uniq > temp_fail.txt
        cp -f temp_fail.txt $WORK_PATH/fail_exist.flg
    fi
    return 0
}

while true ; do
    some_check
done

On repeated runs, the code here

cat ${item_file} | grep ^"\[" | grep "\]" | awk -F "] " '{print $2}' | awk
-F " " '{print $1}' | tr "[A-Z]" "[a-z]" | grep -qwv "pass"
if [ $? -eq 0 ];then
    echo "fail" >${result_file}
fi

There will be probabilistic problems. The original exit code of the command
is 1, but it does enter the branch of [ $? -eq 0 ], which means $? is
abnormally changed to 0.

The exit code of the command is 1, which has been strictly traced using the
strace tool.

Is this a known BUG of bash4.4? Because the problem will reappear with
probability when using bash4.4.18 or 4.4.23. But if you switch to bash5.0.0
or bash5.3.9, the problem will no longer recur.

Then on bash version 4.4.18 or 4.4.23, if the code is changed to the
following form, the problem will not recur.

local xx
xx=$(cat ${item_file} | grep ^"\[" | grep "\]" | awk -F "] " '{print $2}' |
awk -F " " '{print $1}' | tr "[A-Z]" "[a-z]" | grep -qwv "pass")
if [ $? -eq 0 ];then
    echo "fail" >${result_file}
fi

Why is this? Is there any difference?

王伟 <[email protected]> 于2026年3月2日周一 09:56写道:

> I've encountered a strange problem. On versions of bash4.4.18 and 4.4.23.
> Executing code similar to the following in an embedded environment may
> cause problems.
>
> 11817 ?        S          0:00 /bin/bash ./grand_father.sh
> 13939 ?        S          0:00  \_ /bin/bash ./father.sh
> 22168 ?        S          1:14      \_ /bin/bash ./son_1.sh
> 27973 ?        S          0:00      |   \_ sleep 60
> 22549 ?        S          0:00      \_ /bin/bash ./son_2.sh
>  8378 ?        S          0:00      |   \_ sleep 1200
> 22559 ?        S          2:21      \_ /bin/bash ./son_3.sh
>
> The problem lies in a piece of code in son_3.sh (the file name and code
> have been desensitized, so the file name and variable name will be strange):
>
> some_check ()
> {
>     local check_file="$1"
>     cd $WORK_PATH
>     local list_tmp=""
>     rm -r $WORK_PATH/fail_exist.flg
>     while read list_tmp || [ ! -z $list_tmp ]
>     do
>         local dir=$(echo $list_tmp | cut -d ' ' -f 1)
>         local absolutepath=$WORK_PATH"/"$dir
>
>         cd $absolutepath
>         local some_num=$(cat /root/xx/yy/zz.ini | wc -l)
>
>         local k
>         for((k=1;k<=$some_num;k++))
>         do
>             local sn=$(cat /root/xx/yy/zz.ini | sed -n ''$k'p' | awk -F
> "SN:" '{print $2}' | awk -F ";" '{print $1}')
>             result_file="${sn}_result.txt"
>             if [ ! -f "${result_file}" ];then
>                 result_file="result.txt"
>             fi
>
>             item_file="${sn}_item.txt"
>             if [ ! -f "$item_file" ];then
>                 item_file="item.txt"
>             fi
>
>             local RESULT=$(cat ${result_file})
>             if [ "$RESULT" = "pass" ]; then
>                 if [ -s ${item_file} ]; then
>                     cat ${item_file} | grep ^"\[" | grep "\]" | awk -F "]
> " '{print $2}' | awk -F " " '{print $1}' | tr "[A-Z]" "[a-z]" | grep -qwv
> "pass"
>                     if [ $? -eq 0 ];then
>                         echo "fail" >${result_file}
>                     fi
>                 fi
>             fi
>             cat ${result_file} | grep -i "fail" >/dev/null 2>&1
>             if [[ $? -eq 0 ]]; then
>                 echo "${sn} fail" >> $WORK_PATH/fail_exist.flg
>             fi
>         done
>         cd $WORK_PATH
>
>     done <$check_file
>     if [ -s $WORK_PATH/fail_exist.flg ] ;then
>         cat $WORK_PATH/fail_exist.flg | sort | uniq > temp_fail.txt
>         cp -f temp_fail.txt $WORK_PATH/fail_exist.flg
>     fi
>     return 0
> }
>
> while true ; do
>     some_check
> done
>
> On repeated runs, the code here
>
>
>
>
> *cat ${item_file} | grep ^"\[" | grep "\]" | awk -F "] " '{print $2}' |
> awk -F " " '{print $1}' | tr "[A-Z]" "[a-z]" | grep -qwv "pass"if [ $? -eq
> 0 ];then    echo "fail" >${result_file}fi*
>
> There will be probabilistic problems. The original exit code of the
> command is 1, but it does enter the branch of [ $? -eq 0 ], which means *$?
> is abnormally changed to 0*.
>
> The exit code of the command is 1, which has been strictly traced using
> the strace tool.
>
> Is this a known BUG of bash4.4? Because the problem will reappear with
> probability when using bash4.4.18 or 4.4.23. But if you switch to bash5.0.0
> or bash5.3.9, the problem will no longer recur.
>
> Then on bash version 4.4.18 or 4.4.23, if the code is changed to the
> following form, the problem will not recur.
>
>
>
>
>
> *local xxxx=$(cat ${item_file} | grep ^"\[" | grep "\]" | awk -F "] "
> '{print $2}' | awk -F " " '{print $1}' | tr "[A-Z]" "[a-z]" | grep -qwv
> "pass")if [ $? -eq 0 ];then    echo "fail" >${result_file}fi*
>
> Why is this? Is there any difference?
>

Reply via email to