Re: [ndctl PATCH v2] test/monitor.sh: replace sleep with polling after sync

Marc Herbert Fri, 16 May 2025 15:35:12 -0700

On 2025-05-15 23:08, Dan Williams wrote:
> alison.schofield@ wrote:

>>  
>> +wait_for_logfile_update()
>> +{
>> +    local file="$1"
>> +    local prev_size="$2"
>> +    local timeout=30
>> +    local i=0
>> +
>> +    # prev_size is always zero because start_monitor truncates it.
>> +    # Set and check against it anyway to future proof.
>> +    while [ $i -lt $timeout ]; do
>> +            local new_size=$(stat -c%s "$file" 2>/dev/null || echo 0)
>> +            if [ "$new_size" -gt "$prev_size" ]; then
>> +                    return 0
>> +            fi
>> +            sleep 0.1
>> +            i=$((i+1))
>> +    done
>> +
>> +    echo "logfile not updated within 3 seconds"
>> +    err "$LINENO"
> 
> Hmm... not a fan of this open coded "wait for file to change" bash
> function. This feels like something that a tool can do... (searches)
> 
> Does inotifywait fit the bill here?
> 
> https://linux.die.net/man/1/inotifywait


If inotify works, go for it. Blocking is always better than polling.  It
might be tricky because the file does not exist yet. Create an empty
file yourself first, would that work? Probably not if ndctl monitor
creates a brand new file.

If inotify does not work, consider adding to test/common this generic
polling function that lets you poll in bash literally anything:

https://github.com/pmem/run_qemu/pull/177/files

It would require making `prev_size` global which does not look like an
issue to me.

Before adding it run_qemu, that polling function has been used for years
and thousands of runs in
https://github.com/thesofproject/sof-test/blob/main/case-lib/lib.sh
I mean it's been extremely well tested.

Even if you don't need polling here, it's unfortunately fairly common to
have to poll from shell scripts. Why I'm suggesting a test/common
location.

>> +}
>> +
>>  start_monitor()
>>  {
>>      logfile=$(mktemp)
>>      $NDCTL monitor -c "$monitor_conf" -l "$logfile" $1 &
>>      monitor_pid=$!
>> -    sync; sleep 3
>> +    sync
>> +    for i in {1..30}; do
>> +            if ps -p "$monitor_pid" > /dev/null; then
>> +                    sleep 0.1
>> +                    break
>> +            fi
>> +            sleep 0.1
>> +    done
>> +    if ! ps -p "$monitor_pid" > /dev/null; then
>> +            echo "monitor not ready within 3 seconds"
>> +            err "$LINENO"
>> +    fi
> 
> This does not make sense to. The shell got the pid from the launching
> the executable. This is effectively testing that bash command execution
> works. About the only use I can imagine for this is checking that the
> monitor did not die early, but that should be determined by other parts
> of the test.

Agreed: I'm afraid the only thing this code does is sleeping 0.1s only
once instead of 3s. Because not sleeping at all worked for you, no
surprise a single sleep 0.1 works too.

I suspect the only case where the "for" loop actually iterates is when the
"monitor" process crashes extremely fast, faster than the
"sync". Basically racing with its parent to crash before the latter
notices. That race does not look like a "feature" to me.

I agree this should be replaced by observing side-effects from the
monitor. Dunno what. grep something in the ndctl monitor -v output?


By the way, UNTESTED:

--- a/test/monitor.sh
+++ b/test/monitor.sh
@@ -67,7 +67,19 @@ check_result()
 
 stop_monitor()
 {
-       kill $monitor_pid
+       kill $monitor_pid || die "monitor $monitor_pid was already dead"
+
+       local ret=0
+       timeout 3 wait $monitor_pid || ret=$?
+       case "$ret") in
+       124) # timeout
+               die "monitor $monitor_pid ignored SIGTERM" ;;
+       0|127) # either success or killed fast
+               : ;;
+       *)
+               die "unexpected monitor exit status:$ret" ;;
+       esac
+
        rm "$logfile"
 }


Something like that...

This is all assuming the monitor does not have any kind of "remote
control"; that would be much better.

I don't know the "monitor", but if it has neither obvious
side-effects nor any kind of "remote control", then it does a "great"
job... getting in the way of tests! :-( Maybe the monitor is what should
be fixed rather than adding shell script "creativity"?

Re: [ndctl PATCH v2] test/monitor.sh: replace sleep with polling after sync

Reply via email to