On 2025-05-15 23:08, Dan Williams wrote: > alison.schofield@ wrote: >> >> +wait_for_logfile_update() >> +{ >> + local file="$1" >> + local prev_size="$2" >> + local timeout=30 >> + local i=0 >> + >> + # prev_size is always zero because start_monitor truncates it. >> + # Set and check against it anyway to future proof. >> + while [ $i -lt $timeout ]; do >> + local new_size=$(stat -c%s "$file" 2>/dev/null || echo 0) >> + if [ "$new_size" -gt "$prev_size" ]; then >> + return 0 >> + fi >> + sleep 0.1 >> + i=$((i+1)) >> + done >> + >> + echo "logfile not updated within 3 seconds" >> + err "$LINENO" > > Hmm... not a fan of this open coded "wait for file to change" bash > function. This feels like something that a tool can do... (searches) > > Does inotifywait fit the bill here? > > https://linux.die.net/man/1/inotifywait
If inotify works, go for it. Blocking is always better than polling. It might be tricky because the file does not exist yet. Create an empty file yourself first, would that work? Probably not if ndctl monitor creates a brand new file. If inotify does not work, consider adding to test/common this generic polling function that lets you poll in bash literally anything: https://github.com/pmem/run_qemu/pull/177/files It would require making `prev_size` global which does not look like an issue to me. Before adding it run_qemu, that polling function has been used for years and thousands of runs in https://github.com/thesofproject/sof-test/blob/main/case-lib/lib.sh I mean it's been extremely well tested. Even if you don't need polling here, it's unfortunately fairly common to have to poll from shell scripts. Why I'm suggesting a test/common location. >> +} >> + >> start_monitor() >> { >> logfile=$(mktemp) >> $NDCTL monitor -c "$monitor_conf" -l "$logfile" $1 & >> monitor_pid=$! >> - sync; sleep 3 >> + sync >> + for i in {1..30}; do >> + if ps -p "$monitor_pid" > /dev/null; then >> + sleep 0.1 >> + break >> + fi >> + sleep 0.1 >> + done >> + if ! ps -p "$monitor_pid" > /dev/null; then >> + echo "monitor not ready within 3 seconds" >> + err "$LINENO" >> + fi > > This does not make sense to. The shell got the pid from the launching > the executable. This is effectively testing that bash command execution > works. About the only use I can imagine for this is checking that the > monitor did not die early, but that should be determined by other parts > of the test. Agreed: I'm afraid the only thing this code does is sleeping 0.1s only once instead of 3s. Because not sleeping at all worked for you, no surprise a single sleep 0.1 works too. I suspect the only case where the "for" loop actually iterates is when the "monitor" process crashes extremely fast, faster than the "sync". Basically racing with its parent to crash before the latter notices. That race does not look like a "feature" to me. I agree this should be replaced by observing side-effects from the monitor. Dunno what. grep something in the ndctl monitor -v output? By the way, UNTESTED: --- a/test/monitor.sh +++ b/test/monitor.sh @@ -67,7 +67,19 @@ check_result() stop_monitor() { - kill $monitor_pid + kill $monitor_pid || die "monitor $monitor_pid was already dead" + + local ret=0 + timeout 3 wait $monitor_pid || ret=$? + case "$ret") in + 124) # timeout + die "monitor $monitor_pid ignored SIGTERM" ;; + 0|127) # either success or killed fast + : ;; + *) + die "unexpected monitor exit status:$ret" ;; + esac + rm "$logfile" } Something like that... This is all assuming the monitor does not have any kind of "remote control"; that would be much better. I don't know the "monitor", but if it has neither obvious side-effects nor any kind of "remote control", then it does a "great" job... getting in the way of tests! :-( Maybe the monitor is what should be fixed rather than adding shell script "creativity"?