On 2025/01/10 21:41, Fujii Masao wrote:


On 2025/01/10 16:09, Andy Fan wrote:
Andy Fan <zhihuifan1...@163.com> writes:

Hi:

I run into the {subject} issue with the below setup.

cat foo.sql

\setshell txn_mode echo ${TXN_MODE}
\setshell speed echo ${SPEED}
\setshell sleep_ms echo ${SLEEP_MS}
\setshell subtxn_mode echo ${SUBTXN_MODE}

select 1;

$ TXN_MODE=-1 SPEED=1 SLEEP_MS=0 SUBTXN_MODE=-1 pgbench -n -ffoo.sql postgres 
-T5 -c4 --exit-on-abort

I *randomly*(7/8) get errors like:

pgbench (18devel)
pgbench: error: client 2 aborted in command 0 (setshell) of script 0; execution 
of meta-command failed
pgbench: error: Run was aborted due to an error in thread 0

Interestingly, my git bisect pointed to the following commit
as the cause of this issue, even though it seems unrelated to
the pgbench problem at all. It’s possible my git bisect result
is incorrect, but when I reverted this commit on HEAD,
the pgbench issue didn’t occur during my tests.

----------------------
06843df4abc5a0c24e4bd154a8a1327e074fa3ae is the first bad commit
commit 06843df4abc5a0c24e4bd154a8a1327e074fa3ae
Author: Tom Lane <t...@sss.pgh.pa.us>
Date:   Fri Sep 29 14:07:30 2023 -0400

     Suppress macOS warnings about duplicate libraries in link commands.
----------------------

Before this commit, pgbench used pqsignal() from port/pqsignal.c
to set the signal handler for SIGALRM. This version of pqsignal()
sets SA_RESTART for frontend code, so fgets() in runShellCommand()
wouldn't return NULL even if SIGALRM arrived during fgets(),
preventing the reported error.

On the other hand, currently, pgbench seems to use pqsignal()
from legacy-pqsignal.c, which doesn't set SA_RESTART for SIGALRM.
As a result, SIGALRM can interrupt fgets() in runShellCommand()
and make it return NULL, leading to the reported error.

I'm not sure if this change was an intentional result of that commit...

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION



Reply via email to