> Hi Tatsuo,
> Attached is a proposed patch that deals with a couple of pgbench issues.
> The change at line 490 updates doCustom's local variable "commands"
> after selecting a new file (command sequence).  I think that the
> existing coding will cause the thing to use the first command of the
> old sequence in the remainder of the routine, which would be a bug.
> I have not tried to set up a test case to prove it, though.

Thanks for pointing it out. It's definitely a bug. To reproduce the
problem, you have two SQL files, say,



then run,

pgbench -f a.sql -f b.sql

and see the SQL trace by enabling log_statement=all,

I see:
LOG:  statement: SELECT 300;
LOG:  statement: SELECT 400;
LOG:  statement: SELECT 300;
LOG:  statement: SELECT 2;
LOG:  statement: SELECT 1;
LOG:  statement: SELECT 400;
LOG:  statement: SELECT 300;
LOG:  statement: SELECT 400;
LOG:  statement: SELECT 300;
LOG:  statement: SELECT 2;
LOG:  statement: SELECT 1;

#1 and #2 are ok, but #3(SELECT 300) and #4(SELECT 2) combo is
supposed to be impossible if pgbench works correctly.

> The other two changes cause doCustom to loop after processing a
> meta-command.  This might be a bit controversial, but as the code
> is currently written, each meta-command "costs" one cycle of the
> outer select() loop.  Thus, for example, with the default TPC-B script,
> once a backend returns "COMMIT" it will not receive a new command
> until four cycles of issuing commands to other backends have elapsed.
> (You can see this very easily by strace'ing pgbench under load.)
> ISTM it is bad to have backends sitting idle waiting for pgbench to
> give them a new command.  On the other hand you could argue that
> finishing out several consecutive metacommands will delay issuance of
> new commands to other backends.  In the test case I was running,
> making this change made for a small but noticeable improvement in
> total CPU usage, but I'm not entirely convinced it'd always be a win.


> I get the impression that pgbench itself is a bit of a bottleneck when
> running on multi-CPU machines.  I can't get the total CPU usage to reach
> 90% with ten client threads on a dual HT-enabled Xeon, and the only
> explanation I can see is that there are too many backends waiting for
> pgbench to give them new commands instead of doing useful work.  strace
> shows that each select() iteration usually finds *all ten* sockets
> read-ready, which is definitely bad.  (I tried nice'ing pgbench to
> negative priority, but it did not improve matters.  It could easily be
> that this is a Heisenproblem, though, with strace itself slowing pgbench
> enough to cause the behavior.)  I wonder if it would help for pgbench to
> fork off multiple sub-processes and have each sub-process tend just one
> backend.

I'm not sure multiple sub-processes version of pgbench shows superior
performance than current implementation because of process context
switching overhead. Maybe threading is better? Mr. Yasuo Ohgaki
implemented pthead version of pgbench.


You might want to try.

> Anyway, since I'm not sure of either of these changes, I'm passing them
> to you for review.

Your patches seem fine.
Tatsuo Ishii
SRA OSS, Inc. Japan

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
       choose an index scan if your joining column's datatypes do not

Reply via email to