Hello,

While running some tests, I encountered a situation where pgbench gets
stuck in an infinite loop, consuming 100% cpu. The setup was:

- Start postgres server from the master branch
- Initialise pgbench
- Run pgbench -c 10 -T 100
- Stop postgres with -m immediate

Now it seems that pgbench gets stuck and it's state machine does not
advance. Attaching it to debugger, I saw that one of the clients remain
stuck in this loop forever.

               if (command->type == SQL_COMMAND)
                {
                    if (!sendCommand(st, command))
                    {
                        /*
                         * Failed. Stay in CSTATE_START_COMMAND state, to
                         * retry. ??? What the point or retrying? Should
                         * rather abort?
                         */
                        return;
                    }
                    else
                        st->state = CSTATE_WAIT_RESULT;
                }

sendCommand() returns false because the underlying connection is bad
and PQsendQuery returns 0. Reading the comment, it seems that the author
thought about this situation but decided to retry instead of abort. Not
sure what was the rationale for that decision, may be to deal with
transient failures?

The commit that introduced this code is 12788ae49e1933f463bc. So I am
copying Heikki.

Thanks,
Pavan



-- 
 Pavan Deolasee                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Reply via email to