Re: [PERFORM] UPDATEDs slowing SELECTs in a fully cached database

lars Mon, 11 Jul 2011 21:43:43 -0700

On 07/11/2011 08:26 AM, Robert Klemme wrote:

On Mon, Jul 11, 2011 at 3:13 PM, [email protected]<[email protected]>  wrote:

I do not know if this makes sense in PostgreSQL and that readers
do not block writers and writes do not block readers. Are your
UPDATEs to individual rows, each in a separate transaction, or
do you UPDATE multiple rows in the same transaction? If you
perform multiple updates in a single transaction, you are
synchronizing the changes to that set of rows and that constraint
is causing other readers that need to get the correct values post-
transaction to wait until the COMMIT completes. This means that
the WAL write must be completed.

What readers should that be?  Docs explicitly state that readers are
never blocked by writers:
http://www.postgresql.org/docs/9.0/interactive/mvcc-intro.html
http://www.postgresql.org/docs/9.0/interactive/mvcc.html


 From what I understand about this issue the observed effect must be
caused by the implementation and not by a conceptual issue with
transactions.

Have you tried disabling synchronous_commit? If this scenario
holds, you should be able to reduce the slowdown by un-batching
your UPDATEs, as counter-intuitive as that is. This seems to
be similar to a problem that I have been looking at with using
PostgreSQL as the backend to a Bayesian engine. I am following
this thread with interest.

I don't think this will help (see above).  Also, I would be very
cautious to do this because although the client might get a faster
acknowledge the DB still has to do the same work as without
synchronous_commit (i.e. WAL, checkpointing etc.) but it still has to
do significantly more transactions than in the batched version.

Typically there is an optimum batch size: if batch size is too small
(say, one row) the ratio of TX overhead to "work" is too bad.  If
batch size is too large (say, millions of rows) you hit resource
limitations (memory) which inevitable force the RDBMS to do additional
disk IO.

Kind regards

robert

Thanks Ken and Robert,

What I am observing is definitely not readers blocked by writers bymeans of row-level locking.

This seems to be some implementation detail in Postgres about how dirtypages (or dead versions of tuples) areflushed to the disk (see the other parts of this thread) when theyaccessed by a SELECT query.

The batch size in this case is one SELECT statement accessing 10000 rowsvia an aggregate (such as COUNT) and

an UPDATE updating 10000 rows in a single statement.

I am not trying to optimize this particular use case, but rather tounderstand what Postgres is doing, and why SELECTqueries are affected negatively (sometimes severely) by concurrent (oreven preceding) UPDATEs at all when the

database resides in the cache completely.

-- Lars


--
Sent via pgsql-performance mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] UPDATEDs slowing SELECTs in a fully cached database

Reply via email to