Re: [HACKERS] Proposal for UPDATE: do not insert new tuple on heap if update does not change data

2016-01-20 Thread Kevin Grittner
On Wed, Jan 20, 2016 at 3:55 AM, Gasper Zejn  wrote:

> I was wondering if PostgreSQL adds new tuple if data is not changed
> when using UPDATE. It turns out it does add them and I think it might
> be beneficial not to add a new tuple in this case, since it causes a
> great deal of maintenance: updating indexes, vacuuming table and
> index, also heap fragmentation.

If you have one or more tables on which you routinely updated rows
to the values they already have, you might want to attach an update
trigger using the suppress_redundant_updates_trigger() function.

http://www.postgresql.org/docs/current/interactive/functions-trigger.html

A better solution, where possible, is to use the WHERE clause to
avoid the update attempt where the new values are not distinct from
the old ones.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal for UPDATE: do not insert new tuple on heap if update does not change data

2016-01-20 Thread Tom Lane
Gasper Zejn  writes:
> I was wondering if PostgreSQL adds new tuple if data is not changed
> when using UPDATE. It turns out it does add them and I think it might
> be beneficial not to add a new tuple in this case, since it causes a
> great deal of maintenance: updating indexes, vacuuming table and
> index, also heap fragmentation.

This has been discussed in the past, and the conclusion was that expending
cycles on every UPDATE to check for this case would be a net loss.  How
many real applications do no-op updates often enough that it's worth
optimizing for?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal for UPDATE: do not insert new tuple on heap if update does not change data

2016-01-20 Thread Konstantin Knizhnik

Hi,

To eliminate creation of new tuple version in this case it is necessary 
to check that update actually doesn't change the record.
It is not a cheapest test and it seems to be not so good idea to perform 
it always.
But if you fill that in your case there are many "identical" updates, 
you can always explicitly rewrite query by adding extra check:


UPDATE foo SET val = 'second' where pk = 2 and val <> 'second';




On 20.01.2016 12:55, Gasper Zejn wrote:

Hi,

I was wondering if PostgreSQL adds new tuple if data is not changed
when using UPDATE. It turns out it does add them and I think it might
be beneficial not to add a new tuple in this case, since it causes a
great deal of maintenance: updating indexes, vacuuming table and
index, also heap fragmentation.

How to check:

CREATE TABLE foo (pk serial primary key, val text);
-- Starting point: two rows.
INSERT INTO foo VALUES (1, 'first');
INSERT INTO foo VALUES (2, 'second');
CHECKPOINT;

-- Updating row with same value.
UPDATE foo SET val = 'second' where pk = 2;
CHECKPOINT;

-- "Upsert" is the same.
INSERT INTO foo VALUES (2, 'second') ON CONFLICT (pk) DO UPDATE SET
val = 'second';
CHECKPOINT;

If after any checkpoint you look at page data, you can see multiple
versions of same row with "second".

Unfortunately, I don't believe I can come up with a patch on my own,
but will happily offer any further help with testing and ideas.


Attached is a script with minimal test case.

Kind regards,
Gasper Zejn




--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



[HACKERS] Proposal for UPDATE: do not insert new tuple on heap if update does not change data

2016-01-20 Thread Gasper Zejn
Hi,

I was wondering if PostgreSQL adds new tuple if data is not changed
when using UPDATE. It turns out it does add them and I think it might
be beneficial not to add a new tuple in this case, since it causes a
great deal of maintenance: updating indexes, vacuuming table and
index, also heap fragmentation.

How to check:

CREATE TABLE foo (pk serial primary key, val text);
-- Starting point: two rows.
INSERT INTO foo VALUES (1, 'first');
INSERT INTO foo VALUES (2, 'second');
CHECKPOINT;

-- Updating row with same value.
UPDATE foo SET val = 'second' where pk = 2;
CHECKPOINT;

-- "Upsert" is the same.
INSERT INTO foo VALUES (2, 'second') ON CONFLICT (pk) DO UPDATE SET
val = 'second';
CHECKPOINT;

If after any checkpoint you look at page data, you can see multiple
versions of same row with "second".

Unfortunately, I don't believe I can come up with a patch on my own,
but will happily offer any further help with testing and ideas.


Attached is a script with minimal test case.

Kind regards,
Gasper Zejn


pg-duplicate-tuple.sh
Description: Bourne shell script

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers