Re: [HACKERS] Add database to PGXACT / per database vacuuming

2013-09-03 Thread Robert Haas
On Fri, Aug 30, 2013 at 2:29 PM, Andres Freund and...@2ndquadrant.com wrote:
 I don't know how big an impact adding the database oid would have, on the
 case that the PGPROC/PGXACT split was done in the first place. In the worst
 case it will make taking a snapshot 1/3 slower under contention. That needs
 to be tested.

 Yes, definitely. I am basically wondering whether somebody has/sees
 fundamental probles with it making it pointless to investigate.

I expect there will be a measurable performance degradation, though
I'm willing to be proven wrong.  I think the question is whether we
get enough bang for the buck out of it to eat that.  It seems quite
likely that users with many databases will come out ahead, as such
systems seem likely to be shared hosting environments where the
machine is lightly loaded most of the time anyway, but where
cross-database interactions cause headaches.  But many users have One
Big Database, and AFAICS this is just overhead for them.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Add database to PGXACT / per database vacuuming

2013-08-30 Thread Andres Freund
Hi,

For the logical decoding patch I added support for pegging
RecentGlobalXmin (and GetOldestXmin) to a lower value. To avoid causing
undue bloat  cpu overhead (hot pruning is friggin expensive) I split
RecentGlobalXmin into RecentGlobalXmin and RecentGlobalDataXmin where
the latter is the the xmin horizon used for non-shared, non-catalog
tables. That removed almost all overhead I could measure.

During that I was tinkering with the idea of reusing that split to
vacuum/prune user tables in a per db fashion. In a very quick and hacky
test that sped up the aggregate performance of concurrent pgbenches in
different databases by about 30%. So, somewhat worthwile ;).

The problem with that is that GetSnapshotData, which computes
RecentGlobalXmin, only looks at the PGXACT structures and not PGPROC
which contains the database oid. This is a recently added optimization
which made GetSnapshotData() quite a bit faster  scalable which is
important given the frequency it's called at.

What about moving/copying the database oid from PGPROC to PGXACT?
Currently a single PGXACT is 12 bytes which means we a) have several
entries in a single cacheline b) have ugly sharing because we will have
PGXACTs split over more than one cacheline.

Comments?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add database to PGXACT / per database vacuuming

2013-08-30 Thread Heikki Linnakangas

On 30.08.2013 19:01, Andres Freund wrote:

For the logical decoding patch I added support for pegging
RecentGlobalXmin (and GetOldestXmin) to a lower value. To avoid causing
undue bloat  cpu overhead (hot pruning is friggin expensive) I split
RecentGlobalXmin into RecentGlobalXmin and RecentGlobalDataXmin where
the latter is the the xmin horizon used for non-shared, non-catalog
tables. That removed almost all overhead I could measure.

During that I was tinkering with the idea of reusing that split to
vacuum/prune user tables in a per db fashion. In a very quick and hacky
test that sped up the aggregate performance of concurrent pgbenches in
different databases by about 30%. So, somewhat worthwile ;).

The problem with that is that GetSnapshotData, which computes
RecentGlobalXmin, only looks at the PGXACT structures and not PGPROC
which contains the database oid. This is a recently added optimization
which made GetSnapshotData() quite a bit faster  scalable which is
important given the frequency it's called at.


Hmm, so you're creating a version of GetSnapshotData() that only takes 
into account backends in the same backend?



What about moving/copying the database oid from PGPROC to PGXACT?


Might be worthwhile.


Currently a single PGXACT is 12 bytes which means we a) have several
entries in a single cacheline b) have ugly sharing because we will have
PGXACTs split over more than one cacheline.


I can't get excited about either of these arguments, though. The reason 
for having separate PGXACT structs is that they are as small as 
possible, so that you can fit as many of them as possible in as few 
cache lines as possible. Whether one PGXACT crosses a cache line or not 
is not important, because when taking a snapshot, you scan through all 
of them.


I don't know how big an impact adding the database oid would have, on 
the case that the PGPROC/PGXACT split was done in the first place. In 
the worst case it will make taking a snapshot 1/3 slower under 
contention. That needs to be tested.


One idea is to have a separate PGXACT array for each database? Well, 
that might be difficult, but something similar, like group all PGXACTs 
for one database together, and keep a separate lookup array for where 
the entries for each database begins.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add database to PGXACT / per database vacuuming

2013-08-30 Thread Heikki Linnakangas

On 30.08.2013 21:07, Heikki Linnakangas wrote:

On 30.08.2013 19:01, Andres Freund wrote:

For the logical decoding patch I added support for pegging
RecentGlobalXmin (and GetOldestXmin) to a lower value. To avoid causing
undue bloat cpu overhead (hot pruning is friggin expensive) I split
RecentGlobalXmin into RecentGlobalXmin and RecentGlobalDataXmin where
the latter is the the xmin horizon used for non-shared, non-catalog
tables. That removed almost all overhead I could measure.

During that I was tinkering with the idea of reusing that split to
vacuum/prune user tables in a per db fashion. In a very quick and hacky
test that sped up the aggregate performance of concurrent pgbenches in
different databases by about 30%. So, somewhat worthwile ;).

The problem with that is that GetSnapshotData, which computes
RecentGlobalXmin, only looks at the PGXACT structures and not PGPROC
which contains the database oid. This is a recently added optimization
which made GetSnapshotData() quite a bit faster scalable which is
important given the frequency it's called at.


Hmm, so you're creating a version of GetSnapshotData() that only takes
into account backends in the same backend?


I mean, only takes account backends in the same database?

- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add database to PGXACT / per database vacuuming

2013-08-30 Thread Andres Freund
On 2013-08-30 21:07:04 +0300, Heikki Linnakangas wrote:
 On 30.08.2013 19:01, Andres Freund wrote:
 For the logical decoding patch I added support for pegging
 RecentGlobalXmin (and GetOldestXmin) to a lower value. To avoid causing
 undue bloat  cpu overhead (hot pruning is friggin expensive) I split
 RecentGlobalXmin into RecentGlobalXmin and RecentGlobalDataXmin where
 the latter is the the xmin horizon used for non-shared, non-catalog
 tables. That removed almost all overhead I could measure.
 
 During that I was tinkering with the idea of reusing that split to
 vacuum/prune user tables in a per db fashion. In a very quick and hacky
 test that sped up the aggregate performance of concurrent pgbenches in
 different databases by about 30%. So, somewhat worthwile ;).
 
 The problem with that is that GetSnapshotData, which computes
 RecentGlobalXmin, only looks at the PGXACT structures and not PGPROC
 which contains the database oid. This is a recently added optimization
 which made GetSnapshotData() quite a bit faster  scalable which is
 important given the frequency it's called at.
 
 Hmm, so you're creating a version of GetSnapshotData() that only takes into
 account backends in the same backend?

You can see what I did for logical decoding in 
http://git.postgresql.org/gitweb/?p=users/andresfreund/postgres.git;a=blobdiff;f=src/backend/storage/ipc/procarray.c;h=11aa1f5a71196a61e31b711e0a044b2a5927a6cc;hp=9bf0989c9206b5e07053587f517d5e9a2322a628;hb=edcf0939072ebe68969560a7d54a26c123b279b4;hpb=ff4fa81665798642719c11c779d0518ef6611373

So, basically I compute the normal RecentGlobalXmin, and then just
subtract the logical xmin which is computed elsewhere to get the
catalog xmin.

What I'd done with the prototype of $topic (lost it, but I am going to
hack it together again) was just to compute RecentGlobalXmin (for
non-catalog, non-shared tables) at the same time with
RecentGlobalDataXmin (for everything else) by just not lowering
RecentGlobalDataXmin if pgxact-dboid != MyDatabaseId.
So, the snapshot itself was the same, but because RecentGlobalDataXmin
is independent from the other databases vacuum  pruning can cleanup way
more leading to a smaller database and higher database.

 Currently a single PGXACT is 12 bytes which means we a) have several
 entries in a single cacheline b) have ugly sharing because we will have
 PGXACTs split over more than one cacheline.
 
 I can't get excited about either of these arguments, though. The reason for
 having separate PGXACT structs is that they are as small as possible, so
 that you can fit as many of them as possible in as few cache lines as
 possible. Whether one PGXACT crosses a cache line or not is not important,
 because when taking a snapshot, you scan through all of them.

The problem with that is that we actually write to PGXACT pretty
frequently (at least -xid, -xmin, -nxids, -delayChkpt). As soon as
you factor that in, sharing cachelines between backends can hurt. Even
a plain GetSnapshotData() will write to MyPgXact-xmin...

 I don't know how big an impact adding the database oid would have, on the
 case that the PGPROC/PGXACT split was done in the first place. In the worst
 case it will make taking a snapshot 1/3 slower under contention. That needs
 to be tested.

Yes, definitely. I am basically wondering whether somebody has/sees
fundamental probles with it making it pointless to investigate.

 One idea is to have a separate PGXACT array for each database? Well, that
 might be difficult, but something similar, like group all PGXACTs for one
 database together, and keep a separate lookup array for where the entries
 for each database begins.

Given that we will have to search all PGXACT entries anyway because of
shared relations for the forseeable future, I can't see that being
really beneficial.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers