Re: [HACKERS] pg_basebackup --xlog compatibility break
On tis, 2012-05-29 at 22:31 +0200, Magnus Hagander wrote: Yeah, good arguments all around, i agree too :-) Next question is - suggestions for naming of said paramter? --xlog-method=something? And/or -Xsomething, which would automatically enable -x? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] How to avoid base backup in automated failover
Hi all, Recently I was writing an application to implement automated failover with env: Two 2008 R2 servers, Network area storage, asynchronous replication, WAL archive on primary enabled. Is there any way to avoid starting standby server always from base backup in automated failover. I see the database is growing huge. I can't keep doing base backup every day. Please suggest solution Regards, Reddy -- View this message in context: http://postgresql.1045698.n5.nabble.com/How-to-avoid-base-backup-in-automated-failover-tp5711147.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 9.2beta1, parallel queries, ReleasePredicateLocks, CheckForSerializableConflictIn in the oprofile
On Fri, Jun 1, 2012 at 9:55 PM, Ants Aasma a...@cybertec.at wrote: On Sat, Jun 2, 2012 at 1:48 AM, Merlin Moncure mmonc...@gmail.com wrote: Buffer pins aren't a cache: with a cache you are trying to mask a slow operation (like a disk i/o) with a faster such that the amount of slow operations are minimized. Buffer pins however are very different in that we only care about contention on the reference count (the buffer itself is not locked!) which makes me suspicious that caching type algorithms are the wrong place to be looking. I think it comes to do picking between your relatively complex but general, lock displacement approach or a specific strategy based on known bottlenecks. I agree that pins aren't like a cache. I mentioned the caching algorithms because they work based on access frequency and highly contended locks are likely to be accessed frequently even from a single backend. However this only makes sense for the delayed unpinning method, and I also have come to the conclusion that it's not likely to work well. Besides delaying cleanup, the overhead for the common case of uncontended access is just too much. It seems to me that even the nailing approach will need a replacement algorithm. The local pins still need to be published globally and because shared memory size is fixed, the maximum amount of locally pinned nailed buffers needs to be limited as well. But anyway, I managed to completely misread the profile that Sergey gave. Somehow I missed that the time went into the retry TAS in slock instead of the inlined TAS. This shows that the issue isn't just cacheline ping-pong but cacheline stealing. This could be somewhat mitigated by making pinning lock-free. The Nb-GCLOCK paper that Robert posted earlier in another thread describes an approach for this. I have a WIP patch (attached) that makes the clock sweep lock-free in the common case. This patch gave a 40% performance increase for an extremely allocation heavy load running with 64 clients on a 4 core 1 socket system, lesser gains across the board. Pinning has a shorter lock duration (and a different lock type) so the gain might be less, or it might be a larger problem and post a higher gain. Either way, I think the nailing approach should be explored further, cacheline ping-pong could still be a problem with higher number of processors and losing the spinlock also loses the ability to detect contention. Note sure about the rest of this patch, but this part is definitely bogus: +#if !defined(pg_atomic_fetch_and_set) +#define pg_atomic_fetch_and_set(dst, src, value) \ + do { S_LOCK(dummy_spinlock); \ + dst = src; \ + src = value; \ + S_UNLOCK(dummy_spinlock); } while (0) +#endif Locking a dummy backend-local spinlock doesn't provide atomicity across multiple processes. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [GENERAL] pg_upgrade from 9.0.7 to 9.1.3: duplicate key pg_authid_oid_index
On Sat, Jun 02, 2012 at 05:10:03PM -0400, Tom Lane wrote: Bruce Momjian br...@momjian.us writes: On Fri, Jun 01, 2012 at 09:52:59AM -0400, Tom Lane wrote: It seems that pg_upgrade needs a check to make sure that the bootstrap superuser is named the same in old and new clusters. The attached patch adds checks to verify the the old/new servers have the same install-user oid. That may or may not be a useful check to make, but it's got approximately nothing to do with what I was complaining about. In particular, supposing that the user has given you a username that isn't the bootstrap superuser in the new cluster, this patch is not going to stop the update script from failing. Because the script is then going to try to replace the bootstrap superuser, and that is certainly going to give an error. I see the point of worrying about the install user as well as the bootstrap superuser, but wouldn't it be best to insist they be the same? Particularly in the new cluster, where if they aren't the same it means the user has manually created at least one role in the new cluster, which is likely to lead to OID conflicts or worse. Furthermore, if the bootstrap superusers aren't named the same, your patch fails to handle the original complaint. In the case the OP mentioned, the old cluster had OID 10: ubuntu some user-defined OID: postgres and the new cluster had OID 10: postgres If the user tells pg_upgrade to use username postgres, your check will not fail AFAICS, but nonetheless things are going to be messed up after the upgrade, because some objects and privileges that used to belong to the bootstrap superuser will now belong to a non-default superuser, whereas what used to belong to the non-default superuser will now belong to the bootstrap superuser. That cannot be thought desirable. For one reason, in the old installation the postgres role could have been dropped (possibly after dropping a few non-builtin objects) whereas the ubuntu role was pinned. In the new installation, postgres is pinned and ubuntu won't be. I think the checks that are actually needed here are (1) bootstrap superusers are named the same, and (2) there are no roles other than the bootstrap superuser in the new cluster. You are right that it is more complex than I stated, but given the limited feedback I got on the pg_upgrade/plplython, I figured people didn't want to hear the details. Here they are: There are three failure modes for pg_upgrade: 1. check failure 2. schema restore failure 3. silent failure/corruption Of course, the later items are worse than the earlier ones. The reporter got a schema restore failure while still following the pg_upgrade instructions. My initial patch changed that #2 error to a #1 error. Tom is right that creating users in the new cluster (against instructions), can still generate a #2 error if a new/old pg_authid.oid match, and they are not the install user, but seeing that is something that is against the instructions, I was going to leave that as a #2. However, since Tom feels we should check that and make it a #1 failure, I have added that test to the attached patch. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + diff --git a/contrib/pg_upgrade/check.c b/contrib/pg_upgrade/check.c new file mode 100644 index d226f00..9f3dcda *** a/contrib/pg_upgrade/check.c --- b/contrib/pg_upgrade/check.c *** check_new_cluster(void) *** 138,144 * We don't restore our own user, so both clusters must match have * matching install-user oids. */ ! if (old_cluster.install_user_oid != new_cluster.install_user_oid) pg_log(PG_FATAL, Old and new cluster install users have different values for pg_authid.oid.\n); --- 138,144 * We don't restore our own user, so both clusters must match have * matching install-user oids. */ ! if (old_cluster.install_role_oid != new_cluster.install_role_oid) pg_log(PG_FATAL, Old and new cluster install users have different values for pg_authid.oid.\n); *** check_new_cluster(void) *** 147,153 * defined users might match users defined in the old cluster and * generate an error during pg_dump restore. */ ! if (new_cluster.user_count != 1) pg_log(PG_FATAL, Only the install user can be defined in the new cluster.\n); check_for_prepared_transactions(new_cluster); --- 147,153 * defined users might match users defined in the old cluster and * generate an error during pg_dump restore. */ ! if (new_cluster.role_count != 1) pg_log(PG_FATAL, Only the install user can be defined in the new cluster.\n); check_for_prepared_transactions(new_cluster); *** check_is_super_user(ClusterInfo *cluster *** 618,624 pg_log(PG_FATAL, database user
Re: [HACKERS] 9.2beta1, parallel queries, ReleasePredicateLocks, CheckForSerializableConflictIn in the oprofile
On Mon, Jun 4, 2012 at 5:12 PM, Robert Haas robertmh...@gmail.com wrote: Note sure about the rest of this patch, but this part is definitely bogus: +#if !defined(pg_atomic_fetch_and_set) +#define pg_atomic_fetch_and_set(dst, src, value) \ + do { S_LOCK(dummy_spinlock); \ + dst = src; \ + src = value; \ + S_UNLOCK(dummy_spinlock); } while (0) +#endif Locking a dummy backend-local spinlock doesn't provide atomicity across multiple processes. Right, of course. I don't know why I assumed that dummy_spinlock would be global. In any case, this is very WIP and doesn't even aspire to be portable yet. The main point was to see if there's any significant performance to be gained by this method. Ants Aasma -- Cybertec Schönig Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt Web: http://www.postgresql-support.de -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Updated version of pg_receivexlog
On Thu, Sep 29, 2011 at 11:30 PM, Magnus Hagander mag...@hagander.net wrote: it doesn't say that is not possible to use this for a standby server... probably that's why i get the error i put a recovery.conf after pg_basebackup finished... maybe we can say that more loudly? The idea is, if you use it with -x (or --xlog), it's for taking a backup/clone, *not* for replication. If you use it without -x, then you can use it as the start of a replica, by adding a recovery.conf. But you can't do both at once, that will confuse it. I stumbled upon this again today. There's nothing in the docs that would even hint that using -x shouldn't work to create a replica. Why does it get confused and can we (easily) make it not get confused? At the very least it needs a big fat warning in documentation for the -x option that the resulting backup might not be usable as a standby. Ants Aasma -- Cybertec Schönig Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt Web: http://www.postgresql-support.de -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] Interface of Row Level Security
On Sat, Jun 2, 2012 at 12:58 AM, Kohei KaiGai kai...@kaigai.gr.jp wrote: 2012/6/1 Robert Haas robertmh...@gmail.com: On Thu, May 31, 2012 at 3:52 PM, Kohei KaiGai kai...@kaigai.gr.jp wrote: It may be an option to separate the case into two; a situation to execute the given query immediately just after optimization and never reused, and others. Yes. Simon suggested exactly this a while back, and I agree with him that we ought to have it. It is good for me, also. Then, if so, we will be able to push the stuff corresponding to RLSBYPASS into the query optimization, and works transparently for users. You're still going to need a way to make sure that the cluster can be dumped properly. RLSBYPASS accomplishes that; your scheme doesn't. If something like has_superuser_privilege() OR is automatically appended to user given clauses, it makes sure whole of the database cluster is dumped. That also means any permission checks are delayed to executor stage (except for the case on simple query protocol, I mentioned above), thus it simplifies the condition to invalidate prepared statement. One problem I noticed last night around RLSBYPASS approach is: it can take much number of invalidation whenever current user-id is switched. Not only SET AUTHORIZATION, but SECURITY DEFINER function's invocation also. I'm not sure whether this invalidation storm is reasonable level, or not. Is it unavailable to handle this type of implicit superuser checks with existing EXECUTE statement? I tried to run EXPLAIN with similar case. postgres=# PREPARE p1(int, bool) AS SELECT * FROM tbl WHERE y $1 OR $2; PREPARE postgres=# EXPLAIN EXECUTE p1(10, current_user in ('kaigai','rhaas')); QUERY PLAN Seq Scan on tbl (cost=0.00..21.60 rows=1160 width=40) (1 row However, postgres=# PREPARE p2(int) AS SELECT * FROM tbl WHERE y $1 OR current_user in ('kaigai','rhaas'); PREPARE postgres=# EXPLAIN EXECUTE p2(10); QUERY PLAN - Seq Scan on tbl (cost=0.00..30.30 rows=394 width=40) Filter: ((y 10) OR (current_user() = ANY ('{kaigai,rhaas}'::name[]))) (2 rows) Please assume the second condition something like superuser checks in addition to the main security policy. It implies an idea to replace a certain portion of clause that consists of only constant values and stable / immutable functions by shadow parameter to be calculated at executor stage, makes sense to wipe out RLS policy for superusers. In addition, it requires no new invalidation mechanism to prepared statements. I'm not sure how best to handle the invalidation issue... but I still think that relying on the query planner to use theorem-proving to simplify RLS conditions is going to be problematic. You can certainly try it ad see how it comes out... -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 9.2beta1, parallel queries, ReleasePredicateLocks, CheckForSerializableConflictIn in the oprofile
On Mon, Jun 4, 2012 at 9:20 AM, Ants Aasma a...@cybertec.at wrote: On Mon, Jun 4, 2012 at 5:12 PM, Robert Haas robertmh...@gmail.com wrote: Note sure about the rest of this patch, but this part is definitely bogus: +#if !defined(pg_atomic_fetch_and_set) +#define pg_atomic_fetch_and_set(dst, src, value) \ + do { S_LOCK(dummy_spinlock); \ + dst = src; \ + src = value; \ + S_UNLOCK(dummy_spinlock); } while (0) +#endif Locking a dummy backend-local spinlock doesn't provide atomicity across multiple processes. Right, of course. I don't know why I assumed that dummy_spinlock would be global. In any case, this is very WIP and doesn't even aspire to be portable yet. The main point was to see if there's any significant performance to be gained by this method. yeah -- those are fallback routines in case the compiler primitives don't exist. I think I understand what Ants is doing here: he's reducing the coverage of the free list lock to only cover actually popping a buffer off the free list; it no longer covers the clock sweep. That's a massive win if it works. In order to get away with that he had to decompose all manipulations from the clock sweep to the Strategy to thread safe atomic operations. What happens (in the very unlikely, but possible case?) if another backend races to the buffer you've pointed at with 'victim'? It looks like multiple backends share the clock sweep now, but don't you need to need an extra test to ensure it's still a candidate victim buffer? merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Updated version of pg_receivexlog
On Mon, Jun 4, 2012 at 11:25 PM, Ants Aasma a...@cybertec.at wrote: On Thu, Sep 29, 2011 at 11:30 PM, Magnus Hagander mag...@hagander.net wrote: it doesn't say that is not possible to use this for a standby server... probably that's why i get the error i put a recovery.conf after pg_basebackup finished... maybe we can say that more loudly? The idea is, if you use it with -x (or --xlog), it's for taking a backup/clone, *not* for replication. If you use it without -x, then you can use it as the start of a replica, by adding a recovery.conf. But you can't do both at once, that will confuse it. I stumbled upon this again today. There's nothing in the docs that would even hint that using -x shouldn't work to create a replica. Why does it get confused and can we (easily) make it not get confused? At the very least it needs a big fat warning in documentation for the -x option that the resulting backup might not be usable as a standby. Unless I'm missing something, you can use pg_basebackup -x for the standby. If lots of WAL files are generated in the master after pg_basebackup -x ends and before you start the standby instance, you may get the following error. In this case, you need to consult with archived WAL files even though you specified -x option in pg_basebackup. FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 0001005C has already been removed Though we have the above problem, pg_basebackup -x is usable for the standby, I think. Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 9.2beta1, parallel queries, ReleasePredicateLocks, CheckForSerializableConflictIn in the oprofile
On Mon, Jun 4, 2012 at 10:17 AM, Merlin Moncure mmonc...@gmail.com wrote: What happens (in the very unlikely, but possible case?) if another backend races to the buffer you've pointed at with 'victim'? It looks like multiple backends share the clock sweep now, but don't you need to need an extra test to ensure it's still a candidate victim buffer? Actually, I don't think you do: the existing check on refcount is probably good enough. Hm, why did you get rid of BufferStrategyControl.lastFreeBuffer? merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] Interface of Row Level Security
2012/6/4 Robert Haas robertmh...@gmail.com: On Sat, Jun 2, 2012 at 12:58 AM, Kohei KaiGai kai...@kaigai.gr.jp wrote: 2012/6/1 Robert Haas robertmh...@gmail.com: On Thu, May 31, 2012 at 3:52 PM, Kohei KaiGai kai...@kaigai.gr.jp wrote: It may be an option to separate the case into two; a situation to execute the given query immediately just after optimization and never reused, and others. Yes. Simon suggested exactly this a while back, and I agree with him that we ought to have it. It is good for me, also. Then, if so, we will be able to push the stuff corresponding to RLSBYPASS into the query optimization, and works transparently for users. You're still going to need a way to make sure that the cluster can be dumped properly. RLSBYPASS accomplishes that; your scheme doesn't. If something like has_superuser_privilege() OR is automatically appended to user given clauses, it makes sure whole of the database cluster is dumped. That also means any permission checks are delayed to executor stage (except for the case on simple query protocol, I mentioned above), thus it simplifies the condition to invalidate prepared statement. One problem I noticed last night around RLSBYPASS approach is: it can take much number of invalidation whenever current user-id is switched. Not only SET AUTHORIZATION, but SECURITY DEFINER function's invocation also. I'm not sure whether this invalidation storm is reasonable level, or not. Is it unavailable to handle this type of implicit superuser checks with existing EXECUTE statement? I tried to run EXPLAIN with similar case. postgres=# PREPARE p1(int, bool) AS SELECT * FROM tbl WHERE y $1 OR $2; PREPARE postgres=# EXPLAIN EXECUTE p1(10, current_user in ('kaigai','rhaas')); QUERY PLAN Seq Scan on tbl (cost=0.00..21.60 rows=1160 width=40) (1 row However, postgres=# PREPARE p2(int) AS SELECT * FROM tbl WHERE y $1 OR current_user in ('kaigai','rhaas'); PREPARE postgres=# EXPLAIN EXECUTE p2(10); QUERY PLAN - Seq Scan on tbl (cost=0.00..30.30 rows=394 width=40) Filter: ((y 10) OR (current_user() = ANY ('{kaigai,rhaas}'::name[]))) (2 rows) Please assume the second condition something like superuser checks in addition to the main security policy. It implies an idea to replace a certain portion of clause that consists of only constant values and stable / immutable functions by shadow parameter to be calculated at executor stage, makes sense to wipe out RLS policy for superusers. In addition, it requires no new invalidation mechanism to prepared statements. I'm not sure how best to handle the invalidation issue... but I still think that relying on the query planner to use theorem-proving to simplify RLS conditions is going to be problematic. You can certainly try it ad see how it comes out... I think, the enhancement of planner can be handled independently from RLS policy stuff, and we should avoid to contain many complex stuff into one functionality. At least, it will work correctly with idea to append implicit condition (OR has_superuser_privilege()) without any invalidation mechanism, although its performance penalty is not negligible; without planner- enhancement. How about your opinion? I'm worry about future maintenance issues, once we have RLSBYPASS permission or something user visible... Thanks, -- KaiGai Kohei kai...@kaigai.gr.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 9.2beta1, parallel queries, ReleasePredicateLocks, CheckForSerializableConflictIn in the oprofile
On Mon, Jun 4, 2012 at 6:38 PM, Merlin Moncure mmonc...@gmail.com wrote: On Mon, Jun 4, 2012 at 10:17 AM, Merlin Moncure mmonc...@gmail.com wrote: What happens (in the very unlikely, but possible case?) if another backend races to the buffer you've pointed at with 'victim'? It looks like multiple backends share the clock sweep now, but don't you need to need an extra test to ensure it's still a candidate victim buffer? Actually, I don't think you do: the existing check on refcount is probably good enough. Hm, why did you get rid of BufferStrategyControl.lastFreeBuffer? It was dead code as far as I could tell. That change isn't actually relevant for this patch because free-list management is still protected by a lock (except the initial unlocked test that is doublechecked under lock) and so doesn't need any adjustment. Ants Aasma -- Cybertec Schönig Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt Web: http://www.postgresql-support.de -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Updated version of pg_receivexlog
On Mon, Jun 4, 2012 at 6:20 PM, Fujii Masao masao.fu...@gmail.com wrote: On Mon, Jun 4, 2012 at 11:25 PM, Ants Aasma a...@cybertec.at wrote: On Thu, Sep 29, 2011 at 11:30 PM, Magnus Hagander mag...@hagander.net wrote: it doesn't say that is not possible to use this for a standby server... probably that's why i get the error i put a recovery.conf after pg_basebackup finished... maybe we can say that more loudly? The idea is, if you use it with -x (or --xlog), it's for taking a backup/clone, *not* for replication. If you use it without -x, then you can use it as the start of a replica, by adding a recovery.conf. But you can't do both at once, that will confuse it. I stumbled upon this again today. There's nothing in the docs that would even hint that using -x shouldn't work to create a replica. Why does it get confused and can we (easily) make it not get confused? At the very least it needs a big fat warning in documentation for the -x option that the resulting backup might not be usable as a standby. Unless I'm missing something, you can use pg_basebackup -x for the standby. If lots of WAL files are generated in the master after pg_basebackup -x ends and before you start the standby instance, you may get the following error. In this case, you need to consult with archived WAL files even though you specified -x option in pg_basebackup. FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 0001005C has already been removed Though we have the above problem, pg_basebackup -x is usable for the standby, I think. I assumed from Magnus's comment that this is a known problem. I wonder what went wrong if it should have worked. In the case where this turned up the missing file was an xlog file with the new timeline ID but one segment before the timeline switch. I'll have to see if I can create a reproducible case for this. Ants Aasma -- Cybertec Schönig Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt Web: http://www.postgresql-support.de -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Updated version of pg_receivexlog
On Mon, Jun 4, 2012 at 5:48 PM, Ants Aasma a...@cybertec.at wrote: On Mon, Jun 4, 2012 at 6:20 PM, Fujii Masao masao.fu...@gmail.com wrote: On Mon, Jun 4, 2012 at 11:25 PM, Ants Aasma a...@cybertec.at wrote: On Thu, Sep 29, 2011 at 11:30 PM, Magnus Hagander mag...@hagander.net wrote: it doesn't say that is not possible to use this for a standby server... probably that's why i get the error i put a recovery.conf after pg_basebackup finished... maybe we can say that more loudly? The idea is, if you use it with -x (or --xlog), it's for taking a backup/clone, *not* for replication. If you use it without -x, then you can use it as the start of a replica, by adding a recovery.conf. But you can't do both at once, that will confuse it. I stumbled upon this again today. There's nothing in the docs that would even hint that using -x shouldn't work to create a replica. Why does it get confused and can we (easily) make it not get confused? At the very least it needs a big fat warning in documentation for the -x option that the resulting backup might not be usable as a standby. Unless I'm missing something, you can use pg_basebackup -x for the standby. If lots of WAL files are generated in the master after pg_basebackup -x ends and before you start the standby instance, you may get the following error. In this case, you need to consult with archived WAL files even though you specified -x option in pg_basebackup. FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 0001005C has already been removed Though we have the above problem, pg_basebackup -x is usable for the standby, I think. I assumed from Magnus's comment that this is a known problem. I wonder what went wrong if it should have worked. In the case where this turned up the missing file was an xlog file with the new timeline ID but one segment before the timeline switch. I'll have to see if I can create a reproducible case for this. No, it's more a there's no reason to do that. I don't think it should necessarily be an actual problem. In your case the missing piece of information is why was there a timeline switch? pg_basebackup shouldn't cause a timeline switch whether you use it in -x mode or not. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] Interface of Row Level Security
On Jun4, 2012, at 17:38 , Kohei KaiGai wrote: I'm worry about future maintenance issues, once we have RLSBYPASS permission or something user visible… I fear that without a generic way to disable RLS regardless which RLS policy function is in effect, we're creating a huge maintenance issue for DBAs. In a lot of shops, the DBA is responsible for a large number of databases, each potentially using a completely different approach to RLS and hence a completely different policy function. Without something like RLSBYPASS, the DBA needs to have intimate knowledge about the different RLS policies to e.g. guarantee that his backups aren't missing crucial information, or that the replication system indeed replicates all rows. With RLSBYPASS, all he needs to do is grant one privilege to his replication or backup user. The rest can be left to the development or support team for a specific application. best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Updated version of pg_receivexlog
On Mon, Jun 4, 2012 at 6:53 PM, Magnus Hagander mag...@hagander.net wrote: No, it's more a there's no reason to do that. I don't think it should necessarily be an actual problem. Ok, good to know. In your case the missing piece of information is why was there a timeline switch? pg_basebackup shouldn't cause a timeline switch whether you use it in -x mode or not. No mystery there. The timeline switch was because I had just promoted the master for standby mode. There's a chance I might have accidentally done something horribly wrong somewhere because I can't immediately reproduce this. I'll let you know if I find out how I managed to create this error. Ants Aasma -- Cybertec Schönig Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt Web: http://www.postgresql-support.de -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Btree or not btree? That is the question
We have a 8.3.18 system (yes, the same one from the previous thread, finally upgraded!) that gave us this error yesterday: ERROR: index pg_class_oid_index is not a btree It seems like this happened from just a normal SELECT query, and did not occur again (and this is a very busy system) The pg_class table does gets vacuumed and reindexed often. All appeared fine when I did some manual investigation. Since this only happened this one time, can I chalk it up to some random rare race condition having to do with a reindex? Or is there anything else I should be looking for or can proactively do? No other system catalog problems have been seen before it or since. -- Greg Sabino Mullane g...@endpoint.com End Point Corporation PGP Key: 0x14964AC8 pgp83Kexp6hqX.pgp Description: PGP signature
Re: [HACKERS] [RFC] Interface of Row Level Security
2012/6/4 Florian Pflug f...@phlo.org: On Jun4, 2012, at 17:38 , Kohei KaiGai wrote: I'm worry about future maintenance issues, once we have RLSBYPASS permission or something user visible… I fear that without a generic way to disable RLS regardless which RLS policy function is in effect, we're creating a huge maintenance issue for DBAs. In a lot of shops, the DBA is responsible for a large number of databases, each potentially using a completely different approach to RLS and hence a completely different policy function. Here is two problems around RLSBYPASS. The first is we have no idea to handle invalidation of prepared-statement when current user is switched, right now. The second is we can have another way to describe same RLS policy without PG original enhancement towards permission mechanism... Without something like RLSBYPASS, the DBA needs to have intimate knowledge about the different RLS policies to e.g. guarantee that his backups aren't missing crucial information, or that the replication system indeed replicates all rows. With RLSBYPASS, all he needs to do is grant one privilege to his replication or backup user. The rest can be left to the development or support team for a specific application. It seems to me you can define a function which implements site- specific security requirement (E.g backup should not be prevented by RLS policy), then include it as a part of RLS policy (or implicitly added by extensions, like sepgsql tries to do). These are the reason why I hesitate to go ahead with RLSBYPASS permission. Thanks, -- KaiGai Kohei kai...@kaigai.gr.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 9.2beta1, parallel queries, ReleasePredicateLocks, CheckForSerializableConflictIn in the oprofile
On Mon, Jun 4, 2012 at 10:42 AM, Ants Aasma a...@cybertec.at wrote: On Mon, Jun 4, 2012 at 6:38 PM, Merlin Moncure mmonc...@gmail.com wrote: On Mon, Jun 4, 2012 at 10:17 AM, Merlin Moncure mmonc...@gmail.com wrote: What happens (in the very unlikely, but possible case?) if another backend races to the buffer you've pointed at with 'victim'? It looks like multiple backends share the clock sweep now, but don't you need to need an extra test to ensure it's still a candidate victim buffer? Actually, I don't think you do: the existing check on refcount is probably good enough. Hm, why did you get rid of BufferStrategyControl.lastFreeBuffer? It was dead code as far as I could tell. That change isn't actually relevant for this patch because free-list management is still protected by a lock (except the initial unlocked test that is doublechecked under lock) and so doesn't need any adjustment. I have to admit -- this is pretty cool. The changes look pretty clean but I think it's going to need some benchmarking to prove it's actually faster under all workloads and some more review though. If there's heavy contention on getting victim buffers you're going to be making a lot of atomic operations -- each one of which will lock down the cache line. A yielding lwlock is obviously going to have very different behaviors under contention, so there needs to be a performance test that really burns up the freelist. If it's Well And Truly faster, you have to immediately start wondering if the freelist lock can't be completely scrapped: buffer invalidation only occurs in fairly rare events -- but when it does ISTM to be fairly counter productive. In trying to avoid the sweep you end up pushing everyone towards a locky free list. I don't know about you, but I'd prefer to have more consistent behavior (as in, not drop big table, concurrency drops 50%). Noted, this doesn't really address the OP's problem :-). I bet it'd supplement the nailing strategy nicely however. merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] Interface of Row Level Security
Kohei KaiGai kai...@kaigai.gr.jp writes: Here is two problems around RLSBYPASS. The first is we have no idea to handle invalidation of prepared-statement when current user is switched, right now. How is that specifically the fault of RLSBYPASS? *Any* of the schemes you're proposing for inlined RLS checks will have problems with userID switching. My guess is we'd have to treat the effective userID as part of the plancache lookup key to make it safe to inline anything related to RLS. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Btree or not btree? That is the question
Greg Sabino Mullane g...@endpoint.com writes: We have a 8.3.18 system (yes, the same one from the previous thread, finally upgraded!) that gave us this error yesterday: ERROR: index pg_class_oid_index is not a btree That means you got bogus data while reading the metapage. I'm beginning to wonder about the hardware on this server ... regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] Interface of Row Level Security
2012/6/4 Tom Lane t...@sss.pgh.pa.us: Kohei KaiGai kai...@kaigai.gr.jp writes: Here is two problems around RLSBYPASS. The first is we have no idea to handle invalidation of prepared-statement when current user is switched, right now. How is that specifically the fault of RLSBYPASS? *Any* of the schemes you're proposing for inlined RLS checks will have problems with userID switching. Really? I don't find out a scenario that cause a problem with user-id switching in case when RLS policy is *unconditionally* appended then evaluated on executor stage. I'd like to see the scenario. My guess is we'd have to treat the effective userID as part of the plancache lookup key to make it safe to inline anything related to RLS. It might be a solution, if we append individual RLS policy at the planner stage, depending on user-id. Thanks, -- KaiGai Kohei kai...@kaigai.gr.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [COMMITTERS] pgsql: Send new protocol keepalive messages to standby servers.
On Sat, Jun 2, 2012 at 12:01 PM, Michael Nolan htf...@gmail.com wrote: On 6/2/12, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: On the other hand, if we simply say PostgreSQL computes the replication delay by subtracting the time at which the WAL was generated, as recorded on the master, from the time at which it is replayed by the slave then, hey, we still have a wart, but it's pretty clear what the wart is and how to fix it, and we can easily document that. Again, if we could get rid of the failure modes and make this really water-tight, I think I'd be in favor of that, but it seems to me that we are in the process of expending a lot of energy and an even larger amount of calendar time to create a system that will misbehave in numerous subtle ways instead of one straightforward one. I don't see that as a good trade. Well, okay, but let's document if you use this feature, it's incumbent on you to make sure the master and slave clocks are synced. We recommend running NTP. or words to that effect. What if the two servers are in different time zones? NTP shouldn't have any problem; it uses UTC underneath. As does PostgreSQL, underneath. -- When confronted by a difficult problem, solve it by reducing it to the question, How would the Lone Ranger handle this? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] VIP: new format for psql - shell - simple using psql in shell
On 5/26/12 10:16 PM, Pavel Stehule wrote: My idea is secure to separator - because separator is just space and new line and these symbols are escaped. ISTM it'd be a really good idea to support something other than space, since presumably that'd be trivial. I'm not a fan of supporting the array construction. If you get to that level of complexity in bash it's a really good sign that you need to use a real language. If we do add array support then it needs to be keyed to the actual shell in use, because it's inconsistent between them. -- Jim C. Nasby, Database Architect j...@nasby.net 512.569.9461 (cell) http://jim.nasby.net -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] No, pg_size_pretty(numeric) was not such a hot idea
On 5/27/12 2:54 PM, Euler Taveira wrote: On 27-05-2012 10:45, Fujii Masao wrote: OK, let me propose another approach: add pg_size_pretty(int). If we do this, all usability and performance problems will be solved. I wouldn't like to add another function but if it solves both problems... +1. FWIW, I would argue that the case of pg_size_pretty(8*1024*1024) is pretty contrived... when would you actually do something like that? ISTM that any time you're using pg_size_pretty you'd be coming off a real datatype. -- Jim C. Nasby, Database Architect j...@nasby.net 512.569.9461 (cell) http://jim.nasby.net -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Bug in new buffering GiST build code
On Mon, May 28, 2012 at 1:46 AM, Alexander Korotkov aekorot...@gmail.comwrote: On Sat, May 26, 2012 at 12:33 AM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: Alexander, do you still have the test environments and data lying around that you used for GiST buffering testing last summer? Could you rerun some of those tests with this patch? I think I can restore test environment and data. Will rerun tests soon. I rerun some of tests. There are index build times in seconds for old way of parent refind and new way of it. old new usnoa2 2385 2452 usnoa2_shuffled 8131 8055 uniform 8327 8359 I thinks difference can be described by round error. Indexes seem to be exactly same. It's predictable because changing algorithm of parent refind shouldn't change the result. -- With best regards, Alexander Korotkov.
Re: [HACKERS] No, pg_size_pretty(numeric) was not such a hot idea
Jim Nasby j...@nasby.net writes: On 5/27/12 2:54 PM, Euler Taveira wrote: On 27-05-2012 10:45, Fujii Masao wrote: OK, let me propose another approach: add pg_size_pretty(int). I wouldn't like to add another function but if it solves both problems... +1. FWIW, I would argue that the case of pg_size_pretty(8*1024*1024) is pretty contrived... Yeah, possibly. In any case, I don't think we're making either of these changes in 9.2, because the time for forcing initdbs is past. It would only be realistic to think about changing pg_size_pretty() if we come across some other, much more compelling reason to force a system catalog contents change. Assuming that's how 9.2 ships, we might as well wait to see if there are any real complaints from the field before we decide whether any changing is needed. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] Interface of Row Level Security
On Jun4, 2012, at 18:38 , Kohei KaiGai wrote: 2012/6/4 Florian Pflug f...@phlo.org: Without something like RLSBYPASS, the DBA needs to have intimate knowledge about the different RLS policies to e.g. guarantee that his backups aren't missing crucial information, or that the replication system indeed replicates all rows. With RLSBYPASS, all he needs to do is grant one privilege to his replication or backup user. The rest can be left to the development or support team for a specific application. It seems to me you can define a function which implements site- specific security requirement (E.g backup should not be prevented by RLS policy), then include it as a part of RLS policy (or implicitly added by extensions, like sepgsql tries to do). Sure. But that requires each and every application which uses RLS to provide support for special backup (or replication, or whatever) privileges. And it requires the DBA to know how to assign these privileges to certain roles for each any every application in question. For shops which uses a lot of different applications, all with their own RLS policy, that can quickly get out of hand. Plus, a bug in one of these RLS policies has the potential to render backups incomplete. best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] Interface of Row Level Security
Florian Pflug f...@phlo.org writes: On Jun4, 2012, at 18:38 , Kohei KaiGai wrote: 2012/6/4 Florian Pflug f...@phlo.org: Without something like RLSBYPASS, the DBA needs to have intimate knowledge about the different RLS policies to e.g. guarantee that his backups aren't missing crucial information, or that the replication system indeed replicates all rows. It seems to me you can define a function which implements site- specific security requirement (E.g backup should not be prevented by RLS policy), then include it as a part of RLS policy (or implicitly added by extensions, like sepgsql tries to do). Sure. But that requires each and every application which uses RLS to provide support for special backup (or replication, or whatever) privileges. And it requires the DBA to know how to assign these privileges to certain roles for each any every application in question. For shops which uses a lot of different applications, all with their own RLS policy, that can quickly get out of hand. Plus, a bug in one of these RLS policies has the potential to render backups incomplete. I agree with Florian here: if there is no way to disable RLS for backups, the database will be un-administratable. RLSBYPASS is not necessarily the only or best way to provide such an override, but we have to have something that is simple, foolproof, and *not* dependent on the details of any one RLS policy. I suspect that KaiGai-san's objection basically comes down to not wanting to have what amounts to a backdoor in RLS policies. However, what Florian is saying is that you have to have a backdoor anyway, unless you'd like to be at risk of losing data because it wasn't backed up. You can either have one well-understood, well-documented, well-tested backdoor, or you can have an ad-hoc backdoor in each RLS policy. Nobody can think that the latter approach is preferable. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] 9.3: load path to mitigate load penalty for checksums
Introduction: = A lot of progress has been made on the checksums issue, with Simon's excellent work beginning to gain consensus: http://archives.postgresql.org/message-id/CA +u5nmkw_gbs6qq_y8-rjgl1v7mvw2hwbhartb8lojhnpfx...@mail.gmail.com For the purposes of this proposal, I'll assume that's the general direction we'll be taking for CRCs. The major drawback of that proposal is that it increases the amount of work to be done after a large data load by requiring more WAL. Proposal: = I propose a special user-initiated loading mode at the table granularity. During this time, readers must ignore PD_ALL_VISIBLE, HEAP_XMIN_COMMITTED, and the visibility map entirely. However, writers may set all of those bits before the writing transaction commits, obviating the need to rewrite (and WAL) the data again later. Ideally, there would be no work for VACUUM to do after the data load (unless a transaction aborted). This would also help the non-CRC case of course, but I expect CRCs to make this significantly more important. Goals: = * Table granularity (doesn't affect other tables at all) * Allows concurrent loaders * Allows loading into existing tables with existing data * Online (allow reads to proceed, even if degraded) Rough user interaction: = INITIATE LOAD ON foo AS 'job name'; -- run concurrent loading sessions FINALIZE LOAD 'job name'; High-level design: = By hints I mean the VM bit, PD_ALL_VISIBLE, and HEAP_XMIN_COMMITTED. By ignorehints I mean a flag in pg_class indicating that readers should ignore hints. By optimistichints I mean a flag in pg_class indicating that writers can optimistically set hints. Obviously, readers and writers would need a mechanism to honor those flags, but I haven't dug into the details yet (additional routines in tqual.c?). States: 0: normal * ignorehints = false * optimistichints = false 1: trying to enter data load mode, waiting on existing lockers (who might be scanning) to finish * ignorehints = true * optimistichints = false 2: data load mode * ignorehints = true * optimistichints = true 3: trying to leave data load mode, waiting on old snapshots to be released and aborted transactions to be cleaned up * ignorehints = true * optimistichints = false INITIATE LOAD would first transition from state 0 to 1 by acquiring a ShareUpdateExclusiveLock on the table (to be sure no concurrent INITIATE or FINALIZE LOAD is going on) and setting ignorehints = true. Then it moves from state 1 to state 2 by waiting for all transactions that hold a lock on that table. Any transactions that don't already have a lock will see the new flag when they try to get it. Now we're sure that all readers will see the ignorehints flag, so we can set the optimistichints flag to indicate that writers can write hints optimistically. FINALIZE LOAD would first move from state 2 to state 3 by acquiring a ShareUpdateExclusiveLock on the table setting optimistichints = false. Then, it would move from state 3 to state 0 by first waiting for all transactions that currently hold a lock on the table, to ensure they see the optimistichints=false flag. Then, it would remember the current xid as max_loader_xid, and wait until the global xmin is greater than max_loader_xid. This should ensure that all snapshots regard all loading transactions as complete. Also, it would need to run a lazy VACUUM to remove any tuples from aborted transactions. Details and optimizations = * We probably want a graceful way to handle multiple data loads happening on the same table. Rather than erroring out, we could treat it more like a reference count, and only do the work to move in to data load mode if not there already, and only move out of data load mode if we're the last loading job on the table. * In general, there are some usability issues to sort out, to make sure a table isn't put into data load mode and left that way. Right now, I'm mostly concerned with getting a working design, but those will be important, too. * We could optimize away the VACUUM going from 3 - 0 if we are sure no writing transactions aborted. * INITIATE and FINALIZE probably need to use PreventTransactionChain() and multiple transactions, to avoid holding the ShareUpdateExclusiveLock for too long. Also, we want to keep people from using it in the same transaction as the loading xact, because they might not realize that they would get a concurrency of 1 that way (because of the ShareUpdateExclusiveLock). Thoughts? Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers