Re: [HACKERS] should I post the patch as committed?
On Tue, Apr 20, 2010 at 10:41 PM, Alvaro Herrera alvhe...@commandprompt.com wrote: I think committing a patch from a non-regular is a special case and attaching the modified patch is reasonable in that case. My 8.8 Richter ... Or may be just mention the commit id for easy look up in the git log. Thanks, Pavan -- Pavan Deolasee EnterpriseDB http://www.enterprisedb.com
Re: [HACKERS] testing HS/SR - 1 vs 2 performance
On Thu, 2010-04-22 at 08:57 +0300, Heikki Linnakangas wrote: I think the assert is a good idea. If there's no real problem here, the assert won't trip. It's just a safety precaution. Right. And assertions also act as documentation, they are a precise and compact way to document invariants we assume to hold. A comment explaining why the cyclic nature of XIDs is not a problem would be nice too, in addition or instead of the assertions. Agreed. I was going to reply just that earlier but have been distracted on other things. -- Simon Riggs www.2ndQuadrant.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Hot Standby b-tree delete records review
btree_redo: case XLOG_BTREE_DELETE: /* * Btree delete records can conflict with standby queries. You * might think that vacuum records would conflict as well, but * we've handled that already. XLOG_HEAP2_CLEANUP_INFO records * provide the highest xid cleaned by the vacuum of the heap * and so we can resolve any conflicts just once when that * arrives. After that any we know that no conflicts exist * from individual btree vacuum records on that index. */ { TransactionId latestRemovedXid = btree_xlog_delete_get_latestRemovedXid(record); xl_btree_delete *xlrec = (xl_btree_delete *) XLogRecGetData(record); /* * XXX Currently we put everybody on death row, because * currently _bt_delitems() supplies InvalidTransactionId. * This can be fairly painful, so providing a better value * here is worth some thought and possibly some effort to * improve. */ ResolveRecoveryConflictWithSnapshot(latestRemovedXid, xlrec-node); } break; The XXX comment is out-of-date, the latestRemovedXid value is calculated by btree_xlog_delete_get_latestRemovedXid() nowadays. If we're re-replaying the WAL record, for example after restarting the standby server, btree_xlog_delete_get_latestRemovedXid() won't find the deleted records and will return InvalidTransactionId. That's OK, because until we reach again the point in WAL where we were before the restart, we don't accept read-only connections so there's no-one to kill anyway, but you do get a useless Invalid latestRemovedXid reported, using latestCompletedXid instead message in the log (that shouldn't be capitalized, BTW). It would be nice to check if there's any potentially conflicting read-only queries before calling btree_xlog_delete_get_latestRemovedXid(), which can be quite expensive. If the Invalid latestRemovedXid reported, using latestCompletedXid instead message is going to happen commonly, I think it should be downgraded to DEBUG1. If it's an unexpected scenario, it should be upgraded to WARNING. In btree_xlog_delete_get_latestRemovedXid: Assert(num_unused == 0); Can't that happen as well in a re-replay scenario, if a heap item was vacuumed away later on? /* * Note that if all heap tuples were LP_DEAD then we will be * returning InvalidTransactionId here. This seems very unlikely * in practice. */ If none of the removed heap tuples were present anymore, we currently return InvalidTransactionId, which kills/waits out all read-only queries. But if none of the tuples were present anymore, the read-only queries wouldn't have seen them anyway, so ISTM that we should treat InvalidTransactionId return value as we don't need to kill anyone. Why does btree_xlog_delete_get_latestRemovedXid() keep the num_unused/num_dead/num_redirect counts, it doesn't actually do anything with them. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Assertion failure twophase.c (3) (testing HS/SR)
Can you still reproduce this or has some of the changes since then fixed it? We never quite figured out the cause.. Erik Rijkers wrote: On Thu, March 4, 2010 17:00, Erik Rijkers wrote: in a 9.0devel, primary+standby, cvs from 2010.03.04 01:30 With three patches: new_smart_shutdown_20100201.patch extend_format_of_recovery_info_funcs_v4.20100303.patch fix-KnownAssignedXidsRemoveMany-1.patch pg_dump -d $db8.4.2 | psql -d $db9.0devel-primary FailedAssertion, File: twophase.c, Line: 1201. For the record, this still happens (FailedAssertion, File: twophase.c, Line: 1201.) (created 2010.03.13 23:49 cvs). Unfortunately, it does not happen always, or predictably. patches: new_smart_shutdown_20100201.patch extend_format_of_recovery_info_funcs_v4.20100303.patch (both here: http://archives.postgresql.org/pgsql-hackers/2010-03/msg00446.php ) (fix-KnownAssignedXidsRemoveMany-1.patch has been committed, I think?) I use commandlines like this to copy schemas across from 8.4.2 to 9.0devel: pg_dump -c -h /tmp -p 5432 -n myschema --no-owner --no-privileges mydb \ | psql -1qtA -h /tmp -p 7575 -d replicas (the copied schemas were together 175 GB) As I seem to be the only one who finds this, I started looking what could be unique in this install: and it would be postbio, which we use for its gist-indexing on ranges (http://pgfoundry.org/projects/postbio/). We use postbio's int_interval type as a column type. But keep in mind that sometimes the whole dump+restore+replication completes OK. Other installed modules are: contrib/btree_gist contrib/seg contrib/adminpack log_line_prefix = '%t %p %d %u start=%s ' # slave pgsql.sr_hotslave/logfile: 2010-03-13 23:54:59 CET 15765 start=2010-03-13 23:54:59 CET LOG: database system was interrupted; last known up at 2010-03-13 23:54:31 CET cp: cannot stat `/var/data1/pg_stuff/dump/hotslave/replication_archive/00010001': No such file or directory 2010-03-13 23:55:00 CET 15765 start=2010-03-13 23:54:59 CET LOG: entering standby mode 2010-03-13 23:55:00 CET 15765 start=2010-03-13 23:54:59 CET LOG: redo starts at 0/120 2010-03-13 23:55:00 CET 15765 start=2010-03-13 23:54:59 CET LOG: consistent recovery state reached at 0/200 2010-03-13 23:55:00 CET 15763 start=2010-03-13 23:54:59 CET LOG: database system is ready to accept read only connections TRAP: FailedAssertion(!(((xid) != ((TransactionId) 0))), File: twophase.c, Line: 1201) 2010-03-14 05:28:59 CET 15763 start=2010-03-13 23:54:59 CET LOG: startup process (PID 15765) was terminated by signal 6: Aborted 2010-03-14 05:28:59 CET 15763 start=2010-03-13 23:54:59 CET LOG: terminating any other active server processes Maybe I'll try now to setup a similar instance without postbio, to see if the crash still occurs. hth, Erik Rijkers -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot Standby b-tree delete records review
On Thu, 2010-04-22 at 10:24 +0300, Heikki Linnakangas wrote: btree_redo: case XLOG_BTREE_DELETE: /* * Btree delete records can conflict with standby queries. You * might think that vacuum records would conflict as well, but * we've handled that already. XLOG_HEAP2_CLEANUP_INFO records * provide the highest xid cleaned by the vacuum of the heap * and so we can resolve any conflicts just once when that * arrives. After that any we know that no conflicts exist * from individual btree vacuum records on that index. */ { TransactionId latestRemovedXid = btree_xlog_delete_get_latestRemovedXid(record); xl_btree_delete *xlrec = (xl_btree_delete *) XLogRecGetData(record); /* * XXX Currently we put everybody on death row, because * currently _bt_delitems() supplies InvalidTransactionId. * This can be fairly painful, so providing a better value * here is worth some thought and possibly some effort to * improve. */ ResolveRecoveryConflictWithSnapshot(latestRemovedXid, xlrec-node); } break; The XXX comment is out-of-date, the latestRemovedXid value is calculated by btree_xlog_delete_get_latestRemovedXid() nowadays. Removed, thanks. If we're re-replaying the WAL record, for example after restarting the standby server, btree_xlog_delete_get_latestRemovedXid() won't find the deleted records and will return InvalidTransactionId. That's OK, because until we reach again the point in WAL where we were before the restart, we don't accept read-only connections so there's no-one to kill anyway, but you do get a useless Invalid latestRemovedXid reported, using latestCompletedXid instead message in the log (that shouldn't be capitalized, BTW). It would be nice to check if there's any potentially conflicting read-only queries before calling btree_xlog_delete_get_latestRemovedXid(), which can be quite expensive. Good idea. You're welcome to add such tuning yourself, if you like. If the Invalid latestRemovedXid reported, using latestCompletedXid instead message is going to happen commonly, I think it should be downgraded to DEBUG1. If it's an unexpected scenario, it should be upgraded to WARNING. Set to DEBUG because the above optimisation makes it return invalid much more frequently, which we don't want reported. In btree_xlog_delete_get_latestRemovedXid: Assert(num_unused == 0); Can't that happen as well in a re-replay scenario, if a heap item was vacuumed away later on? OK, will remove. The re-replay gets me every time. /* * Note that if all heap tuples were LP_DEAD then we will be * returning InvalidTransactionId here. This seems very unlikely * in practice. */ If none of the removed heap tuples were present anymore, we currently return InvalidTransactionId, which kills/waits out all read-only queries. But if none of the tuples were present anymore, the read-only queries wouldn't have seen them anyway, so ISTM that we should treat InvalidTransactionId return value as we don't need to kill anyone. That's not the point. The tuples were not themselves the sole focus, they indicated the latestRemovedXid of the backend on the primary that had performed the deletion. So even if those tuples are no longer present there may be others with similar xids that would conflict, so we cannot ignore. Comment updated. Why does btree_xlog_delete_get_latestRemovedXid() keep the num_unused/num_dead/num_redirect counts, it doesn't actually do anything with them. Probably a debug tool. Removed. Changes committed. Thanks for the review. -- Simon Riggs www.2ndQuadrant.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot Standby b-tree delete records review
Simon Riggs wrote: On Thu, 2010-04-22 at 10:24 +0300, Heikki Linnakangas wrote: btree_redo: /* * Note that if all heap tuples were LP_DEAD then we will be * returning InvalidTransactionId here. This seems very unlikely * in practice. */ If none of the removed heap tuples were present anymore, we currently return InvalidTransactionId, which kills/waits out all read-only queries. But if none of the tuples were present anymore, the read-only queries wouldn't have seen them anyway, so ISTM that we should treat InvalidTransactionId return value as we don't need to kill anyone. That's not the point. The tuples were not themselves the sole focus, Yes, they were. We're replaying a b-tree deletion record, which removes pointers to some heap tuples, making them unreachable to any read-only queries. If any of them still need to be visible to read-only queries, we have a conflict. But if all of the heap tuples are gone already, removing the index pointers to them can'ẗ change the situation for any query. If any of them should've been visible to a query, the damage was done already by whoever pruned the heap tuples leaving just the tombstone LP_DEAD item pointers (in the heap) behind. Or do we use the latestRemovedXid value for something else as well? -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot Standby b-tree delete records review
On Thu, 2010-04-22 at 11:28 +0300, Heikki Linnakangas wrote: Simon Riggs wrote: On Thu, 2010-04-22 at 10:24 +0300, Heikki Linnakangas wrote: btree_redo: /* * Note that if all heap tuples were LP_DEAD then we will be * returning InvalidTransactionId here. This seems very unlikely * in practice. */ If none of the removed heap tuples were present anymore, we currently return InvalidTransactionId, which kills/waits out all read-only queries. But if none of the tuples were present anymore, the read-only queries wouldn't have seen them anyway, so ISTM that we should treat InvalidTransactionId return value as we don't need to kill anyone. That's not the point. The tuples were not themselves the sole focus, Yes, they were. We're replaying a b-tree deletion record, which removes pointers to some heap tuples, making them unreachable to any read-only queries. If any of them still need to be visible to read-only queries, we have a conflict. But if all of the heap tuples are gone already, removing the index pointers to them can'ẗ change the situation for any query. If any of them should've been visible to a query, the damage was done already by whoever pruned the heap tuples leaving just the tombstone LP_DEAD item pointers (in the heap) behind. You're missing my point. Those tuples are indicators of what may lie elsewhere in the database, completely unreferenced by this WAL record. Just because these referenced tuples are gone doesn't imply that all tuple versions written by the as yet-unknown-xids are also gone. We can't infer anything about the whole database just from one small group of records. -- Simon Riggs www.2ndQuadrant.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot Standby b-tree delete records review
Simon Riggs wrote: On Thu, 2010-04-22 at 11:28 +0300, Heikki Linnakangas wrote: Simon Riggs wrote: On Thu, 2010-04-22 at 10:24 +0300, Heikki Linnakangas wrote: btree_redo: /* * Note that if all heap tuples were LP_DEAD then we will be * returning InvalidTransactionId here. This seems very unlikely * in practice. */ If none of the removed heap tuples were present anymore, we currently return InvalidTransactionId, which kills/waits out all read-only queries. But if none of the tuples were present anymore, the read-only queries wouldn't have seen them anyway, so ISTM that we should treat InvalidTransactionId return value as we don't need to kill anyone. That's not the point. The tuples were not themselves the sole focus, Yes, they were. We're replaying a b-tree deletion record, which removes pointers to some heap tuples, making them unreachable to any read-only queries. If any of them still need to be visible to read-only queries, we have a conflict. But if all of the heap tuples are gone already, removing the index pointers to them can'ẗ change the situation for any query. If any of them should've been visible to a query, the damage was done already by whoever pruned the heap tuples leaving just the tombstone LP_DEAD item pointers (in the heap) behind. You're missing my point. Those tuples are indicators of what may lie elsewhere in the database, completely unreferenced by this WAL record. Just because these referenced tuples are gone doesn't imply that all tuple versions written by the as yet-unknown-xids are also gone. We can't infer anything about the whole database just from one small group of records. Have you got an example of that? -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot Standby b-tree delete records review
On Thu, 2010-04-22 at 11:56 +0300, Heikki Linnakangas wrote: If none of the removed heap tuples were present anymore, we currently return InvalidTransactionId, which kills/waits out all read-only queries. But if none of the tuples were present anymore, the read-only queries wouldn't have seen them anyway, so ISTM that we should treat InvalidTransactionId return value as we don't need to kill anyone. That's not the point. The tuples were not themselves the sole focus, Yes, they were. We're replaying a b-tree deletion record, which removes pointers to some heap tuples, making them unreachable to any read-only queries. If any of them still need to be visible to read-only queries, we have a conflict. But if all of the heap tuples are gone already, removing the index pointers to them can'ẗ change the situation for any query. If any of them should've been visible to a query, the damage was done already by whoever pruned the heap tuples leaving just the tombstone LP_DEAD item pointers (in the heap) behind. You're missing my point. Those tuples are indicators of what may lie elsewhere in the database, completely unreferenced by this WAL record. Just because these referenced tuples are gone doesn't imply that all tuple versions written by the as yet-unknown-xids are also gone. We can't infer anything about the whole database just from one small group of records. Have you got an example of that? I don't need one, I have suggested the safe route. In order to infer anything, and thereby further optimise things, we would need proof that no cases can exist, which I don't have. Perhaps we can add yet, not sure about that either. -- Simon Riggs www.2ndQuadrant.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot Standby b-tree delete records review
Simon Riggs wrote: On Thu, 2010-04-22 at 11:56 +0300, Heikki Linnakangas wrote: If none of the removed heap tuples were present anymore, we currently return InvalidTransactionId, which kills/waits out all read-only queries. But if none of the tuples were present anymore, the read-only queries wouldn't have seen them anyway, so ISTM that we should treat InvalidTransactionId return value as we don't need to kill anyone. That's not the point. The tuples were not themselves the sole focus, Yes, they were. We're replaying a b-tree deletion record, which removes pointers to some heap tuples, making them unreachable to any read-only queries. If any of them still need to be visible to read-only queries, we have a conflict. But if all of the heap tuples are gone already, removing the index pointers to them can'ẗ change the situation for any query. If any of them should've been visible to a query, the damage was done already by whoever pruned the heap tuples leaving just the tombstone LP_DEAD item pointers (in the heap) behind. You're missing my point. Those tuples are indicators of what may lie elsewhere in the database, completely unreferenced by this WAL record. Just because these referenced tuples are gone doesn't imply that all tuple versions written by the as yet-unknown-xids are also gone. We can't infer anything about the whole database just from one small group of records. Have you got an example of that? I don't need one, I have suggested the safe route. In order to infer anything, and thereby further optimise things, we would need proof that no cases can exist, which I don't have. Perhaps we can add yet, not sure about that either. It's good to be safe rather than sorry, but I'd still like to know because I'm quite surprised by that, and got me worried that I don't understand how hot standby works as well as I thought I did. I thought the point of stopping replay/killing queries at a b-tree deletion record is precisely that it makes some heap tuples invisible to running read-only queries. If it doesn't make any tuples invisible, why do any queries need to be killed? And why was it OK for them to be running just before replaying the b-tree deletion record? -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot Standby b-tree delete records review
On Thu, 2010-04-22 at 12:18 +0300, Heikki Linnakangas wrote: Simon Riggs wrote: On Thu, 2010-04-22 at 11:56 +0300, Heikki Linnakangas wrote: If none of the removed heap tuples were present anymore, we currently return InvalidTransactionId, which kills/waits out all read-only queries. But if none of the tuples were present anymore, the read-only queries wouldn't have seen them anyway, so ISTM that we should treat InvalidTransactionId return value as we don't need to kill anyone. That's not the point. The tuples were not themselves the sole focus, Yes, they were. We're replaying a b-tree deletion record, which removes pointers to some heap tuples, making them unreachable to any read-only queries. If any of them still need to be visible to read-only queries, we have a conflict. But if all of the heap tuples are gone already, removing the index pointers to them can'ẗ change the situation for any query. If any of them should've been visible to a query, the damage was done already by whoever pruned the heap tuples leaving just the tombstone LP_DEAD item pointers (in the heap) behind. You're missing my point. Those tuples are indicators of what may lie elsewhere in the database, completely unreferenced by this WAL record. Just because these referenced tuples are gone doesn't imply that all tuple versions written by the as yet-unknown-xids are also gone. We can't infer anything about the whole database just from one small group of records. Have you got an example of that? I don't need one, I have suggested the safe route. In order to infer anything, and thereby further optimise things, we would need proof that no cases can exist, which I don't have. Perhaps we can add yet, not sure about that either. It's good to be safe rather than sorry, but I'd still like to know because I'm quite surprised by that, and got me worried that I don't understand how hot standby works as well as I thought I did. I thought the point of stopping replay/killing queries at a b-tree deletion record is precisely that it makes some heap tuples invisible to running read-only queries. If it doesn't make any tuples invisible, why do any queries need to be killed? And why was it OK for them to be running just before replaying the b-tree deletion record? I'm sorry but I'm too busy to talk further on this today. Since we are discussing a further optimisation rather than a bug, I hope it is OK to come back to this again later. -- Simon Riggs www.2ndQuadrant.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] don't allow walsender to consume superuser_reserved_connection slots, or during shutdown
On Wed, Apr 21, 2010 at 10:01 PM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: Here's the fine patch. The actual code changes are simple and seem to work as expected, but I struggled a bit with the phrasing of the messages. Feel free to suggest improvements. Stick with the original wording? I don't really see a need to change it. I don't think that's a good idea. If we just say that the remaining connection slots are for superusers, someone will inevitably ask us why their superuser replication can't connect. I think it's important to phrase things as accurately as possible. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Thread safety and libxml2
On Mon, Apr 19, 2010 at 10:52 AM, Robert Haas robertmh...@gmail.com wrote: On Thu, Feb 18, 2010 at 8:41 PM, Bruce Momjian br...@momjian.us wrote: Peter Eisentraut wrote: On ons, 2009-12-30 at 12:55 -0500, Greg Smith wrote: Basically, configure failed on their OpenBSD system because thread safety is on but the libxml2 wasn't compiled with threaded support: http://xmlsoft.org/threads.html Disabling either feature (no --with-libxml or --disable-thread-safety) gives a working build. This could perhaps be fixed by excluding libxml when running the thread test. The thread result is only used in the client libraries and libxml is only used in the backend, so those two shouldn't meet each other in practice. The attached patch removes -lxml2 from the link line of the thread test program. Comments? Can anyone test this fixes the OpenBSD problem? Can someone take the time to test this whether this patch fixes the problem? This is on the list of open items for PG 9.0, but considering that there's been a proposed patch available for almost two months and no responses to this thread, it may be time to conclude that nobody cares very much - in which case we can either remove this item or relocate it to the TODO list. Since no one has responded to this, I'm moving this to the section of the open items list called long-term issues: These items are not 9.0-specific. They should be fixed eventually, but not for now. I am inclined to think this isn't worth adding to the main TODO list. If someone complains about it again, we can ask them to test the patch. If not, I don't see much point in investing any more time in it. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
recovery_connections cannot start (was Re: [HACKERS] master in standby mode croaks)
On Sat, Apr 17, 2010 at 6:52 PM, Robert Haas robertmh...@gmail.com wrote: On Sat, Apr 17, 2010 at 6:41 PM, Simon Riggs si...@2ndquadrant.com wrote: On Sat, 2010-04-17 at 17:44 -0400, Robert Haas wrote: I will change the error message. I gave a good deal of thought to trying to figure out a cleaner solution to this problem than just changing the error message and failed. So let's change the error message. Of course I'm not quite sure what we should change it TO, given that the situation is the result of an interaction between three different GUCs and we have no way to distinguish which one(s) are the problem. You need all three covers it. Actually you need standby_connections and either archive_mode=on or max_wal_senders0, I think. One way we could fix this is use 2 bits rather than 1 for XLogStandbyInfoMode. One bit could indicate that either archive_mode=on or max_wal_senders0, and the second bit could indicate that recovery_connections=on. If the second bit is unset, we could emit the existing complaint: recovery connections cannot start because the recovery_connections parameter is disabled on the WAL source server If the other bit is unset, then we could instead complain: recovery connections cannot start because archive_mode=off and max_wal_senders=0 on the WAL source server If we don't want to use two bits there, it's hard to really describe all the possibilities in a reasonable number of characters. The only thing I can think of is to print a message and a hint: recovery_connections cannot start due to incorrect settings on the WAL source server HINT: make sure recovery_connections=on and either archive_mode=on or max_wal_senders0 I haven't checked whether the hint would be displayed in the log on the standby, but presumably we could make that be the case if it's not already. I think the first way is better because it gives the user more specific information about what they need to fix. Thinking about how each case might happen, since the default for recovery_connections is 'on', it seems that recovery_connections=off will likely only be an issue if the user has explicitly turned it off. The other case, where archive_mode=off and max_wal_senders=0, will likely only occur if someone takes a snapshot of the master without first setting up archiving or SR. Both of these will probably happen relatively rarely, but since we're burning a whole byte for XLogStandbyInfoMode (plus 3 more bytes of padding?), it seems like we might as well snag one more bit for clarity. Thoughts? ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] BETA
On Wed, Apr 21, 2010 at 9:41 AM, Marc G. Fournier scra...@hub.org wrote: On Wed, 21 Apr 2010, Robert Haas wrote: Well, never mind that then. How about a beta next week? I'm good for that ... Anyone else want to weigh in for or against this? ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] BETA
On Thu, 22 Apr 2010, Robert Haas wrote: On Wed, Apr 21, 2010 at 9:41 AM, Marc G. Fournier scra...@hub.org wrote: On Wed, 21 Apr 2010, Robert Haas wrote: Well, never mind that then. How about a beta next week? I'm good for that ... Anyone else want to weigh in for or against this? We're discussing scheduling on -core right now, triggered by your email, and will put out a notice shortly ... although we did just do a back branch release, we have a second one that has to be done, so we're trying to balance schedules around doing both, but not simultaneously ... Marc G. FournierHub.Org Hosting Solutions S.A. scra...@hub.org http://www.hub.org Yahoo:yscrappySkype: hub.orgICQ:7615664MSN:scra...@hub.org -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] BETA
On Thu, Apr 22, 2010 at 12:18 PM, Marc G. Fournier scra...@hub.org wrote: We're discussing scheduling on -core right now, triggered by your email, and will put out a notice shortly ... although we did just do a back branch release, we have a second one that has to be done, so we're trying to balance schedules around doing both, but not simultaneously ... OK, thanks! ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] testing HS/SR - 1 vs 2 performance
On Sun, April 18, 2010 13:01, Simon Riggs wrote: OK, I'll put a spinlock around access to the head of the array. v2 patch attached knownassigned_sortedarray.v2.diff applied to cvs HEAD (2010.04.21 22:36) I have done a few smaller tests (scale 500, clients 1, 20): init: pgbench -h /tmp -p 6565 -U rijkers -i -s 500 replicas 4x primary, clients 1: scale: 500 clients: 1 tps = 11496.372655 pgbench -p 6565 -n -S -c 1 -T 900 -j 1 scale: 500 clients: 1 tps = 11580.141685 pgbench -p 6565 -n -S -c 1 -T 900 -j 1 scale: 500 clients: 1 tps = 11478.294747 pgbench -p 6565 -n -S -c 1 -T 900 -j 1 scale: 500 clients: 1 tps = 11741.432016 pgbench -p 6565 -n -S -c 1 -T 900 -j 1 4x standby, clients 1: scale: 500 clients: 1 tps = 727.217672 pgbench -p 6566 -n -S -c 1 -T 900 -j 1 scale: 500 clients: 1 tps = 785.431011 pgbench -p 6566 -n -S -c 1 -T 900 -j 1 scale: 500 clients: 1 tps = 825.291817 pgbench -p 6566 -n -S -c 1 -T 900 -j 1 scale: 500 clients: 1 tps = 868.107638 pgbench -p 6566 -n -S -c 1 -T 900 -j 1 4x primary, clients 20: scale: 500 clients: 20 tps = 34963.054102 pgbench -p 6565 -n -S -c 20 -T 900 -j 1 scale: 500 clients: 20 tps = 34818.985407 pgbench -p 6565 -n -S -c 20 -T 900 -j 1 scale: 500 clients: 20 tps = 34964.545013 pgbench -p 6565 -n -S -c 20 -T 900 -j 1 scale: 500 clients: 20 tps = 34959.210687 pgbench -p 6565 -n -S -c 20 -T 900 -j 1 4x standby, clients 20: scale: 500 clients: 20 tps = 1099.808192 pgbench -p 6566 -n -S -c 20 -T 900 -j 1 scale: 500 clients: 20 tps = 905.926703 pgbench -p 6566 -n -S -c 20 -T 900 -j 1 scale: 500 clients: 20 tps = 943.531989 pgbench -p 6566 -n -S -c 20 -T 900 -j 1 scale: 500 clients: 20 tps = 1082.215913 pgbench -p 6566 -n -S -c 20 -T 900 -j 1 This is the same behaviour (i.e. extreme slow standby) that I saw earlier (and which caused the original post, btw). In that earlier instance, the extreme slowness disappeared later, after many hours maybe even days (without bouncing either primary or standby). I have no idea what could cause this; is no one else is seeing this ? (if I have time I'll repeat on other hardware in the weekend) any comment is welcome... Erik Rijkers -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] libpq connectoin redirect
While these can be handled at higher level, for example, by setting up LDAP or as Hekki suggested, tricking DNS, the problem is that I don't have control of how the user connect to the server. They may not use LDAP. Solution like pgbouncer has advantages. User just get one ip/port and everything else happens automatically. Thanks, Subject: Re: [HACKERS] libpq connectoin redirect From: li...@jwp.name Date: Wed, 21 Apr 2010 15:52:39 -0700 CC: pgsql-hackers@postgresql.org To: ft...@hotmail.com On Apr 20, 2010, at 10:03 PM, feng tian wrote: Another way to do this, is to send the client an redirect message. When client connect to 127.0.0.10, instead of accepting the connection, it can reply to client telling it to reconnect to one of the server on 127.0.0.11-14. ISTM that this would be better handled at a higher-level. That is, given a server (127.0.0.10) that holds 127.0.0.11-14. Connect to that server and query for the correct target host. _ Hotmail is redefining busy with tools for the New Busy. Get more from your inbox. http://www.windowslive.com/campaign/thenewbusy?ocid=PID28326::T:WLMTAGL:ON:WL:en-US:WM_HMP:042010_2
Re: [HACKERS] libpq connectoin redirect
Hi, John, The change will be on the libpq client side. I am not saying this is a general solution for the distributed transaction/scale out. However, in many cases, it is very useful. For example, in my case, I have about 100 departments each has it own database. The balance machine can just redirect to the right box according to database/user. The 4 boxes I have may not even get domain name or static IP. Another scenario, if I have some kind of replication set up, I can send transaction processing to master and analytic reporting query to slaves. Thanks, Feng feng tian wrote: Hi, I want to load balance a postgres server on 4 physical machines, say 127.0.0.11-14. I can set up a pgbouncer on 127.0.0.10 and connection pooling to my four boxes. However, the traffic from/to clients will go through an extra hop. Another way to do this, is to send the client an redirect message. When client connect to 127.0.0.10, instead of accepting the connection, it can reply to client telling it to reconnect to one of the server on 127.0.0.11-14. I am planning to write/submit a patch to do that. I wonder if there is similar effort in extending libpq protocol, or, if you have better ideas on how to achieve this. how do you plan on maintaining consistency, transactional integrity and atomicity of updates across these 4 machines? _ The New Busy is not the too busy. Combine all your e-mail accounts with Hotmail. http://www.windowslive.com/campaign/thenewbusy?tile=multiaccountocid=PID28326::T:WLMTAGL:ON:WL:en-US:WM_HMP:042010_4
Re: [HACKERS] Thoughts on pg_hba.conf rejection
Tom Lane escribió: Robert Haas robertmh...@gmail.com writes: On Tue, Apr 20, 2010 at 7:13 PM, Tom Lane t...@sss.pgh.pa.us wrote: (You might want to look back at the archived discussions about how to avoid storing entries for temp tables in these catalogs; that poses many of the same issues.) Do you happen to know what a good search term might be? I tried searching for things like pg_class temp tables and pg_class temporary tables and didn't come up with much. I found this thread: http://archives.postgresql.org/pgsql-hackers/2008-07/msg00593.php I claimed in that message that there were previous discussions but I did not come across them right away. I vaguely remember that there was a discussion about pg_attribute and the extra rows for system rows for all tables, which diverged into a discussion about temp tables and those other extra rows. -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] testing HS/SR - 1 vs 2 performance
Erik Rijkers wrote: This is the same behaviour (i.e. extreme slow standby) that I saw earlier (and which caused the original post, btw). In that earlier instance, the extreme slowness disappeared later, after many hours maybe even days (without bouncing either primary or standby). Any possibility the standby is built with assertions turned out? That's often the cause of this type of difference between pgbench results on two systems, which easy to introduce when everyone is building from source. You should try this on both systems: psql -c show debug_assertions just to rule that out. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] testing HS/SR - 1 vs 2 performance
Greg Smith wrote: Erik Rijkers wrote: This is the same behaviour (i.e. extreme slow standby) that I saw earlier (and which caused the original post, btw). In that earlier instance, the extreme slowness disappeared later, after many hours maybe even days (without bouncing either primary or standby). Any possibility the standby is built with assertions turned out? That's often the cause of this type of difference between pgbench results on two systems, which easy to introduce when everyone is building from source. You should try this on both systems: psql -c show debug_assertions Or even: pg_config --configure on both systems might be worth checking. regards Mark -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] testing HS/SR - 1 vs 2 performance
On Thu, April 22, 2010 23:54, Mark Kirkwood wrote: Greg Smith wrote: Erik Rijkers wrote: This is the same behaviour (i.e. extreme slow standby) that I saw earlier (and which caused the original post, btw). In that earlier instance, the extreme slowness disappeared later, after many hours maybe even days (without bouncing either primary or standby). Any possibility the standby is built with assertions turned out? That's often the cause of this type of difference between pgbench results on two systems, which easy to introduce when everyone is building from source. You should try this on both systems: psql -c show debug_assertions Or even: pg_config --configure on both systems might be worth checking. (these instances are on a single server, btw) primary: $ pg_config BINDIR = /var/data1/pg_stuff/pg_installations/pgsql.sr_primary/bin DOCDIR = /var/data1/pg_stuff/pg_installations/pgsql.sr_primary/share/doc HTMLDIR = /var/data1/pg_stuff/pg_installations/pgsql.sr_primary/share/doc INCLUDEDIR = /var/data1/pg_stuff/pg_installations/pgsql.sr_primary/include PKGINCLUDEDIR = /var/data1/pg_stuff/pg_installations/pgsql.sr_primary/include INCLUDEDIR-SERVER = /var/data1/pg_stuff/pg_installations/pgsql.sr_primary/include/server LIBDIR = /var/data1/pg_stuff/pg_installations/pgsql.sr_primary/lib PKGLIBDIR = /var/data1/pg_stuff/pg_installations/pgsql.sr_primary/lib LOCALEDIR = /var/data1/pg_stuff/pg_installations/pgsql.sr_primary/share/locale MANDIR = /var/data1/pg_stuff/pg_installations/pgsql.sr_primary/share/man SHAREDIR = /var/data1/pg_stuff/pg_installations/pgsql.sr_primary/share SYSCONFDIR = /var/data1/pg_stuff/pg_installations/pgsql.sr_primary/etc PGXS = /var/data1/pg_stuff/pg_installations/pgsql.sr_primary/lib/pgxs/src/makefiles/pgxs.mk CONFIGURE = '--prefix=/var/data1/pg_stuff/pg_installations/pgsql.sr_primary' '--with-pgport=6565' '--enable-depend' '--with-openssl' '--with-perl' '--with-libxml' '--with-libxslt' CC = gcc CPPFLAGS = -D_GNU_SOURCE -I/usr/include/libxml2 CFLAGS = -O2 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -fno-strict-aliasing -fwrapv CFLAGS_SL = -fpic LDFLAGS = -Wl,-rpath,'/var/data1/pg_stuff/pg_installations/pgsql.sr_primary/lib' LDFLAGS_SL = LIBS = -lpgport -lxslt -lxml2 -lssl -lcrypto -lz -lreadline -ltermcap -lcrypt -ldl -lm VERSION = PostgreSQL 9.0devel-sr_primary [data:port:db PGPORT=6565 PGDATA=/var/data1/pg_stuff/pg_installations/pgsql.sr_primary/data PGDATABASE=replicas] 2010.04.22 20:55:28 rijk...@denkraam:~/src/perl/85devel [0] $ time ./run_test_suite.sh [data:port:db PGPORT=6565 PGDATA=/var/data1/pg_stuff/pg_installations/pgsql.sr_primary/data PGDATABASE=replicas] 2010.04.22 21:00:26 rijk...@denkraam:~/src/perl/85devel [1] standby: $ pg_config BINDIR = /var/data1/pg_stuff/pg_installations/pgsql.sr_primary/bin DOCDIR = /var/data1/pg_stuff/pg_installations/pgsql.sr_primary/share/doc HTMLDIR = /var/data1/pg_stuff/pg_installations/pgsql.sr_primary/share/doc INCLUDEDIR = /var/data1/pg_stuff/pg_installations/pgsql.sr_primary/include PKGINCLUDEDIR = /var/data1/pg_stuff/pg_installations/pgsql.sr_primary/include INCLUDEDIR-SERVER = /var/data1/pg_stuff/pg_installations/pgsql.sr_primary/include/server LIBDIR = /var/data1/pg_stuff/pg_installations/pgsql.sr_primary/lib PKGLIBDIR = /var/data1/pg_stuff/pg_installations/pgsql.sr_primary/lib LOCALEDIR = /var/data1/pg_stuff/pg_installations/pgsql.sr_primary/share/locale MANDIR = /var/data1/pg_stuff/pg_installations/pgsql.sr_primary/share/man SHAREDIR = /var/data1/pg_stuff/pg_installations/pgsql.sr_primary/share SYSCONFDIR = /var/data1/pg_stuff/pg_installations/pgsql.sr_primary/etc PGXS = /var/data1/pg_stuff/pg_installations/pgsql.sr_primary/lib/pgxs/src/makefiles/pgxs.mk CONFIGURE = '--prefix=/var/data1/pg_stuff/pg_installations/pgsql.sr_primary' '--with-pgport=6565' '--enable-depend' '--with-openssl' '--with-perl' '--with-libxml' '--with-libxslt' CC = gcc CPPFLAGS = -D_GNU_SOURCE -I/usr/include/libxml2 CFLAGS = -O2 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -fno-strict-aliasing -fwrapv CFLAGS_SL = -fpic LDFLAGS = -Wl,-rpath,'/var/data1/pg_stuff/pg_installations/pgsql.sr_primary/lib' LDFLAGS_SL = LIBS = -lpgport -lxslt -lxml2 -lssl -lcrypto -lz -lreadline -ltermcap -lcrypt -ldl -lm VERSION = PostgreSQL 9.0devel-sr_primary $ grep -Ev '(^[[:space:]]*#)|(^$)' pgsql.sr_*ry/data/postgresql.conf pgsql.sr_primary/data/postgresql.conf:data_directory = '/var/data1/pg_stuff/pg_installations/pgsql.sr_primary/data' pgsql.sr_primary/data/postgresql.conf:port = 6565 pgsql.sr_primary/data/postgresql.conf:max_connections = 100 pgsql.sr_primary/data/postgresql.conf:shared_buffers = 256MB pgsql.sr_primary/data/postgresql.conf:checkpoint_segments = 50 pgsql.sr_primary/data/postgresql.conf:archive_mode = 'on' pgsql.sr_primary/data/postgresql.conf:archive_command= 'cp %p /var/data1/pg_stuff/dump/replication_archive/%f'
Re: [HACKERS] testing HS/SR - 1 vs 2 performance
On Thu, 2010-04-22 at 20:39 +0200, Erik Rijkers wrote: On Sun, April 18, 2010 13:01, Simon Riggs wrote: any comment is welcome... Please can you re-run with -l and post me the file of times Please also rebuild using --enable-profile so we can see what's happening. Can you also try the enclosed patch which implements prefetching during replay of btree delete records. (Need to set effective_io_concurrency) Thanks for your further help. -- Simon Riggs www.2ndQuadrant.com diff --git a/src/backend/access/nbtree/nbtxlog.c b/src/backend/access/nbtree/nbtxlog.c index f4c7bf4..9918688 100644 --- a/src/backend/access/nbtree/nbtxlog.c +++ b/src/backend/access/nbtree/nbtxlog.c @@ -578,6 +578,8 @@ btree_xlog_delete_get_latestRemovedXid(XLogRecord *record) OffsetNumber hoffnum; TransactionId latestRemovedXid = InvalidTransactionId; TransactionId htupxid = InvalidTransactionId; + TransactionId oldestxmin = GetCurrentOldestXmin(true, true); + TransactionId latestCompletedXid; int i; /* @@ -586,8 +588,12 @@ btree_xlog_delete_get_latestRemovedXid(XLogRecord *record) * That returns InvalidTransactionId, and so will conflict with * users, but since we just worked out that's zero people, its OK. */ - if (CountDBBackends(InvalidOid) == 0) - return latestRemovedXid; + if (!TransactionIdIsValid(oldestxmin)) + return oldestxmin; + + LWLockAcquire(ProcArrayLock, LW_SHARED); + latestCompletedXid = ShmemVariableCache-latestCompletedXid; + LWLockRelease(ProcArrayLock); /* * Get index page @@ -603,6 +609,27 @@ btree_xlog_delete_get_latestRemovedXid(XLogRecord *record) */ unused = (OffsetNumber *) ((char *) xlrec + SizeOfBtreeDelete); + /* + * Prefetch the heap buffers. + */ + for (i = 0; i xlrec-nitems; i++) + { + /* + * Identify the index tuple about to be deleted + */ + iitemid = PageGetItemId(ipage, unused[i]); + itup = (IndexTuple) PageGetItem(ipage, iitemid); + + /* + * Locate the heap page that the index tuple points at + */ + hblkno = ItemPointerGetBlockNumber((itup-t_tid)); + XLogPrefetchBuffer(xlrec-hnode, MAIN_FORKNUM, hblkno); + } + + /* + * Read through the heap tids + */ for (i = 0; i xlrec-nitems; i++) { /* @@ -659,6 +686,16 @@ btree_xlog_delete_get_latestRemovedXid(XLogRecord *record) latestRemovedXid = htupxid; htupxid = HeapTupleHeaderGetXmax(htuphdr); + + /* + * Stop searching when we've found a recent xid + */ + if (TransactionIdFollowsOrEquals(htupxid,latestCompletedXid)) + { +UnlockReleaseBuffer(hbuffer); +break; + } + if (TransactionIdFollows(htupxid, latestRemovedXid)) latestRemovedXid = htupxid; } diff --git a/src/backend/access/transam/xlogutils.c b/src/backend/access/transam/xlogutils.c index 9ee2036..3ea3a40 100644 --- a/src/backend/access/transam/xlogutils.c +++ b/src/backend/access/transam/xlogutils.c @@ -342,6 +342,16 @@ XLogReadBufferExtended(RelFileNode rnode, ForkNumber forknum, return buffer; } +void +XLogPrefetchBuffer(RelFileNode rnode, ForkNumber forknum, + BlockNumber blkno) +{ + Relation reln = CreateFakeRelcacheEntry(rnode); + + reln-rd_istemp = false; + + PrefetchBuffer(reln, forknum, blkno); +} /* * Struct actually returned by XLogFakeRelcacheEntry, though the declared diff --git a/src/backend/storage/ipc/procarray.c b/src/backend/storage/ipc/procarray.c index 5a214c8..bb23c16 100644 --- a/src/backend/storage/ipc/procarray.c +++ b/src/backend/storage/ipc/procarray.c @@ -933,6 +933,21 @@ TransactionIdIsActive(TransactionId xid) TransactionId GetOldestXmin(bool allDbs, bool ignoreVacuum) { + TransactionId result = GetCurrentOldestXmin(allDbs, ignoreVacuum); + + /* + * Compute the cutoff XID, being careful not to generate a permanent XID + */ + result -= vacuum_defer_cleanup_age; + if (!TransactionIdIsNormal(result)) + result = FirstNormalTransactionId; + + return result; +} + +TransactionId +GetCurrentOldestXmin(bool allDbs, bool ignoreVacuum) +{ ProcArrayStruct *arrayP = procArray; TransactionId result; int index; @@ -985,13 +1000,6 @@ GetOldestXmin(bool allDbs, bool ignoreVacuum) LWLockRelease(ProcArrayLock); - /* - * Compute the cutoff XID, being careful not to generate a permanent XID - */ - result -= vacuum_defer_cleanup_age; - if (!TransactionIdIsNormal(result)) - result = FirstNormalTransactionId; - return result; } diff --git a/src/include/access/xlogutils.h b/src/include/access/xlogutils.h index 8477f88..caa8aa3 100644 --- a/src/include/access/xlogutils.h +++ b/src/include/access/xlogutils.h @@ -28,6 +28,9 @@ extern void XLogTruncateRelation(RelFileNode rnode, ForkNumber forkNum, extern Buffer XLogReadBuffer(RelFileNode rnode, BlockNumber blkno, bool init); extern Buffer XLogReadBufferExtended(RelFileNode rnode, ForkNumber forknum, BlockNumber blkno, ReadBufferMode mode); +extern void XLogPrefetchBuffer(RelFileNode rnode, ForkNumber forknum, + BlockNumber blkno); + extern Relation
Re: recovery_connections cannot start (was Re: [HACKERS] master in standby mode croaks)
On Fri, Apr 23, 2010 at 1:04 AM, Robert Haas robertmh...@gmail.com wrote: One way we could fix this is use 2 bits rather than 1 for XLogStandbyInfoMode. One bit could indicate that either archive_mode=on or max_wal_senders0, and the second bit could indicate that recovery_connections=on. If the second bit is unset, we could emit the existing complaint: recovery connections cannot start because the recovery_connections parameter is disabled on the WAL source server If the other bit is unset, then we could instead complain: recovery connections cannot start because archive_mode=off and max_wal_senders=0 on the WAL source server If we don't want to use two bits there, it's hard to really describe all the possibilities in a reasonable number of characters. The only thing I can think of is to print a message and a hint: recovery_connections cannot start due to incorrect settings on the WAL source server HINT: make sure recovery_connections=on and either archive_mode=on or max_wal_senders0 I haven't checked whether the hint would be displayed in the log on the standby, but presumably we could make that be the case if it's not already. I think the first way is better because it gives the user more specific information about what they need to fix. Thinking about how each case might happen, since the default for recovery_connections is 'on', it seems that recovery_connections=off will likely only be an issue if the user has explicitly turned it off. The other case, where archive_mode=off and max_wal_senders=0, will likely only occur if someone takes a snapshot of the master without first setting up archiving or SR. Both of these will probably happen relatively rarely, but since we're burning a whole byte for XLogStandbyInfoMode (plus 3 more bytes of padding?), it seems like we might as well snag one more bit for clarity. Thoughts? I like the second choice since it's simpler and enough for me. But I have no objection to the first. When we encounter the error, we would need to not only change those parameter values but also take a fresh base backup and restart the standby using it. The description of this required procedure needs to be in the document or error message, I think. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] testing HS/SR - 1 vs 2 performance
Erik Rijkers wrote: This is the same behaviour (i.e. extreme slow standby) that I saw earlier (and which caused the original post, btw). In that earlier instance, the extreme slowness disappeared later, after many hours maybe even days (without bouncing either primary or standby). I have no idea what could cause this; is no one else is seeing this ? (if I have time I'll repeat on other hardware in the weekend) any comment is welcome... I wonder if what you are seeing is perhaps due to the tables on the primary being almost completely cached (from the initial create) and those on the standby being at best partially so. That would explain why the standby performance catches up after a while ( when its tables are equivalently cached). One way to test this is to 'pre-cache' the standby by selecting every row from its tables before running the pgbench test. regards Mark -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] why do we have rd_istemp?
Given Relation rel, it looks to me like rel-rd_rel-relistemp will always give the same answer as rel-rd_istemp. So why have both? ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Assertion failure twophase.c (3) (testing HS/SR)
On Thu, April 22, 2010 09:53, Heikki Linnakangas wrote: Can you still reproduce this or has some of the changes since then fixed it? We never quite figured out the cause.. I don't know for sure: Unfortunately, it does not happen always, or predictably. The only thing that I established after that email was sent, is that the error can also occur without the postbio package being been installed (this has happened once). It's a very easy test; I will probably run it a few more times. Erik Rijkers wrote: On Thu, March 4, 2010 17:00, Erik Rijkers wrote: in a 9.0devel, primary+standby, cvs from 2010.03.04 01:30 With three patches: new_smart_shutdown_20100201.patch extend_format_of_recovery_info_funcs_v4.20100303.patch fix-KnownAssignedXidsRemoveMany-1.patch pg_dump -d $db8.4.2 | psql -d $db9.0devel-primary FailedAssertion, File: twophase.c, Line: 1201. For the record, this still happens (FailedAssertion, File: twophase.c, Line: 1201.) (created 2010.03.13 23:49 cvs). Unfortunately, it does not happen always, or predictably. patches: new_smart_shutdown_20100201.patch extend_format_of_recovery_info_funcs_v4.20100303.patch (both here: http://archives.postgresql.org/pgsql-hackers/2010-03/msg00446.php ) (fix-KnownAssignedXidsRemoveMany-1.patch has been committed, I think?) I use commandlines like this to copy schemas across from 8.4.2 to 9.0devel: pg_dump -c -h /tmp -p 5432 -n myschema --no-owner --no-privileges mydb \ | psql -1qtA -h /tmp -p 7575 -d replicas (the copied schemas were together 175 GB) As I seem to be the only one who finds this, I started looking what could be unique in this install: and it would be postbio, which we use for its gist-indexing on ranges (http://pgfoundry.org/projects/postbio/). We use postbio's int_interval type as a column type. But keep in mind that sometimes the whole dump+restore+replication completes OK. Other installed modules are: contrib/btree_gist contrib/seg contrib/adminpack log_line_prefix = '%t %p %d %u start=%s ' # slave pgsql.sr_hotslave/logfile: 2010-03-13 23:54:59 CET 15765 start=2010-03-13 23:54:59 CET LOG: database system was interrupted; last known up at 2010-03-13 23:54:31 CET cp: cannot stat `/var/data1/pg_stuff/dump/hotslave/replication_archive/00010001': No such file or directory 2010-03-13 23:55:00 CET 15765 start=2010-03-13 23:54:59 CET LOG: entering standby mode 2010-03-13 23:55:00 CET 15765 start=2010-03-13 23:54:59 CET LOG: redo starts at 0/120 2010-03-13 23:55:00 CET 15765 start=2010-03-13 23:54:59 CET LOG: consistent recovery state reached at 0/200 2010-03-13 23:55:00 CET 15763 start=2010-03-13 23:54:59 CET LOG: database system is ready to accept read only connections TRAP: FailedAssertion(!(((xid) != ((TransactionId) 0))), File: twophase.c, Line: 1201) 2010-03-14 05:28:59 CET 15763 start=2010-03-13 23:54:59 CET LOG: startup process (PID 15765) was terminated by signal 6: Aborted 2010-03-14 05:28:59 CET 15763 start=2010-03-13 23:54:59 CET LOG: terminating any other active server processes Maybe I'll try now to setup a similar instance without postbio, to see if the crash still occurs. hth, Erik Rijkers -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] shared_buffers documentation
On Wed, Apr 21, 2010 at 2:54 AM, Greg Smith g...@2ndquadrant.com wrote: Jim Nasby wrote: I've also seen large shared buffer settings perform poorly outside of IO issues, presumably due to some kind of internal lock contention. I tried running 8.3 with 24G for a while, but dropped it back down to our default of 8G after noticing some performance problems. Unfortunately I don't remember the exact details, let alone having a repeatable test case We got a report for Jignesh at Sun once that he had a benchmark workload where there was a clear performance wall at around 10GB of shared_buffers. At http://blogs.sun.com/jkshah/entry/postgresql_east_2008_talk_best he says: Shared Bufferpool getting better in 8.2, worth to increase it to 3GB (for 32-bit PostgreSQL) but still not great to increase it more than 10GB (for 64-bit PostgreSQL) So you running into the same wall around the same amount just fuels the existing idea there's an underlying scalablity issue in there. Nobody with that right hardware has put it under the light of a profiler yet as far as I know. It might be interesting to see whether increasing NUM_BUFFER_PARTITIONS, LOG2_NUM_LOCK_PARTITIONS, and NUM_LOCK_PARTITIONS alleviates this problem at all. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] why do we have rd_istemp?
Robert Haas robertmh...@gmail.com writes: Given Relation rel, it looks to me like rel-rd_rel-relistemp will always give the same answer as rel-rd_istemp. So why have both? Might be historical --- relistemp is pretty new. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers