Re: [HACKERS] Pg default's verbosity?
On Tue, Jun 19, 2012 at 02:15:43AM -0400, Tom Lane wrote: Robert Haas robertmh...@gmail.com writes: There might be something to the idea of demoting a few of the things we've traditionally had as NOTICEs, though. IME, the following two messages account for a huge percentage of the chatter: NOTICE: CREATE TABLE will create implicit sequence foo_a_seq for serial column foo.a NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index foo_pkey for table foo Personally, I'd have no problem with flat-out dropping (not demoting) both of those two specific messages. I seem to recall that Bruce has lobbied for them heavily in the past, though. I would like to see them gone or reduced as well. I think I wanted them when we changed the fact that SERIAL doesn't create unique indexes, but that was long ago --- I think everyone knows what happens now. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Checkpointer on hot standby runs without looking checkpoint_segments
On 8 June 2012 09:14, Kyotaro HORIGUCHI horiguchi.kyot...@lab.ntt.co.jp wrote: Hello, I will make this patch start again for this CF. The requirement for this patch is as follows. - What I want to get is similarity of the behaviors between master and (hot-)standby concerning checkpoint progression. Specifically, checkpoints for streaming replication running at the speed governed with checkpoint_segments. The work of this patch is avoiding to get unexpectedly large number of WAL segments stay on standby side. (Plus, increasing the chance to skip recovery-end checkpoint by my another patch.) I think we need to be clearer about this: I reject this patch and am moving to rejected on the CF manager. The increase chance to skip recovery end checkpoint is completely gone as a reason to do this (see other thread). Plus the premise that we want more restartpoints is wrong, with reasons explained by Heikki, in detail, months ago. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] We probably need autovacuum_max_wraparound_workers
Le vendredi 29 juin 2012 04:26:42, Tom Lane a écrit : Josh Berkus j...@agliodbs.com writes: Well, I think it's plausible but wrong under at least some common circumstances. In addition to seeking, it ignores FS cache effects (not that I have any idea how to account for these mathematically). It also makes the assumption that 3 autovacuum workers running at 1/3 speed each is better than having one worker running at full speed, which is debatable. Well, no, not really, because the original implementation with only one worker was pretty untenable. But maybe we need some concept like only one worker working on *big* tables? Or at least, less than max_workers of them. I think it is easier to manage to keep some workers available to work on other task instead of having all of them doing the same longest job. pgfincore allows since years to snapshot and restore the OS cache to work around such issues. Autovacuum should snapshot the xMB ahead and restore the previous state cache when done. -- Cédric Villemain +33 (0)6 20 30 22 52 http://2ndQuadrant.fr/ PostgreSQL: Support 24x7 - Développement, Expertise et Formation signature.asc Description: This is a digitally signed message part.
Re: [HACKERS] Notify system doesn't recover from No space error
[Resending as the original post didn't get through to the list] Warming up an old thread here - we ran into the same problem. Database is 9.1.4/x86_64 from Debian/testing. The client application is bucardo hammering the database with NOTIFYs (including some master-master replication conflicts, that might add to the parallel NOTIFY load). The problem is reproducible with the attached instructions (several ENOSPC cycles might be requried). When the filesystem is filled using dd, the bucardo and psql processes will die with this error: FEHLER: 53100: konnte auf den Status von Transaktion 0 nicht zugreifen DETAIL: Konnte nicht in Datei »pg_notify/« bei Position 180224 schreiben: Auf dem Gerät ist kein Speicherplatz mehr verfügbar. ORT: SlruReportIOError, slru.c:861 The line number might be different, sometimes its ENOENT, sometimes even Success. Even after disk space is available again, subsequent NOTIFY foobar calls will die, without any other clients connected: ERROR: XX000: could not access status of transaction 0 DETAIL: Could not read from file pg_notify/ at offset 245760: Success. ORT: SlruReportIOError, slru.c:854 Here's a backtrace, caught at slru.c:430: 430 SlruReportIOError(ctl, pageno, xid); (gdb) bt #0 SimpleLruReadPage (ctl=ctl@entry=0xb192a0, pageno=30, write_ok=write_ok@entry=1 '\001', xid=xid@entry=0) at /home/martin/debian/psql/9.1/build-area/postgresql-9.1-9.1.4/build/../src/backend/access/transam/slru.c:430 #1 0x00520d2f in asyncQueueAddEntries (nextNotify=nextNotify@entry=0x29b60c8) at /home/martin/debian/psql/9.1/build-area/postgresql-9.1-9.1.4/build/../src/backend/commands/async.c:1318 #2 0x0052187f in PreCommit_Notify () at /home/martin/debian/psql/9.1/build-area/postgresql-9.1-9.1.4/build/../src/backend/commands/async.c:869 #3 0x004973d3 in CommitTransaction () at /home/martin/debian/psql/9.1/build-area/postgresql-9.1-9.1.4/build/../src/backend/access/transam/xact.c:1827 #4 0x00497a8d in CommitTransactionCommand () at /home/martin/debian/psql/9.1/build-area/postgresql-9.1-9.1.4/build/../src/backend/access/transam/xact.c:2562 #5 0x00649497 in finish_xact_command () at /home/martin/debian/psql/9.1/build-area/postgresql-9.1-9.1.4/build/../src/backend/tcop/postgres.c:2452 #6 finish_xact_command () at /home/martin/debian/psql/9.1/build-area/postgresql-9.1-9.1.4/build/../src/backend/tcop/postgres.c:2441 #7 0x0064c875 in exec_simple_query (query_string=0x2a99d70 notify foobar;) at /home/martin/debian/psql/9.1/build-area/postgresql-9.1-9.1.4/build/../src/backend/tcop/postgres.c:1037 #8 PostgresMain (argc=optimized out, argv=argv@entry=0x29b1df8, username=optimized out) at /home/martin/debian/psql/9.1/build-area/postgresql-9.1-9.1.4/build/../src/backend/tcop/postgres.c:3968 #9 0x0060e731 in BackendRun (port=0x2a14800) at /home/martin/debian/psql/9.1/build-area/postgresql-9.1-9.1.4/build/../src/backend/postmaster/postmaster.c:3611 #10 BackendStartup (port=0x2a14800) at /home/martin/debian/psql/9.1/build-area/postgresql-9.1-9.1.4/build/../src/backend/postmaster/postmaster.c:3296 #11 ServerLoop () at /home/martin/debian/psql/9.1/build-area/postgresql-9.1-9.1.4/build/../src/backend/postmaster/postmaster.c:1460 #12 0x0060f451 in PostmasterMain (argc=argc@entry=5, argv=argv@entry=0x29b1170) at /home/martin/debian/psql/9.1/build-area/postgresql-9.1-9.1.4/build/../src/backend/postmaster/postmaster.c:1121 #13 0x00464bc9 in main (argc=5, argv=0x29b1170) at /home/martin/debian/psql/9.1/build-area/postgresql-9.1-9.1.4/build/../src/backend/main/main.c:199 Restarting the cluster seems to fix the condition in some cases, but I've seen the error persist over restarts, or reappear after some time even without disk full. (That's also what the customer on the live system is seeing.) Christoph -- c...@df7cb.de | http://www.df7cb.de/ pg_notify_error.sh Description: Bourne shell script signature.asc Description: Digital signature
Re: [HACKERS] Covering Indexes
On Thu, Jun 28, 2012 at 7:02 AM, Rob Wultsch wult...@gmail.com wrote: On Thu, Jun 28, 2012 at 8:16 AM, David E. Wheeler da...@justatheory.com wrote: Hackers, Very interesting design document for SQLite 4: http://www.sqlite.org/src4/doc/trunk/www/design.wiki I'm particularly intrigued by covering indexes. For example: CREATE INDEX cover1 ON table1(a,b) COVERING(c,d); This allows the following query to do an index-only scan: SELECT c, d FROM table1 WHERE a=? AND b=?; Now that we have index-only scans in 9.2, I'm wondering if it would make sense to add covering index support, too, where additional, unindexed columns are stored alongside indexed columns. And I wonder if it would work well with expressions, too? David IRC MS SQL also allow unindexed columns in the index. -- Rob Wultsch wult...@gmail.com MS SQL Server does support including non-key columns in indexes, allowing index only scans for queries returning these columns. Their syntax is different from that proposed in the linked SQLite document. To the best of my experience, the advantages in SQL Server to such an index (as opposed to just adding the columns to the index normally) are as follows: 1- You can include extra columns in a unique index which do not participate in the uniqueness determination. 2- The non-key columns can be of types which normally cannot be included in a b-tree index due to lack of proper sorting operators. 3- The resulting index is smaller, because the non-key columns are only contained in leaf nodes, not in internal nodes. 4- The insert/update overhead is lower. Of course, an implementation in a different database system, such as Postgres, may or may not have the same set of benefits. Number 4 in particular seems to be dependent on the details of the b-tree implementation. In my mind numbers 1 and 2 are the most compelling arguments in favor of this feature. Whether the it's worth the effort of coding the feature would depend on how well the above benefits (or any benefits I missed) hold true, and how useful such an index actually proved for index only scans in Postgres. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [ADMIN] pg_basebackup blocking all queries with horrible performance
On Wed, Jun 27, 2012 at 7:58 PM, Simon Riggs si...@2ndquadrant.com wrote: On 27 June 2012 18:24, Fujii Masao masao.fu...@gmail.com wrote: will never become sync standby even if their name is in synchronous_standby_names. I don't understand why you'd want that. What is wrong with removing the name from synchronous_standby_names if you don't like it? I believe that's just fallout -the main point is that we don't want to become a sync standby when synchronous_standby_names is set to '*'. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [ADMIN] pg_basebackup blocking all queries with horrible performance
On Wed, Jun 27, 2012 at 7:24 PM, Fujii Masao masao.fu...@gmail.com wrote: On Thu, Jun 21, 2012 at 3:18 AM, Robert Haas robertmh...@gmail.com wrote: On Wed, Jun 20, 2012 at 7:18 AM, Magnus Hagander mag...@hagander.net wrote: You agreed to add something like NOSYNC option into START_REPLICATION command? I'm on the fence. I was hoping somebody else would chime in with an opinion as well. +1 Nobody else with any opinion on this? :( I don't think we really need a NOSYNC flag at this point. Just not setting the flush location in clients that make a point of flushing in a timely fashion seems fine. Okay, I'm in the minority, so I'm writing the patch that way. WIP patch attached. In the patch, pg_basebackup background process and pg_receivexlog always return invalid location as flush one, and will never become sync standby even if their name is in synchronous_standby_names. The timing of their sending That doesn't match with the patch, afaics. The patch always sets the correct write location, which means it can become a remote_write synchronous standby, no? It will only send it back when timeout expires, but it will be sent back. I wonder if that might actually be a more reasonable mode of operation in general: * always send back the write position, at the write interval * always send back the flush position, when we're flushing (meaning when we switch xlog) have an option that makes it possible to: * always send back the write position as soon as it changes (making for a reasonable remote_write sync standby) * actually flush the log after each write instead of end of file (making for a reasonable full sync standby) meaning you'd have something like pg_receivexlog --sync=write and pg_receivexlog --sync=flush controlling it instead. And deal with the user put * in synchronous_standby_names and accidentally got pg_receivexlog as the sync standby by more clearly warning people not to use * for that parameter... Since it's simply dangerous :) the reply depends on the standby_message_timeout specified in -s option. So the write position may lag behind the true position. pg_receivexlog accepts new option -S (better option character?). If this option is specified, pg_receivexlog returns true flush position, and can become sync standby. It sends back the reply to the master each time the write position changes or the timeout passes. If synchronous_commit is set to remote_write, synchronous replication to pg_receivexlog would work well. Yeah, I hadn't considered the remote_write mode, but I guess that's why you have to track the current write position across loads, which first confused me. Looking at some other usecases for this, I wonder if we should also force a status message whenever we switch xlog files, even if we aren't running in sync mode, even if the timeout hasn't expired. I think that would be a reasonable thing to do, since you often want to track things based on files. The patch needs more documentation. But I think that it's worth reviewing the code in advance, so I attached the WIP patch. Comments? Objections? Looking at the code, what exactly prompts the changes to the backend side? That seems unrelated? Are we actually considering picking a standby with InvalidXlogRecPtr as a sync standby today? Isn't it enough to just send the proper write and flush locations from the frontend? The patch is based on current HEAD, i.e., 9.3dev. If the patch is applied, we need to write the backport version of the patch for 9.2. Oh, conflicts with Heikkis xlog patches, right? Ugh. But yeah. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Reporting hba lines
On Wed, Jun 27, 2012 at 4:27 PM, Tom Lane t...@sss.pgh.pa.us wrote: Magnus Hagander mag...@hagander.net writes: On Wed, Jun 27, 2012 at 4:14 PM, Tom Lane t...@sss.pgh.pa.us wrote: cases where people are modifying the wrong hba file. Can we show the source text of the hba line? We don't currently keep the full source text around - but we certainly could do that if we wanted to. If we're concerned about providing a message like this, I think it'd be well worthwhile. We presumably would only store the active lines not comment lines, so the extra memory space would be negligible in just about any real use-case. Turned out to be a bit more work than I thought, since the current parser reads pg_hba byte by byte, and not line by line. So I had to change that. See attached, seems reasonable? -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ hba_line2.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH 10/16] Introduce the concept that wal has a 'origin' node
2012-06-19 09:24 keltezéssel, Andres Freund írta: On Tuesday, June 19, 2012 04:12:47 AM Steve Singer wrote: On 12-06-18 07:30 AM, Andres Freund wrote: Hrmpf #666. I will go through through the series commit-by-commit again to make sure everything compiles again. Reordinging this late definitely wasn't a good idea... I pushed a rebased version with all those fixups (and removal of the zeroRecPtr patch). Where did you push that rebased version to? I don't see an attachment, or an updated patch in the commitfest app and your repo at http://git.postgresql.org/gitweb/?p=users/andresfreund/postgres.git;a=summa ry hasn't been updated in 5 days. To the 2ndquadrant internal repo. Which strangely doesn't help you. *Headdesk*. Pushed to the correct repo and manually verified. Which repository is the correct one? http://git.postgresql.org/gitweb/?p=users/andresfreund/postgres.git was refreshed 11 days ago. The patch taken from there fails with a reject in src/include/access/xlog.h. Andres -- -- Zoltán Böszörményi Cybertec Schönig Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de http://www.postgresql.at/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH 11/16] Add infrastructure for manipulating multiple streams of wal on a segment handling level
Hi, trying to review this one according to http://wiki.postgresql.org/wiki/Reviewing_a_Patch # Is the patch in context diff format http://en.wikipedia.org/wiki/Diff#Context_format? No. (Does this requirement still apply after PostgreSQL switched to GIT?) # Does it apply cleanly to the current git master? No. The patches 01...09 in this series taken from the mailing list apply cleanly, 10 and 11 fail with rejects. Best regards, Zoltán Böszörményi 2012-06-13 13:28 keltezéssel, Andres Freund írta: From: Andres Freund and...@anarazel.de For that add a 'node_id' parameter to most commands dealing with wal segments. A node_id thats 'InvalidMultimasterNodeId' references local wal, every other node_id referes to wal in a new pg_lcr directory. Using duplicated code would reduce the impact of that change but the long-term code-maintenance burden outweighs that by a far bit. Besides the decision to add a 'node_id' parameter to several functions the changes in this patch are fairly mechanical. --- src/backend/access/transam/xlog.c | 54 --- src/backend/replication/basebackup.c|4 +- src/backend/replication/walreceiver.c |2 +- src/backend/replication/walsender.c |9 +++-- src/bin/initdb/initdb.c |1 + src/bin/pg_resetxlog/pg_resetxlog.c |2 +- src/include/access/xlog.h |2 +- src/include/access/xlog_internal.h | 13 +-- src/include/replication/logical.h |2 + src/include/replication/walsender_private.h |2 +- 10 files changed, 56 insertions(+), 35 deletions(-) diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index 504b4d0..0622726 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -635,8 +635,8 @@ static bool XLogCheckBuffer(XLogRecData *rdata, bool doPageWrites, static bool AdvanceXLInsertBuffer(bool new_segment); static bool XLogCheckpointNeeded(uint32 logid, uint32 logseg); static void XLogWrite(XLogwrtRqst WriteRqst, bool flexible, bool xlog_switch); -static bool InstallXLogFileSegment(uint32 *log, uint32 *seg, char *tmppath, - bool find_free, int *max_advance, +static bool InstallXLogFileSegment(RepNodeId node_id, uint32 *log, uint32 *seg, + char *tmppath, bool find_free, int *max_advance, bool use_lock); static int XLogFileRead(uint32 log, uint32 seg, int emode, TimeLineID tli, int source, bool notexistOk); @@ -1736,8 +1736,8 @@ XLogWrite(XLogwrtRqst WriteRqst, bool flexible, bool xlog_switch) /* create/use new log file */ use_existent = true; - openLogFile = XLogFileInit(openLogId, openLogSeg, - use_existent, true); + openLogFile = XLogFileInit(InvalidMultimasterNodeId, openLogId, + openLogSeg, use_existent, true); openLogOff = 0; } @@ -2376,6 +2376,9 @@ XLogNeedsFlush(XLogRecPtr record) * place. This should be TRUE except during bootstrap log creation. The * caller must *not* hold the lock at call. * + * node_id: if != InvalidMultimasterNodeId this xlog file is actually a LCR + * file + * * Returns FD of opened file. * * Note: errors here are ERROR not PANIC because we might or might not be @@ -2384,8 +2387,8 @@ XLogNeedsFlush(XLogRecPtr record) * in a critical section. */ int -XLogFileInit(uint32 log, uint32 seg, -bool *use_existent, bool use_lock) +XLogFileInit(RepNodeId node_id, uint32 log, uint32 seg, + bool *use_existent, bool use_lock) { charpath[MAXPGPATH]; chartmppath[MAXPGPATH]; @@ -2396,7 +2399,7 @@ XLogFileInit(uint32 log, uint32 seg, int fd; int nbytes; - XLogFilePath(path, ThisTimeLineID, log, seg); + XLogFilePath(path, ThisTimeLineID, node_id, log, seg); /* * Try to use existent file (checkpoint maker may have created it already) @@ -2425,6 +2428,11 @@ XLogFileInit(uint32 log, uint32 seg, */ elog(DEBUG2, creating and filling new WAL file); + /* +* FIXME: to be safe we need to create tempfile in the pg_lcr directory if +* its actually an lcr file because pg_lcr might be in a different +* partition. +*/ snprintf(tmppath, MAXPGPATH, XLOGDIR /xlogtemp.%d, (int) getpid()); unlink(tmppath); @@ -2493,7 +2501,7 @@ XLogFileInit(uint32 log, uint32 seg, installed_log = log; installed_seg = seg; max_advance = XLOGfileslop; - if (!InstallXLogFileSegment(installed_log,
Re: [HACKERS] [PATCH 10/16] Introduce the concept that wal has a 'origin' node
On Friday, June 29, 2012 02:43:49 PM Boszormenyi Zoltan wrote: 2012-06-19 09:24 keltezéssel, Andres Freund írta: On Tuesday, June 19, 2012 04:12:47 AM Steve Singer wrote: On 12-06-18 07:30 AM, Andres Freund wrote: Hrmpf #666. I will go through through the series commit-by-commit again to make sure everything compiles again. Reordinging this late definitely wasn't a good idea... I pushed a rebased version with all those fixups (and removal of the zeroRecPtr patch). Where did you push that rebased version to? I don't see an attachment, or an updated patch in the commitfest app and your repo at http://git.postgresql.org/gitweb/?p=users/andresfreund/postgres.git;a=su mma ry hasn't been updated in 5 days. To the 2ndquadrant internal repo. Which strangely doesn't help you. *Headdesk*. Pushed to the correct repo and manually verified. Which repository is the correct one? http://git.postgresql.org/gitweb/?p=users/andresfreund/postgres.git was refreshed 11 days ago. The patch taken from there fails with a reject in src/include/access/xlog.h. Thats the right repository but Heikki's recent' changes (dfda6ebaec6763090fb78b458a979b558c50b39b and several following) changed quite a bit around on master and I am currently rebasing and addressing review comments. Greetings, Andres -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH 11/16] Add infrastructure for manipulating multiple streams of wal on a segment handling level
Hi, On Friday, June 29, 2012 02:43:52 PM Boszormenyi Zoltan wrote: trying to review this one according to http://wiki.postgresql.org/wiki/Reviewing_a_Patch # Is the patch in context diff format http://en.wikipedia.org/wiki/Diff#Context_format? No. (Does this requirement still apply after PostgreSQL switched to GIT?) Many people seem to send patches in unified format and just some days ago Tom said it doesn't matter to him. I still can't properly read context diffs, so I am using unified... # Does it apply cleanly to the current git master? No. The patches 01...09 in this series taken from the mailing list apply cleanly, 10 and 11 fail with rejects. Yea, and even the patches before that need to be rebased, at least partially they won't compile even though they apply cleanly. I will produce a rebased version soon, but then we haven't fully aggreed on preliminary patches to this one, so there doesn't seem to be too much point in reviewing this one before the other stuff is clear. Marking the patch as Returned with Feedback for now. I have done the same with 13, 15. Those seem to be too much in limbo for CF-style reviews. Thanks! Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [v9.3] Extra Daemons (Re: [HACKERS] elegant and effective way for running jobs inside a database)
2012-06-21 23:53 keltezéssel, Simon Riggs írta: On 21 June 2012 19:13, Jaime Casanova ja...@2ndquadrant.com wrote: On Sun, Jun 10, 2012 at 4:15 AM, Kohei KaiGai kai...@kaigai.gr.jp wrote: 2012/6/8 Simon Riggs si...@2ndquadrant.com: I have a prototype that has some of these characteristics, so I see our work as complementary. At present, I don't think this patch would be committable in CF1, but I'd like to make faster progress with it than that. Do you want to work on this more, or would you like me to merge our prototypes into a more likely candidate? I'm not favor in duplicate similar efforts. If available, could you merge some ideas in my patch into your prototypes? so, we are waiting for a new patch? is it coming from Simon or Kohei? There is an updated patch coming from me. I thought I would focus on review of other work first. We have some use cases for this patch, when can you post a new version? I would test and review it. Thanks in advance, Zoltán Böszörményi -- -- Zoltán Böszörményi Cybertec Schönig Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de http://www.postgresql.at/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH 11/16] Add infrastructure for manipulating multiple streams of wal on a segment handling level
2012-06-29 15:01 keltezéssel, Andres Freund írta: Hi, On Friday, June 29, 2012 02:43:52 PM Boszormenyi Zoltan wrote: trying to review this one according to http://wiki.postgresql.org/wiki/Reviewing_a_Patch # Is the patch in context diff format http://en.wikipedia.org/wiki/Diff#Context_format? No. (Does this requirement still apply after PostgreSQL switched to GIT?) Many people seem to send patches in unified format and just some days ago Tom said it doesn't matter to him. I still can't properly read context diffs, so I am using unified... Unified diffs are usually more readable for me after following the Linux kernel development for years and are shorter than context diffs. # Does it apply cleanly to the current git master? No. The patches 01...09 in this series taken from the mailing list apply cleanly, 10 and 11 fail with rejects. Yea, and even the patches before that need to be rebased, at least partially they won't compile even though they apply cleanly. I will produce a rebased version soon, but then we haven't fully aggreed on preliminary patches to this one, so there doesn't seem to be too much point in reviewing this one before the other stuff is clear. Marking the patch as Returned with Feedback for now. I have done the same with 13, 15. Those seem to be too much in limbo for CF-style reviews. Thanks! You're welcome. Andres -- -- Zoltán Böszörményi Cybertec Schönig Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de http://www.postgresql.at/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [v9.3] Extra Daemons (Re: [HACKERS] elegant and effective way for running jobs inside a database)
2012-04-25 11:40 keltezéssel, Kohei KaiGai írta: 2012/3/10 Simon Riggs si...@2ndquadrant.com: On Fri, Mar 9, 2012 at 6:51 PM, Andrew Dunstan and...@dunslane.net wrote: On 03/09/2012 01:40 PM, Robert Haas wrote: On Fri, Mar 9, 2012 at 12:02 PM, David E. Wheelerda...@justatheory.com wrote: On Mar 9, 2012, at 7:55 AM, Merlin Moncure wrote: 100% agree (having re-read the thread and Alvaro's idea having sunk in). Being able to set up daemon processes side by side with the postmaster would fit the bill nicely. It's pretty interesting to think of all the places you could go with it. pgAgent could use it *right now*. I keep forgetting to restart it after restarting PostgreSQL and finding after a day or so that no jobs have run. That can and should be fixed by teaching pgAgent that failing to connect to the server, or getting disconnected, is not a fatal error, but a reason to sleep and retry. Yeah. It's still not entirely clear to me what a postmaster-controlled daemon is going to be able to do that an external daemon can't. Start and stop at the same time as postmaster, without any pain. It's a considerable convenience to be able to design this aspect once and then have all things linked to the postmaster follow that. It means people will be able to write code that runs on all OS easily, without everybody having similar but slightly different code about starting up, reading parameters, following security rules etc.. Tight integration, with good usability. I tried to implement a patch according to the idea. It allows extensions to register an entry point of the self-managed daemon processes, then postmaster start and stop them according to the normal manner. [kaigai@iwashi patch]$ ps ax | grep postgres 27784 pts/0S 0:00 /usr/local/pgsql/bin/postgres 27786 ?Ss 0:00 postgres: writer process 27787 ?Ss 0:00 postgres: checkpointer process 27788 ?Ss 0:00 postgres: wal writer process 27789 ?Ss 0:00 postgres: autovacuum launcher process 27790 ?Ss 0:00 postgres: stats collector process 27791 ?Ss 0:00 postgres: auth_counter == (*) The auth_counter being included in this patch is just an example of this functionality. It does not have significant meanings. It just logs number of authentication success and fails every intervals. I'm motivated to define an extra daemon that attach shared memory segment of PostgreSQL as a computing server to avoid limitation of number of GPU code that we can load concurrently. Thanks, I have tested this original version. The patch has a single trivial reject, after fixing it, it compiled nicely. After adding shared_preload_libraries='$libdir/auth_counter', the extra daemon start and stops nicely with pg_ctl start/stop. The auth_counter.c code is a fine minimalistic example on writing one's own daemon. Thanks, Zoltán Böszörményi -- -- Zoltán Böszörményi Cybertec Schönig Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de http://www.postgresql.at/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Reporting hba lines
Magnus Hagander mag...@hagander.net writes: Turned out to be a bit more work than I thought, since the current parser reads pg_hba byte by byte, and not line by line. So I had to change that. See attached, seems reasonable? A couple of comments: * In some places you have if ((c = *(*lineptr)++) != '\0') and in other places just if ((c = *(*lineptr)++)). This should be consistent (and personally I prefer the first way). * I'm not sure that this conversion is right: ! if (c != EOF) ! ungetc(c, fp); --- ! if (c != '\0') ! (*lineptr)--; In the file case, it's impossible to push back EOF, and unnecessary since another getc will produce EOF again anyway. In the string case, though, I think you might need to decrement the lineptr unconditionally, else next call will run off the end of the string no? * This bit seems a bit ugly, and not Windows-aware either: ! /* We don't store the trailing newline */ ! if (rawline[strlen(rawline)-1] == '\n') ! rawline[strlen(rawline)-1] = '\0'; ! It might be better to strip trailing \n and \r from the line immediately upon read, rather than here. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [v9.3] Extra Daemons (Re: [HACKERS] elegant and effective way for running jobs inside a database)
2012/6/29 Boszormenyi Zoltan z...@cybertec.at: 2012-04-25 11:40 keltezéssel, Kohei KaiGai írta: 2012/3/10 Simon Riggs si...@2ndquadrant.com: On Fri, Mar 9, 2012 at 6:51 PM, Andrew Dunstan and...@dunslane.net wrote: On 03/09/2012 01:40 PM, Robert Haas wrote: On Fri, Mar 9, 2012 at 12:02 PM, David E. Wheelerda...@justatheory.com wrote: On Mar 9, 2012, at 7:55 AM, Merlin Moncure wrote: 100% agree (having re-read the thread and Alvaro's idea having sunk in). Being able to set up daemon processes side by side with the postmaster would fit the bill nicely. It's pretty interesting to think of all the places you could go with it. pgAgent could use it *right now*. I keep forgetting to restart it after restarting PostgreSQL and finding after a day or so that no jobs have run. That can and should be fixed by teaching pgAgent that failing to connect to the server, or getting disconnected, is not a fatal error, but a reason to sleep and retry. Yeah. It's still not entirely clear to me what a postmaster-controlled daemon is going to be able to do that an external daemon can't. Start and stop at the same time as postmaster, without any pain. It's a considerable convenience to be able to design this aspect once and then have all things linked to the postmaster follow that. It means people will be able to write code that runs on all OS easily, without everybody having similar but slightly different code about starting up, reading parameters, following security rules etc.. Tight integration, with good usability. I tried to implement a patch according to the idea. It allows extensions to register an entry point of the self-managed daemon processes, then postmaster start and stop them according to the normal manner. [kaigai@iwashi patch]$ ps ax | grep postgres 27784 pts/0 S 0:00 /usr/local/pgsql/bin/postgres 27786 ? Ss 0:00 postgres: writer process 27787 ? Ss 0:00 postgres: checkpointer process 27788 ? Ss 0:00 postgres: wal writer process 27789 ? Ss 0:00 postgres: autovacuum launcher process 27790 ? Ss 0:00 postgres: stats collector process 27791 ? Ss 0:00 postgres: auth_counter == (*) The auth_counter being included in this patch is just an example of this functionality. It does not have significant meanings. It just logs number of authentication success and fails every intervals. I'm motivated to define an extra daemon that attach shared memory segment of PostgreSQL as a computing server to avoid limitation of number of GPU code that we can load concurrently. Thanks, I have tested this original version. The patch has a single trivial reject, after fixing it, it compiled nicely. After adding shared_preload_libraries='$libdir/auth_counter', the extra daemon start and stops nicely with pg_ctl start/stop. The auth_counter.c code is a fine minimalistic example on writing one's own daemon. Thanks for your testing. According to Simon's comment, I'm waiting for his integration of this patch with another implementation by him. The auth_counter is just an proof-of-concept patch, so, it is helpful if you could provide another use case that can make sense. Best regards, -- KaiGai Kohei kai...@kaigai.gr.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] plpython issue with Win64 (PG 9.2)
Thank you. Please do let me know once fix check-in. I will test it and share feedback with you. Thanks. Best Regards, Asif Naeem On Fri, Jun 29, 2012 at 3:36 AM, Jan Urbański wulc...@wulczer.org wrote: On 27/06/12 13:57, Jan Urbański wrote: On 27/06/12 11:51, Asif Naeem wrote: Hi, On Windows 7 64bit, plpython is causing server crash with the following test case i.e. CREATE PROCEDURAL LANGUAGE 'plpython3u'; CREATE OR REPLACE FUNCTION pymax (a integer, b integer) RETURNS integer AS $$ if a b: return a return b $$ LANGUAGE plpython3u; SELECT pymax(1, 2); I think primary reason that trigger this issue is when Function PLyUnicode_Bytes() calls PyUnicode_AsEncodedString( ,WIN1252 /*Server encoding*/, ) it fails with null. I built latest pg 9.2 source code with python 3.2.2.3 by using Visual Studio 2010. Thanks. I'll try to reproduce this on Linux, which should be possible given the results of your investigation. Your analysis is correct, I managed to reproduce this by injecting serverenc = win1252; into PLyUnicode_Bytes. The comment in that function says that Python understands all PostgreSQL encoding names except for SQL_ASCII, but that's not really true. In your case GetDatabaseEncodingName() returns WIN1252 and Python accepts CP125. I'm wondering how this should be fixed. Just by adding more special cases in PLyUnicode_Bytes? Even if we add a switch statement that would convert PG_WIN1250 into CP1250, Python can still raise an exception when encoding (for various reasons). How about replacing the PLy_elog there with just an elog? This loses traceback support and the Python exception message, which could be helpful for debugging (something like invalid character foo for encoding cp1250). OTOH, I'm uneasy about invoking the entire PLy_elog machinery from a function that's as low-level as PLyUnicode_Bytes. Lastly, we map SQL_ASCII to ascii which is arguably wrong. The function is supposed to return bytes in the server encoding, and under SQL_ASCII that probably means we can return anything (ie. use any encoding we deem useful). Using ascii as the Python codec name will raise an error on anything that has the high bit set. So: I'd add code to translate WINxxx into CPxxx when choosing the Python to use, change PLy_elog to elog in PLyUnicode_Bytes and leave the SQL_ASCII case alone, as there were no complaints and people using SQL_ASCII are asking for it anyway. Cheers, Jan
Re: [v9.3] Extra Daemons (Re: [HACKERS] elegant and effective way for running jobs inside a database)
2012-06-29 16:44 keltezéssel, Kohei KaiGai írta: 2012/6/29 Boszormenyi Zoltan z...@cybertec.at: 2012-04-25 11:40 keltezéssel, Kohei KaiGai írta: 2012/3/10 Simon Riggs si...@2ndquadrant.com: On Fri, Mar 9, 2012 at 6:51 PM, Andrew Dunstan and...@dunslane.net wrote: On 03/09/2012 01:40 PM, Robert Haas wrote: On Fri, Mar 9, 2012 at 12:02 PM, David E. Wheelerda...@justatheory.com wrote: On Mar 9, 2012, at 7:55 AM, Merlin Moncure wrote: 100% agree (having re-read the thread and Alvaro's idea having sunk in). Being able to set up daemon processes side by side with the postmaster would fit the bill nicely. It's pretty interesting to think of all the places you could go with it. pgAgent could use it *right now*. I keep forgetting to restart it after restarting PostgreSQL and finding after a day or so that no jobs have run. That can and should be fixed by teaching pgAgent that failing to connect to the server, or getting disconnected, is not a fatal error, but a reason to sleep and retry. Yeah. It's still not entirely clear to me what a postmaster-controlled daemon is going to be able to do that an external daemon can't. Start and stop at the same time as postmaster, without any pain. It's a considerable convenience to be able to design this aspect once and then have all things linked to the postmaster follow that. It means people will be able to write code that runs on all OS easily, without everybody having similar but slightly different code about starting up, reading parameters, following security rules etc.. Tight integration, with good usability. I tried to implement a patch according to the idea. It allows extensions to register an entry point of the self-managed daemon processes, then postmaster start and stop them according to the normal manner. [kaigai@iwashi patch]$ ps ax | grep postgres 27784 pts/0S 0:00 /usr/local/pgsql/bin/postgres 27786 ?Ss 0:00 postgres: writer process 27787 ?Ss 0:00 postgres: checkpointer process 27788 ?Ss 0:00 postgres: wal writer process 27789 ?Ss 0:00 postgres: autovacuum launcher process 27790 ?Ss 0:00 postgres: stats collector process 27791 ?Ss 0:00 postgres: auth_counter == (*) The auth_counter being included in this patch is just an example of this functionality. It does not have significant meanings. It just logs number of authentication success and fails every intervals. I'm motivated to define an extra daemon that attach shared memory segment of PostgreSQL as a computing server to avoid limitation of number of GPU code that we can load concurrently. Thanks, I have tested this original version. The patch has a single trivial reject, after fixing it, it compiled nicely. After adding shared_preload_libraries='$libdir/auth_counter', the extra daemon start and stops nicely with pg_ctl start/stop. The auth_counter.c code is a fine minimalistic example on writing one's own daemon. Thanks for your testing. According to Simon's comment, I'm waiting for his integration of this patch with another implementation by him. The auth_counter is just an proof-of-concept patch, so, it is helpful if you could provide another use case that can make sense. Well, we have two use cases that are more complex: - an SQL-driven scheduler, similar to pgAgent, it's generic enough, we might port it to this scheme and publish it - a huge volume importer daemon, it was written for a very specific purpose and for a single client, we cannot publish it. Both need database connections, the second needs more than one, so they need to link to the client side libpq, the way it was done for walreceiver can be done here as well. Best regards, Zoltán Böszörményi -- -- Zoltán Böszörményi Cybertec Schönig Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de http://www.postgresql.at/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH 13/16] Introduction of pair of logical walreceiver/sender
On 13.06.2012 14:28, Andres Freund wrote: A logical WALReceiver is started directly by Postmaster when we enter PM_RUN state and the new parameter multimaster_conninfo is set. For now only one of those is started, but the code doesn't rely on that. In future multiple ones should be allowed. Could the receiver-side of this be handled as an extra daemon: http://archives.postgresql.org/message-id/CADyhKSW2uyrO3zx-tohzRhN5-vaBEfKNHyvLG1yp7=cx_yh...@mail.gmail.com In general, I feel that the receiver-side could live outside core. The sender-side needs to be at least somewhat integrated into the walsender stuff, and there are changes to the WAL records etc. that are hard to do outside, but AFAICS the stuff to receive changes is pretty high-level stuff. As long as the protocol between the logical replication client and server is well-defined, it should be possible to write all kinds of clients. You could replay the changes to a MySQL database instead of PostgreSQL, for example, or send them to a message queue, or just log them to a log file for auditing purposes. None of that needs to be in implemented inside a PostgreSQL server. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH 13/16] Introduction of pair of logical walreceiver/sender
Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: On 13.06.2012 14:28, Andres Freund wrote: A logical WALReceiver is started directly by Postmaster when we enter PM_RUN state and the new parameter multimaster_conninfo is set. For now only one of those is started, but the code doesn't rely on that. In future multiple ones should be allowed. In general, I feel that the receiver-side could live outside core. The sender-side needs to be at least somewhat integrated into the walsender stuff, and there are changes to the WAL records etc. that are hard to do outside, but AFAICS the stuff to receive changes is pretty high-level stuff. It would be nice if there was at least a thin layer of the sender portion which could by used by a stand-alone program. I can think of lots of useful reasons to T the WAL stream -- passing through the stream with little or no modification to at least one side. As just one example, I would like a program to write traditional WAL files to match what an archive on the sending side would look like while passing the stream through to an asynchronous hot standby. As long as the protocol between the logical replication client and server is well-defined, it should be possible to write all kinds of clients. You could replay the changes to a MySQL database instead of PostgreSQL, for example, or send them to a message queue, or just log them to a log file for auditing purposes. None of that needs to be in implemented inside a PostgreSQL server. +1 -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH 13/16] Introduction of pair of logical walreceiver/sender
On Friday, June 29, 2012 05:16:11 PM Heikki Linnakangas wrote: On 13.06.2012 14:28, Andres Freund wrote: A logical WALReceiver is started directly by Postmaster when we enter PM_RUN state and the new parameter multimaster_conninfo is set. For now only one of those is started, but the code doesn't rely on that. In future multiple ones should be allowed. Could the receiver-side of this be handled as an extra daemon: http://archives.postgresql.org/message-id/CADyhKSW2uyrO3zx-tohzRhN5-vaBEfKN HyvLG1yp7=cx_yh...@mail.gmail.com Well, I think it depends on what the protocol turns out to be. In the prototype we used the infrastructure from walreceiver which reduced the required code considerably. In general, I feel that the receiver-side could live outside core. I think it should be possible to write receivers outside core, but one sensible implementation should be in-core. The sender-side needs to be at least somewhat integrated into the walsender stuff, and there are changes to the WAL records etc. that are hard to do outside, but AFAICS the stuff to receive changes is pretty high-level stuff. None of that needs to be in implemented inside a PostgreSQL server. If you want robust and low-overhead crash recovery you need (at least I think so) tighter integration into postgres. To be sure that you pick of where you stopped after a crash you need to have a state synchronized to the commits into the receiving side. So you either always write to another table and analyze that afterwards - which imo sucks - or you integrate it with the commit record. Which needs integration into pg. Greetings, Andres -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH 13/16] Introduction of pair of logical walreceiver/sender
On 29.06.2012 18:28, Kevin Grittner wrote: It would be nice if there was at least a thin layer of the sender portion which could by used by a stand-alone program. I can think of lots of useful reasons to T the WAL stream -- passing through the stream with little or no modification to at least one side. As just one example, I would like a program to write traditional WAL files to match what an archive on the sending side would look like while passing the stream through to an asynchronous hot standby. That isn't really related to the logical replication stuff, but I agree that would be cool. You can sort of do that with cascading replication, but a thin stand-alone program would be nicer. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [v9.3] Extra Daemons (Re: [HACKERS] elegant and effective way for running jobs inside a database)
On Fri, Jun 29, 2012 at 9:44 AM, Kohei KaiGai kai...@kaigai.gr.jp wrote: The auth_counter is just an proof-of-concept patch, so, it is helpful if you could provide another use case that can make sense. what about pgbouncer? -- Jaime Casanova www.2ndQuadrant.com Professional PostgreSQL: Soporte 24x7 y capacitación -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade log files
Excerpts from Tom Lane's message of vie jun 29 01:04:13 -0400 2012: Alvaro Herrera alvhe...@alvh.no-ip.org writes: I propose this patch which echoes the commands to the respective log files. I would backpatch this to 9.2. OK, but the fflush just before fclose seems a bit pointless; fclose will do that anyway no? Bah, yes, thanks. Pushed with that change. -- Álvaro Herrera alvhe...@commandprompt.com The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Update on the spinlock-pthread_mutex patch experimental: replace s_lock spinlock code with pthread_mutex on linux
Hi, I'll reply to Jeff with a brief thank you to Robert the bottom. First of all, here's an update: I have slightly modified the patch, I'll attach what I have at the moment. The main difference are - loops around the pthread_mutex calls: As the locking function signature is to return void at the moment, there is no error handling code in the callers (so, theoretically, there should be a chance for an infinite loop on a spinlock in the current code if you SIGKILL a spinlock holder (which you shouldn't do, sure). Using robust mutexes, we could avoid this issue). Retrying is probably the best we can do without implementing error recovery in all callers. - ereport(FATAL,) instead of assertions, which is really what we should do (imagine setting PTHREAD_PROCESS_SHARED fails and we still start up) Some insights: - I noticed that, for the simple pgbench tests I ran, PTHREAD_MUTEX_ADAPTIVE_NP yielded worse results than PTHREAD_MUTEX_NORMAL, which is somehow counter- intuitive, because _ADAPTIVE is closer to the current spinlock logic, but yet syscalling in the first place seems to be more efficient than spinning a little first and then syscalling (for the contended case). The increase in usr/sys time for my tests was in the order of 10-20%. - Also I noticed a general issue with linking to libpthread: My understanding is that this should also change the code to be reentrant when compiling with gcc (does anyone know precisely?), which we don't need - we only need the locking code, unless we want to roll our own futex implementation (see below). I am not sure if this is really root-caused because I have not fully understood what is going on, but when compiling with LDFLAGS=-lpthread for the top level Makefile, usr increses by some 10% for my tests. The code is more efficient when I simply leave out -lpthread, libpthread gets linked anyway. - I had a look at futex sample code, for instance http://locklessinc.com/articles/mutex_cv_futex/ and Ulrichs paper but I must say at this point I don't feel ready to roll own futex code for this most critical piece of code. There is simply too much which can go wrong and major mistakes are very hard to spot. I'd very much prefer to use an existing, proven implementation. At this point, I'd guess pulling in the relevant code from glibc/nptl would be one of the safest bets, but even this path is risky. On benchmarks: With the same pgbench parameters as before, I ended up with comparable results for unpatched and patched in terms of resource consumption: Test setup for both: for i in {1..10} ; do rsync -av --delete /tmp/test_template_data/ /tmp/data/ /usr/bin/time ./postgres -D /tmp/data -p 55502 ppid=$! pid=$(pgrep -P $ppid) sleep 15 ./pgbench -c 256 -t 20 -j 128 -p 55502 postgres kill $pid wait $ppid wait while pgrep -f 55502 ; do echo procs still running - hm sleep 1 done done unpatched (bins postgresql-server-91-9.1.3-1PGDG.rhel6.rpm) -bash-4.1$ grep elapsed /var/tmp/20120627_noslock_check/orig_code_2_perf 34.55user 20.07system 0:25.63elapsed 213%CPU (0avgtext+0avgdata 1360688maxresident)k 35.26user 19.90system 0:25.38elapsed 217%CPU (0avgtext+0avgdata 1360704maxresident)k 38.04user 21.68system 0:26.24elapsed 227%CPU (0avgtext+0avgdata 1360704maxresident)k 36.72user 21.95system 0:27.21elapsed 215%CPU (0avgtext+0avgdata 1360688maxresident)k 37.19user 22.00system 0:26.44elapsed 223%CPU (0avgtext+0avgdata 1360704maxresident)k 37.88user 22.58system 0:25.70elapsed 235%CPU (0avgtext+0avgdata 1360704maxresident)k 35.70user 20.90system 0:25.63elapsed 220%CPU (0avgtext+0avgdata 1360688maxresident)k 40.24user 21.65system 0:26.02elapsed 237%CPU (0avgtext+0avgdata 1360688maxresident)k 44.93user 22.96system 0:26.38elapsed 257%CPU (0avgtext+0avgdata 1360704maxresident)k 38.10user 21.51system 0:26.66elapsed 223%CPU (0avgtext+0avgdata 1360688maxresident)k -bash-4.1$ grep elapsed /var/tmp/20120627_noslock_check/orig_code_2_perf | tail -10 | sed 's:[^0-9. ]::g' | awk '{ u+=$1; s+=$2; c++;} END { print avg u/c s/c; }' avg 37.861 21.52 patched (based upon modified source rpm of the above) -bash-4.1$ egrep elapsed /var/tmp/20120627_noslock_check/with_slock_6_nocompile_without_top_-lpthread 42.32user 27.16system 0:28.18elapsed 246%CPU (0avgtext+0avgdata 2003488maxresident)k 39.14user 26.31system 0:27.24elapsed 240%CPU (0avgtext+0avgdata 2003504maxresident)k 38.81user 26.17system 0:26.67elapsed 243%CPU (0avgtext+0avgdata 2003520maxresident)k 41.04user 27.80system 0:29.00elapsed 237%CPU (0avgtext+0avgdata 2003520maxresident)k 35.41user 22.85system 0:27.15elapsed 214%CPU (0avgtext+0avgdata 2003504maxresident)k 32.74user 21.87system 0:25.62elapsed 213%CPU (0avgtext+0avgdata 2003504maxresident)k 35.68user 24.86system 0:27.16elapsed 222%CPU (0avgtext+0avgdata 2003520maxresident)k 32.10user
Re: [HACKERS] Posix Shared Mem patch
According to the Google, there is absolutely no way of gettIng MacOS X not to overcommit like crazy. Well, this is one of a long list of broken things about OSX. If you want to see *real* breakage, do some IO performance testing of HFS+ FWIW, I have this issue with Mac desktop applications on my MacBook, which will happily memory leak until I run out of swap space. You can read the amount of system memory by using sysctl() to fetch hw.memsize, but it's not really clear how much that helps. We could refuse to start up if the shared memory allocation is = hw.memsize, but even an amount slightly less than that seems like enough to send the machine into a tailspin, so I'm not sure that really gets us anywhere. I still think it would help. User errors in allocating shmmem are more likely to be order-of-magnitude errors (I meant 500MB, not 500GB!) than be matters of 20% of RAM over. One idea I had was to LOG the size of the shared memory allocation just before allocating it. That way, if your system goes into the tank, there will at least be something in the log. But that would be useless chatter for most users. Yes, but it would provide mailing list, IRC and StackExchange quick answers. I started up PostgreSQL and my MacBook crashed. Find the file postgres.log. What's the last 10 lines? So neither of those things *fixes* the problem ... ultimately, it's Apple's problem and we can't fix it ... but both of them make it somewhat better. The other thing which will avoid the problem for most Mac users is if we simply allocate 10% of RAM at initdb as a default. If we do that, then 90% of users will never touch Shmem themselves, and not have the opportunity to mess up. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Update on the spinlock-pthread_mutex patch experimental: replace s_lock spinlock code with pthread_mutex on linux
On Friday, June 29, 2012 07:07:11 PM Nils Goroll wrote: Also, 20 transactions per connection is not enough of a run to make any evaluation on. As you can see I've repeated the tests 10 times. I've tested slight variations as mentioned above, so I was looking for quick results with acceptable variation. Running only 20 transactions is still meaningless. Quite often that will means that no backends run concurrently because the starting up takes longer than to process those 20 transactions. You need at the very, very least 10s. Check out -T. Andres -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Update on the spinlock-pthread_mutex patch experimental: replace s_lock spinlock code with pthread_mutex on linux
You need at the very, very least 10s. ok, thanks. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] change_varattnos_of_a_node versus whole-row Vars
change_varattnos_of_a_node(), which is used to adjust expressions referencing a parent relation or LIKE source relation to refer to the child relation, ignores whole-row Vars (those with attnum zero). Thus, after processing the expression, the whole-row Var still claims to have the rowtype (vartype) of the parent rel. This is utterly wrong in the LIKE case, as pointed out in http://archives.postgresql.org/pgsql-general/2012-06/msg00674.php and AFAICS it is wrong in the inheritance case too, since there's no guarantee that the parent and child rowtypes are binary-equivalent. My inclination, especially in the back branches, is to just throw error if we find a whole-row Var. ISTM the entire point of CREATE TABLE LIKE is that the two tables' rowtypes might diverge after the child is created, so it makes no sense to do anything that would presuppose that an expression involving the parent's rowtype will remain sensible for the child. The only caller not associated with CREATE TABLE LIKE is MergeAttributes, which is trying to adjust CHECK constraints for a parent rel to apply to a new inheritance child. In that case it would be sensible to expect a parent.* reference to be valid for the child (though a ConvertRowtypeExpr would still need to be thrown in to make this work). However, it's hard to implement because the Var would need to be modified to contain the rowtype OID for the child relation, which has not been assigned at this point. In principle we could rearrange CREATE TABLE processing enough so that inherited CHECK constraints aren't adjusted till after we know the rowtype OID, but I'm not personally volunteering to do that. There are a bunch of other things I don't like about change_varattnos_of_a_node, starting with the name, but those gripes are in the nature of cosmetics or inadequate error checking and wouldn't have any user-visible impact if changed. Any thoughts on the topic? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Posix Shared Mem patch
Josh Berkus j...@agliodbs.com writes: The other thing which will avoid the problem for most Mac users is if we simply allocate 10% of RAM at initdb as a default. If we do that, then 90% of users will never touch Shmem themselves, and not have the opportunity to mess up. If we could do that on *all* platforms, I might be for it, but we only know how to get that number on some platforms. There's also the issue of whether we really want to assume that the machine is dedicated to Postgres, which IMO is an implicit assumption of any default that scales itself to physical RAM. For the moment I think we should just allow initdb to scale up a little bit more than where it is now, perhaps 128MB instead of 32. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Posix Shared Mem patch
Tom, If we could do that on *all* platforms, I might be for it, but we only know how to get that number on some platforms. I don't see what's wrong with using it where we can get it, and not using it where we can't. There's also the issue of whether we really want to assume that the machine is dedicated to Postgres, which IMO is an implicit assumption of any default that scales itself to physical RAM. 10% isn't assuming dedicated. Assuming dedicated would be 20% or 25%. I was thinking 10%, with a ceiling of 512MB. For the moment I think we should just allow initdb to scale up a little bit more than where it is now, perhaps 128MB instead of 32. I wouldn't be opposed to that. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Posix Shared Mem patch
Josh Berkus j...@agliodbs.com writes: If we could do that on *all* platforms, I might be for it, but we only know how to get that number on some platforms. I don't see what's wrong with using it where we can get it, and not using it where we can't. Because then we still need to define, and document, a sensible behavior on the machines where we can't get it. And document that we do it two different ways, and document which machines we do it which way on. There's also the issue of whether we really want to assume that the machine is dedicated to Postgres, which IMO is an implicit assumption of any default that scales itself to physical RAM. 10% isn't assuming dedicated. Really? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Posix Shared Mem patch
10% isn't assuming dedicated. Really? Yes. As I said, the allocation for dedicated PostgreSQL servers is usually 20% to 25%, up to 8GB. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Posix Shared Mem patch
Josh Berkus j...@agliodbs.com writes: 10% isn't assuming dedicated. Really? Yes. As I said, the allocation for dedicated PostgreSQL servers is usually 20% to 25%, up to 8GB. Any percentage is assuming dedicated, IMO. 25% might be the more common number, but you're still assuming that you can have your pick of the machine's resources. My idea of not dedicated is I can launch a dozen postmasters on this machine, and other services too, and it'll be okay as long as they're not doing too much. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Update on the spinlock-pthread_mutex patch experimental: replace s_lock spinlock code with pthread_mutex on linux
On Fri, Jun 29, 2012 at 12:11 PM, Andres Freund and...@2ndquadrant.com wrote: On Friday, June 29, 2012 07:07:11 PM Nils Goroll wrote: Also, 20 transactions per connection is not enough of a run to make any evaluation on. As you can see I've repeated the tests 10 times. I've tested slight variations as mentioned above, so I was looking for quick results with acceptable variation. Running only 20 transactions is still meaningless. Quite often that will means that no backends run concurrently because the starting up takes longer than to process those 20 transactions. You need at the very, very least 10s. Check out -T. yeah. also, standard pgbench is typically very much i/o bound on typical hardware. it's would be much more interesting to see performance in spinlock heavy workloads -- the OP noted one when introducing the thread. would it be possible to simulate those conditions. merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] lock_timeout and common SIGALRM framework
2012-06-27 10:34 keltezéssel, Boszormenyi Zoltan írta: 2012-06-26 18:49 keltezéssel, Alvaro Herrera írta: Excerpts from Boszormenyi Zoltan's message of mar jun 26 12:43:34 -0400 2012: So, should I keep the enum TimeoutName? Are global variables for keeping dynamically assigned values preferred over the enum? Currently we have 5 timeout sources in total, 3 of them are used by regular backends, the remaining 2 are used by replication standby. We can have a fixed size array (say with 8 or 16 elements) for future use and this would be plenty. Opinions? My opinion is that the fixed size array is fine. Attached is the version which uses a registration interface. Also, to further minimize knowledge of timeouts in timeout.c, all GUCs are moved back to proc.c I'll go set the patch waiting on author. Also, remember to review some other people's patches. I will look into it. Best regards, Zoltán Böszörményi Does anyone have a little time to look at the latest timeout framework with the registration interface and the 2nd patch too? I am at work until Friday next week, after that I will be on vacation for two weeks. Just in case there is anything that needs tweaking to make it more acceptable. Thanks in advance, Zoltán Böszörményi -- -- Zoltán Böszörményi Cybertec Schönig Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de http://www.postgresql.at/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Posix Shared Mem patch
My idea of not dedicated is I can launch a dozen postmasters on this machine, and other services too, and it'll be okay as long as they're not doing too much. Oh, 128MB then? -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] new --maintenance-db options
On Sun, Jun 24, 2012 at 01:26:58AM +0300, Peter Eisentraut wrote: About the new --maintenance-db options: What is the purpose of these options? The initial discussion was unclear on this. The documentation contains no explanation of why they should be used. If we want to really support the case where both postgres and template1 are removed, an environment variable might be more useful than requiring this to be typed out for every command. Yes, I had the same question about the usefulness/purpose of the option, but was out-voted. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] new --maintenance-db options
On Mon, Jun 25, 2012 at 11:57:36AM -0400, Robert Haas wrote: In retrospect, it seems as though it might have been a good idea to make the postgres database read-only and undroppable, so that all client utilities could count on being able to connect to it and get a list of databases in the cluster without the need for all this complexity. Or else having some other way for a client to authenticate and list out all the available databases. In the absence of such a mechanism, I don't think we can turn around and say that not having a postgres database is an unsupported configuration, and therefore we need some way to cope with it when it happens. Well, we certainly don't allow 'template1' to be dropped: test= DROP DATABASE template1; ERROR: cannot drop a template database so you could make the argument that making 'postgres' undroppable seem reasonable. I should point out that it was EnterpriseDB that complained about this related to their Advanced Server product, that doesn't have a 'postgres' database, but an 'edb' one. I said that was their problem, but when community users said they also dropped the 'postgres' database, it became a community problem too. Where are we going on this for PG 9.2? 9.3? I hate to ship options in 9.2 that will be gone in 9.3. FYI, we do allow the 'template1' database to be renamed: test= ALTER DATABASE template1 RENAME TO template2; ALTER DATABASE Oops. TODO? I think the original report that prompted this change was a complaint that pg_upgrade failed when the postgres database had been dropped. Now, admittedly, pg_upgrade fails for all kinds of crazy stupid reasons and the chances of fixing that problem completely any time in the next 5 years do not seem good, but that's not a reason not to keep plugging the holes we can. Anyhow, the same commit that introduced --maintenance-db fixed that problem by making arranging to try both postgres and template1 before giving up... but have two hard-coded database names either of which can be dropped or renamed seems only marginally better than having one, hence the switch. Really, I think Actually, 'template1' can't be dropped like 'postgres', but can be renamed (which I think needs fixing). I think falling back to template1 for missing 'postgres' database was the goal there. pg_upgrade needs this option too, unless we're going to kill the problem at its root by providing a reliable way to enumerate database names without first knowing the name one that you can connect to. pg_upgrade doesn't use --maintenance-db because the tools now fallback to template1, which again brings up the question of the usefulness of the --maintenance-db options. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] new --maintenance-db options
On Mon, Jun 25, 2012 at 02:58:25PM -0400, Robert Haas wrote: On Mon, Jun 25, 2012 at 2:49 PM, Alvaro Herrera alvhe...@commandprompt.com wrote: Excerpts from Robert Haas's message of lun jun 25 11:57:36 -0400 2012: Really, I think pg_upgrade needs this option too, unless we're going to kill the problem at its root by providing a reliable way to enumerate database names without first knowing the name one that you can connect to. I think pg_upgrade could do this one task by using a standalone backend instead of a full-blown postmaster. It should be easy enough ... Maybe, but it seems like baking even more hackery into a tool that's already got too much hackery. It's also hard for pg_upgrade to know things like - whether pg_hba.conf prohibits access to certain users/databases/etc. or just requires the use of authentication methods that happen to fail. From pg_upgrade's perspective, it would be nice to have a flag that starts the server in some mode where nobody but pg_upgrade can connect to it and all connections are automatically allowed, but it's not exactly clear how to implement nobody but pg_upgrade can connect to it. pg_upgrade already starts the postmaster with a -b option that disables non-super-user logins: /* * Binary upgrades only allowed super-user connections */ if (IsBinaryUpgrade !am_superuser) { ereport(FATAL, (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE), errmsg(must be superuser to connect in binary upgrade mode))); } It also uses port 50432 by default. Not sure what else we can do. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] new --maintenance-db options
On Mon, Jun 25, 2012 at 03:12:00PM -0400, Alvaro Herrera wrote: Excerpts from Robert Haas's message of lun jun 25 14:58:25 -0400 2012: On Mon, Jun 25, 2012 at 2:49 PM, Alvaro Herrera alvhe...@commandprompt.com wrote: Excerpts from Robert Haas's message of lun jun 25 11:57:36 -0400 2012: Really, I think pg_upgrade needs this option too, unless we're going to kill the problem at its root by providing a reliable way to enumerate database names without first knowing the name one that you can connect to. I think pg_upgrade could do this one task by using a standalone backend instead of a full-blown postmaster. It should be easy enough ... Maybe, but it seems like baking even more hackery into a tool that's already got too much hackery. It's also hard for pg_upgrade to know things like - whether pg_hba.conf prohibits access to certain users/databases/etc. or just requires the use of authentication methods that happen to fail. From pg_upgrade's perspective, it would be nice to have a flag that starts the server in some mode where nobody but pg_upgrade can connect to it and all connections are automatically allowed, but it's not exactly clear how to implement nobody but pg_upgrade can connect to it. Well, have it specify a private socket directory, listen only on that (not TCP), and bypass all pg_hba rules. This could be added to the poststmaster -b behavior, but I am concerned about the security of this. We sugest 'trust', but admins can adjust as needed. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Posix Shared Mem patch
Hi All, In a *very* quick patch I tested using huge pages/MAP_HUGETLB for the mmap'ed memory. That gives around 9.5% performance benefit in a read-only pgbench run (-n -S - j 64 -c 64 -T 10 -M prepared, scale 200, 6GB s_b, 8 cores, 24GB mem). It also saves a bunch of memory per process due to the smaller page table (shared_buffers 6GB): cat /proc/$pid_of_pg_backend/status |grep VmPTE VmPTE: 6252 kB vs VmPTE:60 kB Additionally it has the advantage that top/ps/... output under linux now looks like: PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ COMMAND 10603 andres20 0 6381m 4924 1952 R21 0.0 0:28.04 postgres i.e. RES now actually shows something usable... Which is rather nice imo. I don't have the time atm into making this something useable, maybe somebody else want to pick it up? Looks pretty worthwile investing some time. Because of the required setup we sure cannot make this the default but... Greetings, Andres -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services diff --git a/src/backend/port/sysv_shmem.c b/src/backend/port/sysv_shmem.c index e040400..05bbdf6 100644 --- a/src/backend/port/sysv_shmem.c +++ b/src/backend/port/sysv_shmem.c @@ -54,7 +54,7 @@ typedef int IpcMemoryId; /* shared memory ID returned by shmget(2) */ #define MAP_HASSEMAPHORE 0 #endif -#define PG_MMAP_FLAGS (MAP_SHARED|MAP_ANONYMOUS|MAP_HASSEMAPHORE) +#define PG_MMAP_FLAGS (MAP_SHARED|MAP_ANONYMOUS|MAP_HASSEMAPHORE|MAP_HUGETLB) /* Some really old systems don't define MAP_FAILED. */ #ifndef MAP_FAILED @@ -407,6 +407,10 @@ PGSharedMemoryCreate(Size size, bool makePrivate, int port) { long pagesize = sysconf(_SC_PAGE_SIZE); + /* round up to hugetlb size on x86-64 linux */ + if(pagesize (1024*2048)) + pagesize = 1024*2048; + /* * Ensure request size is a multiple of pagesize. * -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Posix Shared Mem patch
On Fri, Jun 29, 2012 at 2:52 PM, Andres Freund and...@2ndquadrant.com wrote: Hi All, In a *very* quick patch I tested using huge pages/MAP_HUGETLB for the mmap'ed memory. That gives around 9.5% performance benefit in a read-only pgbench run (-n -S - j 64 -c 64 -T 10 -M prepared, scale 200, 6GB s_b, 8 cores, 24GB mem). It also saves a bunch of memory per process due to the smaller page table (shared_buffers 6GB): cat /proc/$pid_of_pg_backend/status |grep VmPTE VmPTE: 6252 kB vs VmPTE: 60 kB Additionally it has the advantage that top/ps/... output under linux now looks like: PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 10603 andres 20 0 6381m 4924 1952 R 21 0.0 0:28.04 postgres i.e. RES now actually shows something usable... Which is rather nice imo. I don't have the time atm into making this something useable, maybe somebody else want to pick it up? Looks pretty worthwile investing some time. Because of the required setup we sure cannot make this the default but... ... those results are just spectacular (IMO). nice! merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] elog/ereport noreturn decoration
There is continued interest in static analyzers (clang, coverity), and the main problem with those is that they don't know that elog and ereport with level = ERROR don't return, leading to lots of false positives. I looked in the archive; there were some attempts to fix this some time ago. One was putting an __attribute__((analyzer_noreturn)) on these functions, but that was decided to be incorrect (and it would only work for clang anyway). Another went along the lines of what I'm proposing here (but before ereport was introduced, if I read it correctly), but didn't go anywhere. My proposal with ereport would be to do this: diff --git i/src/include/utils/elog.h w/src/include/utils/elog.h --- i/src/include/utils/elog.h +++ w/src/include/utils/elog.h @@ -104,7 +104,8 @@ */ #define ereport_domain(elevel, domain, rest) \ (errstart(elevel, __FILE__, __LINE__, PG_FUNCNAME_MACRO, domain) ? \ -(errfinish rest) : (void) 0) +(errfinish rest) : (void) 0), \ + (elevel = ERROR ? abort() : (void) 0) There are no portability problems with that. With elog, I would do this: diff --git i/src/include/utils/elog.h w/src/include/utils/elog.h @@ -196,7 +197,11 @@ * elog(ERROR, portal \%s\ not found, stmt-portalname); *-- */ +#if defined(__STDC_VERSION__) __STDC_VERSION__ = 199901L +#define elog(elevel, ...) elog_start(__FILE__, __LINE__, PG_FUNCNAME_MACRO), elog_finish(elevel, __VA_ARGS__), (elevel = ERROR ? abort() : (void) 0) +#else #define elog elog_start(__FILE__, __LINE__, PG_FUNCNAME_MACRO), elog_finish +#endif I think the issue here was that if we support two separate code paths, we still need to do the actually unreachable /* keep compiler happy */ bits, and that compilers that know about elog not returning would complain about unreachable code. But I have never seen warnings like that, so maybe it's a nonissue. But it could be tested if there are concerns. Complete patch attached for easier applying and testing. For clang aficionados: This reduces scan-build warnings from 1222 to 553 for me. diff --git i/src/include/utils/elog.h w/src/include/utils/elog.h index 93b141d..b32737e 100644 --- i/src/include/utils/elog.h +++ w/src/include/utils/elog.h @@ -104,7 +104,8 @@ */ #define ereport_domain(elevel, domain, rest) \ (errstart(elevel, __FILE__, __LINE__, PG_FUNCNAME_MACRO, domain) ? \ - (errfinish rest) : (void) 0) + (errfinish rest) : (void) 0), \ + (elevel = ERROR ? abort() : (void) 0) #define ereport(elevel, rest) \ ereport_domain(elevel, TEXTDOMAIN, rest) @@ -196,7 +197,11 @@ extern int getinternalerrposition(void); * elog(ERROR, portal \%s\ not found, stmt-portalname); *-- */ +#if defined(__STDC_VERSION__) __STDC_VERSION__ = 199901L +#define elog(elevel, ...) elog_start(__FILE__, __LINE__, PG_FUNCNAME_MACRO), elog_finish(elevel, __VA_ARGS__), (elevel = ERROR ? abort() : (void) 0) +#else #define elog elog_start(__FILE__, __LINE__, PG_FUNCNAME_MACRO), elog_finish +#endif extern void elog_start(const char *filename, int lineno, const char *funcname); extern void -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Event Triggers reduced, v1
Hi, Here's an answer to your review (thanks!), with no patch attached yet even if I've been cleanup up most of what you reported. Incremental diff is available for browsing here: https://github.com/dimitri/postgres/compare/f99e8d93b7...8da156dc70 Robert Haas robertmh...@gmail.com writes: Some of the pg_dump hunks fail to apply for me; I guess this needs to be remerged. Done. There are a few remaining references to EVTG_FIRED_BEFORE / EVTG_FIRED_INSTEAD_OF which should be cleaned up. I suggest writing the grammar as CREATE EVENT TRIGGER name ON event_name EXECUTE... On a related note, evttype is still mentioned in the documentation, also. And CreateEventTrigStmt has a char timing field that can go. I didn't get the memo that we had made a decision here :) That said it will be possible to change our mind and introduce that instead of syntax if that's necessary later in the cycle, so I'll go clean up for the first commit. Incidentally, why do we not support an argument list here, as with ordinary triggers? Left for a follow-up patch. Do you want it already in this one? I think the below hunk should get removed. Or else it should be surrounded by #ifdef rather than commented out. Removed. Typo: Fixed. psql is now very confused about what the slash command is. It's actually implemented as \dy, but the comments say \dev in several places, and slashUsage() documents it as \dct. Fixed. InitializeEvtTriggerCommandCache still needs work. First, it ensures that CacheMemoryContext is non-NULL... after it's already stored the value of CacheMemoryContext into the HASHCTL. Obviously, the order there needs to be fixed. Also, I think you need to split this into two functions. There should be one function that gets called just once at startup time to CacheRegisterSyscacheCallback(), because we don't want to redo that every time the cache gets blown away. Then there should be another function that gets called when, and only when, someone needs to use the cache. That should create and populate the hash table. CacheMemoryContext ordering fixed, I wrongly copied from attoptcache here even when the memory management is not really done the same. The only place I can see where to init the Event Trigger Cache is in InitPostgres() itself (in src/backend/utils/init/postinit.c), done now. I think that all event triggers should fire in exactly alphabetical order by name. Now that we have the concept of multiple firing points, there's really no reason to distinguish between any triggers and triggers on specific commands. Eliminating that distinction will eliminate a bunch of complexity while making things *more* flexible for the user, who will be able to get a command trigger for a specific command to fire either before or after an ANY command trigger he has also defined rather than having the system enforce policy on him. Internally we still need to keep the distinction to be able to fire ANY triggers on otherwise non supported commands. I agree that we should not concern our users with that, though. Done. plperl_validator still makes reference to CMDTRIGGER. In a comment, fixed. Let's simplify this down to an enum with just one element, since that's all we're going to support for starters, and remove the related parser support for everything but command_start: +typedef enum TrigEvent +{ + E_CommandStart = 1, + E_SecurityCheck = 10, + E_ConsistencyCheck = 15, + E_NameLookup = 20, + E_CommandEnd = 51 +} TrigEvent; I wanted to see where the current choice would lead us, I agree that we can remove that level of details for now, that also allows not to pick next event integration point names already :) Done. I think we should similarly rip out the vestigial support for supplying schema name, object name, and object ID. That's not going to be possible at command_start anyway; we can include that stuff in the patch that adds a later firing point (command end, or consistency check, perhaps). We got this part of the API fixed last round, so I would prefer not to dumb it down in this first patch. We know that we want to add exactly that specification later, don't we? I think standard_ProcessUtility should have a shortcut that bypasses setting up the event context if there are no event triggers at all in the entire system, so that we do no extra work unless there's some potential benefit. Setting up the event context is a very lightweight operation, and there's no way to know if the command is going to fire an event trigger without having done exactly what the InitEventContext() is doing. Maybe what we need to do here is rename it? Another problem with short cutting it aggressively is what happens if a new event trigger is created while the command is in flight. We have yet to discuss about that (as we only support a single timing point), but doing it the way you propose forecloses any
Re: [HACKERS] elog/ereport noreturn decoration
Peter Eisentraut pete...@gmx.net writes: I think the issue here was that if we support two separate code paths, we still need to do the actually unreachable /* keep compiler happy */ bits, and that compilers that know about elog not returning would complain about unreachable code. Yes. The problem with trying to change that is that it's damned if you do and damned if you don't: compilers that are aware that abort() doesn't return will complain about unreachable code if we keep those extra variable initializations, while those that are not so aware will complain about uninitialized variables if we don't. I don't think that's going to be a step forward. IOW I am not on board with reducing the number of warnings in clang by increasing the number everywhere else. Perhaps we could do something like default: elog(ERROR, unrecognized drop object type: %d, (int) drop-removeType); - relkind = 0; /* keep compiler quiet */ + UNREACHABLE(relkind = 0);/* keep compiler quiet */ break; where UNREACHABLE(stuff) expands to the given statements (possibly empty) or to abort() depending on the compiler's properties. If we did something like that we'd not need to monkey with the definition of either elog or ereport, but instead we'd have to run around and fix everyplace that was emitting warnings of this sort. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Event Triggers reduced, v1
Robert Haas robertmh...@gmail.com writes: ../../../src/include/catalog/pg_event_trigger.h:34: error: expected specifier-qualifier-list before ‘int2’ This needs to be changed to int16 as a result of commit b8b2e3b2deeaab19715af063fc009b7c230b2336. Done as part of the previous work. alter.c:73: warning: implicit declaration of function ‘RenameEventTrigger’ Done now. Please spell out DO_EVTTRIGGER as DO_EVENT_TRIGGER. Also done as part of the previous email. Another random warning: plpy_main.c:195: warning: ‘retval’ may be used uninitialized in this function Will do a whole warning check pass later. Can you give me your local Makefile trick to turn them into hard errors again please? :) Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] change_varattnos_of_a_node versus whole-row Vars
I wrote: change_varattnos_of_a_node(), which is used to adjust expressions referencing a parent relation or LIKE source relation to refer to the child relation, ignores whole-row Vars (those with attnum zero). ... My inclination, especially in the back branches, is to just throw error if we find a whole-row Var. ... There are a bunch of other things I don't like about change_varattnos_of_a_node, starting with the name, but those gripes are in the nature of cosmetics or inadequate error checking and wouldn't have any user-visible impact if changed. Attached is a draft patch that adds error reports for whole-row Vars and replaces change_varattnos_of_a_node with a more robust function. The latter makes it kind of a large patch, but it's difficult to make it a whole lot simpler unless we are willing to have the error messages say only cannot convert whole-row table reference, without any indication of where the problem is. That seems a tad unfriendly to me. Comments? regards, tom lane diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c index 70753e33e430237a66991bac0740b8761c1b4e4e..d69809a2f866de511e7ce278b8dca9bf9db33d40 100644 *** a/src/backend/commands/tablecmds.c --- b/src/backend/commands/tablecmds.c *** *** 68,73 --- 68,74 #include parser/parser.h #include rewrite/rewriteDefine.h #include rewrite/rewriteHandler.h + #include rewrite/rewriteManip.h #include storage/bufmgr.h #include storage/lmgr.h #include storage/lock.h *** static void truncate_check_rel(Relation *** 253,259 static List *MergeAttributes(List *schema, List *supers, char relpersistence, List **supOids, List **supconstr, int *supOidCount); static bool MergeCheckConstraint(List *constraints, char *name, Node *expr); - static bool change_varattnos_walker(Node *node, const AttrNumber *newattno); static void MergeAttributesIntoExisting(Relation child_rel, Relation parent_rel); static void MergeConstraintsIntoExisting(Relation child_rel, Relation parent_rel); static void StoreCatalogInheritance(Oid relationId, List *supers); --- 254,259 *** MergeAttributes(List *schema, List *supe *** 1496,1502 * parents after the first one, nor if we have dropped columns.) */ newattno = (AttrNumber *) ! palloc(tupleDesc-natts * sizeof(AttrNumber)); for (parent_attno = 1; parent_attno = tupleDesc-natts; parent_attno++) --- 1496,1502 * parents after the first one, nor if we have dropped columns.) */ newattno = (AttrNumber *) ! palloc0(tupleDesc-natts * sizeof(AttrNumber)); for (parent_attno = 1; parent_attno = tupleDesc-natts; parent_attno++) *** MergeAttributes(List *schema, List *supe *** 1510,1524 * Ignore dropped columns in the parent. */ if (attribute-attisdropped) ! { ! /* ! * change_varattnos_of_a_node asserts that this is greater ! * than zero, so if anything tries to use it, we should find ! * out. ! */ ! newattno[parent_attno - 1] = 0; ! continue; ! } /* * Does it conflict with some previously inherited column? --- 1510,1516 * Ignore dropped columns in the parent. */ if (attribute-attisdropped) ! continue; /* leave newattno entry as zero */ /* * Does it conflict with some previously inherited column? *** MergeAttributes(List *schema, List *supe *** 1656,1669 { char *name = check[i].ccname; Node *expr; /* ignore if the constraint is non-inheritable */ if (check[i].ccnoinherit) continue; ! /* adjust varattnos of ccbin here */ ! expr = stringToNode(check[i].ccbin); ! change_varattnos_of_a_node(expr, newattno); /* check for duplicate */ if (!MergeCheckConstraint(constraints, name, expr)) --- 1648,1677 { char *name = check[i].ccname; Node *expr; + bool found_whole_row; /* ignore if the constraint is non-inheritable */ if (check[i].ccnoinherit) continue; ! /* Adjust Vars to match new table's column numbering */ ! expr = map_variable_attnos(stringToNode(check[i].ccbin), ! 1, 0, ! newattno, tupleDesc-natts, ! found_whole_row); ! ! /* ! * For the moment we have to reject whole-row variables. ! * We could convert them, if we knew the new table's rowtype ! * OID, but that hasn't been assigned yet. ! */ ! if (found_whole_row) ! ereport(ERROR, ! (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), ! errmsg(cannot convert whole-row table reference), ! errdetail(Constraint \%s\ contains a whole-row reference to table \%s\., ! name, ! RelationGetRelationName(relation; /* check for duplicate */ if (!MergeCheckConstraint(constraints, name,
Re: [HACKERS] Posix Shared Mem patch
On Fri, Jun 29, 2012 at 1:00 PM, Merlin Moncure mmonc...@gmail.com wrote: On Fri, Jun 29, 2012 at 2:52 PM, Andres Freund and...@2ndquadrant.com wrote: Hi All, In a *very* quick patch I tested using huge pages/MAP_HUGETLB for the mmap'ed memory. That gives around 9.5% performance benefit in a read-only pgbench run (-n -S - j 64 -c 64 -T 10 -M prepared, scale 200, 6GB s_b, 8 cores, 24GB mem). It also saves a bunch of memory per process due to the smaller page table (shared_buffers 6GB): cat /proc/$pid_of_pg_backend/status |grep VmPTE VmPTE: 6252 kB vs VmPTE:60 kB ... those results are just spectacular (IMO). nice! That is super awesome. Smallish databases with a high number of connections actually spend a considerable fraction of their otherwise-available-for-buffer-cache space on page tables in common cases currently. -- fdr -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] foreign key locks
Excerpts from Kevin Grittner's message of mié jun 27 08:40:58 -0400 2012: Alvaro Herrera wrote: here's fklocks v14, which also merges to new master as there were several other conflicts. It passes make installcheck-world now. Recent commits broke it again, so here's a rebased v15. (No changes other than attempting to merge your work with the pg_controldata and pg_resetxlog changes and wrapping that FIXME in a comment.) Here's v16, merged to latest stuff, before attempting to fix this: Using that patch, pg_upgrade completes, but pg_dumpall fails: + pg_ctl start -l /home/kevin/pg/master/contrib/pg_upgrade/log/postmaster2.log -w waiting for server to start done server started + pg_dumpall pg_dump: Dumping the contents of table hslot failed: PQgetResult() failed. pg_dump: Error message from server: ERROR: MultiXactId 115 does no longer exist -- apparent wraparound I was only testing migrating from an old version into patched, not same-version upgrades. I think I see what's happening here. -- Álvaro Herrera alvhe...@commandprompt.com The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Can someone help me to get ODBC fdw running on windows?
I've tried to compile ODBC fdw on Win64 with all sort of compilers without success (MingGW, gcc-win32, MS C++2005 32 and 64). I think I'm getting too old for this so many switches, too many dependencies. Could a gently soul help me get back on track, possible providing precompiled binaries that I can use? Regards, Edson -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] elog/ereport noreturn decoration
On 29 June 2012 22:35, Tom Lane t...@sss.pgh.pa.us wrote: IOW I am not on board with reducing the number of warnings in clang by increasing the number everywhere else. I successfully lobbied the Clang developers to remove some warnings that came from certain sites where we use a single element array at the end of a struct as pseudo flexible arrays (Clang tried to do compile-time bounds checking, with predictable results). Now, I had to make a nuisance of myself in order to get that result, but you might consider trying that. I have been using the Clang static analyser some. I've seen all those false positives. The greater problem is that the analyser apparently won't work across translation units. -- Peter Geoghegan http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training and Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Support for array_remove and array_replace functions
On Thu, Jun 14, 2012 at 4:41 AM, Marco Nenciarini marco.nenciar...@2ndquadrant.it wrote: Hi, following Gabriele's email regarding our previous patch on Foreign Key Arrays[1], I am sending a subset of that patch which includes only two array functions which will be needed in that patch: array_remove (limited to single-dimensional arrays) and array_replace. The patch includes changes to the documentation. Hi, I've been reviewing this patch. Good documentation, and regression tests. The code looked fine but I didn't care for the code duplication between array_replace and array_remove so I merged those into a helper function, array_replace_internal(). Thoughts? Other than that it all looks good to me. array-functions_v2.patch.bz2 Description: BZip2 compressed data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers