Re: [HACKERS] Initial 9.2 pgbench write results
On Sun, Feb 19, 2012 at 4:17 AM, Robert Haas wrote: > Here's what's bugging me. Greg seemed to be assuming that the > business of the background writer might be the cause of the > performance drop-off he measured on certain test cases. But you and I > both seem to feel that the business of the background writer is > intentional and desirable. Supposing we're right, where's the > drop-off coming from? *scratches head* Any source of logical I/O becomes physical I/O when we run short of memory. So if we're using more memory for any reason that will cause more swapping. Or if we are doing things like consulting the vmap that would also cause a problem. I notice the issue is not as bad for 9.2 in the scale 4000 case, so it seems more likely that we're just hitting the tipping point earlier on 9.2 and that scale 1000 is right in the middle of the tipping point. What it does show quite clearly is that the extreme high end response time variability is still there. It also shows that insufficient performance testing has been done on this release so far. We may have "solved" some scalability problems but we've completely ignored real world performance issues and as Greg says, we now get to pray the price for not having done that earlier. I've argued previously that we should have a performance tuning phase at the end of the release cycle, now it looks that has become a necessity. Which will turn out to be a good thing in the end, I'm sure, even if its a little worrying right now. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Initial 9.2 pgbench write results
On Tue, Feb 14, 2012 at 6:45 PM, Greg Smith wrote: > Minimal changes were made to the postgresql.conf. shared_buffers=2GB, > checkpoint_segments=64, and I left wal_buffers at its default so that 9.1 > got credit for that going up. See > http://highperfpostgres.com/results-write-9.2-cf4/541/pg_settings.txt for a > full list of changes, drive mount options, and important kernel settings. > Much of that data wasn't collected in last year's pgbench-tools runs. Please retest with wal_buffers 128MB, checkpoint_segments 1024 Best to remove any tunable resource bottlenecks before we attempt further analysis. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] pg_upgrade --logfile option documentation
The documentation of the pg_upgrade -l/--logfile option never made much sense to me: -l, --logfile=FILENAMElog session activity to file I don't know what "session" means for pg_upgrade, so I never used it. What it actually does is log the output of all the programs that pg_upgrade calls internally, such as pg_ctl, psql, vacuumdb, pg_resetxlog, to the specified file, which is quite useful for analyzing errors such as unable to connect to new postmaster started with the command: "/usr/lib/postgresql/9.1/bin/pg_ctl" -w -l "/dev/null" -D "/var/lib/postgresql/9.1/main" -o "-p 5433 -b" start >> "/dev/null" 2>&1 where -l would have put something in the place of /dev/null. So what might be a better wording for this option? Something like "log output of internally called programs to file"? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_restore ignores PGDATABASE
On Sun, Feb 19, 2012 at 1:18 AM, Erik Rijkers wrote: > On Sun, February 19, 2012 06:27, Robert Haas wrote: >> On Sat, Feb 18, 2012 at 11:58 AM, Erik Rijkers wrote: >>> pg_restore ignores environment variable PGDATABASE. >> >> What exactly do you mean by "ignores"? pg_restore prints results to >> standard output unless a database name is specified. AFAIK, there's >> no syntax to say "I want a direct-to-database restore to whatever you >> think the default database is". > > That's right, and that seems contradictory with: > > "This utility [pg_restore], like most other PostgreSQL utilities, also uses > the environment > variables supported by libpq (see Section 31.13)." > > as pg_restore does 'ignore' (for want of a better word) PGDATABASE. > > But I think I can conclude from your reply that that behaviour is indeed > intentional. It is, because we want there to be a way of converting a custom or tar format archive back to text. I think that probably works out for the best anyway, since pg_restore is a sufficiently dangerous operation that you want to be darn sure you're not doing it on the wrong database. dropdb also requires a database name, while createdb does not, for similar reasons... -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] wal_buffers
On 19-02-2012 02:24, Robert Haas wrote: > I have attached tps scatterplots. The obvious conclusion appears to > be that, with only 16MB of wal_buffers, the buffer "wraps around" with > some regularity > Isn't it useful to print some messages on the log when we have "wrap around"? In this case, we have an idea that wal_buffers needs to be increased. -- Euler Taveira de Oliveira - Timbira http://www.timbira.com.br/ PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_restore ignores PGDATABASE
On 02/19/2012 08:02 AM, Robert Haas wrote: On Sun, Feb 19, 2012 at 1:18 AM, Erik Rijkers wrote: On Sun, February 19, 2012 06:27, Robert Haas wrote: On Sat, Feb 18, 2012 at 11:58 AM, Erik Rijkers wrote: pg_restore ignores environment variable PGDATABASE. What exactly do you mean by "ignores"? pg_restore prints results to standard output unless a database name is specified. AFAIK, there's no syntax to say "I want a direct-to-database restore to whatever you think the default database is". That's right, and that seems contradictory with: "This utility [pg_restore], like most other PostgreSQL utilities, also uses the environment variables supported by libpq (see Section 31.13)." as pg_restore does 'ignore' (for want of a better word) PGDATABASE. But I think I can conclude from your reply that that behaviour is indeed intentional. It is, because we want there to be a way of converting a custom or tar format archive back to text. I think that probably works out for the best anyway, since pg_restore is a sufficiently dangerous operation that you want to be darn sure you're not doing it on the wrong database. dropdb also requires a database name, while createdb does not, for similar reasons... Right, I think we probably need to adjust the docs slightly to match this reality. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 16-bit page checksums for 9.2
On Thu, Feb 16, 2012 at 1:53 PM, Robert Haas wrote: > + /* > + * If we're in recovery we cannot dirty a page > because of a hint. > + * We can set the hint, just not dirty the page as a > result so > + * the hint is lost when we evict the page or > shutdown. > + * > + * See long discussion in bufpage.c > + */ > + if (RecoveryInProgress()) > + return; > > Doesn't this seem awfully bad for performance on Hot Standby servers? > I agree that it fixes the problem with un-WAL-logged pages there, but > I seem to recall some recent complaining about performance features > that work on the master but not the standby. Durable hint bits are > one such feature. It's impossible for it to work, in this case, since we cannot write new WAL to prevent torn pages. Note that hint bit setting on a dirty block is allowed, so many hints will still be set in Hot Standby. > + * Basically, we simply prevent the checkpoint WAL > record from > + * being written until we have marked the buffer > dirty. We don't > + * start the checkpoint flush until we have marked > dirty, so our > + * checkpoint must flush the change to disk > successfully or the > + * checkpoint never gets written, so crash recovery > will fix. > + * > + * It's possible we may enter here without an xid, so > it is > + * essential that CreateCheckpoint waits for virtual > transactions > + * rather than full transactionids. > + */ > + MyPgXact->delayChkpt = delayChkpt = true; > > I am slightly worried that this expansion in the use of this mechanism > (formerly called inCommit, for those following along at home) could > lead to checkpoint starvation. Suppose we've got one or two large > table scans wandering along, setting hint bits, and now suddenly it's > time to checkpoint. How long will it take the checkpoint process to > find a time when nobody's got delayChkpt set? We don't need to wait until nobody has it set, we just need to wait for the people that had it set when we first checked to be out of that state momentarily. > + #define PageSetChecksum(page) \ > + do \ > + { \ > + PageHeader p = (PageHeader) page; \ > + p->pd_flags |= PD_PAGE_VERSION_PLUS1; \ > + p->pd_flags |= PD_CHECKSUM1; \ > + p->pd_flags &= ~PD_CHECKSUM2; \ > + p->pd_verify.pd_checksum16 = PageCalcChecksum16(page); \ > + } while (0); > + > + /* ensure any older checksum info is overwritten with watermark */ > + #define PageResetVersion(page) \ > + do \ > + { \ > + PageHeader p = (PageHeader) page; \ > + if (!PageHasNoChecksum(p)) \ > + { \ > + p->pd_flags &= ~PD_PAGE_VERSION_PLUS1; \ > + p->pd_flags &= ~PD_CHECKSUM1; \ > + p->pd_flags &= ~PD_CHECKSUM2; \ > + PageSetPageSizeAndVersion(p, BLCKSZ, PG_PAGE_LAYOUT_VERSION); > \ > + } \ > + } while (0); > > So, when the page has a checksum, PD_CHECKSUM2 is not set, and when it > doesn't have a checksum, PD_CHECKSUM2 is not set? What good does that > do? As explained in detailed comments, the purpose of this is to implement Heikki's suggestion that we have a bit set to zero so we can detect failures that cause a run of 1s. > * PageGetPageSize > * Returns the page size of a page. > * > ! * Since PageSizeIsValid() when pagesize == BLCKSZ, just written BLCKSZ. > ! * This can be called on any page, initialised or not, in or out of buffers. > ! * You might think this can vary at runtime but you'd be wrong, since pages > ! * frequently need to occupy buffers and pages are copied from one to > another > ! * so there are many hidden assumptions that this simple definition is true. > */ > ! #define PageGetPageSize(page) (BLCKSZ) > > I think I agree that the current definition of PageGetPageSize seems > unlikely to come anywhere close to allowing us to cope with multiple > page sizes, but I think this method of disabling it is a hack. The > callers that want to know how big the page really is should just use > BLCKSZ instead of this macro, and those that want to know how big the > page THINKS it is (notably contrib/pageinspect) need a way to get that > information. Fair comment. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] wal_buffers
On Sun, Feb 19, 2012 at 9:46 AM, Euler Taveira de Oliveira wrote: > On 19-02-2012 02:24, Robert Haas wrote: >> I have attached tps scatterplots. The obvious conclusion appears to >> be that, with only 16MB of wal_buffers, the buffer "wraps around" with >> some regularity >> > Isn't it useful to print some messages on the log when we have "wrap around"? > In this case, we have an idea that wal_buffers needs to be increased. I was thinking about that. I think that what might be more useful than a log message is a counter somewhere in shared memory. Logging imposes a lot of overhead, which is exactly what we don't want here, and the volume might be quite high on a system that is bumping up against this problem. Of course then the question is... how would we expose the counter value? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade --logfile option documentation
On Sun, Feb 19, 2012 at 6:13 AM, Peter Eisentraut wrote: > The documentation of the pg_upgrade -l/--logfile option never made much > sense to me: > > -l, --logfile=FILENAME log session activity to file > > I don't know what "session" means for pg_upgrade, so I never used it. > > What it actually does is log the output of all the programs that > pg_upgrade calls internally, such as pg_ctl, psql, vacuumdb, > pg_resetxlog, to the specified file, which is quite useful for analyzing > errors such as > > unable to connect to new postmaster started with the command: > "/usr/lib/postgresql/9.1/bin/pg_ctl" -w -l "/dev/null" -D > "/var/lib/postgresql/9.1/main" -o "-p 5433 -b" start >> "/dev/null" 2>&1 > > where -l would have put something in the place of /dev/null. > > So what might be a better wording for this option? Something like "log > output of internally called programs to file"? I don't think we should be that specific, because we might someday want pg_upgrade itself to write messages to that file as well, even if it doesn't today. I agree that the phrase "session activity" is a bit misleading. As a more general comment, I think that the way pg_upgrade does logging right now is absolutely terrible. IME, it is utterly impossible to understand what has gone wrong with pg_upgrade without looking at the log file. And by default, no log file is created. So typically what happens is: - I run pg_upgrade. It fails. - I rename the control file from the old cluster back to its original name. - I rerun pg_upgrade, this time with -l. It fails again. - I read the log file, figure out what the problem is, and correct it. - I rename the control file from the old cluster back to its original name, again. - I run pg_upgrade a third time. - On a good day, it works, else go to step 5. One pretty obvious improvement would be: if pg_upgrade fails after renaming the control file for the old cluster out of the way - say, while loading the schema dump into the new cluster - have it RENAME THE OLD CONTROL FILE BACK before exiting. But I also think the logging needs improvement. Right now, we studiously redirect both stdout and stderr to /dev/null; maybe it would be better to redirect stdout to /dev/null and NOT redirect stderr. If that generates too much chatter in non-failure cases, then let's adjust the output of the commands pg_upgrade is invoking until it doesn't. The actual cause of the failure, rather than pg_upgrade's fairly-useless gloss on it, ought to be visible right away, at least IMHO. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] wal_buffers
Robert Haas writes: > On Sun, Feb 19, 2012 at 9:46 AM, Euler Taveira de Oliveira > wrote: >> Isn't it useful to print some messages on the log when we have "wrap around"? >> In this case, we have an idea that wal_buffers needs to be increased. > I was thinking about that. I think that what might be more useful > than a log message is a counter somewhere in shared memory. Logging > imposes a lot of overhead, which is exactly what we don't want here, > and the volume might be quite high on a system that is bumping up > against this problem. Of course then the question is... how would we > expose the counter value? Why do you need a counter, other than the current LSN? Surely the number of WAL buffer ring cycles can be deduced directly from that. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] wal_buffers
On Sun, Feb 19, 2012 at 6:33 PM, Tom Lane wrote: > Robert Haas writes: >> On Sun, Feb 19, 2012 at 9:46 AM, Euler Taveira de Oliveira >> wrote: >>> Isn't it useful to print some messages on the log when we have "wrap >>> around"? >>> In this case, we have an idea that wal_buffers needs to be increased. > >> I was thinking about that. I think that what might be more useful >> than a log message is a counter somewhere in shared memory. Logging >> imposes a lot of overhead, which is exactly what we don't want here, >> and the volume might be quite high on a system that is bumping up >> against this problem. Of course then the question is... how would we >> expose the counter value? > > Why do you need a counter, other than the current LSN? Surely the > number of WAL buffer ring cycles can be deduced directly from that. The problem isn't how many times its cycled, the issue is whether there was a wait induced by needing to flush wal buffers because of too many writes. You can't count those waits in the way you suggest, though you can calculate an upper limit on them, but that's not very useful. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Reducing bgwriter wakeups
Recent changes for power reduction mean that we now issue a wakeup call to the bgwriter every time we set a hint bit. However cheap that is, its still overkill. My proposal is that we wakeup the bgwriter whenever a backend is forced to write a dirty buffer, a job the bgwriter should have been doing. This significantly reduces the number of wakeup calls and allows the bgwriter to stay asleep even when very light traffic happens, which is good because the bgwriter is often the last process to sleep. Seems useful to have an explicit discussion on this point, especially in view of recent performance results. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c index 1adb6d3..310cd95 100644 --- a/src/backend/storage/buffer/bufmgr.c +++ b/src/backend/storage/buffer/bufmgr.c @@ -654,6 +654,10 @@ BufferAlloc(SMgrRelation smgr, char relpersistence, ForkNumber forkNum, FlushBuffer(buf, NULL); LWLockRelease(buf->content_lock); +/* The bgwriter may need to be woken. */ +if (ProcGlobal->bgwriterLatch) + SetLatch(ProcGlobal->bgwriterLatch); + TRACE_POSTGRESQL_BUFFER_WRITE_DIRTY_DONE(forkNum, blockNum, smgr->smgr_rnode.node.spcNode, smgr->smgr_rnode.node.dbNode, @@ -2368,9 +2372,6 @@ SetBufferCommitInfoNeedsSave(Buffer buffer) VacuumPageDirty++; if (VacuumCostActive) VacuumCostBalance += VacuumCostPageDirty; - /* The bgwriter may need to be woken. */ - if (ProcGlobal->bgwriterLatch) -SetLatch(ProcGlobal->bgwriterLatch); } } } -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] patch: autocomplete for functions
Hello I found so this extremely simple patch should be useful. It helps for pattern SELECT fx(); There was thread about it. Regards Pavel *** ./src/bin/psql/tab-complete.c.orig 2012-02-05 11:28:48.0 +0100 --- ./src/bin/psql/tab-complete.c 2012-02-19 20:05:05.241626625 +0100 *** *** 2555,2562 COMPLETE_WITH_CONST("IS"); /* SELECT */ ! /* naah . . . */ ! /* SET, RESET, SHOW */ /* Complete with a variable name */ else if ((pg_strcasecmp(prev_wd, "SET") == 0 && --- 2555,2562 COMPLETE_WITH_CONST("IS"); /* SELECT */ ! else if (pg_strcasecmp(prev_wd, "SELECT") == 0) ! COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_functions, NULL); /* SET, RESET, SHOW */ /* Complete with a variable name */ else if ((pg_strcasecmp(prev_wd, "SET") == 0 && -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 16-bit page checksums for 9.2
On Sun, Feb 19, 2012 at 4:35 PM, Simon Riggs wrote: > We don't need to wait until nobody has it set, we just need to wait > for the people that had it set when we first checked to be out of that > state momentarily. I've just finished doing some performance analysis on various aspects of hint bit setting for this patch. I've seen as high as 14% of transactions writing full pages during a pgbench run. That sounds quite bad, but on pgbench at least all of those are associated with UPDATEs, which would dirty the page anyway, so there aren't any more full page writes overall. Checkpoints would be delayed only until a virtual transaction ends or a virtual transaction comes out of DelayCkpt state. If a virtual transaction was long running it wouldn't spend much time in the delaying state, especially if we take into account I/O requirements. So although I'm not exactly happy with the overheads, they don't seem to be as big a problem as they sound. Plus this is all optional and avoidable. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] patch: CREATE OR REPLACE FUNCTION autocomplete
Hello other very simple patch - enhance autocomplete to support CREATE OR REPLACE FUNCTION statement Regards Pavel Stehule *** ./src/bin/psql/tab-complete.c.orig 2012-02-19 20:05:05.0 +0100 --- ./src/bin/psql/tab-complete.c 2012-02-19 20:20:43.817202512 +0100 *** *** 644,649 --- 644,650 {"INDEX", NULL, &Query_for_list_of_indexes}, {"OPERATOR", NULL, NULL}, /* Querying for this is probably not such a * good idea. */ + {"OR REPLACE FUNCTION", NULL, &Query_for_list_of_functions}, {"OWNED", NULL, NULL, THING_NO_CREATE}, /* for DROP OWNED BY ... */ {"PARSER", Query_for_list_of_ts_parsers, NULL, THING_NO_SHOW}, {"ROLE", Query_for_list_of_roles}, -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Reducing bgwriter wakeups
On Sun, Feb 19, 2012 at 1:53 PM, Simon Riggs wrote: > Recent changes for power reduction mean that we now issue a wakeup > call to the bgwriter every time we set a hint bit. > > However cheap that is, its still overkill. > > My proposal is that we wakeup the bgwriter whenever a backend is > forced to write a dirty buffer, a job the bgwriter should have been > doing. > > This significantly reduces the number of wakeup calls and allows the > bgwriter to stay asleep even when very light traffic happens, which is > good because the bgwriter is often the last process to sleep. > > Seems useful to have an explicit discussion on this point, especially > in view of recent performance results. I don't see what this has to do with recent performance results, so please elaborate. Off-hand, I don't see any point in getting cheap. It seems far more important to me that the background writer become active when needed than that we save some trivial amount of power by waiting longer before activating it. If we're concerned about saving power, then IMHO what we should be worried about is that the wal writer is still waking up 5x/s. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Reducing bgwriter wakeups
On Sun, Feb 19, 2012 at 8:15 PM, Robert Haas wrote: > On Sun, Feb 19, 2012 at 1:53 PM, Simon Riggs wrote: >> Recent changes for power reduction mean that we now issue a wakeup >> call to the bgwriter every time we set a hint bit. >> >> However cheap that is, its still overkill. >> >> My proposal is that we wakeup the bgwriter whenever a backend is >> forced to write a dirty buffer, a job the bgwriter should have been >> doing. >> >> This significantly reduces the number of wakeup calls and allows the >> bgwriter to stay asleep even when very light traffic happens, which is >> good because the bgwriter is often the last process to sleep. >> >> Seems useful to have an explicit discussion on this point, especially >> in view of recent performance results. > > I don't see what this has to do with recent performance results, so > please elaborate. Off-hand, I don't see any point in getting cheap. > It seems far more important to me that the background writer become > active when needed than that we save some trivial amount of power by > waiting longer before activating it. Then you misunderstand, since I am advocating waking it when needed. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [COMMITTERS] pgsql: Improve pretty printing of viewdefs.
On 02/19/2012 04:18 PM, Andrew Dunstan wrote: [redirecting to -hackers] Arghh, this time redirecting ... On 02/19/2012 12:04 PM, Pavel Stehule wrote: Hello nice should be this functionality used for query too? some like pg_pretty_query('SELECT ... ', 80) when we have this functionality. It would probably be possible to leverage some of this for that, but it's certainly not part of the present piece of work. All the logic is there in get_query_def() and friends. There would need to be a wrapper that called the parser to get a query object from the input string and then called get_query_def() to get back the reformatted output, and there are probably any number of wrinkles I haven't thought of. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 16-bit page checksums for 9.2
On Sun, Feb 19, 2012 at 11:35 AM, Simon Riggs wrote: >> Doesn't this seem awfully bad for performance on Hot Standby servers? >> I agree that it fixes the problem with un-WAL-logged pages there, but >> I seem to recall some recent complaining about performance features >> that work on the master but not the standby. Durable hint bits are >> one such feature. > > It's impossible for it to work, in this case, since we cannot write > new WAL to prevent torn pages. > > Note that hint bit setting on a dirty block is allowed, so many hints > will still be set in Hot Standby. To me, it seems that you are applying a double standard. You have twice attempted to insist that I do extra work to make major features that I worked on - unlogged tables and index-only scans - work in Hot Standby mode, despite the existence of significant technological obstacles. But when it comes to your own feature, you simply state that it cannot be done, and therefore we need not do it. Of course, this feature, like those, CAN be made to work. It just involves solving difficult problems that have little to do with the primary purpose of the patch. To be honest, I don't use Hot Standby enough to care very much about this particular issue, and I'm not saying we should reject it on these grounds. But I do think it's a mistake to dismiss it entirely, since every limitation of Hot Standby narrows the set of cases in which it can be deployed. And at any rate, I want the same standard applied to my work as to yours. >> I am slightly worried that this expansion in the use of this mechanism >> (formerly called inCommit, for those following along at home) could >> lead to checkpoint starvation. Suppose we've got one or two large >> table scans wandering along, setting hint bits, and now suddenly it's >> time to checkpoint. How long will it take the checkpoint process to >> find a time when nobody's got delayChkpt set? > > We don't need to wait until nobody has it set, we just need to wait > for the people that had it set when we first checked to be out of that > state momentarily. Ah... good point. >> So, when the page has a checksum, PD_CHECKSUM2 is not set, and when it >> doesn't have a checksum, PD_CHECKSUM2 is not set? What good does that >> do? > > As explained in detailed comments, the purpose of this is to implement > Heikki's suggestion that we have a bit set to zero so we can detect > failures that cause a run of 1s. I think it's nonsensical to pretend that there's anything special about that particular bit. If we want to validate the page header before trusting the lack of a checksum bit, we can do that far more thoroughly than just checking that one bit. There are a whole bunch of bits that ought to always be zero, and there are other things we can validate as well (e.g. LSN not in future). If we're concerned about the checksum-enabled bit getting flipped (and I agree that we should be), we can check some combination of that stuff in the hope of catching it, and that'll be a far better guard than just checking one arbitrarily selected bit. That having been said, I don't feel very good about the idea of relying on the contents of the page to tell us whether or not the page has a checksum. There's no guarantee that an error that flips the has-checksum bit will flip any other bit on the page, or that it won't flip everything else we're relying on as a backstop in exactly the manner that foils whatever algorithm we put in place. Random corruption is, perhaps, unlikely to do that, but somehow I feel like it'll happen more often than random chance suggests. Things never fail the way you want them to. Another disadvantage of the current scheme is that there's no particularly easy way to know that your whole cluster has checksums. No matter how we implement checksums, you'll have to rewrite every table in the cluster in order to get them fully turned on. But with the current design, there's no easy way to know how much of the cluster is actually checksummed. If you shut checksums off, they'll linger until those pages are rewritten, and there's no easy way to find the relations from which they need to be removed, either. I'm tempted to suggest a relation-level switch: when you want checksums, you use ALTER TABLE to turn them on, and when you don't want them any more you use ALTER TABLE to shut them off again, in each case rewriting the table. That way, there's never any ambiguity about what's in the data pages in a given relation: either they're either all checksummed, or none of them are. This moves the decision about whether checksums are enabled or disabled quite a but further away from the data itself, and also allows you to determine (by catalog inspection) which parts of the cluster do in fact have checksums. It might be kind of a pain to implement, though: you'd have to pass the information about how any given relation was configured down to the place where we validate page sanity. I'm not sure whether th
Re: [HACKERS] Reducing bgwriter wakeups
On Sun, Feb 19, 2012 at 4:11 PM, Simon Riggs wrote: > On Sun, Feb 19, 2012 at 8:15 PM, Robert Haas wrote: >> On Sun, Feb 19, 2012 at 1:53 PM, Simon Riggs wrote: >>> Recent changes for power reduction mean that we now issue a wakeup >>> call to the bgwriter every time we set a hint bit. >>> >>> However cheap that is, its still overkill. >>> >>> My proposal is that we wakeup the bgwriter whenever a backend is >>> forced to write a dirty buffer, a job the bgwriter should have been >>> doing. >>> >>> This significantly reduces the number of wakeup calls and allows the >>> bgwriter to stay asleep even when very light traffic happens, which is >>> good because the bgwriter is often the last process to sleep. >>> >>> Seems useful to have an explicit discussion on this point, especially >>> in view of recent performance results. >> >> I don't see what this has to do with recent performance results, so >> please elaborate. Off-hand, I don't see any point in getting cheap. >> It seems far more important to me that the background writer become >> active when needed than that we save some trivial amount of power by >> waiting longer before activating it. > > Then you misunderstand, since I am advocating waking it when needed. Well, I guess that depends on when it's actually needed. You haven't presented any evidence one way or the other. I mean, let's suppose that a sudden spike of activity hits a previously-idle system. If we wait until all of shared_buffers is dirty before waking up the background writer, it seems possible that the background writer is going to have a hard time catching up. If we wake it immediately, we don't have that problem. Also, in general, I think that it's not a good idea to let dirty data sit in shared_buffers forever. I'm unhappy about the change this release cycle to skip checkpoints if we've written less than a full WAL segment, and this seems like another step in that direction. It's exposing us to needless risk of data loss. In 9.1, if you process a transaction and, an hour later, the disk where pg_xlog is written melts into a heap of molten slag, your transaction will be there, even if you end up having to run pg_resetxlog. In 9.2, it may well be that xlog contains the only record of that transaction, and you're hosed. The more work we do to postpone writing the data until the absolutely last possible moment, the more likely it is that it won't be on disk when we need it. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] leakproof
I missed all the fun while the "leakproof" addition to function attributes was being decided, so I know I'm late to the party. Today I had to go and look up what it actually meant. I have to say that I was a bit surprised. I expected it to refer to memory management in some way. I don't honestly think "leakproof" as a term is going to convey much to lots of people. Can we come up with a more descriptive term? cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Future of our regular expression code
On 19 February 2012 15:49, Tom Lane wrote: > That sounds great. > > BTW, if you don't have it already, I'd highly recommend getting a copy > of Friedl's "Mastering Regular Expressions". It's aimed at users not > implementers, but there is a wealth of valuable context information in > there, as well as a really good not-too-technical overview of typical > implementation techniques for RE engines. You'd probably still want one > of the more academic presentations such as the dragon book for > reference, but I think Freidl's take on it is extremely useful. Thanks for the recommendations Tom. I've now got Friedl, and there's a dead-tree copy of 'Compilers' making its gradual way to me (no ebook). I've also been reading the article series by Russ Cox linked upthread -- it's good stuff. Are you far enough into the backrefs bug that you'd prefer to see it through, or would you like me to pick it up? Cheers, BJ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Reducing bgwriter wakeups
On Sun, Feb 19, 2012 at 2:18 PM, Robert Haas wrote: > > Also, in general, I think that it's not a good idea to let dirty data > sit in shared_buffers forever. I'm unhappy about the change this > release cycle to skip checkpoints if we've written less than a full > WAL segment, and this seems like another step in that direction. It's > exposing us to needless risk of data loss. In 9.1, if you process a > transaction and, an hour later, the disk where pg_xlog is written > melts into a heap of molten slag, your transaction will be there, even > if you end up having to run pg_resetxlog. Would the log really have been archived in 9.1? I don't think checkpoint_timeout caused a log switch, just a checkpoint which could happily be in the same file as the previous checkpoint. > In 9.2, it may well be that > xlog contains the only record of that transaction, and you're hosed. > The more work we do to postpone writing the data until the absolutely > last possible moment, the more likely it is that it won't be on disk > when we need it. Isn't that what archive_timeut is for? Should archive_timeout default to something like 5 min, rather than 0? Cheers, Jeff -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 16-bit page checksums for 9.2
On Sun, Feb 19, 2012 at 10:04 PM, Robert Haas wrote: > To me, it seems that you are applying a double standard. You have > twice attempted to insist that I do extra work to make major features > that I worked on - unlogged tables and index-only scans - work in Hot > Standby mode, despite the existence of significant technological > obstacles. But when it comes to your own feature, you simply state > that it cannot be done, and therefore we need not do it. Of course, > this feature, like those, CAN be made to work. Vitriol aside, If you would be so kind as to explain how it is possible, as you claim, I'll look into making it work. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Future of our regular expression code
Brendan Jurd writes: > Are you far enough into the backrefs bug that you'd prefer to see it > through, or would you like me to pick it up? Actually, what I've been doing today is a brain dump. This code is never going to be maintainable by anybody except its original author without some internals documentation, so I've been trying to write some based on what I've managed to reverse-engineer so far. It's not very complete, but I do have some words about the DFA/NFA stuff, which I will probably revise and fill in some more as I work on the backref fix, because that's where that bug lives. I have also got a bunch of text about the colormap management code, which I think is interesting right now because that is what we are going to have to fix if we want decent performance for Unicode \w and related classes (cf the other current -hackers thread about regexes). I was hoping to prevail on you to pick that part up as your first project. I will commit what I've got in a few minutes --- look for src/backend/regex/README in that commit. I encourage you to add to that file as you figure stuff out. We could stand to upgrade a lot of the code comments too, of course, but I think a narrative description is pretty useful before diving into code. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 16-bit page checksums for 9.2
On Sun, Feb 19, 2012 at 10:04 PM, Robert Haas wrote: >> As explained in detailed comments, the purpose of this is to implement >> Heikki's suggestion that we have a bit set to zero so we can detect >> failures that cause a run of 1s. > > I think it's nonsensical to pretend that there's anything special > about that particular bit. If we want to validate the page header > before trusting the lack of a checksum bit, we can do that far more > thoroughly than just checking that one bit. There are a whole bunch > of bits that ought to always be zero, and there are other things we > can validate as well (e.g. LSN not in future). If we're concerned > about the checksum-enabled bit getting flipped (and I agree that we > should be), we can check some combination of that stuff in the hope of > catching it, and that'll be a far better guard than just checking one > arbitrarily selected bit. I thought it was a reasonable and practical idea from Heikki. The bit is not selected arbitrarily, it is by design adjacent to one of the other bits. So overall, 3 bits need to be set to a precise value and a run of 1s or 0s will throw and error. > That having been said, I don't feel very good about the idea of > relying on the contents of the page to tell us whether or not the page > has a checksum. There's no guarantee that an error that flips the > has-checksum bit will flip any other bit on the page, or that it won't > flip everything else we're relying on as a backstop in exactly the > manner that foils whatever algorithm we put in place. Random > corruption is, perhaps, unlikely to do that, but somehow I feel like > it'll happen more often than random chance suggests. Things never > fail the way you want them to. You're right. This patch is not the best possible world, given a clean slate. But we don't have a clean slate. The fact is this patch will detect corruptions pretty well and that's what Postgres users want. While developing this, many obstacles could have been blockers. I think we're fairly lucky that I managed to find a way through the minefield of obstacles. > Another disadvantage of the current scheme is that there's no > particularly easy way to know that your whole cluster has checksums. > No matter how we implement checksums, you'll have to rewrite every > table in the cluster in order to get them fully turned on. But with > the current design, there's no easy way to know how much of the > cluster is actually checksummed. You can read every block and check. > If you shut checksums off, they'll > linger until those pages are rewritten, and there's no easy way to > find the relations from which they need to be removed, either. We can't have it both ways. Either we have an easy upgrade, or we don't. I'm told that an easy upgrade is essential. > I'm tempted to suggest a relation-level switch: when you want > checksums, you use ALTER TABLE to turn them on, and when you don't > want them any more you use ALTER TABLE to shut them off again, in each > case rewriting the table. That way, there's never any ambiguity about > what's in the data pages in a given relation: either they're either > all checksummed, or none of them are. This moves the decision about > whether checksums are enabled or disabled quite a but further away > from the data itself, and also allows you to determine (by catalog > inspection) which parts of the cluster do in fact have checksums. It > might be kind of a pain to implement, though: you'd have to pass the > information about how any given relation was configured down to the > place where we validate page sanity. I'm not sure whether that's > practical. It's not practical as the only mechanism, given the requirement for upgrade. As I mention in the docs, if you want that, use VACUUM FULL. So there is a mechanism if you want it. > I also think that the question of where exactly the checksum ought to > be put might bear some more thought. Tom expressed some concern about > stealing the page version field, and it seems to me we could work > around that by instead stealing something less valuable. The top > eight bits of the pd_pagesize_version field probably meet that > criteria, since in practice basically everybody uses 8K blocks, and > the number of errors that will be caught by checksums is probably much > larger than the number of errors that will be caught by the page-size > cross-check. But maybe the other 8 bits should come from somewhere > else, maybe pd_tli or pd_flags. For almost all practical purposes, > storing only the low-order 8 bits of the TLI ought to provide just as > much of a safety check as storing the low-order 16 bits, so I think > that might be the way to go. We could set it up so that we always > check only the low 8 bits against the TLI, regardless of whether > checksums are enabled; then we basically free up that bit space at no > cost in code complexity. The version stuff is catered for by the current patch. TLI isn't something I want
Re: [HACKERS] Potential reference miscounts and segfaults in plpython.c
On 18/02/12 21:18, Jan Urbański wrote: > On 18/02/12 21:17, Tom Lane wrote: >> =?UTF-8?B?SmFuIFVyYmHFhHNraQ==?= writes: >>> On 18/02/12 20:30, Tom Lane wrote: Dave Malcolm at Red Hat has been working on a static code analysis tool for Python-related C code. He reports here on some preliminary results for plpython.c: https://bugzilla.redhat.com/show_bug.cgi?id=795011 >> >> If you find any live bugs, it'd likely be better to deal with them as >> a separate patch so that we can back-patch ... > > Sure, I meant to say I'll look at these as well, but will make them into > a separate patch. Here's a patch that fixes everything I was sure was an actual bug. The rest of the warnings seem to be caused by the tool not knowing that elog(ERROR) throws a longjmp and things like "we never unref this object, so it can't disappear mid-execution". Attached are patches for HEAD and for 9.1.x (since splitting plpython.c in 9.2 was kind of my idea I felt bad about you having to back-patch this so I tried to do the necessary legwork myself; I hope the attached is what you need). BTW, that tool is quite handy, I'll have to try running it over psycopg2. Cheers, Jan diff --git a/src/pl/plpython/plpython.c b/src/pl/plpython/plpython.c index 530b5f0..2b064c5 100644 *** a/src/pl/plpython/plpython.c --- b/src/pl/plpython/plpython.c *** PLyList_FromArray(PLyDatumToOb *arg, Dat *** 2345,2350 --- 2345,2352 length = ARR_DIMS(array)[0]; lbound = ARR_LBOUND(array)[0]; list = PyList_New(length); + if (list == NULL) + elog(ERROR, "could not transform Python list to array"); for (i = 0; i < length; i++) { *** PLy_spi_execute_query(char *query, long *** 3664,3670 int rv; volatile MemoryContext oldcontext; volatile ResourceOwner oldowner; ! PyObject *ret; oldcontext = CurrentMemoryContext; oldowner = CurrentResourceOwner; --- 3666,3672 int rv; volatile MemoryContext oldcontext; volatile ResourceOwner oldowner; ! PyObject *ret = NULL; oldcontext = CurrentMemoryContext; oldowner = CurrentResourceOwner; *** PLy_spi_execute_query(char *query, long *** 3727,3732 --- 3729,3735 if (rv < 0) { + Py_XDECREF(ret); PLy_exception_set(PLy_exc_spi_error, "SPI_execute failed: %s", SPI_result_code_string(rv)); *** PLy_generate_spi_exceptions(PyObject *mo *** 3967,3973 --- 3970,3982 PyObject *sqlstate; PyObject *dict = PyDict_New(); + if (dict == NULL) + elog(ERROR, "could not generate SPI exceptions"); + sqlstate = PyString_FromString(unpack_sql_state(exception_map[i].sqlstate)); + if (sqlstate == NULL) + elog(ERROR, "could not generate SPI exceptions"); + PyDict_SetItemString(dict, "sqlstate", sqlstate); Py_DECREF(sqlstate); exc = PyErr_NewException(exception_map[i].name, base, dict); *** PLy_add_exceptions(PyObject *plpy) *** 4008,4013 --- 4017,4027 PLy_exc_fatal = PyErr_NewException("plpy.Fatal", NULL, NULL); PLy_exc_spi_error = PyErr_NewException("plpy.SPIError", NULL, NULL); + if (PLy_exc_error == NULL || + PLy_exc_fatal == NULL || + PLy_exc_spi_error == NULL) + elog(ERROR, "could not create the base SPI exceptions"); + Py_INCREF(PLy_exc_error); PyModule_AddObject(plpy, "Error", PLy_exc_error); Py_INCREF(PLy_exc_fatal); *** PLy_init_interp(void) *** 4124,4129 --- 4138,4145 Py_INCREF(mainmod); PLy_interp_globals = PyModule_GetDict(mainmod); PLy_interp_safe_globals = PyDict_New(); + if (PLy_interp_safe_globals == NULL) + PLy_elog(ERROR, "could not create globals"); PyDict_SetItemString(PLy_interp_globals, "GD", PLy_interp_safe_globals); Py_DECREF(mainmod); if (PLy_interp_globals == NULL || PyErr_Occurred()) *** PLy_init_plpy(void) *** 4164,4169 --- 4180,4187 main_mod = PyImport_AddModule("__main__"); main_dict = PyModule_GetDict(main_mod); plpy_mod = PyImport_AddModule("plpy"); + if (plpy_mod == NULL) + elog(ERROR, "could not initialize plpy"); PyDict_SetItemString(main_dict, "plpy", plpy_mod); if (PyErr_Occurred()) elog(ERROR, "could not initialize plpy"); *** PLy_output(volatile int level, PyObject *** 4231,4238 * decoration. */ PyObject *o; - PyArg_UnpackTuple(args, "plpy.elog", 1, 1, &o); so = PyObject_Str(o); } else --- 4249,4260 * decoration. */ PyObject *o; + int result; + + result = PyArg_UnpackTuple(args, "plpy.elog", 1, 1, &o); + if (!result) + elog(ERROR, "could not unpack arguments in plpy.elog"); so = PyObject_Str(o); } else *** get_source_line(const char *src, int lin *** 4554,4559 --- 4576,4585 const char *next = src; int current = 0; + /* sanity check */ + if (lineno <= 0) + return NULL; + while (current < lineno) {
Re: [HACKERS] pl/python long-lived allocations in datum->dict transformation
On 14/02/12 01:35, Tom Lane wrote: > =?UTF-8?B?SmFuIFVyYmHFhHNraQ==?= writes: >> It's not very comfortable, but >> I think PLyDict_FromTuple can be allowed to be non-reentrant. > > I think that's pretty short-sighted. Even if it's safe today (which > I am not 100% convinced of), there are plenty of foreseeable reasons > why it might^Wwill break in the future. > >> OTOH if we want to make it reentrant, some more tinkering would be in order. > > I think that's in order. Here are the results of the tinkering. I came up with a stack of context structures that gets pushed when a PL/Python starts being executed and popped when it returns. At first they contained just a scratch memory context used by PLyDict_FromTuple. Then under the premise of confirming the usefulness of introducing such contexts I removed the global PLy_curr_procedure variable and changed all users to get the current procedure from the context. It seems to have worked, so the total count of global variables is unchanged - hooray! While testing I found one more leak, this time caused by allocating a structure for caching array type I/O functions and never freeing it. Attached as separate patch. Cheers, Jan diff --git a/src/pl/plpython/plpy_cursorobject.c b/src/pl/plpython/plpy_cursorobject.c index 4226dc7..46930b0 100644 *** a/src/pl/plpython/plpy_cursorobject.c --- b/src/pl/plpython/plpy_cursorobject.c *** *** 14,19 --- 14,20 #include "plpy_cursorobject.h" #include "plpy_elog.h" + #include "plpy_main.h" #include "plpy_planobject.h" #include "plpy_procedure.h" #include "plpy_resultobject.h" *** PLy_cursor_query(const char *query) *** 121,126 --- 122,128 { SPIPlanPtr plan; Portal portal; + PLyExecutionContext *exec_ctx = PLy_current_execution_context(); pg_verifymbstr(query, strlen(query), false); *** PLy_cursor_query(const char *query) *** 129,136 elog(ERROR, "SPI_prepare failed: %s", SPI_result_code_string(SPI_result)); portal = SPI_cursor_open(NULL, plan, NULL, NULL, ! PLy_curr_procedure->fn_readonly); SPI_freeplan(plan); if (portal == NULL) --- 131,140 elog(ERROR, "SPI_prepare failed: %s", SPI_result_code_string(SPI_result)); + Assert(exec_ctx->curr_proc != NULL); + portal = SPI_cursor_open(NULL, plan, NULL, NULL, ! exec_ctx->curr_proc->fn_readonly); SPI_freeplan(plan); if (portal == NULL) *** PLy_cursor_plan(PyObject *ob, PyObject * *** 210,215 --- 214,220 Portal portal; char *volatile nulls; volatile int j; + PLyExecutionContext *exec_ctx = PLy_current_execution_context(); if (nargs > 0) nulls = palloc(nargs * sizeof(char)); *** PLy_cursor_plan(PyObject *ob, PyObject * *** 252,259 } } portal = SPI_cursor_open(NULL, plan->plan, plan->values, nulls, ! PLy_curr_procedure->fn_readonly); if (portal == NULL) elog(ERROR, "SPI_cursor_open() failed: %s", SPI_result_code_string(SPI_result)); --- 257,266 } } + Assert(exec_ctx->curr_proc != NULL); + portal = SPI_cursor_open(NULL, plan->plan, plan->values, nulls, ! exec_ctx->curr_proc->fn_readonly); if (portal == NULL) elog(ERROR, "SPI_cursor_open() failed: %s", SPI_result_code_string(SPI_result)); diff --git a/src/pl/plpython/plpy_elog.c b/src/pl/plpython/plpy_elog.c index 741980c..9909f23 100644 *** a/src/pl/plpython/plpy_elog.c --- b/src/pl/plpython/plpy_elog.c *** *** 12,17 --- 12,18 #include "plpy_elog.h" + #include "plpy_main.h" #include "plpy_procedure.h" *** PLy_traceback(char **xmsg, char **tbmsg, *** 260,265 --- 261,267 char *line; char *plain_filename; long plain_lineno; + PLyExecutionContext *exec_ctx = PLy_current_execution_context(); /* * The second frame points at the internal function, but to mimick *** PLy_traceback(char **xmsg, char **tbmsg, *** 270,276 else fname = PyString_AsString(name); ! proname = PLy_procedure_name(PLy_curr_procedure); plain_filename = PyString_AsString(filename); plain_lineno = PyInt_AsLong(lineno); --- 272,280 else fname = PyString_AsString(name); ! Assert(exec_ctx->curr_proc != NULL); ! ! proname = PLy_procedure_name(exec_ctx->curr_proc); plain_filename = PyString_AsString(filename); plain_lineno = PyInt_AsLong(lineno); *** PLy_traceback(char **xmsg, char **tbmsg, *** 287,293 * function code object was compiled with "" as the * filename */ ! if (PLy_curr_procedure && plain_filename != NULL && strcmp(plain_filename, "") == 0) { /* --- 291,297 * function code object was compiled with "" as the * filename */ ! if (exec_ctx->curr_proc && plain_filename != N
Re: [HACKERS] leakproof
On Sun, Feb 19, 2012 at 5:29 PM, Andrew Dunstan wrote: > I missed all the fun while the "leakproof" addition to function attributes > was being decided, so I know I'm late to the party. Today I had to go and > look up what it actually meant. I have to say that I was a bit surprised. I > expected it to refer to memory management in some way. I don't honestly > think "leakproof" as a term is going to convey much to lots of people. Can > we come up with a more descriptive term? We bikeshed on that topic a while back and nobody suggested anything that got more than 1 or 2 votes. But I'm still happy to rename it if we can come up with something better, because I'm not in love with it either. Having now spent far too much time in bed with that patch, I'm feeling like the concept that we are really looking for there is what some languages call "pure" - that is, there must be no side effects, whether by throwing exceptions or otherwise. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Reducing bgwriter wakeups
On Sun, Feb 19, 2012 at 5:56 PM, Jeff Janes wrote: > Would the log really have been archived in 9.1? I don't think > checkpoint_timeout caused a log switch, just a checkpoint which could > happily be in the same file as the previous checkpoint. The log segment doesn't need to get archived - it's sufficient that the dirty buffers get written to disk. >> In 9.2, it may well be that >> xlog contains the only record of that transaction, and you're hosed. >> The more work we do to postpone writing the data until the absolutely >> last possible moment, the more likely it is that it won't be on disk >> when we need it. > > Isn't that what archive_timeut is for? > > Should archive_timeout default to something like 5 min, rather than 0? I dunno. I think people are doing replication are probably mostly using streaming replication these days, in which case archive_timeout won't matter one way or the other. But if you're not doing replication, your only hope of recovering from a trashed pg_xlog is that PostgreSQL wrote the buffers and (in the case of an OS crash) the OS wrote them to disk. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 16-bit page checksums for 9.2
On Sun, Feb 19, 2012 at 6:33 PM, Simon Riggs wrote: > On Sun, Feb 19, 2012 at 10:04 PM, Robert Haas wrote: >> To me, it seems that you are applying a double standard. You have >> twice attempted to insist that I do extra work to make major features >> that I worked on - unlogged tables and index-only scans - work in Hot >> Standby mode, despite the existence of significant technological >> obstacles. But when it comes to your own feature, you simply state >> that it cannot be done, and therefore we need not do it. Of course, >> this feature, like those, CAN be made to work. > > Vitriol aside, If you would be so kind as to explain how it is > possible, as you claim, I'll look into making it work. It would require a double-write buffer of some kind. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Future of our regular expression code
On 20 February 2012 10:42, Tom Lane wrote: > I have also got > a bunch of text about the colormap management code, which I think > is interesting right now because that is what we are going to have > to fix if we want decent performance for Unicode \w and related > classes (cf the other current -hackers thread about regexes). > I was hoping to prevail on you to pick that part up as your first > project. I will commit what I've got in a few minutes --- look > for src/backend/regex/README in that commit. Okay, I've read through your README content, it was very helpful. I'll now go chew through some more reading material and then start studying our existing regex source code. Once I'm firing on all cylinders with this stuff, I'll begin to tackle the colormap. Cheers, BJ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Future of our regular expression code
Tom, I did a google search, and found the following: http://www.arglist.com/regex/ Which states that Tcl uses the same library from Henry. Maybe someone involved with that project would help explain the library? Also I noticed at the url above is a few ports people did from Henry's code. I didn't download and analyze their code, but maybe they have made some comments that could help, or maybe have some improvements to the code.. Just a thought.. :) Billy Earney On Sun, Feb 19, 2012 at 5:42 PM, Tom Lane wrote: > Brendan Jurd writes: > > Are you far enough into the backrefs bug that you'd prefer to see it > > through, or would you like me to pick it up? > > Actually, what I've been doing today is a brain dump. This code is > never going to be maintainable by anybody except its original author > without some internals documentation, so I've been trying to write > some based on what I've managed to reverse-engineer so far. It's > not very complete, but I do have some words about the DFA/NFA stuff, > which I will probably revise and fill in some more as I work on the > backref fix, because that's where that bug lives. I have also got > a bunch of text about the colormap management code, which I think > is interesting right now because that is what we are going to have > to fix if we want decent performance for Unicode \w and related > classes (cf the other current -hackers thread about regexes). > I was hoping to prevail on you to pick that part up as your first > project. I will commit what I've got in a few minutes --- look > for src/backend/regex/README in that commit. I encourage you to > add to that file as you figure stuff out. We could stand to upgrade > a lot of the code comments too, of course, but I think a narrative > description is pretty useful before diving into code. > >regards, tom lane > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers >
Re: [HACKERS] 16-bit page checksums for 9.2
On Sun, Feb 19, 2012 at 6:57 PM, Simon Riggs wrote: > I thought it was a reasonable and practical idea from Heikki. The bit > is not selected arbitrarily, it is by design adjacent to one of the > other bits. So overall, 3 bits need to be set to a precise value and a > run of 1s or 0s will throw and error. Sure, but who is to say that a typical error will look anything like that? Anyway, you could check even more bits just as easily; we know exactly which ones have a plausible reason for being non-zero. >> That having been said, I don't feel very good about the idea of >> relying on the contents of the page to tell us whether or not the page >> has a checksum. There's no guarantee that an error that flips the >> has-checksum bit will flip any other bit on the page, or that it won't >> flip everything else we're relying on as a backstop in exactly the >> manner that foils whatever algorithm we put in place. Random >> corruption is, perhaps, unlikely to do that, but somehow I feel like >> it'll happen more often than random chance suggests. Things never >> fail the way you want them to. > > You're right. This patch is not the best possible world, given a clean > slate. But we don't have a clean slate. > > The fact is this patch will detect corruptions pretty well and that's > what Postgres users want. > > While developing this, many obstacles could have been blockers. I > think we're fairly lucky that I managed to find a way through the > minefield of obstacles. I think we could do worse than this patch, but I don't really believe it's ready for commit. We don't have a single performance number showing how much of a performance regression this causes, either on the master or on the standby, on any workload, much less those where a problem is reasonably forseeable from the design you've chosen. Many controversial aspects of the patch, such as the way you're using the buffer header spinlocks as a surrogate for x-locking the buffer, or the right way of obtaining the bit-space the patch requires, haven't really been discussed, and to the extent that they have been discussed, they have not been agreed. On the former point, you haven't updated src/backend/storage/buffer/README at all; but updating is not by itself sufficient. Before we change the buffer-locking rules, we ought to have some discussion of whether it's OK to do that, and why it's necessary. "Why it's necessary" would presumably include demonstrating that the performance of the straightforward implementation stinks, and that with this change is better. You haven't made any effort to do that whatsoever, or if you have, you haven't posted the results here. I'm pretty un-excited by that change, first because I think it's a modularity violation and possibly unsafe, and second because I believe we've already got a problem with buffer header spinlock contention which this might exacerbate. >> Another disadvantage of the current scheme is that there's no >> particularly easy way to know that your whole cluster has checksums. >> No matter how we implement checksums, you'll have to rewrite every >> table in the cluster in order to get them fully turned on. But with >> the current design, there's no easy way to know how much of the >> cluster is actually checksummed. > > You can read every block and check. Using what tool? >> If you shut checksums off, they'll >> linger until those pages are rewritten, and there's no easy way to >> find the relations from which they need to be removed, either. > > We can't have it both ways. Either we have an easy upgrade, or we > don't. I'm told that an easy upgrade is essential. Easy upgrade and removal of checksums are unrelated issues AFAICS. >> I'm tempted to suggest a relation-level switch: when you want >> checksums, you use ALTER TABLE to turn them on, and when you don't >> want them any more you use ALTER TABLE to shut them off again, in each >> case rewriting the table. That way, there's never any ambiguity about >> what's in the data pages in a given relation: either they're either >> all checksummed, or none of them are. This moves the decision about >> whether checksums are enabled or disabled quite a but further away >> from the data itself, and also allows you to determine (by catalog >> inspection) which parts of the cluster do in fact have checksums. It >> might be kind of a pain to implement, though: you'd have to pass the >> information about how any given relation was configured down to the >> place where we validate page sanity. I'm not sure whether that's >> practical. > > It's not practical as the only mechanism, given the requirement for upgrade. Why not? > The version stuff is catered for by the current patch. Your patch reuses the version number field for an unrelated purpose and includes some vague hints of how we might do versioning using some of the page-level flag bits. Since bit-space is fungible, I think it makes more sense to keep the version number where it is and ca
Re: [HACKERS] Future of our regular expression code
Billy Earney writes: > I did a google search, and found the following: > http://www.arglist.com/regex/ Hmm ... might be worth looking at those two pre-existing attempts at making a standalone library from Henry's code, just to see what choices they made. > Which states that Tcl uses the same library from Henry. Maybe someone > involved with that project would help explain the library? Um ... did you see the head message in this thread? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] leakproof
Robert Haas writes: > On Sun, Feb 19, 2012 at 5:29 PM, Andrew Dunstan wrote: >> Can we come up with a more descriptive term? > We bikeshed on that topic a while back and nobody suggested anything > that got more than 1 or 2 votes. But I'm still happy to rename it if > we can come up with something better, because I'm not in love with it > either. > Having now spent far too much time in bed with that patch, I'm feeling > like the concept that we are really looking for there is what some > languages call "pure" - that is, there must be no side effects, > whether by throwing exceptions or otherwise. Hmm, "pure" doesn't sound bad to me. Nice and short. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Future of our regular expression code
Thanks Tom. I looked at the code in the libraries I referred to earlier, and it looks like the code in the regex directory is exactly the same as Walter Waldo's version, which has at least one comment from the middle of last decade (~ 2003). Has people thought about migrating to the pcre library? It seems to have a lot of neat features, and also has a jit, and it looks like it is being actively maintained and has decent comments. On Sun, Feb 19, 2012 at 7:40 PM, Tom Lane wrote: > Billy Earney writes: > > I did a google search, and found the following: > > http://www.arglist.com/regex/ > > Hmm ... might be worth looking at those two pre-existing attempts at > making a standalone library from Henry's code, just to see what choices > they made. > > > Which states that Tcl uses the same library from Henry. Maybe someone > > involved with that project would help explain the library? > > Um ... did you see the head message in this thread? > >regards, tom lane >
Re: [HACKERS] Future of our regular expression code
Billy, * Billy Earney (billy.ear...@gmail.com) wrote: > Thanks Tom. I looked at the code in the libraries I referred to earlier, > and it looks like the code in the regex directory is exactly the same as > Walter Waldo's version, which has at least one comment from the middle of > last decade (~ 2003). Has people thought about migrating to the pcre > library? It seems to have a lot of neat features, and also has a jit, and > it looks like it is being actively maintained and has decent comments. It strikes me that you might benefit from reading the full thread. As Tom mentioned previously, pcre would require user-visible changes in behavior, including cases where things which work today wouldn't work. That requires a pretty high bar and I don't think we're anywhere near there with this. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] leakproof
On Feb 19, 2012, at 5:42 PM, Tom Lane wrote: > Robert Haas writes: >> Having now spent far too much time in bed with that patch, I'm feeling >> like the concept that we are really looking for there is what some >> languages call "pure" - that is, there must be no side effects, >> whether by throwing exceptions or otherwise. > > Hmm, "pure" doesn't sound bad to me. Nice and short. > Technically, "pure" is stronger than "has no side effects": http://en.wikipedia.org/wiki/Pure_function Result can't depend on state (for instance, database contents), either. This is the typical definition used in functional programming. gcc extends this to allow use of global variables in a "pure" function (the stricter definition is met by "const" functions). PG has "immutable", so a slightly weaker "pure" probably wouldn't be terribly confusing given the gcc precedent (probably across their family of compilers). "D" adopts the stricter definition of "pure". So there's some confusion around the term. But … I picked up this thread after "leakproof" was settled on and was curious as to what "leakproof" was supposed to be as I didn't read the earlier posts. I assumed it meant "doesn't leak memory", which seems admirable and typical and not needful of an attribute on the function declaration. "pure" is definitely less confusing IMO, if it's congruent with the weaker sense of "pure" that's found in some languages/implementations. Don Baccus http://donb.photo.net http://birdnotes.net http://openacs.org -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] leakproof
I suspect this is wrong for similar reasons as "pure" but I'll throw it out there: "hermetic" -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] leakproof
Don Baccus writes: > On Feb 19, 2012, at 5:42 PM, Tom Lane wrote: >> Hmm, "pure" doesn't sound bad to me. Nice and short. > Technically, "pure" is stronger than "has no side effects": > http://en.wikipedia.org/wiki/Pure_function > Result can't depend on state (for instance, database contents), either. This > is the typical definition used in functional programming. Well, that condition is subsumed in our idea of an immutable function. It's not clear to me whether pure/leakproof functions are meant to be a strict subset of immutable functions, but if they are then they meet this stricter definition. On the other hand, if pure/leakproof functions don't have to be immutable but only stable, then the stricter definition corresponds to "pure immutable". That still doesn't sound too bad, as long as we define our terms clearly in the docs. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Future of our regular expression code
On Sat, Feb 18, 2012 at 6:15 PM, Tom Lane wrote: > A larger point is that it'd be a real shame > for the Spencer regex engine to die off, because it is in fact one of > the best pieces of regex technology on the planet. ... > Another possible long-term answer is to finish the work Henry never did, > that is make the code into a standalone library. That would make it > available to more projects and perhaps attract other people to help > maintain it. However, that looks like a lot of work too, with distant > and uncertain payoff. I can't see how your first claim that the Spencer code is worth keeping around because it's just a superior regex implementation has much force unless we can accomplish the latter. If the library can be split off into a standalone library then it might have some longevity. But if we're the only ones maintaining it then it's just prolonging the inevitable. I can't see Postgres having its own special brand of regexes that nobody else uses being an acceptable situation forever. One thing that concerns me more and more is that most sufficiently powerful regex implementations are susceptible to DOS attacks. A database application is quite likely to allow users to decide directly or indirectly what regexes to apply and it can be hard to predict which regexes will cause which implementations to explode its cpu or memory requirements. We need a library that can be used to defend against malicious regexes and i suspect neither Perl's nor Python's library will suffice for this. -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Potential reference miscounts and segfaults in plpython.c
=?UTF-8?B?SmFuIFVyYmHFhHNraQ==?= writes: >> On 18/02/12 21:17, Tom Lane wrote: >>> Dave Malcolm at Red Hat has been working on a static code analysis tool >>> for Python-related C code. He reports here on some preliminary results >>> for plpython.c: >>> https://bugzilla.redhat.com/show_bug.cgi?id=795011 > Here's a patch that fixes everything I was sure was an actual bug. The > rest of the warnings seem to be caused by the tool not knowing that > elog(ERROR) throws a longjmp and things like "we never unref this > object, so it can't disappear mid-execution". This looks pretty sane to me, but it would probably be better if one of the more python-savvy committers took responsibility for final review. My only comment is whether elog(ERROR) is appropriate, ie, do we consider these to be internal errors that users will never see in practice? If there's a significant risk of the error being thrown in the field, it might be better to use ereport, to expose the message for translation. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Future of our regular expression code
Greg, * Greg Stark (st...@mit.edu) wrote: > I can't see how your first claim that the Spencer code is worth > keeping around because it's just a superior regex implementation has > much force unless we can accomplish the latter. If the library can be > split off into a standalone library then it might have some longevity. > But if we're the only ones maintaining it then it's just prolonging > the inevitable. I can't see Postgres having its own special brand of > regexes that nobody else uses being an acceptable situation forever. > > One thing that concerns me more and more is that most sufficiently > powerful regex implementations are susceptible to DOS attacks. A > database application is quite likely to allow users to decide directly > or indirectly what regexes to apply and it can be hard to predict > which regexes will cause which implementations to explode its cpu or > memory requirements. We need a library that can be used to defend > against malicious regexes and i suspect neither Perl's nor Python's > library will suffice for this. Alright, I'll bite.. Which existing regexp implementation that's well written, well maintained, and which is well protected against malicious regexes should we be considering then? While we might not be able to formalize the regex code as a stand-alone library, my bet would be that the Tcl folks (and anyone else using this code..) will be paying attention to the changes and improvments we're making. Sure, it'd be easier for them to incorporate those changes if they could just pull in a new version of the library, but we can't all have our cake and eat it too. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] Future of our regular expression code
Greg Stark writes: > ... We need a library that can be used to defend > against malicious regexes and i suspect neither Perl's nor Python's > library will suffice for this. Yeah. Did you read the Russ Cox papers referenced upthread? One of the things Google wanted was provably limited resource consumption, which motivated them going with a pure-DFA-no-exceptions implementation. However, they gave up backrefs to get that, which is probably a compromise we're not willing to make. One thing that's been bothering me for awhile is that we don't have any CHECK_FOR_INTERRUPTS or equivalent in the library's NFA search loops. It wouldn't be hard to add one but that'd be putting PG-specific code into the very heart of the library, which is something I've tried to resist. One of the issues we'll have to face if we do try to split it out as a standalone library is how that type of requirement can be met. (And, BTW, that's the kind of hack that we would probably not get to make at all with any other library, so the need for it is not evidence that getting away from Spencer's code would be a good thing.) regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] wal_buffers
On Mon, Feb 20, 2012 at 3:08 AM, Robert Haas wrote: > On Sun, Feb 19, 2012 at 9:46 AM, Euler Taveira de Oliveira > wrote: >> On 19-02-2012 02:24, Robert Haas wrote: >>> I have attached tps scatterplots. The obvious conclusion appears to >>> be that, with only 16MB of wal_buffers, the buffer "wraps around" with >>> some regularity >>> >> Isn't it useful to print some messages on the log when we have "wrap around"? >> In this case, we have an idea that wal_buffers needs to be increased. > > I was thinking about that. I think that what might be more useful > than a log message is a counter somewhere in shared memory. Logging > imposes a lot of overhead, which is exactly what we don't want here, > and the volume might be quite high on a system that is bumping up > against this problem. Of course then the question is... how would we > expose the counter value? There is no existing statistics view suitable to include such information. What about defining pg_stat_xlog or something? Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Initial 9.2 pgbench write results
On 02/18/2012 02:35 PM, Robert Haas wrote: I see CheckpointWriteDelay calling BgBufferSync in 9.1. Background writing would stop during the sync phase and perhaps slow down a bit during checkpoint writing, but I don't think it was stopped completely. The sync phase can be pretty long here--that's where the worst-case latency figures lasting many seconds are coming from. When checkpoints are happening every 60 seconds as in some of these cases, that can represent a decent percentage of time. Similarly, when the OS cache fills, the write phase might block for a larger period of time than normally expected. But, yes, you're right that my "BGW is active twice as much in 9.2" comments are overstating the reality here. I'm collecting one last bit of data before posting another full set of results, but I'm getting more comfortable the issue here is simply changes in the BGW behavior. The performance regression tracks the background writer maximum intensity. I can match the original 9.1 performance just by dropping bgwriter_lru_maxpages, in cases where TPS drops significantly between 9.2 and 9.1. At the same time, some cases that improve between 9.1 and 9.2 perform worse if I do that. If whether 9.2 gains or loses compared to 9.1 is adjustable with a tunable parameter, with some winning and other losing at the defaults, that path forward is reasonable to deal with. The fact that pgbench is an unusual write workload is well understood, and I can write something documenting this possibility before 9.2 is officially released. I'm a lot less stressed that there's really a problem here now. -- Greg Smith 2ndQuadrant USg...@2ndquadrant.com Baltimore, MD PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Initial 9.2 pgbench write results
On 02/19/2012 05:37 AM, Simon Riggs wrote: Please retest with wal_buffers 128MB, checkpoint_segments 1024 The test parameters I'm using aim to run through several checkpoint cycles in 10 minutes of time. Bumping up against the ugly edges of resource bottlenecks is part of the test. Increasing checkpoint_segments like that would lead to time driven checkpoints, either 1 or 2 of them during 10 minutes. I'd have to increase the total testing time by at least 5X to get an equal workout of the system. That would be an interesting data point to collect if I had a few weeks to focus just on that test. I think that's more than pgbench testing deserves though. -- Greg Smith 2ndQuadrant USg...@2ndquadrant.com Baltimore, MD PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] wal_buffers
On 02/19/2012 12:24 AM, Robert Haas wrote: I think we might want to consider adjusting our auto-tuning formula for wal_buffers to allow for a higher cap, although this is obviously not enough data to draw any firm conclusions. That's an easy enough idea to throw into my testing queue. The 16MB auto-tuning upper bound was just the easiest number to suggest that was obviously useful and unlikely to be wasteful. One of the reasons wal_buffers remains a user-visible parameter was that no one every really did an analysis at what its useful upper bound was--and that number might move up as other bottlenecks are smashed too. -- Greg Smith 2ndQuadrant USg...@2ndquadrant.com Baltimore, MD PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Future of our regular expression code
On 02/19/2012 10:28 PM, Greg Stark wrote: One thing that concerns me more and more is that most sufficiently powerful regex implementations are susceptible to DOS attacks. There's a list of "evil regexes" at http://en.wikipedia.org/wiki/ReDoS The Perl community's reaction to Russ Cox's regex papers has some interesting comments along these lines too: http://www.perlmonks.org/?node_id=597262 That brings up the backreferences concerns Tom already mentioned. Someone also points out the Thompson NFA that Cox advocates in his first article can use an excessive amount of memory when processing Unicode: http://www.perlmonks.org/?node_id=597312 Aside--Cox's "Regular Expression Matching with a Trigram Index" is an interesting intro to trigram use for FTS purposes, and might have some inspirational ideas for further progress in that area: http://swtch.com/~rsc/regexp/regexp4.html -- Greg Smith 2ndQuadrant USg...@2ndquadrant.com Baltimore, MD PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] leakproof
On Feb 19, 2012, at 7:24 PM, Tom Lane wrote: > Don Baccus writes: >> On Feb 19, 2012, at 5:42 PM, Tom Lane wrote: >>> Hmm, "pure" doesn't sound bad to me. Nice and short. > >> Technically, "pure" is stronger than "has no side effects": >> http://en.wikipedia.org/wiki/Pure_function >> Result can't depend on state (for instance, database contents), either. >> This is the typical definition used in functional programming. > > Well, that condition is subsumed in our idea of an immutable function. Yes, I said that myself, perhaps you didn't bother to read closely? > It's not clear to me whether pure/leakproof functions are meant to be a > strict subset of immutable functions Superset, not subset, unless my guessing is wrong. How could "pure" be a subset of "immutable"? OK, at this point, proponents will explain why ... But if you're not clear as to what a "leakproof" function is meant to be. then I suggest the definition must be defined very clearly, so everyone understands what it is meant to be. > , but if they are then they meet > this stricter definition. On the other hand, if pure/leakproof functions > don't have to be immutable but only stable, then the stricter definition > corresponds to "pure immutable". That still doesn't sound too bad, as > long as we define our terms clearly in the docs. Sure, let those making the proposal make things clear. Just speaking as a gadfly who's not posted here for probably close on 10 years … Don Baccus http://donb.photo.net http://birdnotes.net http://openacs.org -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Future of our regular expression code
Stephen Frost wrote: Alright, I'll bite.. Which existing regexp implementation that's well written, well maintained, and which is well protected against malicious regexes should we be considering then? FWIW, there's a benchmark here that compares a number of regexp engines, including PCRE, TRE and Russ Cox's RE2: http://lh3lh3.users.sourceforge.net/reb.shtml The fastest backtracking-style engine seems to be oniguruma, which is native to Ruby 1.9 and thus not only supports Unicode but I'd bet performs pretty well on it, on account of it's developed in Japan. But it goes pathological on regexen containing '|'; the only safe choice among PCRE-style engines is RE2, but of course that doesn't support backreferences. Russ's page on re2 (http://code.google.com/p/re2/) says: "If you absolutely need backreferences and generalized assertions, then RE2 is not for you, but you might be interested in irregexp, Google Chrome's regular expression engine." That's here: http://blog.chromium.org/2009/02/irregexp-google-chromes-new-regexp.html Sadly, it's in Javascript. Seems like if you need a safe, performant regexp implementation, your choice is (a) finish PLv8 and support it on all platforms, or (b) add backreferences to RE2 and precompile it to C with Comeau (if that's still around), or... Jay -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] leakproof
On Sun, Feb 19, 2012 at 10:24 PM, Tom Lane wrote: > Don Baccus writes: >> On Feb 19, 2012, at 5:42 PM, Tom Lane wrote: >>> Hmm, "pure" doesn't sound bad to me. Nice and short. > >> Technically, "pure" is stronger than "has no side effects": >> http://en.wikipedia.org/wiki/Pure_function >> Result can't depend on state (for instance, database contents), either. >> This is the typical definition used in functional programming. > > Well, that condition is subsumed in our idea of an immutable function. > It's not clear to me whether pure/leakproof functions are meant to be a > strict subset of immutable functions, but if they are then they meet > this stricter definition. On the other hand, if pure/leakproof functions > don't have to be immutable but only stable, then the stricter definition > corresponds to "pure immutable". That still doesn't sound too bad, as > long as we define our terms clearly in the docs. For the present application (pushdown into security views), we really only care whether the function has side effects, such as throwing an error or mutating global state. So, in theory, even a volatile function could be leakproof - it could read (but not write) some piece of global, volatile state. In practice, I'm not sure those cases are important at all. Right now, the only things marked as leakproof are relational operators that might be indexable, precisely so that we might be able to push an indexable qual down far enough to allow an index scan, even in the presence of an intervening security view. Maybe someone will want to push down a qual like x > now() or x > clock_timestamp(), but I guess I can't get that excited about that. There are so few leakproof functions that the chances of making pushdown work safely for much of anything beyond col = const seem remote. So, my tea leaves are telling me that if we want to make pure a subset of immutable, that probably isn't going to cause a problem. However, I am not a CTLR (certified tea leaf reader). -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Initial 9.2 pgbench write results
On Sun, Feb 19, 2012 at 11:12 PM, Greg Smith wrote: > I'm collecting one last bit of data before posting another full set of > results, but I'm getting more comfortable the issue here is simply changes > in the BGW behavior. The performance regression tracks the background > writer maximum intensity. That's really quite fascinating... but it seems immensely counterintuitive. Any idea why? BufFreelist contention between the background writer and regular backends leading to buffer allocation stalls, maybe? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal: XML helper functions
But by using the above code: how do we deal with multiple matching values? For example: java c++ In this case, perhaps I would want something like ---+- my_question | java my_question | c++ -- View this message in context: http://postgresql.1045698.n5.nabble.com/Proposal-XML-helper-functions-tp2018975p5497064.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: Scaling XLog insertion (was Re: [HACKERS] Moving more work outside WALInsertLock)
I was trying to understand this patch and had few doubts: 1. In PerformXLogInsert(), why there is need to check freespace when already during ReserveXLogInsertLocation(), the space is reserved. Is it possible that the record size is more than actually calculted in ReserveXLogInsertLocation(), if so in that case what I understand is it is moving to next page to write, however isn't it possible that some other backend had already reserved that space. 2. In function WaitForXLogInsertionSlotToBecomeFree(), chances are there such that when nextslot equals lastslot, all new backends try to reserve a slot will start waiting on same last slot which can lead to serialization for those backends and can impact latency. 3. GetXlogBuffer - This will get called twice, once for normal buffer, second time for when there is not enough space in current page, and both times it can lead to I/O whereas in earlier algorithm, the chances of I/O is only once. -Original Message- From: pgsql-hackers-ow...@postgresql.org [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Heikki Linnakangas Sent: Friday, February 17, 2012 9:07 PM To: Fujii Masao Cc: Jeff Janes; Robert Haas; PostgreSQL-development Subject: Re: Scaling XLog insertion (was Re: [HACKERS] Moving more work outside WALInsertLock) On 17.02.2012 07:27, Fujii Masao wrote: > Got another problem: when I ran pg_stop_backup to take an online > backup, it got stuck until I had generated new WAL record. This > happens because, in the patch, when pg_stop_backup forces a switch to > new WAL file, old WAL file is not marked as archivable until next new > WAL record has been inserted, but pg_stop_backup keeps waiting for that WAL file to be archived. > OTOH, without the patch, WAL file is marked as archivable as soon as > WAL file switch occurs. > > So, in short, the patch seems to handle the WAL file switch logic incorrectly. Yep. For a WAL-switch record, XLogInsert returns the location of the end of the record, not the end of the empty padding space. So when the caller flushed up to that point, it didn't flush the empty space and therefore didn't notify the archiver. Attached is a new version, fixing that, and off-by-one bug you pointed out in the slot wraparound handling. I also moved code around a bit, I think this new division of labor between the XLogInsert subroutines is more readable. Thanks for the testing! -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Displaying accumulated autovacuum cost
On Sat, Feb 18, 2012 at 2:16 AM, Robert Haas wrote: > On Fri, Feb 17, 2012 at 5:04 AM, Fujii Masao wrote: >> Here are review comments: >> >> The document about EXPLAIN needs to be updated. >> >> You forgot to add the long-integer-valued property of >> shared/local_blks_dirtied. >> So when I ran EXPLAIN and used json as a format, no information about >> blks_dirtied >> was reported. > > Thanks for the review. Updated patch attached. Thanks for updating the patch! The patch looks good to me. But I have three minor comments: In pg_stat_statements--1.1.sql +/* contrib/pg_stat_statements/pg_stat_statements--1.0.sql */ Typo: s/1.0/1.1 In pg_stat_statements--1.0--1.1.sql, we should complain if script is sourced in psql, as follows? \echo Use "ALTER EXTENSION pg_stat_statements UPDATE TO '1.1'" to load this file. \quit +DATA = pg_stat_statements--1.1.sql pg_stat_statements--1.0--1.1.sql \ + pg_stat_statements--unpackaged--1.0.sql Though I'm not familiar with CREATE EXTENSION. Why did you exclude 1.0.sql from DATA? In hstore/Makefile, 1.0.sql is included. You think we should prevent old version (i.e., 1.0) of pg_stat_statements from being used in 9.2? Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers