Re: [HACKERS] regression in analyze
Hi, Attached test shows a regression in analyze command. Expected rows in an empty table is 2140 even after an ANALYZE is executed Doesn't seem to be a regression to me, as I've just checked that 8.0 did behave the same. However the question also was raised a few days ago on the italian mailing list and I couldn't find a reasonable explanation for it. Cheers -- Matteo Beccati OpenX - http://www.openx.org -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Synchronous replication patch v1
On Thu, Nov 6, 2008 at 3:59 PM, Fujii Masao [EMAIL PROTECTED] wrote: 1) Start postgres in the primary 2) Get an online-backup in the primary 3) Locate the online-backup in the standby 4) Start postgres (with walreceiver) in the standby # Configure restore_command, host of the primary and port in recovery.conf 5) Manual operation # If there are missing files for PITR in the standby, copy them from somewhere (archive location of the primary, tape backup..etc). The missing files might be xlog or history file. Since xlog file segment is switched when replication starts, the missing xlog files would basically exist in the archive location of the primary. More properly, since startup process and walreceiver decide timeline ID from the history files, all of them need to exist in the standby (need copy if missing) before 4) starting postgres. If the database whose timeline is the same as the primary's exists in the standby, 2)3) getting new online-backup is not necessary. For example, after the standby falls down, the database at that time is applicable to restart it. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Assorted contrib infrastructures patch
Hello, I'm submitting 2 contrib modules and there 3 patches to core for them from me and Martin, but they confict each other and there are some hunks and rejections already. Here is an assorted patch of them. Can I ask you to review the patches in this form? - Martin's querydesc patch http://archives.postgresql.org/message-id/[EMAIL PROTECTED] - My patch for contrib/auto_explain http://archives.postgresql.org/message-id/[EMAIL PROTECTED] - My patch for contrib/pg_stat_statements http://archives.postgresql.org/message-id/[EMAIL PROTECTED] This is a list of modification by the patch: - Add sourceText field in QueryDesc. To handle query texts of nested statements in ExecutorRun_hook. - Add DefineCustomVariable(type, variable) function. New API to define a custom guc variable to open config_group and flags to developers. - Add ExplainOnePlan(outStr, queryDesc, ...) function. Enable access to EXPLAIN output by plugin modules. - Add force_instrument variable. If the value is true, executor states are always initialized with instruments. Used by auto_explain. - Add startup_hook. Called on server startup by startup process where LoadFreeSpaceMap() in 8.3 had been called. - Add shutdown_hook. Called on server shutdown by bgwriter where DumpFreeSpaceMap() in 8.3 had been called. - shared_preload_libraries are loaded by auxiliary processes. Windows port requires it. Regards, --- ITAGAKI Takahiro NTT Open Source Software Center contrib_infrastructures.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch for ALTER DATABASE WITH TABLESPACE
Tom Lane a écrit : Bernd Helmle [EMAIL PROTECTED] writes: * We really should error out when trying to copy into the same tablespace the database already lives in. No, I think that should just be a no-op. We don't for instance throw error when you ALTER OWNER to the existing owner. Moreover, ALTER TABLE SET TABLESPACE is silent when a user tries to move an object to the tablespace it already belongs to. * The current implementation cannot merge a tablespace used by some database objects already, for example: Hmm --- there's more there than meets the eye. To handle that case correctly, you'd have to go into the DB's pg_class and change the recorded tablespace for those objects to zero. (Fail to do so, and you've got a mess when you move the DB to yet another tablespace.) I tend to agree that throwing an error is sufficient, as long as it's a clear error message. OK. I added a code that checks the existence of the target tablespace directory before executing copydir. If it found an empty directory, it deletes it. The error message looks like this: postgres=# alter database test set tablespace db2; ERROR: some relations are already in the target tablespace db2 HINT: You need to move them back to the default tablespace before using this command. Here is the complete test case: postgres=# create database bernd; CREATE DATABASE postgres=# create database test; CREATE DATABASE postgres=# create tablespace db1 location '/home/guillaume/postgresql_tblspc/db1'; CREATE TABLESPACE postgres=# create tablespace db2 location '/home/guillaume/postgresql_tblspc/db2'; CREATE TABLESPACE postgres=# \c test psql (8.4devel) You are now connected to database test. test=# create table foo(id integer) tablespace db2; CREATE TABLE test=# \c bernd psql (8.4devel) You are now connected to database bernd. bernd=# alter database test set tablespace db2; ERROR: some relations are already in the target tablespace db2 HINT: You need to move them back to the default tablespace before using this command. bernd=# \c test psql (8.4devel) You are now connected to database test. test=# alter table foo set tablespace pg_default; ALTER TABLE test=# \c bernd psql (8.4devel) You are now connected to database bernd. bernd=# alter database test set tablespace db2; ALTER DATABASE v4 patch attached. Thanks. -- Guillaume. http://www.postgresqlfr.org http://dalibo.com alterdb_tablespace_v4.patch.bz2 Description: application/bzip -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] No write stats in pg_statio system views
Hi, What is the reason for not having heap_blks_write kind of stats in the pg_statio system views? Is it because bgwriter does the writing (we do have bg stats there) most of the times? Wouldn't the write stats help to get the complete IO picture for the relation? Regards, Nikhils -- http://www.enterprisedb.com
Re: [HACKERS] Patch for ALTER DATABASE WITH TABLESPACE
Guillaume Lelarge a écrit : v4 patch attached. v5 patch attached. Fixes two issues : * I forgot about Bernd's advice : And i think we can avoid to call database_file_update_needed() in this case then. This is fixed. * I forgot to remove a debug ereport. Sorry about this. -- Guillaume. http://www.postgresqlfr.org http://dalibo.com alterdb_tablespace_v5.patch.bz2 Description: application/bzip -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] question about large object
Hi, I am reading code about large object of pgsql and have a question: in pg_largeobject.h: CATALOG(pg_largeobject,2613) BKI_WITHOUT_OIDS { Oid loid; /* Identifier of large object */ int4 pageno; /* Page number (starting from 0) */ bytea data; /* Data for page (may be zero-length) */ } FormData_pg_largeobject; what's mean of pageno? or what 's page of a large object refer to? is this page(pageno) refer to chunk(chunk number) of lob, as opposed to real data page? (or just one data page to store one chunk of lob) Thanks! Jiong
[HACKERS] question about large object
Hi, I am reading code about large object of pgsql and have a question: in pg_largeobject.h: CATALOG(pg_largeobject,2613) BKI_WITHOUT_OIDS { Oid loid; /* Identifier of large object */ int4 pageno; /* Page number (starting from 0) */ bytea data; /* Data for page (may be zero-length) */ } FormData_pg_largeobject; what's mean of pageno? or what 's page of a large object refer to? is this page(pageno) refer to chunk(chunk number) of lob, as opposed to real data page? (or just one data page to store one chunk of lob) Thanks! Jiong
Re: [HACKERS] regression in analyze
On Thu, Nov 6, 2008 at 9:24 AM, Matteo Beccati [EMAIL PROTECTED] wrote: Hi, Attached test shows a regression in analyze command. Expected rows in an empty table is 2140 even after an ANALYZE is executed Doesn't seem to be a regression to me, as I've just checked that 8.0 did behave the same. However the question also was raised a few days ago on the italian mailing list and I couldn't find a reasonable explanation for it. mmm yeah! i'm seeing the same at 8.3 too :( -- Atentamente, Jaime Casanova Soporte y capacitación de PostgreSQL Asesoría y desarrollo de sistemas Guayaquil - Ecuador Cel. +59387171157 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pointer scope and memory contexts
Tim Keitt [EMAIL PROTECTED] writes: [questions] Switching memory contexts, in itself, only switches which context a bare palloc() will allocate from (as opposed to MemoryContextAlloc). It cannot have some magic impact on the validity of existing pointers. One last question: if I call SPI_finish, on the first call, do I need to switch contexts in the per-call section? (I saw some example code that suggested one needs to switch contexts back to multi_call_memory_ctx after SPI_finish.) I believe SPI_finish will switch back to the context that was current when SPI_connect was called. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] plperl needs upgrade for Fedora 10
Andrew Dunstan [EMAIL PROTECTED] writes: OK, I have got to the bottom of this. It appears that the Fedora people have for some reason best known to themselves decided to stop bundling the ExtUtils::Embed module with base perl, as it was before. That's been true since F-9, so I'm not quite sure why Pavel's build only broke at F-10. FWIW the postgresql Fedora RPMs have BuildRequires: perl(ExtUtils::MakeMaker) glibc-devel bison flex autoconf gawk BuildRequires: perl(ExtUtils::Embed), perl-devel The extra Requires for MakeMaker has been there even longer. Meanwhile, I think we should make our call to it in the config file more robust, so we detect the call failure. +1. Would be a good idea to check for MakeMaker too. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] The suppress_redundant_updates_trigger() works incorrectly
KaiGai Kohei [EMAIL PROTECTED] writes: Andrew Dunstan wrote: Wouldn't this omit comparing the null bitmap? Oops, I added the comparison of null bitmap here. That's really, really ugly code. Why would it be necessary anyway? Shouldn't the security tag be expected to match? I suppose that it should be possible to alter a security tag with UPDATE, and that means it cannot work the way OID does anyway. In a sane implementation the field would already be valid before the triggers fire. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [WIP] In-place upgrade
Robert Haas [EMAIL PROTECTED] writes: To spell this out in more detail: Suppose page 123 is a V3 page containing 6 tuples A, B, C, D, E, and F. We examine the page and determine that if we convert this to a V4 page, only five tuples will fit. So we need to get rid of one of the tuples. We begin a transaction and choose F as the victim. Searching the FSM, we discover that page 456 is a V4 page with available free space. We pin and lock pages 123 and 456 just as if we were doing a heap_update. We create F', the V4 version of F, and write it onto page 456. We set xmax on the original F. We peform the corresponding index updates and commit the transaction. Time passes. Eventually F becomes dead. We reclaim the space previously used by F, and page 123 now contains only 5 tuples. This is exactly what we needed in order to convert page F to a V4 page, so we do. That's all fine and dandy, except that it presumes that you can perform SELECT/UPDATE/DELETE on V3 tuple versions; you can't just pretend that A-E aren't there until they get converted. Which is exactly the overhead we were looking to avoid. (Another small issue is exactly when you convert the index entries, should you be faced with an upgrade that requires that.) regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Inefficiency in InitIndexFreeSpaceMap
Heikki Linnakangas [EMAIL PROTECTED] writes: Tom Lane wrote: Why is InitIndexFreeSpaceMap coded to test for the FSM file already existing? AFAICS it cannot yet exist and it should be an error anyway if it does. Hmm. The FSM file can exist, if the index isn't created anew, but truncated and rebuilt. However, we normally create a new relfilenode in that case, so the only place where that actually happens is with a temporary ON COMMIT DELETE ROWS table. Hm. I would say that the brokenness in RelationTruncateIndexes is that it truncates the main fork and not the others. This is unlike other places that do such things. The smgrexists probe is hardly free, so losing it would be good. Well, it's only done in index build, so I'm not too worried. See Kevin Grittner's gripe about the speed of temp table creation. I'm already worried that the FSM changes will have a huge negative impact on that. Adding extra filesystem operations that don't have to be there doesn't help. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Synchronous replication patch v1
On Thu, Nov 6, 2008 at 2:12 PM, Fujii Masao [EMAIL PROTECTED] wrote: If the database whose timeline is the same as the primary's exists in the standby, 2)3) getting new online-backup is not necessary. For example, after the standby falls down, the database at that time is applicable to restart it. If I remember correctly, when postgres finishes its recovery, it increments the timeline. If this is true, whenever ACT fails and SBY becomes primary, SBY would increment its timeline. So when the former ACT comes back and joins the replication as SBY, would it need to get a fresh backup before it can join as SBY ? Thanks, Pavan -- Pavan Deolasee EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Timing problem in DROP TABLESPACE?
I've run regression tests many, many times recently, usually in the form of a looping installcheck, rather than just make check. On a recent test the last command of the last test has failed: DROP TABLESPACE testspace; ERROR: tablespace testspace is not empty The directory was confirmed as non-empty, though when running the command again it worked successfully. However, I've run it many times and it hasn't happened at all, so it's sporadic and fairly rare. When I isolate just that test case in a tight loop, I've seen no problems. Now I'm not sure what causes that and although I am working on patches I don't think I'm causing it myself. This error happens on the *master*, not on the standby server. Is there a timing problem? I see that we attempt to do a checkpoint to clean up deleted files. Is that working? Is it ignoring certain messages? -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] EnableDisableTrigger Cleanup Questions
Jonah H. Harris [EMAIL PROTECTED] writes: While working on the join elimination patch, I was going through the trigger code and found quite a bit of nastiness in regard to naming and variable repurposing related to the addition of replication roles in 8.3. The most obvious issue is that tgenabled was switched from a bool to char to support replication roles. From a naming standpoint, the term enabled generally implies boolean and is fairly consistently used as such in other functions within the core. My initial preference would be to return tgenabled to its original boolean for use only in enabling/disabling triggers. It would have been useful to make this criticism before 8.3 was released. I don't think it's reasonable to change it now. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pointer scope and memory contexts
On Thu, Nov 6, 2008 at 7:01 AM, Tim Keitt [EMAIL PROTECTED] wrote: I am working on a set returning function and have a question about switching memory contexts. Basically, what I want to know is whether memory allocated in one context can be referenced when a different context is current. You can safely refer the memory allocated in a different context as long as the other memory context is still alive. Thanks, Pavan -- Pavan Deolasee EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_dump roles support [Review]
Hi, Thanks for your review. I created an updated patch according to your notices. 1 - Patch does not apply cleanly on latest git repository, although there is no hunk failed but there are some hunk succeeded messages. Rebased to the current HEAD. 2- Patch contains unnecessary spaces and tabs which makes the patch unnecessarily big. IMHO please read the patch before sending and make sure that patch only contains the changes you intended to send. Yes, there were trailing whitespaces in the original files which were removed by the previous patch. The attached version leaves them as is. 3 - We should follow the coding standards of existing code I tried, of course, but this escaped my observation. 4 - pg_restore gives error wile restoring custom format backup 5 - Do you really want to write this code like this Fixed. I also need some feedback about the role support in pg_restore (not implemented yet). Currently pg_restore sets the role during the restore process according to the TOC entry in the archive. It may also support the --role option (just like pg_dump). If specified it can be used to cancel the effect of the TOC entry and force the emitting of the SET ROLE ... command. With emtpy argument it can be used to omit the SET ROLE even if it is specified in the archieve. What do you think? Thank you again. doc/src/sgml/ref/pg_dump.sgml| 16 ++ doc/src/sgml/ref/pg_dumpall.sgml | 15 + src/bin/pg_dump/pg_backup.h |2 + src/bin/pg_dump/pg_backup_archiver.c | 36 +- src/bin/pg_dump/pg_dump.c| 53 ++ src/bin/pg_dump/pg_dumpall.c | 23 ++ 6 files changed, 143 insertions(+), 2 deletions(-) diff --git a/doc/src/sgml/ref/pg_dump.sgml b/doc/src/sgml/ref/pg_dump.sgml index 2e30906..de139c3 100644 --- a/doc/src/sgml/ref/pg_dump.sgml +++ b/doc/src/sgml/ref/pg_dump.sgml @@ -698,6 +698,22 @@ PostgreSQL documentation /para /listitem /varlistentry + + varlistentry + termoption--role=replaceable class=parameterrolename/replaceable/option/term + listitem + para +Specifies the user identifier used by the dump session. This cause +applicationpg_dump/application to issue a +commandSET ROLE TO replaceable class=parameterrolename/replaceable/command +command just after a successful database connection. It is useful in cases when +the logged in user specified by the -U option has not enough privileges needed by +applicationpg_dump/application but can switch to a role with the needed rights. +The SET ROLE command is reserved in the archive because most of the time this +user identifier also needed for the restore to succeed. + /para + /listitem + /varlistentry /variablelist /para /refsect1 diff --git a/doc/src/sgml/ref/pg_dumpall.sgml b/doc/src/sgml/ref/pg_dumpall.sgml index ec40890..e3016cd 100644 --- a/doc/src/sgml/ref/pg_dumpall.sgml +++ b/doc/src/sgml/ref/pg_dumpall.sgml @@ -417,6 +417,21 @@ PostgreSQL documentation /para /listitem /varlistentry + + varlistentry + termoption--role=replaceable class=parameterrolename/replaceable/option/term + listitem + para +Specifies the user identifier used by the dump session. This option is passed +to applicationpg_dump/ too and cause these applications to issue a +commandSET ROLE TO replaceable class=parameterrolename/replaceable/command +command just after a successful database connection. It is useful in cases when +the logged in user specified by the -U option has not enough privileges needed by +applicationpg_dumpall/application but can switch to a role with the needed rights. +The SET ROLE command is reserved in the archive by applicationpg_dump/application. + /para + /listitem + /varlistentry /variablelist /para /refsect1 diff --git a/src/bin/pg_dump/pg_backup.h b/src/bin/pg_dump/pg_backup.h index c57bb22..cbe4d46 100644 --- a/src/bin/pg_dump/pg_backup.h +++ b/src/bin/pg_dump/pg_backup.h @@ -70,6 +70,8 @@ typedef struct _Archive int encoding; /* libpq code for client_encoding */ bool std_strings; /* standard_conforming_strings */ + const char *rolename; /* role name */ + /* error handling */ bool exit_on_error; /* whether to exit on SQL errors... */ int n_errors; /* number of errors (if no die) */ diff --git a/src/bin/pg_dump/pg_backup_archiver.c b/src/bin/pg_dump/pg_backup_archiver.c index 7bd44f2..53a469d 100644 --- a/src/bin/pg_dump/pg_backup_archiver.c +++ b/src/bin/pg_dump/pg_backup_archiver.c @@ -56,6 +56,7 @@ static void _selectOutputSchema(ArchiveHandle *AH, const char *schemaName); static void _selectTablespace(ArchiveHandle *AH, const char *tablespace); static void processEncodingEntry(ArchiveHandle *AH, TocEntry *te); static void
Re: [HACKERS] SE-PostgreSQL wiki updates, but ...
KaiGai Kohei wrote: KaiGai Kohei wrote: Simon Riggs wrote: On Tue, 2008-10-21 at 18:48 +0900, KaiGai Kohei wrote: I started to rework the SE-PostgreSQL documentation to catch up the latest implementation, because the existing PDF documents are a bit legacy to be updated. In addition, I moved them to wiki site for easier future updates. http://code.google.com/p/sepgsql/wiki/TheSepgsqlDocument I've forwarded this on as promised, with request for comments. I'm now reworking it at: http://wiki.postgresql.org/wiki/SEPostgreSQL Now, I completed the chapter 1, 2 and 6, and working for the chapter 3. If you can comment anything in this state, please feel free to tell me. http://wiki.postgresql.org/wiki/SEPostgreSQL Now, I completed the planned works. If you can comment anything, please feel free to tell me or update them. Thanks, -- OSS Platform Development Division, NEC KaiGai Kohei [EMAIL PROTECTED] -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] question about large object
On Thu, 6 Nov 2008, xie jiong [EMAIL PROTECTED] writes: what's mean of pageno? or what 's page of a large object refer to? is this page(pageno) refer to chunk(chunk number) of lob, as opposed to real data page? (or just one data page to store one chunk of lob) Checked the explanation[1] in the documentation? Regards. [1] http://www.postgresql.org/docs/current/static/catalog-pg-largeobject.html -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] The suppress_redundant_updates_trigger() works incorrectly
Tom Lane wrote: KaiGai Kohei [EMAIL PROTECTED] writes: Andrew Dunstan wrote: Wouldn't this omit comparing the null bitmap? Oops, I added the comparison of null bitmap here. That's really, really ugly code. Why would it be necessary anyway? Shouldn't the security tag be expected to match? I suppose that it should be possible to alter a security tag with UPDATE, and that means it cannot work the way OID does anyway. In a sane implementation the field would already be valid before the triggers fire. OK, I'll put a code to preserve it somewhere prior to triggers fire. # Maybe, ExecBRUpdateTriggers() However, I wonder if the oid field should be also preserved at same place, not inside a specific trigger function. What is your opinion? Thanks, -- KaiGai Kohei [EMAIL PROTECTED] -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [WIP] In-place upgrade
Tom Lane napsal(a): Robert Haas [EMAIL PROTECTED] writes: To spell this out in more detail: Suppose page 123 is a V3 page containing 6 tuples A, B, C, D, E, and F. We examine the page and determine that if we convert this to a V4 page, only five tuples will fit. So we need to get rid of one of the tuples. We begin a transaction and choose F as the victim. Searching the FSM, we discover that page 456 is a V4 page with available free space. We pin and lock pages 123 and 456 just as if we were doing a heap_update. We create F', the V4 version of F, and write it onto page 456. We set xmax on the original F. We peform the corresponding index updates and commit the transaction. Time passes. Eventually F becomes dead. We reclaim the space previously used by F, and page 123 now contains only 5 tuples. This is exactly what we needed in order to convert page F to a V4 page, so we do. That's all fine and dandy, except that it presumes that you can perform SELECT/UPDATE/DELETE on V3 tuple versions; you can't just pretend that A-E aren't there until they get converted. Which is exactly the overhead we were looking to avoid. We want to avoid overhead on V$lastest$ tuples, but I guess small performance gap on old tuple is acceptable. The only way (which I see now) how it should work is to have multi page version processing. And old tuple will be converted when PageGetHepaTuple will be called. However, how Heikki mentioned tuple and page conversion is basic and same for all upgrade method and it should be done first. Zdenek -- Zdenek Kotala Sun Microsystems Prague, Czech Republic http://sun.com/postgresql -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] The suppress_redundant_updates_trigger() works incorrectly
KaiGai Kohei [EMAIL PROTECTED] writes: However, I wonder if the oid field should be also preserved at same place, not inside a specific trigger function. Possibly. I wasn't planning to mess with it now; but if you've fixed the other problems with assigning to a system column then maybe we should allow it for OIDs too. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [WIP] In-place upgrade
That's all fine and dandy, except that it presumes that you can perform SELECT/UPDATE/DELETE on V3 tuple versions; you can't just pretend that A-E aren't there until they get converted. Which is exactly the overhead we were looking to avoid. I don't understand this comment at all. Unless you have some sort of magical wand in your back pocket that will instantaneously transform the entire database, there is going to be a period of time when you have to cope with both V3 and V4 pages. ISTM that what we should be talking about here is: (1) How are we going to do that in a way that imposes near-zero overhead once the entire database has been converted? (2) How are we going to do that in a way that is minimally invasive to the code? (3) Can we accomplish (1) and (2) while still retaining somewhat reasonable performance for V3 pages? Zdenek's initial proposal did this by replacing all of the tuple header macros with functions that were conditionalized on page version. I think we agree that's not going to work. That doesn't mean that there is no approach that can work, and we were discussing possible ways to make it work upthread until the thread got hijacked to discuss the right way of handling page expansion. Now that it seems we agree that a transaction can be used to move tuples onto new pages, I think we'd be well served to stop talking about page expansion and get back to the original topic: where and how to insert the hooks for V3 tuple handling. (Another small issue is exactly when you convert the index entries, should you be faced with an upgrade that requires that.) Zdenek set out his thoughts on this point upthread, no need to rehash here. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] EnableDisableTrigger Cleanup Questions
On Thu, Nov 6, 2008 at 9:01 AM, Tom Lane [EMAIL PROTECTED] wrote: It would have been useful to make this criticism before 8.3 was released. I don't think it's reasonable to change it now. Well, I didn't have time to review code back in the 8.3 days, and ugly is ugly regardless of when it was originally committted. I'm not saying it needs to be an 8.4 fix, just that as a whole, several of the components of that patch (including rewrite) seem to be a little hackish and that they could be cleaned up in 8.5. I would imagine someone will be working on trigger-related code in 8.5, and just thought it would be nice to clean it up if one had the time to do so. -- Jonah H. Harris, Senior DBA myYearbook.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Timing problem in DROP TABLESPACE?
Simon Riggs wrote: I've run regression tests many, many times recently, usually in the form of a looping installcheck, rather than just make check. On a recent test the last command of the last test has failed: DROP TABLESPACE testspace; ERROR: tablespace testspace is not empty The directory was confirmed as non-empty, though when running the command again it worked successfully. Maybe it is failing due to files that are scheduled to be deleted by next checkpoint? If it runs a checkpoint internally, perhaps there's a race condition in the code that waits until the files are gone. -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] EnableDisableTrigger Cleanup Questions
Jonah H. Harris [EMAIL PROTECTED] writes: On Thu, Nov 6, 2008 at 9:01 AM, Tom Lane [EMAIL PROTECTED] wrote: It would have been useful to make this criticism before 8.3 was released. I don't think it's reasonable to change it now. Well, I didn't have time to review code back in the 8.3 days, and ugly is ugly regardless of when it was originally committted. I'm not saying it needs to be an 8.4 fix, just that as a whole, several of the components of that patch (including rewrite) seem to be a little hackish and that they could be cleaned up in 8.5. I have no objection to cleaning up the backend internals, but system catalog definitions are client-visible. I don't think we should thrash the catalog definitions for minor aesthetic improvements. Since 8.3 is already out, that means client-side code (like pg_dump and psql, and probably other programs we don't control) is going to have to deal with the existing definition for the foreseeable future. Dealing with this definition *and* a slightly cleaner one isn't a net improvement from the client standpoint. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Timing problem in DROP TABLESPACE?
Alvaro Herrera [EMAIL PROTECTED] writes: Simon Riggs wrote: On a recent test the last command of the last test has failed: DROP TABLESPACE testspace; ERROR: tablespace testspace is not empty Maybe it is failing due to files that are scheduled to be deleted by next checkpoint? If it runs a checkpoint internally, perhaps there's a race condition in the code that waits until the files are gone. The buildfarm has shown this type of error occasionally, though AFAIR only on Windows boxen. I had assumed it was a Windows-specific issue. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Bitmap index - first look
http://archives.postgresql.org/message-id/[EMAIL PROTECTED] 1) Sometimes index doesn't find all matching rows: postgres=# SELECT * FROM qq WHERE t ='asd'; i | t ---+- 2 | asd 1 | asd 2 | asd (3 rows) postgres=# SET enable_seqscan=off; SET postgres=# SELECT * FROM qq WHERE t ='asd'; i | t ---+- 2 | asd (1 row) How to reproduce: DROP TABLE IF EXISTS qq; CREATE TABLE qq ( i int, t text ); INSERT INTO qq VALUES (1, 'qwe'); INSERT INTO qq VALUES (2, 'asd'); CREATE INDEX qqidx ON qq USING bitmap (i,t); INSERT INTO qq VALUES (1, 'asd'); INSERT INTO qq VALUES (2, 'asd'); SELECT * FROM qq; SELECT * FROM qq WHERE t ='asd'; SET enable_seqscan=off; SELECT * FROM qq WHERE t ='asd'; 2) Why is pg_am.amstrategies set to 5 while index supports only equal operation? 3) Typo in bmbulkdelete: /* allocate stats if first time through, else re-use existing struct */ if (result == NULL) result = (IndexBulkDeleteResult *) palloc0(sizeof(IndexBulkDeleteResult)); result = (IndexBulkDeleteResult *) palloc0(sizeof(IndexBulkDeleteResult)); 'result' is allocated twice. 4) Bitmap index is marked with pg_am.amcanorder = 'f', so you don't need to support ammarkpos/amrestrpos - see http://archives.postgresql.org/pgsql-hackers/2008-10/msg00862.php -- Teodor Sigaev E-mail: [EMAIL PROTECTED] WWW: http://www.sigaev.ru/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Upgrading Postgres versions question
Hello all, ( I posted already in pgsql-general and got no replies. ) I am in the process of updating my DB on Postgres 8.1.11 to 8.3.4. I also use Slony 1.2.14 for replication. Is there a safe path on how to accomplish this, please advice on what steps I will need to consider. Bear in mind that I am planning to skip from Postgres 8.1.x to 8.3.x ( without goint into the intermediate version 8.2.x ) and I use Slony to replicate my production DB into two more boxes simultaneously. Thanks, Tony Fernandez
[HACKERS] patch to fix client only builds
I'm trying to do client only builds on a bunch of legacy platforms and noticed that the include path is messed up...if keywords.o is not already built, it fails to build be because src/backend/parser but not src/backend is in the include path. (keywords.c includes parser/gram.h). The following fixes it. Probably not the right thing exactly but it works: Index: Makefile === RCS file: /projects/cvsroot/pgsql/src/backend/parser/Makefile,v retrieving revision 1.48 diff -r1.48 Makefile 13c13 override CPPFLAGS := -I$(srcdir) $(CPPFLAGS) --- override CPPFLAGS := -I$(subdir) -I.. $(CPPFLAGS) This would be a nice backpatch to 8.3 (and possibly earlier, I didn't check). merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] RAM-only temporary tables
Alvaro Herrera [EMAIL PROTECTED] wrote: Kevin Grittner wrote: An idea for a possible enhancement to PostgreSQL: allow creation of a temporary table without generating any disk I/O. (Creating and dropping a three-column temporary table within a database transaction currently generates about 150 disk writes). Most of these are catalog updates. A trace of WAL logs including only heap inserts says that to create a temp table with 3 columns (2 int, 1 text) and no indexes there are this many inserts: 3 1247 (pg_type) 20 1249 (pg_attribute) 3 1259 (pg_class) 7 2608 (pg_depend) 1 2610 (pg_index) Note the excess of pg_attribute entries! There are 3 in the table, 3 in the toast table, and then there are 14 extra attrs which are for system columns (7 for the main table, 7 for the toast table). Just getting rid of pg_attribute entries for those would probably prove to be an importante gain. (Don't forget the index updates for each of those heap inserts; for pg_type it's 2 btree inserts for each index insert.) If you do this, you've shaved 42 of those 150 writes. Note that the 150 disk writes were for the CREATE and the DROP. Does that mean that we'd actually shave 84 of 150 writes? Also, if you're looking to account for all the writes, it's worth noting that my test declared a one-column primary key (on an integer column) in the CREATE TEMPORARY TABLE statement. In suggesting this enhancement, my hope is that each session could check for a referenced table as a temporary in RAM before going to the system tables, in a manner vaguely similar to how space reserved by the temp_buffers GUC is used for temp table data. I, of course, am suggesting this from a position of blissful ignorance of the actual complexity of making such a change. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] plperl needs upgrade for Fedora 10
Tom Lane escribió: Andrew Dunstan [EMAIL PROTECTED] writes: OK, I have got to the bottom of this. It appears that the Fedora people have for some reason best known to themselves decided to stop bundling the ExtUtils::Embed module with base perl, as it was before. That's been true since F-9, so I'm not quite sure why Pavel's build only broke at F-10. FWIW the postgresql Fedora RPMs have BuildRequires: perl(ExtUtils::MakeMaker) glibc-devel bison flex autoconf gawk BuildRequires: perl(ExtUtils::Embed), perl-devel The extra Requires for MakeMaker has been there even longer. Huh, but the requirement for ExtUtils::Embed is at runtime, so shouldn't it be a plain Requires instead of BuildRequires? -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] patch to fix client only builds
Merlin Moncure [EMAIL PROTECTED] writes: I'm trying to do client only builds on a bunch of legacy platforms and noticed that the include path is messed up...if keywords.o is not already built, it fails to build be because src/backend/parser but not src/backend is in the include path. (keywords.c includes parser/gram.h). Hmm, but nobody should be including gram.h directly out of backend/parser anyway. They should be getting it via the symlink in src/include/parser. I think the real problem must be that that symlink isn't being created during a client-only build? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] RAM-only temporary tables
Alvaro Herrera [EMAIL PROTECTED] wrote: Kevin Grittner wrote: Note that the 150 disk writes were for the CREATE and the DROP. Does that mean that we'd actually shave 84 of 150 writes? Hmm, you'd shave more than 42 but not 84, because index entries are not deleted until a later vacuum. (I'd say about 56 -- 42 plus the 14 heap deletions). Also, if you're looking to account for all the writes, it's worth noting that my test declared a one-column primary key (on an integer column) in the CREATE TEMPORARY TABLE statement. That probably makes up for the extra few writes that I didn't see in my quick test. It sounds like you were counting the 8kB pages pushed from the PostgreSQL cache to the OS cache, and I was counting the 1kB blocks pushed from the OS cache to the RAID controller cache. By watching vmstat results after pushing this to a more-or-less steady state, I was probably picking up the results of autovacuum runs, but multiple writes to a single page were often combined by the OS. If we match, it's really just a coincidence. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] plperl needs upgrade for Fedora 10
2008/11/6 Tom Lane [EMAIL PROTECTED]: Andrew Dunstan [EMAIL PROTECTED] writes: OK, I have got to the bottom of this. It appears that the Fedora people have for some reason best known to themselves decided to stop bundling the ExtUtils::Embed module with base perl, as it was before. That's been true since F-9, so I'm not quite sure why Pavel's build only broke at F-10. FWIW the postgresql Fedora RPMs have I skip F9. So I never used it. Pavel BuildRequires: perl(ExtUtils::MakeMaker) glibc-devel bison flex autoconf gawk BuildRequires: perl(ExtUtils::Embed), perl-devel The extra Requires for MakeMaker has been there even longer. Meanwhile, I think we should make our call to it in the config file more robust, so we detect the call failure. +1. Would be a good idea to check for MakeMaker too. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] plperl needs upgrade for Fedora 10
Alvaro Herrera wrote: Tom Lane escribió: Andrew Dunstan [EMAIL PROTECTED] writes: OK, I have got to the bottom of this. It appears that the Fedora people have for some reason best known to themselves decided to stop bundling the ExtUtils::Embed module with base perl, as it was before. That's been true since F-9, so I'm not quite sure why Pavel's build only broke at F-10. FWIW the postgresql Fedora RPMs have BuildRequires: perl(ExtUtils::MakeMaker) glibc-devel bison flex autoconf gawk BuildRequires: perl(ExtUtils::Embed), perl-devel The extra Requires for MakeMaker has been there even longer. Huh, but the requirement for ExtUtils::Embed is at runtime, so shouldn't it be a plain Requires instead of BuildRequires? No. We need ExtUtils::Embed to get the linkage flags for the build. See config/perl.m4 cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ARRAY vars (was Enable pl/python to return records based on multiple OUT params)
On Tue, Nov 4, 2008 at 4:17 PM, Hannu Krosing [EMAIL PROTECTED] wrote: One open question is how to translate arrays with non-default subscript values Quote: Subscripted assignment allows creation of arrays that do not use one-based subscripts. For example one might assign to myarray[-2:7] to create an array with subscript values running from -2 to 7. Should I just shift it to standard python tuple, or would it be better to return it as a dictionary with keys from -2 to 7 I think changing the base type is bound to cause issues. For example, suppose someone expects to be able to simply iterate over the array. If they're assuming it's a list, they will expect the values to be returned. If it's a dictionary, the keys will be. If you're going to do that, you'd need to do a custom dict class that iterated over the values I think. David Blewett -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] RAM-only temporary tables
Kevin Grittner [EMAIL PROTECTED] writes: Note that the 150 disk writes were for the CREATE and the DROP. Does that mean that we'd actually shave 84 of 150 writes? It really shouldn't be the case that each system catalog tuple insertion generates a separate write --- especially not for multiple insertions into the same catalog, which we could expect to go into the same page or few pages. I think a large fraction of the writes you're measuring are coming from the file create/unlink operations. It would certainly be important to identify where the bulk of the cost *really* is before we start expending effort on a solution. In suggesting this enhancement, my hope is that each session could check for a referenced table as a temporary in RAM before going to the system tables, in a manner vaguely similar to how space reserved by the temp_buffers GUC is used for temp table data. This isn't very workable. For one thing, client-side operations such as psql's \dt still need to see catalog entries for temp tables. There's been some handwaving about keeping catalog entries for temp tables in temp children of the main system catalogs, but it hasn't got past the handwaving stage. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Bitmap index - first look
One more point on pg_am: amsearchnull is equal to f however the index stores and could find nulls perfectly. Regards, Vladimir Sitnikov
Re: [HACKERS] RAM-only temporary tables
Kevin Grittner wrote: It sounds like you were counting the 8kB pages pushed from the PostgreSQL cache to the OS cache, and I was counting the 1kB blocks pushed from the OS cache to the RAID controller cache. By watching vmstat results after pushing this to a more-or-less steady state, I was probably picking up the results of autovacuum runs, but multiple writes to a single page were often combined by the OS. If we match, it's really just a coincidence. I was counting WAL inserts actually, so logical database operations. I left out storage items (filesystem actions), so it was mainly just catalog changes. -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch for ISO-8601-Interval Input and output.
Brendan Jurd wrote: I've applied them with a couple minor changes. * If ISO 8601 5.5.3.1.d's statement The designator T shall be absent if all of the time components are absent. also applies to 5.5.4.2.2; then I think the 'T' needed to be inside the optional tags, so I moved it there. The link to the spec's below[1]. Hmm, okay. When I was running my tests in psql I came away with the impression that the T was required in the alternative format. I might be mistaken. I'll run some further tests a little later on. Indeed that's a bug in my code; where I was sometimes requiring the 'T' (in the ISO8601 alternative format) and sometimes not (in the ISO8601 format from 5.5.4.2.1). Below's a test case. If I read the spec[1] right both of those should mean 1 day. I'll update git and post a new patch now. If people think I read the specs wrong, I'll undo this change and fix the docs. == [2]lt:/home/ramayer/proj/pg% ./psql regression psql (8.4devel) Type help for help. regression=# select interval 'P1D'; interval -- 1 day (1 row) regression=# select interval 'P-00-01'; ERROR: invalid input syntax for type interval: P-00-01 LINE 1: select interval 'P-00-01'; ^ == [1] http://isotc.iso.org/livelink/livelink/4021199/ISO_8601_2004_E.zip?func=doc.Fetchnodeid=4021199 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] RAM-only temporary tables
Alvaro Herrera [EMAIL PROTECTED] writes: Right -- I don't expect we can make use of such an idea readily. Not creating unnecessary pg_attribute entries for system columns is probably a lot easier to do. I seem to recall having proposed that in the past, and getting shot down on the basis that clients might be depending on those pg_attribute entries being there. I'm not sure how big a risk there really is --- most of the code I've seen explicitly selects attnum 0 --- but it's a consideration. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] plperl needs upgrade for Fedora 10
Alvaro Herrera [EMAIL PROTECTED] writes: Tom Lane escribió: BuildRequires: perl(ExtUtils::MakeMaker) glibc-devel bison flex autoconf gawk BuildRequires: perl(ExtUtils::Embed), perl-devel Huh, but the requirement for ExtUtils::Embed is at runtime, so shouldn't it be a plain Requires instead of BuildRequires? Really? I'm pretty sure I recall the RPM build failing when they changed that. (But it's possible that the regression test step is what failed, I don't remember.) I'd think I'd have heard about it if there were a missing runtime dependency. BTW, Andrew was wondering *why* Fedora isn't bundling these anymore. The CVS logs mention something about versioned perl, so it's possible that they had to split out some modules to support multiple Perl installations cleanly. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] plperl needs upgrade for Fedora 10
Tom Lane escribió: Alvaro Herrera [EMAIL PROTECTED] writes: Tom Lane escribi�: BuildRequires: perl(ExtUtils::MakeMaker) glibc-devel bison flex autoconf gawk BuildRequires: perl(ExtUtils::Embed), perl-devel Huh, but the requirement for ExtUtils::Embed is at runtime, so shouldn't it be a plain Requires instead of BuildRequires? Really? I'm pretty sure I recall the RPM build failing when they changed that. Actually, seeing Andrew's response I think I'm probably wrong. -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch for ISO-8601-Interval Input and output.
Ron Mayer wrote: Brendan Jurd wrote: 'T' ... optional Indeed that's a bug in my code; where I was sometimes requiring the 'T' (in the ISO8601 alternative format) and sometimes not (in the ISO8601 format from 5.5.4.2.1). Below's a test case. If I read the spec[1] right both of those should mean 1 day. I'll update git and post a new patch now. If people think I read the specs wrong, I'll undo this change and fix the docs. I think I updated the web site and git now, and 'P-00-01' is now accepted. It might be useful if someone double checked my reading of the spec, tho. [1] http://isotc.iso.org/livelink/livelink/4021199/ISO_8601_2004_E.zip?func=doc.Fetchnodeid=4021199 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] RAM-only temporary tables
Tom Lane [EMAIL PROTECTED] wrote: I think a large fraction of the writes you're measuring are coming from the file create/unlink operations. It would certainly be important to identify where the bulk of the cost *really* is before we start expending effort on a solution. Any ideas on a good way to gather that information? Given the temp_buffers space, would it make sense to defer the creation of the actual file until there is actually a need to spill data to the disk? -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] EnableDisableTrigger Cleanup Questions
On Thu, Nov 6, 2008 at 10:08 AM, Tom Lane [EMAIL PROTECTED] wrote: I have no objection to cleaning up the backend internals, but system catalog definitions are client-visible. I don't think we should thrash the catalog definitions for minor aesthetic improvements. Since 8.3 is already out, that means client-side code (like pg_dump and psql, and probably other programs we don't control) is going to have to deal with the existing definition for the foreseeable future. Dealing with this definition *and* a slightly cleaner one isn't a net improvement from the client standpoint. Well, it didn't seem like anyone had an issue changing the definition at 8.3 time. As for pg_dump/psql, those changes are fairly simple. And, there aren't that many PG utilities out there. PGAdmin looks like it would require a 1-3 line change (depending on coding preferences) and I don't see anything that checks it in Slony. I'm fine with cleaning up the internal-side, I just don't think there's that much relying on tgenabled. In fact, Google code search seems to show more things relying on a boolean tgenabled rather than the current implementation. Oh well, it was just a thought. -- Jonah H. Harris, Senior DBA myYearbook.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] patch to fix client only builds
On Thu, Nov 6, 2008 at 11:09 AM, Tom Lane [EMAIL PROTECTED] wrote: Merlin Moncure [EMAIL PROTECTED] writes: I'm trying to do client only builds on a bunch of legacy platforms and noticed that the include path is messed up...if keywords.o is not already built, it fails to build be because src/backend/parser but not src/backend is in the include path. (keywords.c includes parser/gram.h). Hmm, but nobody should be including gram.h directly out of backend/parser anyway. They should be getting it via the symlink in src/include/parser. I think the real problem must be that that symlink isn't being created during a client-only build? ah, correct. the symlink is setup in src/backend/Makefile psql Makefile pulls in submake-backend to grab keywords.o. submake-backend: $(MAKE) -C $(top_builddir)/src/backend/parser keywords.o so psql build is trying to build directly in backend which breaks (so is pg_dump, the offender is a backend dependency in pg_dump from symlinked in dumputils.c). This looks caused by a change to clean up some quoting issues: revision 1.36 date: 2007/06/18 21:40:58; author: tgl; state: Exp; lines: +18 -5 Arrange for quote_identifier() and pg_dump to not quote keywords that are unreserved according to the grammar. The list of unreserved words has gotten extensive enough that the unnecessary quoting is becoming a bit of an eyesore. To do this, add knowledge of the keyword category to keywords.c's table. (Someday we might be able to generate keywords.c's table and the keyword lists in gram.y from a common source.) snip IMO, the client only build should be fixed, so we can: *) put the makefile hack I proposed in *) implement the keywords.c change suggested above *) set up backend/parser symlink from different/additional place. merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] per-database locale: createdb switches
Hi, I just noticed that the interface for choosing a different locale at db creation time is createdb --lc-collate=X --lc-ctype=X. Is there a reason for having these two separate switches? It seems awkward; why can't we just have a single --locale switch that selects both settings at once? -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] RAM-only temporary tables
Kevin Grittner [EMAIL PROTECTED] writes: Tom Lane [EMAIL PROTECTED] wrote: I think a large fraction of the writes you're measuring are coming from the file create/unlink operations. It would certainly be important to identify where the bulk of the cost *really* is before we start expending effort on a solution. Any ideas on a good way to gather that information? I had done some preliminary trials using strace (you need to trace the active backend, the bgwriter, and the wal writer process to be sure you see everything going on). However it's difficult to tell how much physical I/O results from the create or unlink syscalls. It might be interesting to make a test program that just creates 4000 files and then removes them again, and see what sort of load you see from that. Given the temp_buffers space, would it make sense to defer the creation of the actual file until there is actually a need to spill data to the disk? No, because that opens us to problems with reuse of relfilenode numbers. One place that I've always wanted to look at was suppressing the creation of a btree metapage until there's some useful data in the table. We managed to avoid creating a root page until there's data, but at the time avoiding the metapage seemed too invasive. (Admittedly, though, if one assumes that your real world case does involve putting some data in the tables, this wouldn't actually save anything...) regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] RAM-only temporary tables
Tom Lane wrote: One place that I've always wanted to look at was suppressing the creation of a btree metapage until there's some useful data in the table. We managed to avoid creating a root page until there's data, but at the time avoiding the metapage seemed too invasive. (Admittedly, though, if one assumes that your real world case does involve putting some data in the tables, this wouldn't actually save anything...) Agreed on the parenthised comment -- it'd be just benchmark optimization. -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] plperl needs upgrade for Fedora 10
Tom Lane wrote: Alvaro Herrera [EMAIL PROTECTED] writes: Tom Lane escribió: BuildRequires: perl(ExtUtils::MakeMaker) glibc-devel bison flex autoconf gawk BuildRequires: perl(ExtUtils::Embed), perl-devel Huh, but the requirement for ExtUtils::Embed is at runtime, so shouldn't it be a plain Requires instead of BuildRequires? Really? I'm pretty sure I recall the RPM build failing when they changed that. (But it's possible that the regression test step is what failed, I don't remember.) I'd think I'd have heard about it if there were a missing runtime dependency. BTW, Andrew was wondering *why* Fedora isn't bundling these anymore. The CVS logs mention something about versioned perl, so it's possible that they had to split out some modules to support multiple Perl installations cleanly. My F9 instance has the module, from a separate RPM package, but I'm fairly sure it came as part of the normal install. But the F10 install DVD doesn't have it at all. I had to grab it from CPAN and build/install manually to check that it was what was missing. The really bad thing about this mess is that it doesn't make the build fail - we don't get a failure until runtime, so unless the RPM build checks run the Postgres PL install checks it wouldn't be caught. I'm thinking of something like this change to config/perl.m4: Index: config/perl.m4 === RCS file: /cvsroot/pgsql/config/perl.m4,v retrieving revision 1.3 diff -c -r1.3 perl.m4 *** config/perl.m4 29 Nov 2003 19:51:17 - 1.3 --- config/perl.m4 6 Nov 2008 17:14:34 - *** *** 32,35 pgac_tmp2=`$PERL -MConfig -e 'print $Config{ccdlflags}'` perl_embed_ldflags=`echo X$pgac_tmp1 | sed s/^X//;s%$pgac_tmp2%%` AC_SUBST(perl_embed_ldflags)dnl ! AC_MSG_RESULT([$perl_embed_ldflags])]) --- 32,41 pgac_tmp2=`$PERL -MConfig -e 'print $Config{ccdlflags}'` perl_embed_ldflags=`echo X$pgac_tmp1 | sed s/^X//;s%$pgac_tmp2%%` AC_SUBST(perl_embed_ldflags)dnl ! if test -z $perl_embed_ldflags ; then ! AC_MSG_RESULT(no) ! AC_MSG_ERROR([unable to determine flags to link embedded Perl]) ! else ! AC_MSG_RESULT([$perl_embed_ldflags]) ! fi ! ])# PGAC_CHECK_PERL_EMBED_LDFLAGS cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] My review for the patch Table command
Unicron, 4. Since it is just an alternative to select * from Table, I think this feature is unneccessary. Heh. I agree, but tell that to the SQL committee. I don't think we need to argue out the merits of adding standard syntax. This patch is Ready for Code Review. --Josh -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] patch to fix client only builds
Merlin Moncure [EMAIL PROTECTED] writes: This looks caused by a change to clean up some quoting issues: No, that patch is unrelated --- it didn't modify the inclusion situation at all. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] plperl needs upgrade for Fedora 10
Andrew Dunstan [EMAIL PROTECTED] writes: I'm thinking of something like this change to config/perl.m4: ! if test -z $perl_embed_ldflags ; then ! AC_MSG_RESULT(no) ! AC_MSG_ERROR([unable to determine flags to link embedded Perl]) Hm, is it certain that empty is never a valid value for $perl_embed_ldflags? In any case I'm a bit confused how this fixes the problem --- it looks like the test is just relying on Config not Embed. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] patch to fix client only builds
On Thu, Nov 6, 2008 at 12:26 PM, Tom Lane [EMAIL PROTECTED] wrote: Merlin Moncure [EMAIL PROTECTED] writes: This looks caused by a change to clean up some quoting issues: No, that patch is unrelated --- it didn't modify the inclusion situation at all. oopright againcvs annotate claims psql Makefile was modified 1.56 (dec-05)? It's not completely clear why. It doesn't really matter...the dependency is clearly there. merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] patch to fix client only builds
Merlin Moncure [EMAIL PROTECTED] writes: IMO, the client only build should be fixed, so we can: *) put the makefile hack I proposed in *) implement the keywords.c change suggested above *) set up backend/parser symlink from different/additional place. The last of these seems the correct fix. A minimal change would be something like submake-backend: + $(MAKE) -C $(top_builddir)/src/backend $(top_builddir)/src/include/parser/gram.h $(MAKE) -C $(top_builddir)/src/backend/parser keywords.o (in at least three different Makefiles: pg_dump, psql, scripts). But I can't help feeling that some Makefile-refactoring seems called for here. Peter, what do you think? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] My review for the patch Table command
Josh Berkus [EMAIL PROTECTED] writes: Unicron, 4. Since it is just an alternative to select * from Table, I think this feature is unneccessary. Heh. I agree, but tell that to the SQL committee. I don't think we need to argue out the merits of adding standard syntax. We can, however, argue the merits of making extensive documentation changes to document such a useless feature. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] plperl needs upgrade for Fedora 10
Tom Lane wrote: Andrew Dunstan [EMAIL PROTECTED] writes: I'm thinking of something like this change to config/perl.m4: ! if test -z $perl_embed_ldflags ; then ! AC_MSG_RESULT(no) ! AC_MSG_ERROR([unable to determine flags to link embedded Perl]) Hm, is it certain that empty is never a valid value for $perl_embed_ldflags? Yes. If it's empty we don't even link against libperl at all. That can't possibly be right. In any case I'm a bit confused how this fixes the problem --- it looks like the test is just relying on Config not Embed. No, we get the ldopts from Embed and then *remove* the ccldflags from Config from that string. What is left is set as perl_embed_flags, and that's what mustn't be empty. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] broken URL in commitfest page
On Wed, 2008-11-05 at 13:23 -0500, Jonah H. Harris wrote: On Wed, Nov 5, 2008 at 12:35 PM, Simon Riggs [EMAIL PROTECTED] wrote: The Join Removal item fails to point to a patch, also. I've removed that entry now. The patch was being worked on by Jonah but it looks like we didn't make the deadline. Well, what is the official deadline on it? It, like several other patches on the wiki, was a WIP. I'm hopeful that RI-based join elimination for JOIN_INNER should be ready tonight based on your and Tom's comments. I think you should post what you have now as WIP. It will make it easier to discuss the questions you raise above. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] plperl needs upgrade for Fedora 10
Andrew Dunstan [EMAIL PROTECTED] writes: No, we get the ldopts from Embed and then *remove* the ccldflags from Config from that string. What is left is set as perl_embed_flags, and that's what mustn't be empty. Got it. Sounds good then. What about the MakeMaker dependency? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] plperl needs upgrade for Fedora 10
Tom Lane wrote: Andrew Dunstan [EMAIL PROTECTED] writes: No, we get the ldopts from Embed and then *remove* the ccldflags from Config from that string. What is left is set as perl_embed_flags, and that's what mustn't be empty. Got it. Sounds good then. What about the MakeMaker dependency? The call to ldopts will fail if MakeMaker is not present, so this will cover it. It's very unlikely to be absent - it's required to build almost every Perl module known to man. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] RAM-only temporary tables
Tom Lane [EMAIL PROTECTED] wrote: it's difficult to tell how much physical I/O results from the create or unlink syscalls. It might be interesting to make a test program that just creates 4000 files We use xfs with noatime for our databases. In a fresh subdirectory of such a mountpoint: for ((i=0 ; i 4000 ; ++i)) ; do touch $i ; done causes 44,969 block writes and then removes them again, and see what sort of load you see from that. rm * causes 26,820 block writes That would make the file creation and unlink just under half the load. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [WIP] In-place upgrade
Tom Lane wrote: Bruce Momjian [EMAIL PROTECTED] writes: I envision a similar system where we have utilities to guarantee all pages have enough free space, and all pages are the current version, before allowing an upgrade-in-place to the next version. Such a consistent API will make the job for users easier and our job simpler, and with upgrade-in-place, where we have limited time and resources to code this for each release, simplicity is important. An external utility doesn't seem like the right way to approach it. For example, given the need to ensure X amount of free space in each page, the only way to guarantee that would be to shut down the database while you run the utility over all the pages --- otherwise somebody might fill some page up again. And that completely defeats the purpose, which is to have minimal downtime during upgrade. I think we can have a notion of pre-upgrade maintenance, but it would have to be integrated into normal operations. For instance, if conversion to 8.4 requires extra free space, we'd make late releases of 8.3.x not only be able to force that to occur, but also tweak the normal code paths to maintain that minimum free space. The full concept as I understood it (dunno why Bruce left all these details out of his message) went like this: Exactly. I didn't go into the implementation details to make it easer for people to see my general goals. Tom's implementation steps are the correct approach, assuming we can get agreement on the general goals. -- Bruce Momjian [EMAIL PROTECTED]http://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] plperl needs upgrade for Fedora 10
Andrew Dunstan [EMAIL PROTECTED] writes: Tom Lane wrote: What about the MakeMaker dependency? The call to ldopts will fail if MakeMaker is not present, so this will cover it. It's very unlikely to be absent - it's required to build almost every Perl module known to man. I see. I think then the error message should read something like AC_MSG_ERROR([could not determine flags for linking embedded Perl This probably means that ExtUtils::Embed or ExtUtils::MakeMaker is not installed.]) Otherwise, looks good. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] per-database locale: createdb switches
Alvaro Herrera wrote: Hi, I just noticed that the interface for choosing a different locale at db creation time is createdb --lc-collate=X --lc-ctype=X. Is there a reason for having these two separate switches? It seems awkward; why can't we just have a single --locale switch that selects both settings at once? Sometimes it's needed to use C-collate with non-C-ctype. But for most users it's enough just a locale switch. What about [--locale=X|--lc-collate=X --lc-ctype=X] option? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] RAM-only temporary tables
Kevin Grittner [EMAIL PROTECTED] writes: That would make the file creation and unlink just under half the load. Worst possible case :-( ... means that we wouldn't get much improvement without addressing both aspects. It strikes me however that this does put some urgency into the question of how much per-relation FSM is going to cost us. For short-lived temp tables the FSM is never going to have any usefulness at all, but in the current HEAD code it'll double the create/unlink load. Heikki, would it be reasonable to fix things so that a nonexistent FSM fork is semantically the same as an empty one, and not create FSM until there's actually something to put in it? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] per-database locale: createdb switches
Teodor Sigaev [EMAIL PROTECTED] writes: Alvaro Herrera wrote: It seems awkward; why can't we just have a single --locale switch that selects both settings at once? Sometimes it's needed to use C-collate with non-C-ctype. But for most users it's enough just a locale switch. What about [--locale=X|--lc-collate=X --lc-ctype=X] option? Seems to me there's one there already. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] per-database locale: createdb switches
Tom Lane wrote: Teodor Sigaev [EMAIL PROTECTED] writes: Alvaro Herrera wrote: It seems awkward; why can't we just have a single --locale switch that selects both settings at once? Sometimes it's needed to use C-collate with non-C-ctype. But for most users it's enough just a locale switch. What about [--locale=X|--lc-collate=X --lc-ctype=X] option? Seems to me there's one there already. You're thinking of initdb maybe? I'm talking about createdb. $ LC_ALL=C createdb --version createdb (PostgreSQL) 8.4devel $ LC_ALL=C createdb --help createdb creates a PostgreSQL database. Usage: createdb [OPTION]... [DBNAME] [DESCRIPTION] Options: -D, --tablespace=TABLESPACE default tablespace for the database -E, --encoding=ENCODING encoding for the database --lc-collate=LOCALE LC_COLLATE setting for the database --lc-ctype=LOCALELC_CTYPE setting for the database -O, --owner=OWNERdatabase user to own the new database -T, --template=TEMPLATE template database to copy -e, --echo show the commands being sent to the server --help show this help, then exit --versionoutput version information, then exit Connection options: -h, --host=HOSTNAME database server host or socket directory -p, --port=PORT database server port -U, --username=USERNAME user name to connect as -W, --password force password prompt By default, a database with the same name as the current user is created. Report bugs to [EMAIL PROTECTED]. -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] RAM-only temporary tables
Kevin Grittner wrote: Note that the 150 disk writes were for the CREATE and the DROP. Does that mean that we'd actually shave 84 of 150 writes? Hmm, you'd shave more than 42 but not 84, because index entries are not deleted until a later vacuum. (I'd say about 56 -- 42 plus the 14 heap deletions). Also, if you're looking to account for all the writes, it's worth noting that my test declared a one-column primary key (on an integer column) in the CREATE TEMPORARY TABLE statement. That probably makes up for the extra few writes that I didn't see in my quick test. In suggesting this enhancement, my hope is that each session could check for a referenced table as a temporary in RAM before going to the system tables, in a manner vaguely similar to how space reserved by the temp_buffers GUC is used for temp table data. I, of course, am suggesting this from a position of blissful ignorance of the actual complexity of making such a change. Right -- I don't expect we can make use of such an idea readily. Not creating unnecessary pg_attribute entries for system columns is probably a lot easier to do. The idea of uncatalogued temp tables has been suggested and rejected several times in the past. -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] per-database locale: createdb switches
Alvaro Herrera [EMAIL PROTECTED] writes: Tom Lane wrote: Seems to me there's one there already. You're thinking of initdb maybe? I'm talking about createdb. Oh, okay. But how often is someone going to be changing locales during createdb? I think the most common case might well be like Teodor said, where you need to tweak them individually anyway. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [WIP] In-place upgrade
An external utility doesn't seem like the right way to approach it. For example, given the need to ensure X amount of free space in each page, the only way to guarantee that would be to shut down the database while you run the utility over all the pages --- otherwise somebody might fill some page up again. And that completely defeats the purpose, which is to have minimal downtime during upgrade. Agreed. I think we can have a notion of pre-upgrade maintenance, but it would have to be integrated into normal operations. For instance, if conversion to 8.4 requires extra free space, we'd make late releases of 8.3.x not only be able to force that to occur, but also tweak the normal code paths to maintain that minimum free space. 1. This seems to fly in the face of the sort of thing we've traditionally back-patched. The code to make pages ready for upgrade to the next major release will not necessarily be straightforward (in fact it probably isn't, otherwise we wouldn't have insisted on a two-stage conversion process), which turns a seemingly safe minor upgrade into a potentially dangerous operation. 2. Just because I want to upgrade to 8.3.47 and get the latest bug fixes does not mean that I have any intention of upgrading to 8.4, and yet you've rearranged all of my pages to have useless free space in them (possibly at considerable and unexpected I/O cost for at least as long as the conversion is running). The second point could probably be addressed with a GUC but the first one certainly can't. 3. What about multi-release upgrades? Say someone wants to upgrade from 8.3 to 8.6. 8.6 only knows how to read pages that are 8.5-and-a-half or better, 8.5 only knows how to read pages that are 8.4-and-a-half or better, and 8.4 only knows how to read pages that are 8.3-and-a-half or better. So the user will have to upgrade to 8.3.MAX, then 8.4.MAX, then 8.5.MAX, and then 8.6. It seems to me that if there is any way to put all of the logic to handle old page versions in the new code that would be much better, especially if it's an optional feature that can be compiled in or not. Then when it's time to upgrade from 8.3 to 8.6 you could do: ./configure --with-upgrade-83 --with-upgrade-84 --with-upgrade85 but if you don't need the code to handle old page versions you can: ./configure --without-upgrade85 Admittedly, this requires making the new code capable of rearranging pages to create free space when necessary, and to be able to continue to execute queries while doing it, but ways of doing this have been proposed. The only uncertainty is as to whether the performance and code complexity can be kept manageable, but I don't believe that question has been explored to the point where we should be ready to declare defeat. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [WIP] In-place upgrade
Robert Haas wrote: The second point could probably be addressed with a GUC but the first one certainly can't. 3. What about multi-release upgrades? Say someone wants to upgrade from 8.3 to 8.6. 8.6 only knows how to read pages that are 8.5-and-a-half or better, 8.5 only knows how to read pages that are 8.4-and-a-half or better, and 8.4 only knows how to read pages that are 8.3-and-a-half or better. So the user will have to upgrade to 8.3.MAX, then 8.4.MAX, then 8.5.MAX, and then 8.6. Yes. It seems to me that if there is any way to put all of the logic to handle old page versions in the new code that would be much better, especially if it's an optional feature that can be compiled in or not. Then when it's time to upgrade from 8.3 to 8.6 you could do: ./configure --with-upgrade-83 --with-upgrade-84 --with-upgrade85 but if you don't need the code to handle old page versions you can: ./configure --without-upgrade85 Admittedly, this requires making the new code capable of rearranging pages to create free space when necessary, and to be able to continue to execute queries while doing it, but ways of doing this have been proposed. The only uncertainty is as to whether the performance and code complexity can be kept manageable, but I don't believe that question has been explored to the point where we should be ready to declare defeat. And almost guarantee that the job will never be completed, or tested fully. Remember that in-place upgrades would be pretty painless so doing multiple major upgrades should not be a difficult requiremnt, or they can dump/reload their data to skip it. -- Bruce Momjian [EMAIL PROTECTED]http://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [WIP] In-place upgrade
Tom Lane wrote: I think we can have a notion of pre-upgrade maintenance, but it would have to be integrated into normal operations. For instance, if conversion to 8.4 requires extra free space, we'd make late releases of 8.3.x not only be able to force that to occur, but also tweak the normal code paths to maintain that minimum free space. Agreed, the backend needs to be modified to reserve the space. The full concept as I understood it (dunno why Bruce left all these details out of his message) went like this: * Add a format serial number column to pg_class, and probably also pg_database. Rather like the frozenxid columns, this would have the semantics that all pages in a relation or database are known to have at least the specified format number. * There would actually be two serial numbers per release, at least for releases where pre-update prep work is involved --- for instance, between 8.3 and 8.4 there'd be an 8.3-and-a-half format which is 8.3 but known ready to update to 8.4 (eg, enough free space available). Minor releases of 8.3 that appear with or subsequent to 8.4 release understand the half format number and how to upgrade to it. * VACUUM would be empowered, in the same way as it handles frozenxid maintenance, to update any less-than-the-latest-version pages and then fix the pg_class and pg_database entries. * We could mechanically enforce that you not update until the database is ready for it by checking pg_database.datformatversion during postmaster startup. Adding catalog columns seems rather complicated, and not back-patchable. Not backpatchable means that we'd need to be sure now that the format serial numbers are enough for the upcoming 8.4-8.5 upgrade. I imagined that you would have just a single cluster-wide variable, a GUC perhaps, indicating how much space should be reserved by updates/inserts. Then you'd have an additional program, perhaps a new contrib module, that sets the variable to the right value for the version you're upgrading, and scans through all tables, moving tuples so that every page has enough free space for the upgrade. After that's done, it'd set a flag in the data directory indicating that the cluster is ready for upgrade. The tool could run concurrently with normal activity, so you could just let it run for as long as it takes. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [WIP] In-place upgrade
Robert Haas wrote: That's all fine and dandy, except that it presumes that you can perform SELECT/UPDATE/DELETE on V3 tuple versions; you can't just pretend that A-E aren't there until they get converted. Which is exactly the overhead we were looking to avoid. I don't understand this comment at all. Unless you have some sort of magical wand in your back pocket that will instantaneously transform the entire database, there is going to be a period of time when you have to cope with both V3 and V4 pages. ISTM that what we should be talking about here is: (1) How are we going to do that in a way that imposes near-zero overhead once the entire database has been converted? (2) How are we going to do that in a way that is minimally invasive to the code? (3) Can we accomplish (1) and (2) while still retaining somewhat reasonable performance for V3 pages? Zdenek's initial proposal did this by replacing all of the tuple header macros with functions that were conditionalized on page version. I think we agree that's not going to work. That doesn't mean that there is no approach that can work, and we were discussing possible ways to make it work upthread until the thread got hijacked to discuss the right way of handling page expansion. Now that it seems we agree that a transaction can be used to move tuples onto new pages, I think we'd be well served to stop talking about page expansion and get back to the original topic: where and how to insert the hooks for V3 tuple handling. I think the above is a good summary. For me, the problem with any approach that has information about prior-version block formats in the main code path is code complexity, and secondarily performance. I know there is concern that converting all blocks on read-in might expand the page beyond 8k in size. One idea Heikki had was to require some tool must be run on minor releases before a major upgrade to guarantee there is enough free space to convert the block to the current format on read-in, which would localize the information about prior block formats. We could release the tool in minor branches around the time as a major release. Also consider that there are very few releases that expand the page size. For these reasons, the expand-the-page-beyond-8k problem should not be dictating what approach we take for upgrade-in-place because there are workarounds for the problem, and the problem is rare. I would like us to again focus on converting the pages to the current version format on read-in, and perhaps a tool to convert all old pages to the new format. FYI, we are also going to need the ability to convert all pages to the current format for multi-release upgrades. For example, if you did upgrade-in-place from 8.2 to 8.3, you are going to need to update all pages to the 8.3 format before doing upgrade-in-place to 8.4; perhaps vacuum can do something like this on a per-table basis, and we can record that status a pg_class column. Also, consider that when we did PITR, we required commands before and after the tar so that there was a consistent API for PITR, and later had to add capabilities to those functions, but the user API didn't change. I envision a similar system where we have utilities to guarantee all pages have enough free space, and all pages are the current version, before allowing an upgrade-in-place to the next version. Such a consistent API will make the job for users easier and our job simpler, and with upgrade-in-place, where we have limited time and resources to code this for each release, simplicity is important. -- Bruce Momjian [EMAIL PROTECTED]http://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [WIP] In-place upgrade
And almost guarantee that the job will never be completed, or tested fully. Remember that in-place upgrades would be pretty painless so doing multiple major upgrades should not be a difficult requiremnt, or they can dump/reload their data to skip it. Regardless of what design is chosen, there's no requirement that we support in-place upgrade from 8.3 to 8.6, or even 8.4 to 8.6, in one shot. But the design that you and Tom are proposing pretty much ensures that it will be impossible. But that's certainly the least important reason not to do it this way. I think this comment from Heikki is pretty revealing: Adding catalog columns seems rather complicated, and not back-patchable. Not backpatchable means that we'd need to be sure now that the format serial numbers are enough for the upcoming 8.4-8.5 upgrade. That means, in essence, that the earliest possible version that could be in-place upgraded would be an 8.4 system - we are giving up completely on in-place upgrade to 8.4 from any earlier version (which personally I thought was the whole point of this feature in the first place). And we'll only be able to in-place upgrade to 8.5 if the unproven assumption that these catalog changes are sufficient turns out to be true, or if whatever other changes turn out to be necessary are back-patchable. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] BufferAccessStrategy for bulk insert
Robert Haas [EMAIL PROTECTED] writes: OK, here's an updated version... Applied with some small stylistic revisions. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] postgres buildfarm member dugong
Hi, I figure that you're the maintainer for the dugong Postgres buildfarm member. I noticed that this member does not run tests for 8.3 or older stable branches, only CVS HEAD. Is there a reason for this? I was just checking IA64 machines and this seems to be the only one; it would be very useful for it to run tests for older branches. Thanks, -- Alvaro Herrera http://www.amazon.com/gp/registry/5ZYLFMCVHXC The West won the world not by the superiority of its ideas or values or religion but rather by its superiority in applying organized violence. Westerners often forget this fact, non-Westerners never do. (Samuel P. Huntington) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [WIP] In-place upgrade
Bruce Momjian [EMAIL PROTECTED] writes: I envision a similar system where we have utilities to guarantee all pages have enough free space, and all pages are the current version, before allowing an upgrade-in-place to the next version. Such a consistent API will make the job for users easier and our job simpler, and with upgrade-in-place, where we have limited time and resources to code this for each release, simplicity is important. An external utility doesn't seem like the right way to approach it. For example, given the need to ensure X amount of free space in each page, the only way to guarantee that would be to shut down the database while you run the utility over all the pages --- otherwise somebody might fill some page up again. And that completely defeats the purpose, which is to have minimal downtime during upgrade. I think we can have a notion of pre-upgrade maintenance, but it would have to be integrated into normal operations. For instance, if conversion to 8.4 requires extra free space, we'd make late releases of 8.3.x not only be able to force that to occur, but also tweak the normal code paths to maintain that minimum free space. The full concept as I understood it (dunno why Bruce left all these details out of his message) went like this: * Add a format serial number column to pg_class, and probably also pg_database. Rather like the frozenxid columns, this would have the semantics that all pages in a relation or database are known to have at least the specified format number. * There would actually be two serial numbers per release, at least for releases where pre-update prep work is involved --- for instance, between 8.3 and 8.4 there'd be an 8.3-and-a-half format which is 8.3 but known ready to update to 8.4 (eg, enough free space available). Minor releases of 8.3 that appear with or subsequent to 8.4 release understand the half format number and how to upgrade to it. * VACUUM would be empowered, in the same way as it handles frozenxid maintenance, to update any less-than-the-latest-version pages and then fix the pg_class and pg_database entries. * We could mechanically enforce that you not update until the database is ready for it by checking pg_database.datformatversion during postmaster startup. So the update process would require users to install a suitably late version of 8.3, vacuum everything over a suitable maintenance window, then install 8.4, then perhaps vacuum everything again if they want to try to push page update work into specific maintenance windows. But the DB is up and functioning the whole time. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [WIP] In-place upgrade
Robert Haas [EMAIL PROTECTED] writes: That means, in essence, that the earliest possible version that could be in-place upgraded would be an 8.4 system - we are giving up completely on in-place upgrade to 8.4 from any earlier version (which personally I thought was the whole point of this feature in the first place). Quite honestly, given where we are in the schedule and the lack of consensus about how to do this, I think we would be well advised to decide right now to forget about supporting in-place upgrade to 8.4, and instead work on allowing in-place upgrades from 8.4 onwards. Shooting for a general-purpose does-it-all scheme that can handle old versions that had no thought of supporting such updates is likely to ensure that we end up with *NOTHING*. What Bruce is proposing, I think, is that we intentionally restrict what we want to accomplish to something that might be within reach now and also sustainable over the long term. Planning to update any version to any other version is *not* sustainable --- we haven't got the resources nor the interest to create large amounts of conversion code. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [WIP] In-place upgrade
Heikki Linnakangas [EMAIL PROTECTED] writes: Adding catalog columns seems rather complicated, and not back-patchable. Agreed, we'd not be able to make them retroactively appear in 8.3. I imagined that you would have just a single cluster-wide variable, a GUC perhaps, indicating how much space should be reserved by updates/inserts. Then you'd have an additional program, perhaps a new contrib module, that sets the variable to the right value for the version you're upgrading, and scans through all tables, moving tuples so that every page has enough free space for the upgrade. After that's done, it'd set a flag in the data directory indicating that the cluster is ready for upgrade. Possibly that could work. The main thing is to have a way of being sure that the prep work has been completed on every page of the database. The disadvantage of not having catalog support is that you'd have to complete the entire scan operation in one go to be sure you'd hit everything. Another thought here is that I don't think we are yet committed to any changes that require extra space between 8.3 and 8.4, are we? The proposed addition of CRC words could be put off to 8.5, for instance. So it seems at least within reach to not require any preparatory steps for 8.3-to-8.4, and put the infrastructure in place now to support such steps in future go-rounds. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Final /contrib cleanup -- yes/no?
All, Looking at my old thread I realized I never got an answer on whether people agreed with these two items: 1) Take the SET search_path=public out of all contrib SQL scripts so that DBAs can determine the correct schema by using PGOPTIONS. 2) Add BEGIN/COMMIT to all SQL scripts. --Josh -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Final /contrib cleanup -- yes/no?
Josh Berkus [EMAIL PROTECTED] writes: 1) Take the SET search_path=public out of all contrib SQL scripts so that DBAs can determine the correct schema by using PGOPTIONS. I don't recall that having been proposed, and I don't think it's really a good idea. We intentionally put those SETs in, not that long ago. 2) Add BEGIN/COMMIT to all SQL scripts. The effects of that haven't been debated, either. Are you sure none of those scripts rely on surviving errors? What about the possibility of other scripts including them when already inside a BEGIN block? The thing we really need to make that stuff nice is a proper module facility. Changing stuff at the margins in the meantime doesn't really do much except create more different possible behaviors that people will have to deal with. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Final /contrib cleanup -- yes/no?
Tom, I don't recall that having been proposed, and I don't think it's really a good idea. We intentionally put those SETs in, not that long ago. I haven't been able to find any reasoning on any list why those SETs where a good idea. Bruce put them in, but apparently without discussion. Unless you have a link for something I can't find in search? The way the SQL scripts currently work, there is no way to manage what schema the contrib modules get built in *except* to edit the scripts. In fact, because of the SET statements, a DBA who might *reasonably* expect that setting PGOPTIONS would allow him to determine that will be unpleasantly surprised when the module ends up in public anyway. For that matter, I really don't see the point of explicitly setting the default schema (public) in the scripts. Why bother? The effects of that haven't been debated, either. Are you sure none of those scripts rely on surviving errors? What about the possibility of other scripts including them when already inside a BEGIN block? Hmmm, I can see that. Not that important given that we have the remove scripts. I need to finish testing whether the remove scripts actually remove everything, though. The thing we really need to make that stuff nice is a proper module facility. Changing stuff at the margins in the meantime doesn't really do much except create more different possible behaviors that people will have to deal with. Yeah, but we're clearly not getting that done for 8.4, so I'm trying to do a little admin cleanup to live with for the next year. This isn't based on idle conjecture; this came up again because I'm writing scripts to automatically build PostgreSQL servers, and the SET search_path thing keeps biting me on the tuchas. --Josh -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Final /contrib cleanup -- yes/no?
Josh Berkus [EMAIL PROTECTED] writes: The way the SQL scripts currently work, there is no way to manage what schema the contrib modules get built in *except* to edit the scripts. Right, that's the intended and documented way to do it. In fact, because of the SET statements, a DBA who might *reasonably* expect that setting PGOPTIONS would allow him to determine that will be unpleasantly surprised when the module ends up in public anyway. I don't see that this is a reasonable expectation; it has never worked in any previous release, and the documentation explicitly says to do the other. Also, at least some of the proposed forms of a module facility would have the effect of overriding any such approach anyhow. Again, I'm not for whacking around the procedures for dealing with contrib each time we make a release. We should change it once when we have a shot at getting it right. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposed Patch to Improve Performance of Multi-Batch Hash Join for Skewed Data Sets
On Wed, Nov 5, 2008 at 5:06 PM, Bryce Cutt [EMAIL PROTECTED] wrote: The error is causes by me Asserting against the wrong variable. I never noticed this as I apparently did not have assertions turned on on my development machine. That is fixed now and with the new patch version I have attached all assertions are passing with your query and my test queries. I added another assertion to that section of the code so that it is a bit more vigorous in confirming the hash table partition is correct. It does not change the operation of the code. There are two partition counts. One holds the maximum number of buckets in the hash table and the other counts the number of actual buckets created for hash values. I was incorrectly testing against the second one because that was valid before I started using a hash table to store the buckets. The enable_hashjoin_usestatmcvs flag was valuable for my own research and tests and likely useful for your review but Tom is correct that it can be removed in the final version. - Bryce Cutt Well, that builds nicely, lets me import the data, and I've seen a performance improvement with enable_hashjoin_usestatmcvs on vs. off. I plan to test that more formally (though probably not fully to the extent you did in your paper; just enough to feel comfortable that I'm getting similar results). Then I'll spend some time poking in the code, for the relatively little good I feel I can do in that capacity, and I'll also investigate scenarios with particularly inaccurate statistics. Stay tuned. - Josh -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Final /contrib cleanup -- yes/no?
On Thu, 2008-11-06 at 17:24 -0500, Tom Lane wrote: Josh Berkus [EMAIL PROTECTED] writes: The way the SQL scripts currently work, there is no way to manage what schema the contrib modules get built in *except* to edit the scripts. Right, that's the intended and documented way to do it. I believe the intention is a bad one. They should be installed per the settings of the user installing them. Whether that be through an ALTER ROLE/USER or PGOPTIONS. Joshua D. Drake -- -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Final /contrib cleanup -- yes/no?
Joshua D. Drake wrote: On Thu, 2008-11-06 at 17:24 -0500, Tom Lane wrote: Josh Berkus [EMAIL PROTECTED] writes: The way the SQL scripts currently work, there is no way to manage what schema the contrib modules get built in *except* to edit the scripts. Right, that's the intended and documented way to do it. I believe the intention is a bad one. They should be installed per the settings of the user installing them. Whether that be through an ALTER ROLE/USER or PGOPTIONS. Eh, Tom has a point. If we build module loading for 8.5, we shouldn't change the functionality in the interim for 8.4. Annoying as it is. --Josh -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposed Patch to Improve Performance of Multi-Batch Hash Join for Skewed Data Sets
On Thu, 2008-11-06 at 15:33 -0700, Joshua Tolley wrote: Stay tuned. Minor question on this patch. AFAICS there is another patch that seems to be aiming at exactly the same use case. Jonah's Bloom filter patch. Shouldn't we have a dust off to see which one is best? Or at least a discussion to test whether they overlap? Perhaps you already did that and I missed it because I'm not very tuned in on this thread. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposed Patch to Improve Performance of Multi-Batch Hash Join for Skewed Data Sets
On Thu, Nov 6, 2008 at 3:52 PM, Simon Riggs [EMAIL PROTECTED] wrote: On Thu, 2008-11-06 at 15:33 -0700, Joshua Tolley wrote: Stay tuned. Minor question on this patch. AFAICS there is another patch that seems to be aiming at exactly the same use case. Jonah's Bloom filter patch. Shouldn't we have a dust off to see which one is best? Or at least a discussion to test whether they overlap? Perhaps you already did that and I missed it because I'm not very tuned in on this thread. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support We haven't had that discussion AFAIK, and definitely should. First glance suggests they could coexist peacefully, with proper coaxing. If I understand things properly, Jonah's patch filters tuples early in the join process, and this patch tries to ensure that hash join batches are kept in RAM when they're most likely to be used. So they're orthogonal in purpose, and the patches actually apply *almost* cleanly together. Jonah, any comments? If I continue to have some time to devote, and get through all I think I can do to review this patch, I'll gladly look at Jonah's too, FWIW. - Josh -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [WIP] In-place upgrade
On Thu, 6 Nov 2008, Tom Lane wrote: Another thought here is that I don't think we are yet committed to any changes that require extra space between 8.3 and 8.4, are we? The proposed addition of CRC words could be put off to 8.5, for instance. I was just staring at that code as you wrote this thinking about the same thing. CRCs are a great feature I'd really like to see. On the other hand, announcing that 8.4 features in-place upgrades for 8.3 databases, and that the project has laid the infrastructure such that future releases will also upgrade in-place, would IMHO be the biggest positive announcement of the new release by a large margin. At least then new large (1TB) installs could kick off on either the stable 8.3 or 8.4 knowing they'd never be forced to deal with dump/reload, whereas right now there is no reasonable solution for them that involves PostgreSQL (I just crossed 3TB on a system last month and I'm not looking forward to its future upgrades). Two questions come to mind here: -If you reduce the page layout upgrade problem to convert from V4 to V5 adding support for CRCs, is there a worthwhile simpler path to handling that without dragging the full complexity of the older page layout changes in? -Is it worth considering making CRCs an optional compile-time feature, and that (for now at least) you couldn't get them and the in-place upgrade at the same time? Stepping back for a second, the idea that in-place upgrade is only worthwhile if it yields zero downtime isn't necessarily the case. Even having an offline-only upgrade tool to handle the more complicated situations where tuples have to be squeezed onto another page would still be a major improvement over the current situation. The thing that you have to recognize here is that dump/reload is extremely slow because of bottlenecks in the COPY process. That makes for a large amount of downtime--many hours isn't unusual. If older version upgrade downtime was reduced to how long it takes to run a must scan every page and fiddle with it if full tool, that would still be a giant improvement over the current state of things. If Zdenek's figures that only a small percentages of pages will need such adjustment holds up, that should take only some factor longer than a sequential scan of the whole database. That's not instant, but it's at least an order of magnitude faster than a dump/reload on a big system. The idea that you're going to get in-place upgrade all the way back to 8.2 without taking the database down for a even little bit to run such a utility is hard to pull off, and it's impressive that Zdenek and everyone else involved has gotten so close to doing it. I personally am on the fence as to whether it's worth paying even the 1% penalty for that implementation all the time just to get in-place upgrades. If an offline utility with reasonable (scan instead of dump/reload) downtime and closer to zero overhead when finished was available instead, that might be a more reasonable trade-off to make for handling older releases. There are so many bottlenecks in the older versions that you're less likely to find a database too large to dump and reload there anyway. It would also be the case that improvements to that offline utility could continue after 8.4 proper was completely frozen. -- * Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch for ALTER DATABASE WITH TABLESPACE
--On Donnerstag, November 06, 2008 11:35:54 +0100 Guillaume Lelarge [EMAIL PROTECTED] wrote: Guillaume Lelarge a écrit : v4 patch attached. v5 patch attached. Thanks Guillaume. Maybe this is nit-picking, but i see that you have to rmdir() an existing empty tablespace directory to use copydir() afterwards. Maybe we can teach copydir() to error out when trying to mkdir() an existing directory only when forced by the caller? I see copydir() used at four places, so the impact of this change would be minimal. -- Thanks Bernd -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [WIP] In-place upgrade
The idea that you're going to get in-place upgrade all the way back to 8.2 without taking the database down for a even little bit to run such a utility is hard to pull off, and it's impressive that Zdenek and everyone else involved has gotten so close to doing it. I think we should at least wait to see what the next version of his patch looks like before making any final decisions. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposed Patch to Improve Performance of Multi-Batch Hash Join for Skewed Data Sets
-Original Message- Minor question on this patch. AFAICS there is another patch that seems to be aiming at exactly the same use case. Jonah's Bloom filter patch. Shouldn't we have a dust off to see which one is best? Or at least a discussion to test whether they overlap? Perhaps you already did that and I missed it because I'm not very tuned in on this thread. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support We haven't had that discussion AFAIK, and definitely should. First glance suggests they could coexist peacefully, with proper coaxing. If I understand things properly, Jonah's patch filters tuples early in the join process, and this patch tries to ensure that hash join batches are kept in RAM when they're most likely to be used. So they're orthogonal in purpose, and the patches actually apply *almost* cleanly together. Jonah, any comments? If I continue to have some time to devote, and get through all I think I can do to review this patch, I'll gladly look at Jonah's too, FWIW. - Josh The skew patch and bloom filter patch are orthogonal and can both be applied. The bloom filter patch is a great idea, and it is used in many other database systems. You can use the TPC-H data set to demonstrate that the bloom filter patch will significantly improve performance of multi-batch joins (with or without data skew). Any query that filters a build table before joining on the probe table will show improvements with a bloom filter. For example, select * from customer, orders where customer.c_nationkey = 10 and customer.c_custkey = orders.o_custkey The bloom filter on customer would allow us to avoid probing with orders tuples that cannot possibly find a match due to the selection criteria. This is especially beneficial for multi-batch joins where an orders tuple must be written to disk if its corresponding customer batch is not the in-memory batch. I have no experience reviewing patches, but I would be happy to help contribute/review the bloom filter patch as best I can. -- Dr. Ramon Lawrence Assistant Professor, Department of Computer Science, University of British Columbia Okanagan E-mail: [EMAIL PROTECTED] -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] No write stats in pg_statio system views
Nikhil Sontakke [EMAIL PROTECTED] wrote: What is the reason for not having heap_blks_write kind of stats in the pg_statio system views? Is it because bgwriter does the writing (we do have bg stats there) most of the times? Wouldn't the write stats help to get the complete IO picture for the relation? That's because we don't have relfilenode-based statistics. We have only relation oid based statistics and the oid information is not available on writing (including bgwriter). If we can change statistics to be relfilenode-based, heap_blks_write would be available. It might be worth trying, but some works are needed. Regards, --- ITAGAKI Takahiro NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposed Patch to Improve Performance of Multi-Batch Hash Join for Skewed Data Sets
On Thu, Nov 6, 2008 at 5:31 PM, Lawrence, Ramon [EMAIL PROTECTED] wrote: -Original Message- Minor question on this patch. AFAICS there is another patch that seems to be aiming at exactly the same use case. Jonah's Bloom filter patch. Shouldn't we have a dust off to see which one is best? Or at least a discussion to test whether they overlap? Perhaps you already did that and I missed it because I'm not very tuned in on this thread. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support We haven't had that discussion AFAIK, and definitely should. First glance suggests they could coexist peacefully, with proper coaxing. If I understand things properly, Jonah's patch filters tuples early in the join process, and this patch tries to ensure that hash join batches are kept in RAM when they're most likely to be used. So they're orthogonal in purpose, and the patches actually apply *almost* cleanly together. Jonah, any comments? If I continue to have some time to devote, and get through all I think I can do to review this patch, I'll gladly look at Jonah's too, FWIW. - Josh The skew patch and bloom filter patch are orthogonal and can both be applied. The bloom filter patch is a great idea, and it is used in many other database systems. You can use the TPC-H data set to demonstrate that the bloom filter patch will significantly improve performance of multi-batch joins (with or without data skew). Any query that filters a build table before joining on the probe table will show improvements with a bloom filter. For example, select * from customer, orders where customer.c_nationkey = 10 and customer.c_custkey = orders.o_custkey The bloom filter on customer would allow us to avoid probing with orders tuples that cannot possibly find a match due to the selection criteria. This is especially beneficial for multi-batch joins where an orders tuple must be written to disk if its corresponding customer batch is not the in-memory batch. I have no experience reviewing patches, but I would be happy to help contribute/review the bloom filter patch as best I can. -- Dr. Ramon Lawrence Assistant Professor, Department of Computer Science, University of British Columbia Okanagan E-mail: [EMAIL PROTECTED] I've no patch review experience, either -- this is my first one. See http://wiki.postgresql.org/wiki/Reviewing_a_Patch for details on what a reviewer ought to do in general; various patch review discussions on the -hackers list have also proven helpful. As regards this patch specifically, it seems we could merge the two patches into one and consider them together. However, the bloom filter patch is listed as a Work in Progress on http://wiki.postgresql.org/wiki/CommitFest_2008-11. Perhaps it needs more work before being considered seriously? Jonah, what do you think would be most helpful? - Josh / eggyknap -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Final /contrib cleanup -- yes/no?
Josh Berkus [EMAIL PROTECTED] writes: Eh, Tom has a point. If we build module loading for 8.5, we shouldn't change the functionality in the interim for 8.4. Annoying as it is. The main reason I'm concerned about it is that when we do modules (which I certainly hope happens for 8.5) we would then have two different old behaviors to worry about compatibility with. I'm afraid of painting ourselves into a corner. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [WIP] In-place upgrade
Greg Smith [EMAIL PROTECTED] writes: On Thu, 6 Nov 2008, Tom Lane wrote: Another thought here is that I don't think we are yet committed to any changes that require extra space between 8.3 and 8.4, are we? The proposed addition of CRC words could be put off to 8.5, for instance. I was just staring at that code as you wrote this thinking about the same thing. ... -Is it worth considering making CRCs an optional compile-time feature, and that (for now at least) you couldn't get them and the in-place upgrade at the same time? Hmm ... might be better than not offering them in 8.4 at all, but the thing is that then you are asking packagers to decide for their customers which is more important. And I'd bet you anything you want that in-place upgrade would be their choice. Also, having such an option would create extra complexity for 8.4-to-8.5 upgrades. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers