Re: [HACKERS] Removing pg_migrator limitations
Bruce Momjian wrote: > > Well, we might eventually allow addition of values to enums too; the > > fact that it's not implemented outside pg_migrator right now doesn't > > mean we won't ever think of a solution. In any case I'm not persuaded > > that a zero-element enum is totally without value. Think of it like a > > domain with a "must be null" constraint. > > OK, but that is going to expand the my patch. I will probably implement > zero-element enums first and then go ahead and do the binary upgrade > part. Zero-element enums will simplify the pg_dump code. I have implemented the zero-value option to CREATE TYPE ENUM with the attached patch. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + Index: doc/src/sgml/ref/create_type.sgml === RCS file: /cvsroot/pgsql/doc/src/sgml/ref/create_type.sgml,v retrieving revision 1.79 diff -c -c -r1.79 create_type.sgml *** doc/src/sgml/ref/create_type.sgml 30 Nov 2008 19:01:29 - 1.79 --- doc/src/sgml/ref/create_type.sgml 25 Dec 2009 06:15:56 - *** *** 25,31 ( attribute_name data_type [, ... ] ) CREATE TYPE name AS ENUM ! ( 'label' [, ... ] ) CREATE TYPE name ( INPUT = input_function, --- 25,31 ( attribute_name data_type [, ... ] ) CREATE TYPE name AS ENUM ! ( [ 'label' [, ... ] ] ) CREATE TYPE name ( INPUT = input_function, Index: src/backend/parser/gram.y === RCS file: /cvsroot/pgsql/src/backend/parser/gram.y,v retrieving revision 2.699 diff -c -c -r2.699 gram.y *** src/backend/parser/gram.y 23 Dec 2009 17:41:43 - 2.699 --- src/backend/parser/gram.y 25 Dec 2009 06:15:56 - *** *** 297,303 TableFuncElementList opt_type_modifiers prep_type_clause execute_param_clause using_clause returning_clause ! enum_val_list table_func_column_list create_generic_options alter_generic_options relation_expr_list dostmt_opt_list --- 297,303 TableFuncElementList opt_type_modifiers prep_type_clause execute_param_clause using_clause returning_clause ! opt_enum_val_list enum_val_list table_func_column_list create_generic_options alter_generic_options relation_expr_list dostmt_opt_list *** *** 3623,3629 n->coldeflist = $6; $$ = (Node *)n; } ! | CREATE TYPE_P any_name AS ENUM_P '(' enum_val_list ')' { CreateEnumStmt *n = makeNode(CreateEnumStmt); n->typeName = $3; --- 3623,3629 n->coldeflist = $6; $$ = (Node *)n; } ! | CREATE TYPE_P any_name AS ENUM_P '(' opt_enum_val_list ')' { CreateEnumStmt *n = makeNode(CreateEnumStmt); n->typeName = $3; *** *** 3715,3720 --- 3715,3725 } ; + opt_enum_val_list: + enum_val_list { $$ = $1; } + | /*EMPTY*/{ $$ = NIL; } + ; + enum_val_list: Sconst { $$ = list_make1(makeString($1)); } | enum_val_list ',' Sconst -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: Streaming Rep - 2-phase backups and reducing time to full replication
On Thu, Dec 24, 2009 at 6:03 PM, Simon Riggs wrote: > I see it would work like this: Add a new option to recovery.conf, > perhaps two_phase_backup = on. Startup creates a file called > backup_in_progress then waits. When second phase of backup is complete > (7b), delete the file and then Startup process will continue. Very few > lines of code to make this work. Where do you think the WAL files shipped before doing (7b) are stored? If it's pg_xlog, the disk full failure would occur in the standby. If it's an archive, restore_command would have to be supplied the same as my idea. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Removing pg_migrator limitations
Bruce Momjian wrote: > Tom Lane wrote: > > Bruce Momjian writes: > > > Tom Lane wrote: > > >> The reason I don't want to do it that way is that then you need two > > >> ugly kluges in the backend, not just one. With the zero-and-add-one > > >> approach there is no need to have a "next enum oid" variable at all. > > > > > Uh, I still need that variable because that is how we are going to set > > > the oid in EnumValuesCreate(), unless we want to add dummy oid-value > > > arguments to that function for use only by the binary upgrade > > > server-side function. > > > > Please go back and re-read what I suggested: you need a function along > > the lines of > > add_enum_member(enum-type, 'value name', value-oid) > > and then there's no need for any saved state. So what if it has a > > different signature from the other pg_migrator special functions? > > It's not doing the same thing. > > OK, right, I can get rid of the enum function that just sets the next > oid value if I do all the enum value creation via function calls. I > will work in that direction then. There is only one call to EnumValuesCreate() so maybe adding a binary-upgrade-only parameter to the function will be the cleanest approach. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] info about patch: using parametrised query in psql
On Thu, Dec 24, 2009 at 2:45 AM, Pavel Stehule wrote: > Hello > > I try to explain my motivation for creating this patch > https://commitfest.postgresql.org/action/patch_view?id=224 . > > Parametrised queries are supported in PostgreSQL long time. Using the > parametrised queries is point of all advices about good programming > style. On application level it is protection to SQL injection. On low > level it is protection to some potential quoting or escaping bugs. It > is paradox, so PostgreSQL doesn't use this techniques in own > applications - mainly in psql. > > In psql we have not any quoting, or escaping functionality. We have to > use external tools like awk, sed: > > http://www.redhat.com/docs/manuals/database/RHDB-2.1-Manual/admin_user/r1-app-psql-4.html > >> >> testdb=> \set content '\'' `cat my_file.txt` '\'' >> testdb=> INSERT INTO my_table VALUES (:content); >> >> One possible problem with this approach is that my_file.txt might contain >> single quotes. >> These need to be escaped so that they do not cause a syntax error when the >> third line is processed. You can do this with the program sed: >> >> testdb=> \set content `sed -e "s/'/\\'/g" < my_file.txt` > > Similar problems could be removed with using parameter queries in psql. > > With this parametrised queries feature you can: > > \set content `cat my_file.txt` > INSERT INTO my_table VALUES(:content); > > and this command will be correct without depending on content my_file.txt > file. > > This is more: robust, secure, and simpler. > > My motivation is simplify life to people who use psql for scripting. > For internal use SQL injection isn't too much terrible. Problem are > some obscure identifiers used some users. Now you have to carefully > check every value, if your scripts have to be robust. > > Patch doesn't change default behave. You have to explicitly activate it. This makes sense now that you've explained it. Personally, I would not choose to use psql as a scripting language, and I think there has been some controversy on that point in the past, though I don't remember the details. In spite of that, though, it seems to me that it does make some sense to provide a mechanism for escaping the value stored in a psql variable, since - if nothing else - someone might easily want to do the sort of thing you're describing here in an interactive session. However, I think the approach you've taken in this patch is a non-starter. You've basically added a global flag that will cause ALL variables to be passed in a way that removes the need for them to be escaped. That seems pretty inconvenient and awkward. What happens if someone wants to do "INSERT INTO :foo VALUES (:bar)"? They're out of luck. Futhermore, if a psql script that expects the pexec flag to be set one way is run with it set the other way, it may either work fine, work OK but with a potential security hole, or fail spectacularly. I think maybe what we need here is a piece of syntax to indicate that a specific parameter should be substituted after first being passed through PQescapeStringConn. Other thoughts? ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Small change of the HS document
On Tue, Dec 22, 2009 at 12:28 PM, Simon Riggs wrote: > On Tue, 2009-12-22 at 11:25 +0900, Fujii Masao wrote: > >> I found there is no "primary" tag for the HS parameters >> in config.sgml. Attached patch adds that tag. > > Thanks Committed. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] creating index names automatically?
On Thu, Dec 24, 2009 at 12:07 AM, Tom Lane wrote: > Robert Haas writes: >> It compiles without warnings for me. There's only one production that >> allows exactly one word between INDEX and ON. > > In that case you broke something. I'm too tired to work out exactly > what. Heh. Well, I almost certainly did, since it wasn't a complete patch and I didn't test it, but I am not sure that proves that the idea was bad. Upthread Greg said: > I suppose we could fix this by specifying a precedence and then > explicitly checking if you're trying to make an index named > concurrently and fixing it up later. And your response was: > No, not really. Past the grammar there is no way to tell concurrently > from "concurrently", ie, if we did it like that then you couldn't even > use double quotes to get around it. But it is merely an accident of the way the grammer happens to be built that CONCURRENTLY and "concurrently" happen to evaluate to equivalent values. It's easy to make a set of productions that treat them differently, which is what I did here. It doesn't even require precedence. AIUI, there are four constructs that we wish to support: 1. CREATE INDEX ON table (columns); 2. CREATE INDEX CONCURRENTLY ON table (columns); 3. CREATE INDEX index_name ON table (columns); 4. CREATE INDEX CONCURRENTLY index_name ON table (columns); If we create these as four separate productions, then after shifting CREATE INDEX CONCURRENTLY and seeing that the next token is ON, we don't know whether to reduce CONCURRENTLY to index_name or shift. But if we unify (2) and (3) into a single production and sort it out when we reduce the whole statement, then we end up with: 1. CREATE INDEX ON table (columns); 2/3. CREATE INDEX tricky_index_name ON table (columns); 4. CREATE INDEX CONCURRENTLY index_name ON table (columns); Unless I'm missing something, this eliminates the problem. Now, after shifting CREATE INDEX CONCURRENTLY, if the next token is ON, we reduce (matching case 2/3); otherwise, we shift again (hoping to match case 4). The remaining problem is to define tricky_index_name in a way that allows us to distinguish CONCURRENTLY from "concurrently", which is easy enough to do. Still another way to solve this problem would be to create a production called unreserved_keywords_except_concurrently, so that index_name could be defined not to include CONCURRENTLY without quotes as one of the possibilities. But I think this way is cleaner. Having said all this, I don't really object to the alternate proposal of creating a set of words that are reserved as relation names but not as column names, either, especially if it would allow us to make some other existing keywords less-reserved. But I don't really understand the justification for thinking that CONCURRENTLY is OK to make more reserved, but, say, EXPLAIN would not be OK. This is one, pretty marginal production - there's nothing else in the grammar that even uses CONCURRENTLY, let alone needs it to be reserved. The whole problem here comes from what seems like a pretty poor choice about where to put the word CONCURRENTLY. It would have been a lot more robust to put this in a section of the statement where any additional verbiage was inevitably going to be introduced by a keyword, like just before or after the storage parameters. I think what we should learn from this case, as well as the recent changes to EXPLAIN, COPY, and VACUUM syntax, is that adding options to commands by creating keywords is not very scalable, and that putting the modifier immediately after the command name is an especially poor positioning. Without explicit delimiters, it's easy to get parser conflicts, and as the number of options grows (even to a relatively modest value like 2 or 3), the fact that they have to appear in a fixed order becomes a real pain. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] unicode questions
On Dec 24, 2009, at 4:14 PM, Andrew Dunstan wrote: 2) How far is normalization support in PG? When I checked a long time ago, there was no such support. Now that the SQL standard mandates a NORMALIZE function that may have changed. Any updates? > > Creating such a function shouldn't be terribly hard AIUI, if someone wants to > submit a patch. It was raised about three months ago but nobody actually > volunteered unless I missed that. I wrote a similar function in PL/Perl: http://justatheory.com/computers/databases/postgresql/unicode-normalization.html Best, David -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] unicode questions
2) How far is normalization support in PG? When I checked a long time ago, there was no such support. Now that the SQL standard mandates a NORMALIZE function that may have changed. Any updates? Creating such a function shouldn't be terribly hard AIUI, if someone wants to submit a patch. It was raised about three months ago but nobody actually volunteered unless I missed that. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Streaming Rep - 2-phase backups and reducing time to full replication
On Thu, 24 Dec 2009 09:58:20 + Simon Riggs wrote: > On Tue, 2009-12-22 at 15:33 -0600, decibel wrote: > > > Dumb question: could the WAL streaming code be made to also ship base > > files? That would make setting up a streaming replica super-simple. > > That was a possible design, but that's not will be there for this > release. > > I opposed adding the "we do the base backup for you" feature because > there are many ways of doing a base backup and it would likely have > restricted your options to do so. One issue that would cause is limiting > the speed of the base backup to a single libpq connection, which would > cause performance problems. So yes, super-simple, but not super-good for > many big users. The big users will always have other options. But for normal users who just want to enable a replication - this feature would be awesome: the entire replication is done by the database. So +1 for integrating such a feature in a future version. Bye -- Andreas 'ads' Scherbaum German PostgreSQL User Group European PostgreSQL User Group - Board of Directors Volunteer Regional Contact, Germany - PostgreSQL Project -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Removing pg_migrator limitations
Tom Lane wrote: > Bruce Momjian writes: > > Tom Lane wrote: > >> The reason I don't want to do it that way is that then you need two > >> ugly kluges in the backend, not just one. With the zero-and-add-one > >> approach there is no need to have a "next enum oid" variable at all. > > > Uh, I still need that variable because that is how we are going to set > > the oid in EnumValuesCreate(), unless we want to add dummy oid-value > > arguments to that function for use only by the binary upgrade > > server-side function. > > Please go back and re-read what I suggested: you need a function along > the lines of > add_enum_member(enum-type, 'value name', value-oid) > and then there's no need for any saved state. So what if it has a > different signature from the other pg_migrator special functions? > It's not doing the same thing. OK, right, I can get rid of the enum function that just sets the next oid value if I do all the enum value creation via function calls. I will work in that direction then. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Removing pg_migrator limitations
Bruce Momjian writes: > Tom Lane wrote: >> The reason I don't want to do it that way is that then you need two >> ugly kluges in the backend, not just one. With the zero-and-add-one >> approach there is no need to have a "next enum oid" variable at all. > Uh, I still need that variable because that is how we are going to set > the oid in EnumValuesCreate(), unless we want to add dummy oid-value > arguments to that function for use only by the binary upgrade > server-side function. Please go back and re-read what I suggested: you need a function along the lines of add_enum_member(enum-type, 'value name', value-oid) and then there's no need for any saved state. So what if it has a different signature from the other pg_migrator special functions? It's not doing the same thing. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Removing pg_migrator limitations
Tom Lane wrote: > Bruce Momjian writes: > > Tom Lane wrote: > >> The approach I originally suggested was to create the enum type with > >> *no* members, and then add the values one at a time. > > > Well, I was hesitant to modify the grammar, unless we want the ability > > to create enums with zero values. Doing enum with only one value will > > not be too complex for me and I don't think binary upgrade should affect > > the grammar unless there are other reasons we want to change. > > The reason I don't want to do it that way is that then you need two > ugly kluges in the backend, not just one. With the zero-and-add-one > approach there is no need to have a "next enum oid" variable at all. Uh, I still need that variable because that is how we are going to set the oid in EnumValuesCreate(), unless we want to add dummy oid-value arguments to that function for use only by the binary upgrade server-side function. I have actually coded the variable case already so you can see how it looks; attached. Most of the patch is just indenting of the existing oid assignment block. > > We do allow tables with no columns, but we allow the addition of columns > > to a table, so it makes more sense there. > > Well, we might eventually allow addition of values to enums too; the > fact that it's not implemented outside pg_migrator right now doesn't > mean we won't ever think of a solution. In any case I'm not persuaded > that a zero-element enum is totally without value. Think of it like a > domain with a "must be null" constraint. OK, but that is going to expand the my patch. I will probably implement zero-element enums first and then go ahead and do the binary upgrade part. Zero-element enums will simplify the pg_dump code. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + Index: src/backend/catalog/pg_enum.c === RCS file: /cvsroot/pgsql/src/backend/catalog/pg_enum.c,v retrieving revision 1.11 diff -c -c -r1.11 pg_enum.c *** src/backend/catalog/pg_enum.c 24 Dec 2009 22:17:58 - 1.11 --- src/backend/catalog/pg_enum.c 24 Dec 2009 22:29:17 - *** *** 25,30 --- 25,32 static int oid_cmp(const void *p1, const void *p2); + Oid binary_upgrade_next_pg_enum_oid = InvalidOid; + /* * EnumValuesCreate *** *** 58,82 tupDesc = pg_enum->rd_att; /* ! * Allocate oids. While this method does not absolutely guarantee that we ! * generate no duplicate oids (since we haven't entered each oid into the ! * table before allocating the next), trouble could only occur if the oid ! * counter wraps all the way around before we finish. Which seems ! * unlikely. */ oids = (Oid *) palloc(num_elems * sizeof(Oid)); ! for (elemno = 0; elemno < num_elems; elemno++) { /* ! * The pg_enum.oid is stored in user tables. This oid must be ! * preserved by binary upgrades. */ ! oids[elemno] = GetNewOid(pg_enum); } - /* sort them, just in case counter wrapped from high to low */ - qsort(oids, num_elems, sizeof(Oid), oid_cmp); - /* and make the entries */ memset(nulls, false, sizeof(nulls)); --- 60,94 tupDesc = pg_enum->rd_att; /* ! * Allocate oids */ oids = (Oid *) palloc(num_elems * sizeof(Oid)); ! if (num_elems == 1 && OidIsValid(binary_upgrade_next_pg_enum_oid)) ! { ! oids[0] = binary_upgrade_next_pg_enum_oid; ! binary_upgrade_next_pg_enum_oid = InvalidOid; ! } ! else { /* ! * While this method does not absolutely guarantee that we generate ! * no duplicate oids (since we haven't entered each oid into the ! * table before allocating the next), trouble could only occur if ! * the oid counter wraps all the way around before we finish. Which ! * seems unlikely. */ ! for (elemno = 0; elemno < num_elems; elemno++) ! { ! /* ! * The pg_enum.oid is stored in user tables. This oid must be ! * preserved by binary upgrades. ! */ ! oids[elemno] = GetNewOid(pg_enum); ! } ! /* sort them, just in case counter wrapped from high to low */ ! qsort(oids, num_elems, sizeof(Oid), oid_cmp); } /* and make the entries */ memset(nulls, false, sizeof(nulls)); -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Removing pg_migrator limitations
Bruce Momjian wrote: As far as the ability to add enum values using ALTER TYPE, it seems we would need a pg_enum.enumnum column like we do for pg_attribute.attnum and order on that rather than pg_enum.oid. (Binary upgrade would still need to preserve oids.) I don't that's necessarily a good way to go - being able to sort by the actual stored value is an efficiency point. I think we might need to look at implementing a more extensible enum type, which would allow new values to be appended to and possibly inserted into the list of labels, but anyway that's really a separate subject from pg_migrator. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Removing pg_migrator limitations
Bruce Momjian writes: > Tom Lane wrote: >> The approach I originally suggested was to create the enum type with >> *no* members, and then add the values one at a time. > Well, I was hesitant to modify the grammar, unless we want the ability > to create enums with zero values. Doing enum with only one value will > not be too complex for me and I don't think binary upgrade should affect > the grammar unless there are other reasons we want to change. The reason I don't want to do it that way is that then you need two ugly kluges in the backend, not just one. With the zero-and-add-one approach there is no need to have a "next enum oid" variable at all. > We do allow tables with no columns, but we allow the addition of columns > to a table, so it makes more sense there. Well, we might eventually allow addition of values to enums too; the fact that it's not implemented outside pg_migrator right now doesn't mean we won't ever think of a solution. In any case I'm not persuaded that a zero-element enum is totally without value. Think of it like a domain with a "must be null" constraint. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Removing pg_migrator limitations
Bruce Momjian wrote: > I have completed the attached patch which assigns oids for all pg_type > rows when pg_dump --binary-upgrade is used. This allows user-defined > arrays and composite types to be migrated cleanly. I tested a reload of > the regression database with --binary-upgrade and all the pg_type oids > were identical. The pg_migrator changes required to use this feature > are trivial. Applied. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Removing pg_migrator limitations
Tom Lane wrote: > Bruce Momjian writes: > > I thought of a cleaner approach. CREATE TYPE ENUM will create one enum > > with the specified oid, and then a server-side function will call > > EnumValuesCreate() be used to add each additional enum with a specified > > oid --- no deleting necessary. I will start working on a patch for > > this. > > The approach I originally suggested was to create the enum type with > *no* members, and then add the values one at a time. It might take a > tweak to the CREATE TYPE AS ENUM grammar to allow zero members, but > I don't see any logical problem with such a thing. Well, I was hesitant to modify the grammar, unless we want the ability to create enums with zero values. Doing enum with only one value will not be too complex for me and I don't think binary upgrade should affect the grammar unless there are other reasons we want to change. I think it will look like: -- For binary upgrade, must preserve pg_enum oids SELECT binary_upgrade.set_next_pg_enum_oid('27258'::pg_catalog.oid); CREATE TYPE empstatus AS ENUM('hired'); SELECT binary_upgrade.set_next_pg_enum_oid('27259'::pg_catalog.oid); SELECT binary_upgrade.add_pg_enum_value('42143'::pg_catalog.oid, 'retired'); We do allow tables with no columns, but we allow the addition of columns to a table, so it makes more sense there. As far as the ability to add enum values using ALTER TYPE, it seems we would need a pg_enum.enumnum column like we do for pg_attribute.attnum and order on that rather than pg_enum.oid. (Binary upgrade would still need to preserve oids.) -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Cancelling idle in transaction state
On Sun, Dec 6, 2009 at 4:23 PM, Tom Lane wrote: > We are using NOTICE, not NOTIFY, assuming that we use anything at all > (which I still regard as unnecessary). Please stop injecting confusion > into the discussion. Attached is a minimal POC patch that allows to cancel an idle transaction with SIGINT. The HS patch also allows this in its current form but as Simon points out the client gets out of sync with it. The proposal is to send an additional NOTICE to the client and abort all open transactions and subtransactions (this is what I got from the previous discussion). I had to write an additional function AbortAnyTransaction() which aborts all transactions and subtransactions and leaves the transaction in the aborted state, is there an existing function to do this? We'd probably want to add a timeout for idle transactions also (which is a wishlist item since quite some time) and could also offer user functions like pg_cancel_idle_transaction(). Along this we might need to add internal reasons like we do for SIGUSR1 because we are now multiplexing different functionality onto the SIGINT signal. One might want to cancel an idle transaction only and not a running query, without keeping track of internal reasons one risks to cancel a legitimate query if that backend has started to work on a query again. Comments? Joachim diff -cr cvs/src/backend/access/transam/xact.c cvs.build/src/backend/access/transam/xact.c *** cvs/src/backend/access/transam/xact.c 2009-12-24 13:55:12.0 +0100 --- cvs.build/src/backend/access/transam/xact.c 2009-12-24 20:55:17.0 +0100 *** *** 2692,2697 --- 2692,2735 } /* + * AbortAnyTransaction + */ + void + AbortAnyTransaction(void) + { + TransactionState s = CurrentTransactionState; + + switch (s->blockState) + { + case TBLOCK_DEFAULT: + case TBLOCK_STARTED: + case TBLOCK_BEGIN: + case TBLOCK_INPROGRESS: + case TBLOCK_END: + case TBLOCK_ABORT: + case TBLOCK_SUBABORT: + case TBLOCK_ABORT_END: + case TBLOCK_ABORT_PENDING: + case TBLOCK_PREPARE: + case TBLOCK_SUBABORT_END: + case TBLOCK_SUBABORT_RESTART: + AbortCurrentTransaction(); + break; + + case TBLOCK_SUBINPROGRESS: + case TBLOCK_SUBBEGIN: + case TBLOCK_SUBEND: + case TBLOCK_SUBABORT_PENDING: + case TBLOCK_SUBRESTART: + AbortSubTransaction(); + CleanupSubTransaction(); + AbortAnyTransaction(); + break; + } + } + + + /* * PreventTransactionChain * * This routine is to be called by statements that must not run inside diff -cr cvs/src/backend/tcop/postgres.c cvs.build/src/backend/tcop/postgres.c *** cvs/src/backend/tcop/postgres.c 2009-12-24 13:55:18.0 +0100 --- cvs.build/src/backend/tcop/postgres.c 2009-12-24 20:55:17.0 +0100 *** *** 2637,2643 if (!proc_exit_inprogress) { InterruptPending = true; ! QueryCancelPending = true; /* * If it's safe to interrupt, and we're waiting for a lock, service --- 2637,2647 if (!proc_exit_inprogress) { InterruptPending = true; ! ! if (DoingCommandRead) ! TransactionCancelPending = true; ! else ! QueryCancelPending = true; /* * If it's safe to interrupt, and we're waiting for a lock, service *** *** 2789,2794 --- 2793,2821 errmsg("canceling statement due to user request"))); } } + if (TransactionCancelPending) + { + QueryCancelPending = false; + ImmediateInterruptOK = false; /* not idle anymore */ + + if (!IsTransactionOrTransactionBlock()) + return; + + if (IsAbortedTransactionBlockState()) + return; + + DisableNotifyInterrupt(); + DisableCatchupInterrupt(); + + ereport(NOTICE, + (errcode(ERRCODE_QUERY_CANCELED), + errmsg("canceling transaction due to user request"))); + + AbortAnyTransaction(); + + set_ps_display("idle in transaction (aborted)", false); + pgstat_report_activity(" in transaction (aborted)"); + } /* If we get here, do nothing (probably, QueryCancelPending was reset) */ } diff -cr cvs/src/backend/utils/init/globals.c cvs.build/src/backend/utils/init/globals.c *** cvs/src/backend/utils/init/globals.c 2009-12-09 11:24:42.0 +0100 --- cvs.build/src/backend/utils/init/globals.c 2009-12-24 20:55:17.0 +0100 *** *** 27,32 --- 27,33 volatile bool InterruptPending = false; volatile bool QueryCancelPending = false; + volatile bool TransactionCancelPending = false; volatile bool ProcDiePending = false; volatile bool ImmediateInterruptOK = false; volatile uint32 InterruptHoldoffCount = 0; diff -cr cvs/src/include/access/xact.h cvs.build/src/include/access/xact.h *** cvs/src/include/access/xact.h 2009-12-24 13:55:28.0 +0100 --- cvs.build/src/include/access/xact.h 2009-12-24 20:55:17.0 +0100 *** *** 204,209 --- 204,210 extern bool IsTransactionOrTransactionBlock(void); extern char TransactionBlockStatusCode(vo
Re: [HACKERS] Corrupt WAL production possible in gistxlog.c
Yoichi Hirai writes: > I was reading GiST core codes when I found an XLogInsert() > call that can produce a corrupt WAL record. Thanks for the report! (We didn't really need nine copies though ;-)) Applied back to 8.2. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] unicode questions
- - wrote: Dear PG hackers, I have two question regarding Unicode support in PG: 1) If I set my database and connection encoding to UTF-8, does pg (and future versions of it) guarantee that unicode code points are stored unmodified? or could it be that pg does some unicode normalization/manipulation with them before storing a string, or when retrieving a string? The reason why I'm asking is, I've built a little program that reads in and stores text and explicilty analyzes the text at a later point in time, also regarding things like if the text is in NFC, NFD or neither. and since I want to store them in the database, it is very imporant for PG not to fiddle around with the normalization unless my program explicitly told PG to do that. 2) How far is normalization support in PG? When I checked a long time ago, there was no such support. Now that the SQL standard mandates a NORMALIZE function that may have changed. Any updates? We don't do any normalization. If the client gives us UTF8 then we store exactly what it gives us, and return exactly that. (This question is not really a -hackers question. The correct forum is pgsql-general. Please make sure you use the correct forum in future.) cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] unicode questions
Dear PG hackers, I have two question regarding Unicode support in PG: 1) If I set my database and connection encoding to UTF-8, does pg (and future versions of it) guarantee that unicode code points are stored unmodified? or could it be that pg does some unicode normalization/manipulation with them before storing a string, or when retrieving a string? The reason why I'm asking is, I've built a little program that reads in and stores text and explicilty analyzes the text at a later point in time, also regarding things like if the text is in NFC, NFD or neither. and since I want to store them in the database, it is very imporant for PG not to fiddle around with the normalization unless my program explicitly told PG to do that. 2) How far is normalization support in PG? When I checked a long time ago, there was no such support. Now that the SQL standard mandates a NORMALIZE function that may have changed. Any updates? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Removing pg_migrator limitations
Bruce Momjian wrote: I now think the easiest solution will be to have pg_dump create the enum with a single dummy value, delete the pg_enum dummy row, and then call a modified version of EnumValuesCreate() to insert row by row into pg_enum, with specified oids. I thought of a cleaner approach. CREATE TYPE ENUM will create one enum with the specified oid, and then a server-side function will call EnumValuesCreate() be used to add each additional enum with a specified oid --- no deleting necessary. I will start working on a patch for this. Either that or Tom's suggested approach of being able to create an empty enum type would be much cleaner than the dummy row suggestion. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Removing pg_migrator limitations
Bruce Momjian writes: > I thought of a cleaner approach. CREATE TYPE ENUM will create one enum > with the specified oid, and then a server-side function will call > EnumValuesCreate() be used to add each additional enum with a specified > oid --- no deleting necessary. I will start working on a patch for > this. The approach I originally suggested was to create the enum type with *no* members, and then add the values one at a time. It might take a tweak to the CREATE TYPE AS ENUM grammar to allow zero members, but I don't see any logical problem with such a thing. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Update ppport.h in plperl
Tim Bunce wrote: I'm about ready to post the next draft of my plperl feature patch. When I looked at the diff I saw ~7800 lines of it were just due to updating ppport.h. So I've broken that out into this large but totally trivial patch. I'm going to apply this pretty soon to get some buildfarm coverage on it. We had a little trouble last time we messed with ppport.h, ISTR, so we need to make sure nothing breaks. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Removing pg_migrator limitations
Bruce Momjian wrote: > Bruce Momjian wrote: > > I looked at DefineEnum() and basically adding the ability to add enums > > would put the new enum after the existing ones unless the OID counter > > has wrapped around and is less than the oid counter at the time the enum > > type was created, in which case it will be listed as before the existing > > values. I wasn't aware enum ordering is something we tried to maintain. > > One issue is that we are not supporting the addition of enum values even > > for people who don't care about the ordering of enums (which I bet might > > be the majority.) > > > > I can think of a few approaches for pg_migrator: > > > > 1) Create an oid array in a permanent memory context and have > > DefineEnum() read from that. > > 2) Renumber the enum entries after they are created but before > > any of their oids are stored in user tables. > > > > Both can be done by pg_dump with proper server-side functions. The > > problem with #2 are cases where the old and new oid ranges overlap, > > e.g.: > > I now think the easiest solution will be to have pg_dump create the enum > with a single dummy value, delete the pg_enum dummy row, and then call a > modified version of EnumValuesCreate() to insert row by row into > pg_enum, with specified oids. I thought of a cleaner approach. CREATE TYPE ENUM will create one enum with the specified oid, and then a server-side function will call EnumValuesCreate() be used to add each additional enum with a specified oid --- no deleting necessary. I will start working on a patch for this. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Backup history file should be replicated in Streaming Replication?
On Thu, Dec 24, 2009 at 5:38 PM, Fujii Masao wrote: > On Thu, Dec 24, 2009 at 1:39 PM, Fujii Masao wrote: >> On Wed, Dec 23, 2009 at 7:50 PM, Heikki Linnakangas >> wrote: >>> Ok. How about writing the history file in pg_stop_backup() for >>> informational purposes only. Ie. never read it, but rely on the WAL >>> records instead. >> >> Sounds good. I'll make such change as a self-contained patch. > > Done. Please see the attached patch. I included this and the following patches in my 'replication' branch. Also I got rid of the capability to replicate a backup history file, which made the SR code simple. http://archives.postgresql.org/pgsql-hackers/2009-12/msg01931.php Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] About the CREATE TABLE LIKE indexes vs constraints issue
Hi, sorry for posting style, -- dim Le 23 déc. 2009 à 23:58, Jeff Davis a écrit : Honestly, I've never used LIKE in a table definition aside from one- off design experiments. For that kind of thing, what I want is to just get everything (except perhaps FKs if the above situation applies), and I adjust it from there. Are there people out there who use LIKE in their production schema files? I do use LIKE in scripts for adding providers of federated data. In some cases you want to INHERIT, in some other you want to move incoming data to another set of tables. Regards, -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Streaming Rep - 2-phase backups and reducing time to full replication
On Tue, 2009-12-22 at 15:33 -0600, decibel wrote: > Dumb question: could the WAL streaming code be made to also ship base files? > That would make setting up a streaming replica super-simple. That was a possible design, but that's not will be there for this release. I opposed adding the "we do the base backup for you" feature because there are many ways of doing a base backup and it would likely have restricted your options to do so. One issue that would cause is limiting the speed of the base backup to a single libpq connection, which would cause performance problems. So yes, super-simple, but not super-good for many big users. -- Simon Riggs www.2ndQuadrant.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: Streaming Rep - 2-phase backups and reducing time to full replication
On Thu, 2009-12-24 at 14:40 +0900, Fujii Masao wrote: > On Wed, Dec 23, 2009 at 3:54 AM, Simon Riggs wrote: > > 9. Create a recovery command file in the standby server with parameters > > required for streaming replication. > > > > 7. (a) Make a base backup of minimal essential files from primary > > server, load this data onto the standby. > > > > 10. Start postgres in the standby server. It will start streaming > > replication. > > > > 7. (b) Continue with second phase of base backup, copying all remaining > > files, ending with pg_stop_backup() > > This manual procedure seems to be more complicated for me than just > using restore_command which retrieves the WAL files from the master's > archival area. Supposing that we adopt this idea, we should implement > an automatic replication of a base backup at the same time, to prevent > the users from doing the complicated two-phase-backup? Either way works for async, but only the 2-phase backup works for synch rep. It's slightly more complicated for the base backup, but not so it isn't still very practical. The list of files is very short: *.conf and the global directory. But overall, ISTM it would actually be easier to do this than to set up both streaming and restore_command. I see it would work like this: Add a new option to recovery.conf, perhaps two_phase_backup = on. Startup creates a file called backup_in_progress then waits. When second phase of backup is complete (7b), delete the file and then Startup process will continue. Very few lines of code to make this work. -- Simon Riggs www.2ndQuadrant.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Backup history file should be replicated in Streaming Replication?
On Thu, Dec 24, 2009 at 5:35 PM, Simon Riggs wrote: > The code has always been capable of starting without this, which was > considered a feature to be able start from a hot copy. I would like to > do as you suggest, but it would remove the feature. This would be a > great example of why I don't want too many ways to start HS. Please tell me what is "a hot copy". You mean standalone hot backup? http://developer.postgresql.org/pgdocs/postgres/continuous-archiving.html#BACKUP-STANDALONE Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Backup history file should be replicated in Streaming Replication?
On Thu, Dec 24, 2009 at 1:39 PM, Fujii Masao wrote: > On Wed, Dec 23, 2009 at 7:50 PM, Heikki Linnakangas > wrote: >> Ok. How about writing the history file in pg_stop_backup() for >> informational purposes only. Ie. never read it, but rely on the WAL >> records instead. > > Sounds good. I'll make such change as a self-contained patch. Done. Please see the attached patch. Design: * pg_stop_backup writes the backup-end xlog record which contains the backup starting point. * In archive recovery, the startup process doesn't mark the database as consistent until it has read the backup-end record. * A backup history file is still created as in the past, but is never used. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center *** a/src/backend/access/transam/xlog.c --- b/src/backend/access/transam/xlog.c *** *** 448,453 static TimeLineID lastPageTLI = 0; --- 448,454 static XLogRecPtr minRecoveryPoint; /* local copy of * ControlFile->minRecoveryPoint */ static bool updateMinRecoveryPoint = true; + static bool reachedBackupEnd = false; static bool InRedo = false; *** *** 515,522 static void xlog_outrec(StringInfo buf, XLogRecord *record); #endif static void issue_xlog_fsync(void); static void pg_start_backup_callback(int code, Datum arg); ! static bool read_backup_label(XLogRecPtr *checkPointLoc, ! XLogRecPtr *minRecoveryLoc); static void rm_redo_error_callback(void *arg); static int get_sync_bit(int method); --- 516,522 #endif static void issue_xlog_fsync(void); static void pg_start_backup_callback(int code, Datum arg); ! static bool read_backup_label(XLogRecPtr *checkPointLoc); static void rm_redo_error_callback(void *arg); static int get_sync_bit(int method); *** *** 5355,5361 StartupXLOG(void) bool haveBackupLabel = false; XLogRecPtr RecPtr, checkPointLoc, - backupStopLoc, EndOfLog; uint32 endLogId; uint32 endLogSeg; --- 5355,5360 *** *** 5454,5460 StartupXLOG(void) recoveryTargetTLI, ControlFile->checkPointCopy.ThisTimeLineID))); ! if (read_backup_label(&checkPointLoc, &backupStopLoc)) { /* * When a backup_label file is present, we want to roll forward from --- 5453,5459 recoveryTargetTLI, ControlFile->checkPointCopy.ThisTimeLineID))); ! if (read_backup_label(&checkPointLoc)) { /* * When a backup_label file is present, we want to roll forward from *** *** 5597,5606 StartupXLOG(void) ControlFile->prevCheckPoint = ControlFile->checkPoint; ControlFile->checkPoint = checkPointLoc; ControlFile->checkPointCopy = checkPoint; ! if (backupStopLoc.xlogid != 0 || backupStopLoc.xrecoff != 0) { ! if (XLByteLT(ControlFile->minRecoveryPoint, backupStopLoc)) ! ControlFile->minRecoveryPoint = backupStopLoc; } ControlFile->time = (pg_time_t) time(NULL); /* No need to hold ControlFileLock yet, we aren't up far enough */ --- 5596,5610 ControlFile->prevCheckPoint = ControlFile->checkPoint; ControlFile->checkPoint = checkPointLoc; ControlFile->checkPointCopy = checkPoint; ! if (InArchiveRecovery) ! { ! if (XLByteLT(ControlFile->minRecoveryPoint, checkPoint.redo)) ! ControlFile->minRecoveryPoint = checkPoint.redo; ! } ! else { ! XLogRecPtr InvalidXLogRecPtr = {0, 0}; ! ControlFile->minRecoveryPoint = InvalidXLogRecPtr; } ControlFile->time = (pg_time_t) time(NULL); /* No need to hold ControlFileLock yet, we aren't up far enough */ *** *** 5703,5717 StartupXLOG(void) InRedo = true; ! if (minRecoveryPoint.xlogid == 0 && minRecoveryPoint.xrecoff == 0) ! ereport(LOG, ! (errmsg("redo starts at %X/%X", ! ReadRecPtr.xlogid, ReadRecPtr.xrecoff))); ! else ! ereport(LOG, ! (errmsg("redo starts at %X/%X, consistency will be reached at %X/%X", ! ReadRecPtr.xlogid, ReadRecPtr.xrecoff, ! minRecoveryPoint.xlogid, minRecoveryPoint.xrecoff))); /* * Let postmaster know we've started redo now, so that it can --- 5707,5715 InRedo = true; ! ereport(LOG, ! (errmsg("redo starts at %X/%X", ! ReadRecPtr.xlogid, ReadRecPtr.xrecoff))); /* * Let postmaster know we've started redo now, so that it can *** *** 5771,5777 StartupXLOG(void) * Have we passed our safe starting point? */ if (!reachedMinRecoveryPoint && ! XLByteLE(minRecoveryPoint, EndRecPtr)) { reachedMinRecoveryPoint = true; ereport(LOG, --- 5769,5776 * Have we passed our safe starting point? */ if (!reachedMinRecoveryPoint && ! XLByteLE(minRecoveryPoint, EndRecPtr) && ! reachedBackupEnd) { reachedMinRecoveryPoint = true; ereport(LOG, *** *
Re: [HACKERS] Backup history file should be replicated in Streaming Replication?
On Wed, 2009-12-23 at 12:50 +0200, Heikki Linnakangas wrote: > I just realized that the current history file fails to recognize this > scenario: > > 1. pg_start_backup() > 2. cp -a $PGDATA data-backup > 3. create data-backup/recovery.conf > 4. postmaster -D data-backup > > That is, starting postmaster on a data directory, without ever calling > pg_stop_backup(). Because pg_stop_backup() was not called, the history > file is not there, and recovery won't complain about not reaching the > safe starting point. > > That is of course a case of "don't do that!", but perhaps we should > refuse to start up if the backup history file is not found? At least in > the WAL-based approach, I think we should refuse to start up if we don't > see the pg_stop_backup WAL record. The code has always been capable of starting without this, which was considered a feature to be able start from a hot copy. I would like to do as you suggest, but it would remove the feature. This would be a great example of why I don't want too many ways to start HS. -- Simon Riggs www.2ndQuadrant.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] info about patch: using parametrised query in psql
Hello I try to explain my motivation for creating this patch https://commitfest.postgresql.org/action/patch_view?id=224 . Parametrised queries are supported in PostgreSQL long time. Using the parametrised queries is point of all advices about good programming style. On application level it is protection to SQL injection. On low level it is protection to some potential quoting or escaping bugs. It is paradox, so PostgreSQL doesn't use this techniques in own applications - mainly in psql. In psql we have not any quoting, or escaping functionality. We have to use external tools like awk, sed: http://www.redhat.com/docs/manuals/database/RHDB-2.1-Manual/admin_user/r1-app-psql-4.html > > testdb=> \set content '\'' `cat my_file.txt` '\'' > testdb=> INSERT INTO my_table VALUES (:content); > > One possible problem with this approach is that my_file.txt might contain > single quotes. > These need to be escaped so that they do not cause a syntax error when the > third line is processed. You can do this with the program sed: > > testdb=> \set content `sed -e "s/'/\\'/g" < my_file.txt` Similar problems could be removed with using parameter queries in psql. With this parametrised queries feature you can: \set content `cat my_file.txt` INSERT INTO my_table VALUES(:content); and this command will be correct without depending on content my_file.txt file. This is more: robust, secure, and simpler. My motivation is simplify life to people who use psql for scripting. For internal use SQL injection isn't too much terrible. Problem are some obscure identifiers used some users. Now you have to carefully check every value, if your scripts have to be robust. Patch doesn't change default behave. You have to explicitly activate it. Regards, merry Christmas Pavel Stehule -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Corrupted WAL production possible in gistxlog.c
Hello, I was reading GiST core codes when I found an XLogInsert() call that can produce a corrupted WAL record. == Summary == There is an execution path that produces a WAL record whose xl_info indicates XLOG_GIST_PAGE_UPDATE while the record actually contains a gistxlogPageSplit structure. == Details == (Line numbers are for HEAD as of Wed Dec 23 19:42:15 2009 +.) The problematic XLogInsert() call is on gistxlog.c, line 770: recptr = XLogInsert(RM_GIST_ID, XLOG_GIST_PAGE_UPDATE, rdata); where the last argument rdata has a pointer assigned either on line 741 or on line 752. When rdata comes from formSplitRdata() at line 741, rdata contains a reference to a gistxlogPageSplit structure. This is inconsistent with the second argument XLOG_GIST_PAGE_UPDATE. == Importance == I think this poses possible data loss under multiple consecutive crashes. == Fix == I include a simple patch (for HEAD as of the datetime above) that, I suppose, prevents the corrupt WAL production. I would be glad if you liked it. Please note that the execution path exists at least in current HEAD, REL8_2_STABLE and the branches in between. Sincerely, Yoichi Hirai Dept. of Computer Science, The University of Tokyo diff --git a/src/backend/access/gist/gistxlog.c b/src/backend/access/gist/gistxlog.c index adf27ff..74e72f0 100644 --- a/src/backend/access/gist/gistxlog.c +++ b/src/backend/access/gist/gistxlog.c @@ -654,6 +654,7 @@ gistContinueInsert(gistIncompleteInsert *insert) XLogRecPtr recptr; Buffer tempbuffer = InvalidBuffer; int ntodelete = 0; + uint8 info; numbuffer = 1; buffers[0] = ReadBuffer(index, insert->path[i]); @@ -738,6 +739,7 @@ gistContinueInsert(gistIncompleteInsert *insert) for (j = 0; j < ntodelete; j++) PageIndexTupleDelete(pages[0], todelete[j]); +info = XLOG_GIST_PAGE_SPLIT; rdata = formSplitRdata(index->rd_node, insert->path[i], false, &(insert->key), gistMakePageLayout(buffers, numbuffer)); @@ -751,6 +753,7 @@ gistContinueInsert(gistIncompleteInsert *insert) PageIndexTupleDelete(pages[0], todelete[j]); gistfillbuffer(pages[0], itup, lenitup, InvalidOffsetNumber); +info = XLOG_GIST_PAGE_UPDATE; rdata = formUpdateRdata(index->rd_node, buffers[0], todelete, ntodelete, itup, lenitup, &(insert->key)); @@ -767,7 +770,7 @@ gistContinueInsert(gistIncompleteInsert *insert) GistPageGetOpaque(pages[j])->rightlink = InvalidBlockNumber; MarkBufferDirty(buffers[j]); } - recptr = XLogInsert(RM_GIST_ID, XLOG_GIST_PAGE_UPDATE, rdata); + recptr = XLogInsert(RM_GIST_ID, info, rdata); for (j = 0; j < numbuffer; j++) { PageSetLSN(pages[j], recptr); -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Corrupted WAL production possible in gistxlog.c
Hello, I was reading GiST core codes when I found an XLogInsert() call that can produce a corrupted WAL record. == Summary == There is an execution path that produces a WAL record whose xl_info indicates XLOG_GIST_PAGE_UPDATE while the record actually contains a gistxlogPageSplit structure. == Details == (Line numbers are for HEAD as of Wed Dec 23 19:42:15 2009 +.) The problematic XLogInsert() call is on gistxlog.c, line 770: recptr = XLogInsert(RM_GIST_ID, XLOG_GIST_PAGE_UPDATE, rdata); where the last argument rdata has a pointer assigned either on line 741 or on line 752. When rdata comes from formSplitRdata() at line 741, rdata contains a reference to a gistxlogPageSplit structure. This is inconsistent with the second argument XLOG_GIST_PAGE_UPDATE. == Importance == I think this poses possible data loss under multiple consecutive crashes. == Fix == I include a simple patch (for HEAD as of the datetime above) that, I suppose, prevents the corrupt WAL production. I would be glad if you liked it. Please note that the execution path exists at least in current HEAD, REL8_2_STABLE and the branches in between. Sincerely, Yoichi Hirai Dept. of Computer Science, The University of Tokyo gistxlog_fix_xlinfo.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Corrupted WAL production possible in gistxlog.c
Hello, I was reading GiST core codes when I found an XLogInsert() call that can produce a corrupted WAL record. == Summary == There is an execution path that produces a WAL record whose xl_info indicates XLOG_GIST_PAGE_UPDATE while the record actually contains a gistxlogPageSplit structure. == Details == (Line numbers are for HEAD as of Wed Dec 23 19:42:15 2009 +.) The problematic XLogInsert() call is on gistxlog.c, line 770: recptr = XLogInsert(RM_GIST_ID, XLOG_GIST_PAGE_UPDATE, rdata); where the last argument rdata has a pointer assigned either on line 741 or on line 752. When rdata comes from formSplitRdata() at line 741, rdata contains a reference to a gistxlogPageSplit structure. This is inconsistent with the second argument XLOG_GIST_PAGE_UPDATE. == Importance == I think this poses possible data loss under multiple consecutive crashes. == Fix == I include a simple patch (for HEAD as of the datetime above) that, I suppose, prevents the corrupt WAL production. I would be glad if you liked it. /* I'm including the patch instead of attaching it because it seems the spam filter on pgsql-hackers is hard on me, sigh. */ Please note that the execution path exists at least in current HEAD, REL8_2_STABLE and the branches in between. Sincerely, Yoichi Hirai Dept. of Computer Science, The University of Tokyo diff --git a/src/backend/access/gist/gistxlog.c b/src/backend/access/gist/gistxlog.c index adf27ff..74e72f0 100644 --- a/src/backend/access/gist/gistxlog.c +++ b/src/backend/access/gist/gistxlog.c @@ -654,6 +654,7 @@ gistContinueInsert(gistIncompleteInsert *insert) XLogRecPtr recptr; Buffer tempbuffer = InvalidBuffer; int ntodelete = 0; + uint8 info; numbuffer = 1; buffers[0] = ReadBuffer(index, insert->path[i]); @@ -738,6 +739,7 @@ gistContinueInsert(gistIncompleteInsert *insert) for (j = 0; j < ntodelete; j++) PageIndexTupleDelete(pages[0], todelete[j]); + info = XLOG_GIST_PAGE_SPLIT; rdata = formSplitRdata(index->rd_node, insert->path[i], false, &(insert->key), gistMakePageLayout(buffers, numbuffer)); @@ -751,6 +753,7 @@ gistContinueInsert(gistIncompleteInsert *insert) PageIndexTupleDelete(pages[0], todelete[j]); gistfillbuffer(pages[0], itup, lenitup, InvalidOffsetNumber); + info = XLOG_GIST_PAGE_UPDATE; rdata = formUpdateRdata(index->rd_node, buffers[0], todelete, ntodelete, itup, lenitup, &(insert->key)); @@ -767,7 +770,7 @@ gistContinueInsert(gistIncompleteInsert *insert) GistPageGetOpaque(pages[j])->rightlink = InvalidBlockNumber; MarkBufferDirty(buffers[j]); } - recptr = XLogInsert(RM_GIST_ID, XLOG_GIST_PAGE_UPDATE, rdata); + recptr = XLogInsert(RM_GIST_ID, info, rdata); for (j = 0; j < numbuffer; j++) { PageSetLSN(pages[j], recptr); -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Corrupted WAL production possible in gistxlog.c
Hello, I was reading GiST core codes when I found an XLogInsert() call that can produce a corrupted WAL record. == Summary == There is an execution path that produces a WAL record whose xl_info indicates XLOG_GIST_PAGE_UPDATE while the record actually contains a gistxlogPageSplit structure. == Details == (Line numbers are for HEAD as of Wed Dec 23 19:42:15 2009 +.) The problematic XLogInsert() call is on gistxlog.c, line 770: recptr = XLogInsert(RM_GIST_ID, XLOG_GIST_PAGE_UPDATE, rdata); where the last argument rdata has a pointer assigned either on line 741 or on line 752. When rdata comes from formSplitRdata() at line 741, rdata contains a reference to a gistxlogPageSplit structure. This is inconsistent with the second argument XLOG_GIST_PAGE_UPDATE. == Importance == I think this poses possible data loss under multiple consecutive crashes. == Fix == I attach a simple patch (for HEAD as of the datetime above) that, I suppose, prevents the corrupted WAL production. I would be glad if you liked it. Please note that the execution path exists at least in current HEAD, REL8_2_STABLE and the branches in between. Sincerely, Yoichi Hirai Dept. of Computer Science, The University of Tokyo gistxlog_fix_xlinfo.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Corrupted WAL production possible in gistxlog.c
Hello, I was reading GiST core codes when I found an XLogInsert() call that can produce a corrupted WAL record. == Summary == There is an execution path that produces a WAL record whose xl_info indicates XLOG_GIST_PAGE_UPDATE while the record actually contains a gistxlogPageSplit structure. == Details == (Line numbers are for HEAD as of Wed Dec 23 19:42:15 2009 +.) The problematic XLogInsert() call is on gistxlog.c, line 770: recptr = XLogInsert(RM_GIST_ID, XLOG_GIST_PAGE_UPDATE, rdata); where the last argument rdata has a pointer assigned either on line 741 or on line 752. When rdata comes from formSplitRdata() at line 741, rdata contains a reference to a gistxlogPageSplit structure. This is inconsistent with the second argument XLOG_GIST_PAGE_UPDATE. == Importance == I think this poses possible data loss under multiple consecutive crashes. == Fix == I attach a simple patch (for HEAD as of the datetime above) that, I suppose, prevents the corrupt WAL production. I would be glad if you liked it. Please note that the execution path exists at least in current HEAD, REL8_2_STABLE and the branches in between. Sincerely, Yoichi Hirai Dept. of Computer Science, The University of Tokyo diff --git a/src/backend/access/gist/gistxlog.c b/src/backend/access/gist/gistxlog.c index adf27ff..74e72f0 100644 --- a/src/backend/access/gist/gistxlog.c +++ b/src/backend/access/gist/gistxlog.c @@ -654,6 +654,7 @@ gistContinueInsert(gistIncompleteInsert *insert) XLogRecPtr recptr; Buffer tempbuffer = InvalidBuffer; int ntodelete = 0; + uint8 info; numbuffer = 1; buffers[0] = ReadBuffer(index, insert->path[i]); @@ -738,6 +739,7 @@ gistContinueInsert(gistIncompleteInsert *insert) for (j = 0; j < ntodelete; j++) PageIndexTupleDelete(pages[0], todelete[j]); +info = XLOG_GIST_PAGE_SPLIT; rdata = formSplitRdata(index->rd_node, insert->path[i], false, &(insert->key), gistMakePageLayout(buffers, numbuffer)); @@ -751,6 +753,7 @@ gistContinueInsert(gistIncompleteInsert *insert) PageIndexTupleDelete(pages[0], todelete[j]); gistfillbuffer(pages[0], itup, lenitup, InvalidOffsetNumber); +info = XLOG_GIST_PAGE_UPDATE; rdata = formUpdateRdata(index->rd_node, buffers[0], todelete, ntodelete, itup, lenitup, &(insert->key)); @@ -767,7 +770,7 @@ gistContinueInsert(gistIncompleteInsert *insert) GistPageGetOpaque(pages[j])->rightlink = InvalidBlockNumber; MarkBufferDirty(buffers[j]); } - recptr = XLogInsert(RM_GIST_ID, XLOG_GIST_PAGE_UPDATE, rdata); + recptr = XLogInsert(RM_GIST_ID, info, rdata); for (j = 0; j < numbuffer; j++) { PageSetLSN(pages[j], recptr); -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: Streaming Rep - 2-phase backups and reducing time to full replication
On Wed, Dec 23, 2009 at 3:54 AM, Simon Riggs wrote: > 9. Create a recovery command file in the standby server with parameters > required for streaming replication. > > 7. (a) Make a base backup of minimal essential files from primary > server, load this data onto the standby. > > 10. Start postgres in the standby server. It will start streaming > replication. > > 7. (b) Continue with second phase of base backup, copying all remaining > files, ending with pg_stop_backup() This manual procedure seems to be more complicated for me than just using restore_command which retrieves the WAL files from the master's archival area. Supposing that we adopt this idea, we should implement an automatic replication of a base backup at the same time, to prevent the users from doing the complicated two-phase-backup? Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Corrupted WAL production possible in gistxlog.c
Hello, I was reading GiST core codes when I found an XLogInsert() call that can produce a corrupt WAL record. == Summary == There is an execution path that produces a WAL record whose xl_info indicates XLOG_GIST_PAGE_UPDATE while the record actually contains a gistxlogPageSplit structure. == Details == (Line numbers are for HEAD as of Wed Dec 23 19:42:15 2009 +.) The problematic XLogInsert() call is on gistxlog.c, line 770: recptr = XLogInsert(RM_GIST_ID, XLOG_GIST_PAGE_UPDATE, rdata); where the last argument rdata has a pointer assigned either on line 741 or on line 752. When rdata comes from formSplitRdata() at line 741, rdata contains a reference to a gistxlogPageSplit structure. This is inconsistent with the second argument XLOG_GIST_PAGE_UPDATE. == Importance == I think this poses possible data loss under multiple consecutive crashes. == Fix == I attach a simple patch (for HEAD as of the datetime above) that, I suppose, prevents the corrupt WAL production. I would be glad if you liked it. Please note that the execution path exists at least in current HEAD, REL8_2_STABLE and the branches in between. Sincerely, Yoichi Hirai Dept. of Computer Science, The University of Tokyo diff --git a/src/backend/access/gist/gistxlog.c b/src/backend/access/gist/gistxlog.c index adf27ff..74e72f0 100644 --- a/src/backend/access/gist/gistxlog.c +++ b/src/backend/access/gist/gistxlog.c @@ -654,6 +654,7 @@ gistContinueInsert(gistIncompleteInsert *insert) XLogRecPtr recptr; Buffer tempbuffer = InvalidBuffer; int ntodelete = 0; + uint8 info; numbuffer = 1; buffers[0] = ReadBuffer(index, insert->path[i]); @@ -738,6 +739,7 @@ gistContinueInsert(gistIncompleteInsert *insert) for (j = 0; j < ntodelete; j++) PageIndexTupleDelete(pages[0], todelete[j]); +info = XLOG_GIST_PAGE_SPLIT; rdata = formSplitRdata(index->rd_node, insert->path[i], false, &(insert->key), gistMakePageLayout(buffers, numbuffer)); @@ -751,6 +753,7 @@ gistContinueInsert(gistIncompleteInsert *insert) PageIndexTupleDelete(pages[0], todelete[j]); gistfillbuffer(pages[0], itup, lenitup, InvalidOffsetNumber); +info = XLOG_GIST_PAGE_UPDATE; rdata = formUpdateRdata(index->rd_node, buffers[0], todelete, ntodelete, itup, lenitup, &(insert->key)); @@ -767,7 +770,7 @@ gistContinueInsert(gistIncompleteInsert *insert) GistPageGetOpaque(pages[j])->rightlink = InvalidBlockNumber; MarkBufferDirty(buffers[j]); } - recptr = XLogInsert(RM_GIST_ID, XLOG_GIST_PAGE_UPDATE, rdata); + recptr = XLogInsert(RM_GIST_ID, info, rdata); for (j = 0; j < numbuffer; j++) { PageSetLSN(pages[j], recptr); -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Corrupt WAL production possible in gistxlog.c
Hello, I was reading GiST core codes when I found an XLogInsert() call that can produce a corrupt WAL record. == Summary == There is an execution path that produces a WAL record whose xl_info indicates XLOG_GIST_PAGE_UPDATE while the record actually contains a gistxlogPageSplit structure. == Details == (Line numbers are for HEAD as of Wed Dec 23 19:42:15 2009 +.) The problematic XLogInsert() call is on gistxlog.c, line 770: recptr = XLogInsert(RM_GIST_ID, XLOG_GIST_PAGE_UPDATE, rdata); where the last argument rdata has a pointer assigned either on line 741 or on line 752. When rdata comes from formSplitRdata() at line 741, rdata contains a reference to a gistxlogPageSplit structure. This is inconsistent with the second argument XLOG_GIST_PAGE_UPDATE. == Importance == I think this poses possible data loss under multiple consecutive crashes. == Fix == I attach a simple patch (for HEAD as of the datetime above) that, I suppose, prevents the corrupt WAL production. I would be glad if you liked it. Please note that the problematic execution path exists at least in current HEAD, REL8_2_STABLE and the branches in between. Sincerely, Yoichi Hirai Dept. of Computer Science, The University of Tokyo diff --git a/src/backend/access/gist/gistxlog.c b/src/backend/access/gist/gistxlog.c index adf27ff..74e72f0 100644 --- a/src/backend/access/gist/gistxlog.c +++ b/src/backend/access/gist/gistxlog.c @@ -654,6 +654,7 @@ gistContinueInsert(gistIncompleteInsert *insert) XLogRecPtr recptr; Buffer tempbuffer = InvalidBuffer; int ntodelete = 0; + uint8 info; numbuffer = 1; buffers[0] = ReadBuffer(index, insert->path[i]); @@ -738,6 +739,7 @@ gistContinueInsert(gistIncompleteInsert *insert) for (j = 0; j < ntodelete; j++) PageIndexTupleDelete(pages[0], todelete[j]); +info = XLOG_GIST_PAGE_SPLIT; rdata = formSplitRdata(index->rd_node, insert->path[i], false, &(insert->key), gistMakePageLayout(buffers, numbuffer)); @@ -751,6 +753,7 @@ gistContinueInsert(gistIncompleteInsert *insert) PageIndexTupleDelete(pages[0], todelete[j]); gistfillbuffer(pages[0], itup, lenitup, InvalidOffsetNumber); +info = XLOG_GIST_PAGE_UPDATE; rdata = formUpdateRdata(index->rd_node, buffers[0], todelete, ntodelete, itup, lenitup, &(insert->key)); @@ -767,7 +770,7 @@ gistContinueInsert(gistIncompleteInsert *insert) GistPageGetOpaque(pages[j])->rightlink = InvalidBlockNumber; MarkBufferDirty(buffers[j]); } - recptr = XLogInsert(RM_GIST_ID, XLOG_GIST_PAGE_UPDATE, rdata); + recptr = XLogInsert(RM_GIST_ID, info, rdata); for (j = 0; j < numbuffer; j++) { PageSetLSN(pages[j], recptr); -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] creating index names automatically?
Robert Haas writes: > It compiles without warnings for me. There's only one production that > allows exactly one word between INDEX and ON. In that case you broke something. I'm too tired to work out exactly what. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Streaming Rep - 2-phase backups and reducing time to full replication
On Wed, Dec 23, 2009 at 6:33 AM, decibel wrote: > Dumb question: could the WAL streaming code be made to also ship base files? No. > That would make setting up a streaming replica super-simple. Agreed, but it's not easy to implement that. So I'm thinking that it's out of the scope of the development for v8.5. Of course, anyone is welcome to try that ;-) Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Removing pg_migrator limitations
Bruce Momjian wrote: > I looked at DefineEnum() and basically adding the ability to add enums > would put the new enum after the existing ones unless the OID counter > has wrapped around and is less than the oid counter at the time the enum > type was created, in which case it will be listed as before the existing > values. I wasn't aware enum ordering is something we tried to maintain. > One issue is that we are not supporting the addition of enum values even > for people who don't care about the ordering of enums (which I bet might > be the majority.) > > I can think of a few approaches for pg_migrator: > > 1) Create an oid array in a permanent memory context and have > DefineEnum() read from that. > 2) Renumber the enum entries after they are created but before > any of their oids are stored in user tables. > > Both can be done by pg_dump with proper server-side functions. The > problem with #2 are cases where the old and new oid ranges overlap, > e.g.: I now think the easiest solution will be to have pg_dump create the enum with a single dummy value, delete the pg_enum dummy row, and then call a modified version of EnumValuesCreate() to insert row by row into pg_enum, with specified oids. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Backup history file should be replicated in Streaming Replication?
On Wed, Dec 23, 2009 at 7:50 PM, Heikki Linnakangas wrote: > Ok. How about writing the history file in pg_stop_backup() for > informational purposes only. Ie. never read it, but rely on the WAL > records instead. Sounds good. I'll make such change as a self-contained patch. > I just realized that the current history file fails to recognize this > scenario: > > 1. pg_start_backup() > 2. cp -a $PGDATA data-backup > 3. create data-backup/recovery.conf > 4. postmaster -D data-backup > > That is, starting postmaster on a data directory, without ever calling > pg_stop_backup(). Because pg_stop_backup() was not called, the history > file is not there, and recovery won't complain about not reaching the > safe starting point. > > That is of course a case of "don't do that!", but perhaps we should > refuse to start up if the backup history file is not found? At least in > the WAL-based approach, I think we should refuse to start up if we don't > see the pg_stop_backup WAL record. Agreed. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] updateMinRecoveryPoint bug?
Hi, In UpdateMinRecoveryPoint() and XLogNeedsFlush(), updateMinRecoveryPoint is used for us to short-circuit future checks only during a crash recovery. But it doesn't seem to play its role in a crash recovery that follows an archive recovery. Because such crash recovery always starts from *valid* minRecoveryPoint, i.e., updateMinRecoveryPoint is never set to FALSE. How about always resetting ControlFile->minRecoveryPoint to {0, 0} at the beginning of a crash recovery, to fix the bug? Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Corrupt WAL production possible in gistxlog.c
Hello, I was reading GiST core codes when I found an XLogInsert() call that can produce a corrupt WAL record. == Summary == There is an execution path that produces a WAL record whose xl_info indicates XLOG_GIST_PAGE_UPDATE while the record actually contains a gistxlogPageSplit structure. == Details == (Line numbers are for HEAD as of Wed Dec 23 19:42:15 2009 +.) The problematic XLogInsert() call is on gistxlog.c, line 770: recptr = XLogInsert(RM_GIST_ID, XLOG_GIST_PAGE_UPDATE, rdata); where the last argument rdata has a pointer assigned either on line 741 or on line 752. When rdata comes from formSplitRdata() on line 741, rdata contains a reference to a gistxlogPageSplit structure. This is inconsistent with the second argument XLOG_GIST_PAGE_UPDATE. == Importance == I think this poses possible data loss under multiple consecutive crashes. == Fix == I attach a simple patch (for HEAD as of the datetime above) that, I suppose, prevents the corrupt WAL production. I would be glad if you liked it. Please note that the execution path exists at least in current HEAD, REL8_2_STABLE and the branches in between. Sincerely, Yoichi Hirai y...@lyon.is.s.u-tokyo.ac.jp Dept. of Computer Science, The University of Tokyo diff --git a/src/backend/access/gist/gistxlog.c b/src/backend/access/gist/gistxlog.c index adf27ff..74e72f0 100644 --- a/src/backend/access/gist/gistxlog.c +++ b/src/backend/access/gist/gistxlog.c @@ -654,6 +654,7 @@ gistContinueInsert(gistIncompleteInsert *insert) XLogRecPtr recptr; Buffer tempbuffer = InvalidBuffer; int ntodelete = 0; + uint8 info; numbuffer = 1; buffers[0] = ReadBuffer(index, insert->path[i]); @@ -738,6 +739,7 @@ gistContinueInsert(gistIncompleteInsert *insert) for (j = 0; j < ntodelete; j++) PageIndexTupleDelete(pages[0], todelete[j]); +info = XLOG_GIST_PAGE_SPLIT; rdata = formSplitRdata(index->rd_node, insert->path[i], false, &(insert->key), gistMakePageLayout(buffers, numbuffer)); @@ -751,6 +753,7 @@ gistContinueInsert(gistIncompleteInsert *insert) PageIndexTupleDelete(pages[0], todelete[j]); gistfillbuffer(pages[0], itup, lenitup, InvalidOffsetNumber); +info = XLOG_GIST_PAGE_UPDATE; rdata = formUpdateRdata(index->rd_node, buffers[0], todelete, ntodelete, itup, lenitup, &(insert->key)); @@ -767,7 +770,7 @@ gistContinueInsert(gistIncompleteInsert *insert) GistPageGetOpaque(pages[j])->rightlink = InvalidBlockNumber; MarkBufferDirty(buffers[j]); } - recptr = XLogInsert(RM_GIST_ID, XLOG_GIST_PAGE_UPDATE, rdata); + recptr = XLogInsert(RM_GIST_ID, info, rdata); for (j = 0; j < numbuffer; j++) { PageSetLSN(pages[j], recptr); -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Corrupt WAL production possible in gistxlog.c
Hello, I was reading GiST core codes when I found an XLogInsert() call that can produce a corrupt WAL record. == Summary == There is an execution path that produces a WAL record whose xl_info indicates XLOG_GIST_PAGE_UPDATE while the record actually contains a gistxlogPageSplit structure. == Details == (Line numbers are for HEAD as of Wed Dec 23 19:42:15 2009 +.) The problematic XLogInsert() call is on gistxlog.c, line 770: recptr = XLogInsert(RM_GIST_ID, XLOG_GIST_PAGE_UPDATE, rdata); where the last argument rdata has a pointer assigned either on line 741 or on line 752. When rdata comes from formSplitRdata() at line 741, rdata contains a reference to a gistxlogPageSplit structure. This is inconsistent with the second argument XLOG_GIST_PAGE_UPDATE. == Importance == I think this poses possible data loss under multiple consequent crashes. == Fix == I attach a simple patch (for HEAD as of the datetime above) that, I suppose, prevents the corrupt WAL production. I would be glad if you liked it. Please note that the problematic execution path exists at least in current HEAD, REL8_2_STABLE and the branches in between. Sincerely, -- Yoichi Hirai Dept. of Computer Science, The University of Tokyo. diff --git a/src/backend/access/gist/gistxlog.c b/src/backend/access/gist/gistxlog.c index adf27ff..74e72f0 100644 --- a/src/backend/access/gist/gistxlog.c +++ b/src/backend/access/gist/gistxlog.c @@ -654,6 +654,7 @@ gistContinueInsert(gistIncompleteInsert *insert) XLogRecPtr recptr; Buffer tempbuffer = InvalidBuffer; int ntodelete = 0; + uint8 info; numbuffer = 1; buffers[0] = ReadBuffer(index, insert->path[i]); @@ -738,6 +739,7 @@ gistContinueInsert(gistIncompleteInsert *insert) for (j = 0; j < ntodelete; j++) PageIndexTupleDelete(pages[0], todelete[j]); +info = XLOG_GIST_PAGE_SPLIT; rdata = formSplitRdata(index->rd_node, insert->path[i], false, &(insert->key), gistMakePageLayout(buffers, numbuffer)); @@ -751,6 +753,7 @@ gistContinueInsert(gistIncompleteInsert *insert) PageIndexTupleDelete(pages[0], todelete[j]); gistfillbuffer(pages[0], itup, lenitup, InvalidOffsetNumber); +info = XLOG_GIST_PAGE_UPDATE; rdata = formUpdateRdata(index->rd_node, buffers[0], todelete, ntodelete, itup, lenitup, &(insert->key)); @@ -767,7 +770,7 @@ gistContinueInsert(gistIncompleteInsert *insert) GistPageGetOpaque(pages[j])->rightlink = InvalidBlockNumber; MarkBufferDirty(buffers[j]); } - recptr = XLogInsert(RM_GIST_ID, XLOG_GIST_PAGE_UPDATE, rdata); + recptr = XLogInsert(RM_GIST_ID, info, rdata); for (j = 0; j < numbuffer; j++) { PageSetLSN(pages[j], recptr); -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers