Re: [HACKERS] Incomplete description of pg_start_backup?
I don't get still. Suppose we have a data file with blocks with important (non-empty) data: A B C D 1. I call pg_start_backup(). 2. Tar starts to copy A block to the destination archive... 3. During this copying, somebody removes data from a table which is situated in B block. So this data is a subject for vacuuming, and the block is marked as a free space. 4. Somebody writes data to a table, and this data is placed to a free space - to B block. This is also added to the WAL log (so the data is stored at 2 places: at B block and at WAL). 5. Tar (at last!) finishes copying of A block and begins to copy B block. 6. It finishes, then it copies C and D to the archive too. 7. Then we call pg_stop_backup() and also archive collected WAL (which contains the new data of B block as we saw above). The question is - *where is the OLD data of B block in this scheme?* Seems it is NOT in the backup! So it cannot be restored. (And, in case when we never overwrite blocks between pg_start_backup...pg_stop_backup, but always append the new data, it is not a problem.) Seems to me this is not documented at all! That is what my initial e-mail about. (I have one hypothesis on that, but I am not sure. Here is it: does vacuum saves ALL deleted data of B block to WAL on step 3 prior deletion? If yes, it is, of course, a part of the backup. But it wastes space a lot...) On Tue, May 14, 2013 at 6:05 PM, Jeff Janes jeff.ja...@gmail.com wrote: On Mon, May 13, 2013 at 4:31 PM, Dmitry Koterov dmi...@koterov.ru wrote: Could you please provide a bit more detailed explanation on how it works? And how could postgres write at the middle of archiving files during an active pg_start_backup? if it could, here might be a case when a part of archived data file contains an overridden information from the future, The data files cannot contain information from the future. If the backup is restored, it must be restored to the time of pg_stop_backup (at least), which means the data would at that point be from the past/present, not the future. Cheers, Jeff
Re: [HACKERS] Incomplete description of pg_start_backup?
Could you please provide a bit more detailed explanation on how it works? And how could postgres write at the middle of archiving files during an active pg_start_backup? if it could, here might be a case when a part of archived data file contains an overridden information from the future, while wal files contain only information like i want to write X to Z, not i've overridden the following X with Y at the Z position. The appending is much better here, because unexpected appended data from the future may just be ignored. On Wednesday, May 1, 2013, Jeff Janes wrote: On Tue, Apr 30, 2013 at 3:24 PM, Dmitry Koterov dmi...@koterov.rujavascript:_e({}, 'cvml', 'dmi...@koterov.ru'); wrote: I think that at http://www.postgresql.org/docs/current/static/functions-admin.html and http://www.postgresql.org/docs/current/static/continuous-archiving.html two important points on how pg_start_backup() works are missing: 1. After pg_start_backup() and till pg_stop_backup() VACUUM is denied (e.g. autovacuum is turned off), so the new data is always appended to data files, is never written at their middle. This is not the case. Autovacuum continues to run during the backup. This allows to archive the data directory using any copying tools (rsync, tar, cp etc.). If you forget to call pg_stop_backup() by accident, data files will grow forever. So pg_start_backup() switches the database to append-only mode which is safe to backup without stopping (data files temporarily become append-only, WAL logs are append-only always). No, it doesn't work that way. I don't know why appending would be any safer than normal updates would be anyway. WAL replay fixes up any problems that might arise. 2. After pg_start_backup() and till pg_stop_backup() full_page_writes is forced to be ON. Effectively yes, this is documented in one of your links above (and is one of the reasons vacuuming during the backup is not a problem) Cheers, Jeff
[HACKERS] Incomplete description of pg_start_backup?
I think that at http://www.postgresql.org/docs/current/static/functions-admin.html and http://www.postgresql.org/docs/current/static/continuous-archiving.html two important points on how pg_start_backup() works are missing: 1. After pg_start_backup() and till pg_stop_backup() VACUUM is denied (e.g. autovacuum is turned off), so the new data is always appended to data files, is never written at their middle. This allows to archive the data directory using any copying tools (rsync, tar, cp etc.). If you forget to call pg_stop_backup() by accident, data files will grow forever. So pg_start_backup() switches the database to append-only mode which is safe to backup without stopping (data files temporarily become append-only, WAL logs are append-only always). 2. After pg_start_backup() and till pg_stop_backup() full_page_writes is forced to be ON. BTW are these points fully correct? If yes, it would be great to update the documentation, because in google there are a lot of questions on how exactly backup with pg_start_backup() works and why cp, tar etc. are safe after pg_start_backup(), but no clear answers. If no, could you please give a couple of comments on all these?
Re: [HACKERS] Reproducible Bus error in 9.2.3 during database dump restoration (Ubuntu Server 12.04 LTS)
x86_64, PostgreSQL 9.2. is run within an OpenVZ container and generates SIGBUS. PostgreSQL 9.1 has no such problem. (OpenVZ is a linux kernel-level virtualization which adds namespaces for processes, networking, quotas etc. It works not like e.g. Xen or VMWare, because all containers share the same kernel.) On Wed, Mar 6, 2013 at 7:51 AM, Merlin Moncure mmonc...@gmail.com wrote: On Tue, Mar 5, 2013 at 3:04 PM, Kevin Grittner kgri...@ymail.com wrote: Dmitry Koterov dmi...@koterov.ru wrote: LOG: server process (PID 18705) was terminated by signal 7: Bus error So far I have only heard of this sort of error when PostgreSQL is running in a virtual machine and the VM software is buggy. If you are not running in a VM, my next two suspects would be hardware/BIOS configuration issues, or an antivirus product. for posterity, what's the hardware platform? software bus errors are more likely on non x86 hardware. merlin
[HACKERS] Reproducible Bus error in 9.2.3 during database dump restoration (Ubuntu Server 12.04 LTS)
Hello. I have a database dump file (unfortunately with proprietary information) which leads to the following error in logs during its restoration (even after initdb - it is stable reproducible, at the same large table, the same time): *LOG: server process (PID 18705) was terminated by signal 7: Bus error* DETAIL: Failed process was running: COPY *br_agent_log* (id, agent_id, stamp, trace, message) FROM stdin; LOG: terminating any other active server processes WARNING: terminating connection because of crash of another server process ... and then, after recovery: ... redo done at 0/12DDB7A8 ... LOG: database system is ready to accept connections ERROR: could not read block 1 in file base/57390/11783: read only 4448 of 8192 bytes at character 39 I think it could look like a memory corruption in PG? BTW 9.1.8 does not have such problem - the restoration is OK. Possibly I could help with this crash investigation? How to do it better? Maybe you have a tutorial article about it which shows the preferable error reporting format?
Re: [HACKERS] pg_dump --split patch
To me, this is a wonderful feature, thanks! I think many people would be happy if this patch woud be included to the mainstream (and it is quite short and simple). About name ordering - I think that the problem exists for objects: 1. Stored functions. 2. Foreign keys/triggers (objects which has owning objects). It is wonderful that you store all functions with the same name to the same file. To order them within this file we may simply compare the first definition line lexicographically (or - first line which differs one function definition from another). Foreign key/triggers ordering problem is described by me at http://www.mail-archive.com/pgsql-hackers@postgresql.org/msg114586.html The big problem is with triggers: many triggers may have the same name, but be bound to different tables. It would be great to include these triggers to table's definition or, at least, have separated files for each trigger+table pair. On Wed, Dec 29, 2010 at 6:21 PM, Joel Jacobson j...@gluefinance.com wrote: 2010/12/29 Tom Lane t...@sss.pgh.pa.us I think they're fundamentally different things, because the previously proposed patch is an extension of the machine-readable archive format, and has to remain so because of the expectation that people will want to use parallel restore with it. Joel is arguing for a split-up of the text dump format. Yes, exactly. My patch is of course also a lot smaller :-) pg_dump-directory.diff.: 112 853 bytes pg-dump-split-plain-text-files-9.1devel.patch..: 5 579 bytes I just tried the pg_dump-directory.diff patch. The only thing is has in common with my patch is it writes data to different files, and it's only the data which is splitted into different files, the schema appears to go into the single file TOC. Example, pg_dump-directory.diff: $ ./pg_dump -f /crypt/dirpatch -F d -s glue $ ls -la /crypt/dirpatch/ TOC (1 file) $ rm -rf /crypt/dirpatch $ ./pg_dump -f /crypt/dirpatch -F d glue $ ls /crypt/dirpatch/ 6503.dat 6504.dat ...lots of files... 6871.dat 6872.dat 6873.dat 6874.dat TOC Example, pg_dump --split patch: $ pg_dump -f /crypt/splitpatch -F p --split -s glue $ ls /crypt/splitpatch* /crypt/splitpatch (file) /crypt/splitpatch-split: (directory) myschema1 myschema2 public $ ls /crypt/splitpatch-split/public/ AGGREGATE CONSTRAINT FK_CONSTRAINT FUNCTION INDEX SEQUENCE TABLE TRIGGER TYPE VIEW $ ls /crypt/splitpatch-split/public/FUNCTION/ myfunc.sql otherfunc.sql $ cat /crypt/splitpatch -- -- PostgreSQL database dump -- SET statement_timeout = 0; SET client_encoding = 'UTF8'; SET standard_conforming_strings = off; SET check_function_bodies = false; SET client_min_messages = warning; SET escape_string_warning = off; ...etc... \i /crypt/splitpatch-split/public/FUNCTION/myfunc.sql \i /crypt/splitpatch-split/public/FUNCTION/otherfunc.sql -- Best regards, Joel Jacobson Glue Finance
Re: [HACKERS] Fast ALTER TABLE ... ADD COLUMN ... DEFAULT xxx?
Dmitry Koterov dmi...@koterov.ru writes: No, I meant that in case of the row (1, NULL, NULL, 2, 3, NULL): - the corresponding NULL bitmap is (100110...) - the corresponding tuple is (1, 2, 3) - t_natts=3 (if I am not wrong here) You are wrong --- t_natts would be six here. In general the length of the null bitmap in a tuple (if it has one at all) is always exactly equal to its t_natts value. And so, the real number of values in the tuple - (1, 2, 3) above - is equal to the number of 1-bits in NULL bitmap. And the size of NULL bitmap is held in t_natts. I meant that when I said thanks to NULL bitmap, adding a new nullable column is cheap. :-) And, of course, thanks to t_natts (HeapTupleHeaderGetNatts macro) - too.
Re: [HACKERS] Fast ALTER TABLE ... ADD COLUMN ... DEFAULT xxx?
On Thu, May 21, 2009 at 6:45 PM, Tom Lane t...@sss.pgh.pa.us wrote: Sam Mason s...@samason.me.uk writes: On Thu, May 21, 2009 at 12:06:29PM +0400, Dmitry Koterov wrote: ALTER TABLE ... ADD COLUMN ... NULL; (nullable without a default value). This is because of NULL bitmap in tuples. And it's greatest feature for a developer! I don't think this is because of the NULL bitmap. No, it isn't. It's because each tuple includes the actual count of fields it contains (t_natts or HeapTupleHeaderGetNatts), and the value extraction routines are coded to assume that references to fields beyond that number should yield NULL. So the ALTER can just leave the existing rows alone --- only when you update a row will it change to include the newly added field(s). No, I meant that in case of the row (1, NULL, NULL, 2, 3, NULL): - the corresponding NULL bitmap is (100110...) - the corresponding tuple is (1, 2, 3) - t_natts=3 (if I am not wrong here) And in case of the row (5, 6, NULL, 7, 8, 9): - the corresponding NULL bitmap is (110111...) - the corresponding tuple is (5, 6, 7, 9) - t_natts=4 So, without a NULL bitmap, we cannot handle this kind of rows at all by t_natts only. And the NULL bitmap plays very important role in tuple saving, and I meant exactly that point.
[HACKERS] Fast ALTER TABLE ... ADD COLUMN ... DEFAULT xxx?
Hello. PostgreSQL is very fast when we perform (even on a huge table) ALTER TABLE ... ADD COLUMN ... NULL; (nullable without a default value). This is because of NULL bitmap in tuples. And it's greatest feature for a developer! But another very common-case query like ALTER TABLE ... ADD COLUMN ... BOOLEAN NOT NULL DEFAULT false; or ALTER TABLE ... ADD COLUMN ... INT NOT NULL DEFAULT 0; for a huge table is performed very slow - this is because PostgreSQL have to re-create all tuples assigning the default value to them. If I have a table with 1 billion rows (for example), I have no chance to perform this query at all - too slow. (In most cases NOT NULL DEFAULT xxx fields are BOOLEAN, flags: it is not handy to have 3-way flags.) So, are there plans to optimize such kind of queries? This could be done by many ways: 1. Store the DEFAULT flag directly in NULL BITMAP (add a bit to NULL bitmap not only for NULLable fields, but also for NOT NULL DEFAULT ... fields). 2. Add another bitmap for each tuple (DEFAULT bitmap). Bit value 0 means that there is a real value in a cell, 1 - that the value is default. 3. The same as (1), but always force default value to be 0 (or false or any other values with meaning zero) and optimize only these cases.
[HACKERS] PostgreSQL 8.3.4 reproducible crash
Hello. Here is the SQL to reproduce the server crash: CREATE SCHEMA bug1 AUTHORIZATION postgres; SET search_path = bug1, pg_catalog; CREATE FUNCTION bug1.domain_check (integer) RETURNS boolean AS $body$ SELECT $1 0 $body$ LANGUAGE sql IMMUTABLE STRICT; CREATE DOMAIN bug1.domain AS integer CONSTRAINT check CHECK (bug1.domain_check(VALUE)); CREATE TYPE bug1.composite AS ( id domain ); select '(1)'::bug1.composite;
Re: [HACKERS] ALTER composite type does not work, but ALTER TABLE which ROWTYPE is used as a type - works fine
Could you please say, if ALTER TYPE ... ADD COLUMN is planned for a future PostgreSQL version? On Fri, Dec 5, 2008 at 4:08 PM, Merlin Moncure [EMAIL PROTECTED] wrote: On Thu, Dec 4, 2008 at 9:53 AM, Dmitry Koterov [EMAIL PROTECTED] wrote: Hello. I can successfully ALTER a NON-EMPTY table (ct) which ROWTYPE is used as a column type for another table (dt): CREATE TABLE ct (id INTEGER); CREATE TABLE dt (id INTEGER, c ct); INSERT INTO dt VALUES(1, '(666)'); SELECT * FROM dt; -- (1, '(666)') ALTER TABLE ct ADD COLUMN n INTEGER; SELECT * FROM dt; -- (1, '(666,)') You see, '(666,)' means that the new field is added successfully. But, if I declare ct as a COMPOSITE type (not a table), it is not permitted to ALTER this type (Postgres says that there are dependensies on ct). Why? Because of this there is no reason to ever use 'create type'always use 'create table'. 'alter type' can't add/remove columns anyways. merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] ALTER composite type does not work, but ALTER TABLE which ROWTYPE is used as a type - works fine
Hello. I can successfully ALTER a NON-EMPTY table (ct) which ROWTYPE is used as a column type for another table (dt): CREATE TABLE ct (id INTEGER); CREATE TABLE dt (id INTEGER, c ct); INSERT INTO dt VALUES(1, '(666)'); SELECT * FROM dt; -- (1, '(666)') ALTER TABLE ct ADD COLUMN n INTEGER; SELECT * FROM dt; -- (1, '(666,)') You see, '(666,)' means that the new field is added successfully. But, if I declare ct as a COMPOSITE type (not a table), it is not permitted to ALTER this type (Postgres says that there are dependensies on ct). Why?
Re: [HACKERS] Sometimes pg_dump generates dump which is not restorable
Oh, I understood you. Clearly, to surely avoid any side-effect in pg_dump, all IMMUTABLE functions must implicitly assign search_path in develop time. It's not obvious, so I propose to include this in CONSTRAINT ... CHECK and CREATE INDEX documentation. :-) Or - raise NOTICE if an IMMUTABLE function is used in CHECK or INDEX, but does not define search_path ints arguments. Thanks! On Fri, Nov 14, 2008 at 7:25 PM, Tom Lane [EMAIL PROTECTED] wrote: Dmitry Koterov [EMAIL PROTECTED] writes: Thank you for a possible solution. But what about the database which exists and works correctly (and conforms all the standards from the documentation), but dump+restore sequence is failed for it? Does it mean that pg_dump should be improved to pass dump+restore sequence? No matter what pg_dump does, it can never guarantee that a non-immutable check constraint will still pass at restore time ... and that's basically what you've got, if the check function is search-path-sensitive. regards, tom lane
Re: [HACKERS] Sometimes pg_dump generates dump which is not restorable
Thank you for a possible solution. But what about the database which exists and works correctly (and conforms all the standards from the documentation), but dump+restore sequence is failed for it? Does it mean that pg_dump should be improved to pass dump+restore sequence? Besides that, for pg_dump has corresponding behaviour CONSTRAINT = FOREIGN KEY. For CONSTRAINT = CHECK - it hasn't. On Thu, Nov 13, 2008 at 9:07 PM, Tom Lane [EMAIL PROTECTED] wrote: Dmitry Koterov [EMAIL PROTECTED] writes: 3. The function a() calls any OTHER function b() from OTHER namespace (or uses operators from other namespaces), but does not specify the schema name, because it is in database search_path: CREATE FUNCTION a(i integer) RETURNS boolean AS $$ BEGIN PERFORM b(); -- b() is is from nsp schema RETURN true; END;$$ LANGUAGE plpgsql IMMUTABLE; I think your function is broken. You might want to fix it by attaching a local search_path setting to it. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Sometimes pg_dump generates dump which is not restorable
Hello. Why pg_dump dumps CONSTRAINT ... CHECK together with CREATE TABLE queries, but NOT at the end of dump file (as FOREIGN KEY)? Sometimes it causes the generation of invalid dumps which cannot be restored. Details follow. 1. I use database-dedicated search_path: ALTER DATABASE d SET search_path TO nsp, public, pg_catalog; 2. I have a CHECK on table1 which calls a stored function: CREATE TABLE table1 ( i integer, CONSTRAINT table1_chk CHECK ((a(i) = true)) ); 3. The function a() calls any OTHER function b() from OTHER namespace (or uses operators from other namespaces), but does not specify the schema name, because it is in database search_path: CREATE FUNCTION a(i integer) RETURNS boolean AS $$ BEGIN PERFORM b(); -- b() is is from nsp schema RETURN true; END;$$ LANGUAGE plpgsql IMMUTABLE; 4. If I dump such schema using pg_dump, later this dump cannot be restored. Look the following piece of generated dump: SET search_path = public, pg_catalog; COPY table1 (i) FROM stdin; 1 \. You see, when COPY is executed, data is inserted, and CHECK is called. So, function a() is called with public, pg_catalog search_path. It is errorous! Possible solutions: 1. When generating CREATE TABLE dump query, DO NOT include CONSTRAINT ... CHECK clauses in it. Instead, use ALTER TABLE to add all checks AT THE END of dump, the same as it is done for foreign keys. I have already offered this above. Additionally, seems to me it will speed up the dump restoration. 2. Replace SET search_path = public, pg_catalog to SET search_path = public, pg_catalog, all other database-dedicated search_pathes. It's a worse way, kind a hack.
Re: [HACKERS] Review Report: propose to include 3 new functions into intarray and intagg
No problem, I have time for clearing. But are these functions guaranteed to be included in the contrib? If there is no guarantee, seems the time of clearing will be wasted. (5 years ago I have already cleaned one open-source library on demand and after that it was not approved for PEAR repository, so now I am more circumspect, sorry :-). So - is the approval procedure finished? If not, who could make the final decision (be or not to be)? Sorry, I don't yet know well the patch proposal procedure... P.S. 1. Unfortunately GROUP BY + ORDER BY is sometimes 1000 times (!) slower than _int_group_count_sort. Because of that I have created this function. 2. I have to assume that the input is sorted in functions, because else the performance is lowered very much. Sort operation is quite expensive; checking if an array is sorted or not is also quite expensive... So I think it is not a good idea to add sorting-independency to functions. 3. Seems the conversion of functions to anyarray/anyelement is much more time-expensive than simply including it to specialized intarray/intagg. Is it absolutely necessery? On Mon, Sep 15, 2008 at 4:38 PM, Markus Wanner [EMAIL PROTECTED] wrote: Hi, sorry for not having completed this review, yet. As you are obviously looking at the patch as well, I'll try to quickly write down my points so far. Trying to compile the intarray module, I now receive an error: error: 'INT4OID' undeclared (first use in this function) That can be solved by including catalog/pg_type.h from contrib/intarr/_int_op.c. The PG_FUNCTION_INFO_V1 and prototype definition certainly belong to the top of the file, where all others are. Some lines are longer than 80 columns and again comments are a bit sparse or even useless (no additional things, please). Heikki Linnakangas wrote: I find it a bit unfriendly to have a function that depends on sorted input, but doesn't check it. But that's probably not a good enough reason to reject an otherwise simple and useful function. Also, we already have uniq, which doesn't strictly speaking require sorted input, but it does if you want to eliminate all duplicates from the array. I think it's a performance optimization which is absolutely required in some cases. Some time ago I've also had to rip out the sorting step from certain intarray module functions to save processing time. One option already mentioned somewhere would be saving a 'sorted' property for the array. Then again, I think such a thing would certainly have to be done globally, for all kinds of arrays. _int_group_count_sort seems a bit special purpose. Why does it bother to sort the output? That's wasted time if you don't need sorted output, or if you want the array sorted by the integer value instead of frequency. If you want sorted output, you can just sort it afterwards. Agreed. IMO the normal GROUP BY and ORDER BY stuff of the database itself should be used for such a thing. However, that means turning an array into a set of rows... Also, it's requiring sorted input for a small performance gain, but there's a lot more precedence in the existing intarray functions to not require sorted input, but to sort the input instead (union, intersect, same, overlap). ..and exactly these are the functions I had to wrap again to strip the sorting step, due to poor performance for known-sorted arrays. I realize that the current implementation is faster for the use case where the input is sorted, and output needs to be sorted, but if we go down that path we'll soon have dozens of different variants of various functions, with different ordering requirements of inputs and outputs. Agreed. However, given the OP is using that in production, there seems to be a use case for the optimization, where we have none for the same function without it. So, I'd suggest changing _int_group_count_sort so that it doesn't require sorted input, and doesn't sort the output. The binary search function looks good to me (I think I'd prefer naming it bsearch(), though, though I can see that it was named bidx in reference to the existing idx function). Also, as Markus pointed out, the SGML docs need to be updated. As is, it should probably also carry the '_int' prefix, because it's not a general purpose array function. So propose to name it '_int_bsearch'. Overall I think these functions are overly specialized and should be replaced be more general counterparts in core. However, until we have that, it's hard to refuse such a thing for contrib. By that reasoning, however, the intarray would have to provide methods for sorted as well as asorted input arrays as well. I'm closing my part of reviewing of this patch now. Dmitry, how do you want to proceed with these patches? Do you have time for some cleaning up and writing documentation? Regards Markus Wanner -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription:
Re: [HACKERS] Foreign key constraint for array-field?
Normalization is not a panacea here. Sometimes such normalization creates too much overeat and a lot of additional code (especially if there are a lot of such dependencies). Array support in Postgres is quite handy; in my practive, moving from a_b_map to arrays economizes hundreds of lines of stored procedure and calling application code. Triggers are not very helpful here, because it is too boringly to control that all needed tables has appropriate triggers (we need N + 1 triggers with unique code, where N is the number of referring tables). So, built-in support looks much more interesting... On Sun, Sep 21, 2008 at 8:46 AM, Joshua D. Drake [EMAIL PROTECTED] wrote: David Fetter wrote: On Sun, Sep 21, 2008 at 04:38:56AM +0400, Dmitry Koterov wrote: Hello. Is it possible to create a foreign key constraint for ALL elements of an array field? Whether it's possible or not--it probably is--it's a very bad idea. Just normalize :) +1 Cheers, David. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Predictable order of SQL commands in pg_dump
Unfortunately, I cannot reproduce this with 100% effect. But, time to time I execute diff utility for a database and notice that two or more trigger or constraint definitions (or something else) are permuted. Something like this: +ALTER TABLE ONLY a +ADD CONSTRAINT fk_b_Id FOREIGN KEY (b_id) REFERENCES b(id) MATCH FULL; -ALTER TABLE ONLY a -ADD CONSTRAINT fk_b_id FOREIGN KEY (b_id) REFERENCES b(id) MATCH FULL; -ALTER TABLE ONLY a +ALTER TABLE ONLY a ADD CONSTRAINT fk_c_id FOREIGN KEY (c_id) REFERENCES c(id) MATCH FULL; Or that: CREATE TRIGGER t000_set_id -BEFORE INSERT OR DELETE OR UPDATE ON a +BEFORE INSERT OR DELETE OR UPDATE ON b FOR EACH ROW EXECUTE PROCEDURE i_trg(); CREATE TRIGGER t000_set_id -BEFORE INSERT OR DELETE OR UPDATE ON b +BEFORE INSERT OR DELETE OR UPDATE ON a FOR EACH ROW EXECUTE PROCEDURE i_trg(); You see, object names are the same, but ordering is mixed. Seems pg_dump orders objects with no care about their dependencies? So, if object names are the same, it dumps it in unpredictable order, no matter on their contents... On Sun, Sep 21, 2008 at 5:28 AM, Tom Lane [EMAIL PROTECTED] wrote: Dmitry Koterov [EMAIL PROTECTED] writes: Utility pg_dump dumps the identical database schemas not always identically: sometimes it changes an order of SQL statements. Please provide a concrete example. The dump order for modern servers (ie, since 7.3) is by object type, and within a type by object name, except where another order is forced by dependencies. And there is no random component to the dependency solver ;-). So it should be behaving the way you want. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Foreign key constraint for array-field?
I strongly suspect you'd benefit a lot more by learning database best practices rather than assuming, as you appear to be doing, that you are dealing with a new field and that you know it best. Neither is true. Of course, you absolutely right. I venerate you! O! :-) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Predictable order of SQL commands in pg_dump
Great! Would it be implemented in a next version? Seems it would be very helpful, especially for people who commit database structure to CVS/SVN once per minute to track changes history (or similar)... On Sun, Sep 21, 2008 at 11:57 PM, Tom Lane [EMAIL PROTECTED] wrote: Dmitry Koterov [EMAIL PROTECTED] writes: CREATE TRIGGER t000_set_id -BEFORE INSERT OR DELETE OR UPDATE ON a +BEFORE INSERT OR DELETE OR UPDATE ON b FOR EACH ROW EXECUTE PROCEDURE i_trg(); CREATE TRIGGER t000_set_id -BEFORE INSERT OR DELETE OR UPDATE ON b +BEFORE INSERT OR DELETE OR UPDATE ON a FOR EACH ROW EXECUTE PROCEDURE i_trg(); You see, object names are the same, but ordering is mixed. Yeah, because the sort is just on object name. For objects of the same type I suppose we could take a look at their owning object's name too ... regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Foreign key constraint for array-field?
Hello. Is it possible to create a foreign key constraint for ALL elements of an array field? CREATE TABLE a(id INTEGER); CREATE TABLE b(id INTEGER, a_ids INTEGER[]); Field b.a_ids contains a list of ID's of a table. I want to ensure that each element in b.a_ids exists in a in any time. Is it possible to create an automatic foreign key? According to http://www.postgresql.org/docs/current/static/catalog-pg-constraint.html , seems to me it is possible if I create a custom entry in pg_constraint with my custom conpfeqop, conppeqop and conffeqop fields. Am I right? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Predictable order of SQL commands in pg_dump
Hello. Utility pg_dump dumps the identical database schemas not always identically: sometimes it changes an order of SQL statements. E.g.: 1. Dump of database A: ALTER TABLE xxx ADD CONSTRAINT ...; ALTER TABLE yyy ADD CONSTRAINT ...; 2. Dump of database B which has identical structure as A (pg_dump A | psql -d B was executed) ALTER TABLE yyy ADD CONSTRAINT ...; ALTER TABLE xxx ADD CONSTRAINT ...; This behaviour is not good, because I cannot execute diff to visually view what was changed between databases A and B. (I use this diff only for visual detection, please do not refer I want to use this diff for schema migration - I don't want it!). Is it possible to make pg_dump more predictable in SQL ordering? What order does it use by default? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Review Report: propose to include 3 new functions into intarray and intagg
OK, thank you for your review. I'll correct everything and send a patch in a couple of days. Are you completely sure that this patch will be included? If not, seems the work of the patch standartization has much lower priority, and I will not hurry so much. But, what about intarray patch? Does somebody plan to review it? I'd prefer to include it too. If you approve, I'll correct the code style in intarray contrib patch too. On Sat, Sep 6, 2008 at 3:34 PM, Markus Wanner [EMAIL PROTECTED] wrote: Hi, this is my first official review. I've tried to follow the Review a patch guidelines from the wiki - thanks Simon, that was pretty helpful. This review covers only the intagg additions. Dmitry Koterov wrote: Here are these functions with detailed documentation: http://en.dklab.ru/lib/dklab_postgresql_patch/ Submission review: we generally prefer having patches archived on our mailing lists, so please just send future patches or revisions of this patch to our lists (I prefer -hackers, but probably -patches is still the official one). The documentation should go into a README file of the contrib module or some such. Tests are missing, but that's the case for the intagg contrib module anyway. The patch applies and is in context diff format, good. It adds an unrelated newline, which should generally be avoided. Usability review: functional additions look good and usable, certainly within a contrib module. I don't think it's compliant to any spec, but that's not required for contrib, IMO. Functional test: works as advertised on the accompanying website. I've tested the int_array_append_aggregate somewhat, see [1]. Performance testing: I've tested with 10 mio rows of arrays of size 1 and compared against core's int_array_aggregate function, see [1] again. In this simple test it took roughly 50% longer, which seems okay. Memory consumption looks sane as well. Coding review: style seems fine for contrib, though lines longer than 80 cols should be broken up. Comments in the code are sparse , some even counter-productive (marking something as additional things certainly doesn't help). Code and architecture review: the new int_array_append_aggregate() functions itself seems fine to me. Summary: My general feeling is, that this patch should be applied after minor code style corrections. As a longer term goal I think intagg should be integrated into core, since it's very basic functionality. TODO entries for things like an array_accum() aggregate already exist. Adding this patch to contrib now might be a step into the right direction. Dmitry, can you please apply these small corrections and re-submit the patch? Regards Markus Wanner P.S.: I dislike the intagg's use of PGARRAY, but that's nothing to do with this patch. Shouldn't this better use a real composite type as the aggregate's state type? I'd propose to clean up the intagg contrib module and prepare it for inclusion into core. [1]: functional and performance testing session: On a database with (patched) intagg and intarr contrib modules: markus=# CREATE TABLE test (id INT NOT NULL, arr INT[] NOT NULL); CREATE TABLE markus=# INSERT INTO test VALUES (1, ARRAY[1,2,3]), (2, ARRAY[4,5]), (3, ARRAY[3,2,1]); INSERT 0 3 markus=# SELECT * FROM test; id | arr +- 1 | {1,2,3} 2 | {4,5} 3 | {3,2,1} (3 rows) markus=# SELECT int_array_append_aggregate(arr) FROM test; int_array_append_aggregate {1,2,3,4,5,3,2,1} (1 row) markus=# SELECT * FROM test; id | arr +- 1 | {1,2,3} 2 | {4,5} 3 | {3,2,1} (3 rows) markus=# SELECT int_array_aggregate(id) AS ids, int_array_append_aggregate(arr) FROM test GROUP BY (id / 2); ids | int_array_append_aggregate ---+ {1} | {1,2,3} {2,3} | {4,5,3,2,1} (2 rows) markus=# SELECT int_array_aggregate(id) AS ids, int_array_append_aggregate(arr) FROM test GROUP BY (id % 2); ids | int_array_append_aggregate ---+ {2} | {4,5} {1,3} | {1,2,3,3,2,1} (2 rows) markus=# INSERT INTO test VALUES (4, NULL); INSERT 0 1 markus=# SELECT int_array_append_aggregate(arr) FROM test; int_array_append_aggregate {1,2,3,4,5,3,2,1} (1 row) markus=# SELECT id, int_array_append_aggregate(arr) FROM test GROUP BY id; id | int_array_append_aggregate + 4 | {} 2 | {4,5} 3 | {3,2,1} 1 | {1,2,3} (4 rows) -- switching to performance testing markus=# \timing Timing is on. markus=# DELETE FROM test; DELETE 4 Time: 9.037 ms markus=# INSERT INTO test SELECT generate_series(1, 1000), array[round(random() * 100)]::int[]; INSERT 0 1000 Time: 53321.186 ms markus=# SELECT icount(int_array_aggregate(id)) AS count FROM test; count -- 1000 (1 row) Time: 2493.184 ms markus=# SELECT icount(int_array_append_aggregate(arr
[HACKERS] Patch: propose to include 3 new functions into intarray and intagg
Hello. Here are these functions with detailed documentation: http://en.dklab.ru/lib/dklab_postgresql_patch/ intagg.int_array_append_aggregate(int[]): fast merge arrays into one large list intarray._int_group_count_sort(int[], bool): frequency-based sorting intarray.bidx(int[], int): binary search in a sorted array Tested for about a year on a real PostgreSQL cluster (10 machines, Slony replication) under a heavy load (millions of requests). No crash nor memory problem detected during a year, so I suppose these functions are well-tested. What do you think about that?